From: Luwei Kang <luwei.kang@intel.com>
To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: peterz@infradead.org, mingo@redhat.com, acme@kernel.org,
mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
jolsa@redhat.com, namhyung@kernel.org, tglx@linutronix.de,
bp@alien8.de, hpa@zytor.com, pbonzini@redhat.com,
sean.j.christopherson@intel.com, vkuznets@redhat.com,
wanpengli@tencent.com, jmattson@google.com, joro@8bytes.org,
pawan.kumar.gupta@linux.intel.com, ak@linux.intel.com,
thomas.lendacky@amd.com, fenghua.yu@intel.com,
kan.liang@linux.intel.com, like.xu@linux.intel.com
Subject: [PATCH v1 05/11] KVM: x86/pmu: Add support to reprogram PEBS event for guest counters
Date: Fri, 6 Mar 2020 01:56:59 +0800 [thread overview]
Message-ID: <1583431025-19802-6-git-send-email-luwei.kang@intel.com> (raw)
In-Reply-To: <1583431025-19802-1-git-send-email-luwei.kang@intel.com>
From: Like Xu <like.xu@linux.intel.com>
When the event precise level is non-zero, the performance counter
will be reprogramed for PEBS event and set PBES PMI bit in global_status
when the PEBS event is overflowed. Since KVM never knows the setting
of precise level in guest because it's a SW parameter, we force all PEBS
events to be precise level 1 for enough accuracy with a dedicated counter.
Originally-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/pmu.c | 69 ++++++++++++++++++++++++++++++++++++++++-
arch/x86/kvm/vmx/pmu_intel.c | 1 +
3 files changed, 70 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 98959e8..83abb49 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -478,6 +478,7 @@ struct kvm_pmu {
u64 global_ctrl_mask;
u64 global_ovf_ctrl_mask;
u64 reserved_bits;
+ u64 pebs_enable;
u8 version;
struct kvm_pmc gp_counters[INTEL_PMC_MAX_GENERIC];
struct kvm_pmc fixed_counters[INTEL_PMC_MAX_FIXED];
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index b4f9e97..b2bdacb 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -77,6 +77,11 @@ static void kvm_perf_overflow_intr(struct perf_event *perf_event,
if (!test_and_set_bit(pmc->idx, pmu->reprogram_pmi)) {
__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
+
+ /* Indicate PEBS overflow to guest. */
+ if (perf_event->attr.precise_ip)
+ __set_bit(62, (unsigned long *)&pmu->global_status);
+
kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
/*
@@ -99,6 +104,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
bool exclude_kernel, bool intr,
bool in_tx, bool in_tx_cp)
{
+ struct kvm_pmu *pmu = vcpu_to_pmu(pmc->vcpu);
struct perf_event *event;
struct perf_event_attr attr = {
.type = type,
@@ -111,6 +117,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
.config = config,
.disabled = 1,
};
+ bool pebs = test_bit(pmc->idx, (unsigned long *)&pmu->pebs_enable);
attr.sample_period = (-pmc->counter) & pmc_bitmask(pmc);
@@ -126,8 +133,50 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
attr.config |= HSW_IN_TX_CHECKPOINTED;
}
+ if (pebs) {
+ /*
+ * Host never knows the precision level set by guest.
+ * Force Host's PEBS event to precision level 1, which will
+ * not impact the accuracy of the results for guest PEBS events.
+ * Because,
+ * - For most cases, there is no difference among precision
+ * level 1 to 3 for PEBS events.
+ * - The functions as below checks the precision level in host.
+ * But the results from these functions in host are replaced
+ * by guest when sampling the guest.
+ * The accuracy for guest PEBS events will not be impacted.
+ * -- event_constraints() impacts the index of counter.
+ * The index for host event is exactly the same as guest.
+ * It's decided by guest.
+ * -- pebs_update_adaptive_cfg() impacts the value of
+ * MSR_PEBS_DATA_CFG. When guest is switched in,
+ * the MSR value will be replaced by the value from guest.
+ * -- setup_sample () impacts the output of a PEBS record.
+ * Guest handles the PEBS records.
+ */
+ attr.precise_ip = 1;
+ /*
+ * When the host's PMI handler completes, it's going to
+ * enter the guest and trigger the guest's PMI handler.
+ *
+ * At this moment, this function may be called by
+ * kvm_pmu_handle_event(). However the next sample_period
+ * hasn't been determined by guest yet and the left period,
+ * which probably be 0, is used for current sample_period.
+ *
+ * In this case, perf will mistakenly treat it as non
+ * sampling events. The PEBS event will error out.
+ *
+ * Fill it with maximum period to prevent the error out.
+ * The guest PMI handler will soon reprogram the counter.
+ */
+ if (!attr.sample_period)
+ attr.sample_period = (-1ULL) & pmc_bitmask(pmc);
+ }
+
event = perf_event_create_kernel_counter(&attr, -1, current,
- intr ? kvm_perf_overflow_intr :
+ (intr || pebs) ?
+ kvm_perf_overflow_intr :
kvm_perf_overflow, pmc);
if (IS_ERR(event)) {
pr_debug_ratelimited("kvm_pmu: event creation failed %ld for pmc->idx = %d\n",
@@ -135,6 +184,20 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
return;
}
+ if (pebs) {
+ event->guest_dedicated_idx = pmc->idx;
+ /*
+ * For guest PEBS events, guest takes the responsibility to
+ * drain PEBS buffers, and load proper values to reset counters.
+ *
+ * Host will unconditionally set auto-reload flag for PEBS
+ * events with fixed period which is not necessary. Host should
+ * do nothing in drain_pebs() but inject the PMI into the guest.
+ *
+ * Unset the auto-reload flag for guest PEBS events.
+ */
+ perf_x86_pmu_unset_auto_reload(event);
+ }
pmc->perf_event = event;
pmc_to_pmu(pmc)->event_count++;
perf_event_enable(event);
@@ -158,6 +221,10 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc)
if (!pmc->perf_event)
return false;
+ if (test_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->pebs_enable)
+ != (!!pmc->perf_event->attr.precise_ip))
+ return false;
+
/* recalibrate sample period and check if it's accepted by perf core */
if (perf_event_period(pmc->perf_event,
(-pmc->counter) & pmc_bitmask(pmc)))
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index fd21cdb..ebadc33 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -293,6 +293,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->counter_bitmask[KVM_PMC_GP] = 0;
pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
pmu->version = 0;
+ pmu->pebs_enable = 0;
pmu->reserved_bits = 0xffffffff00200000ull;
entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
--
1.8.3.1
next prev parent reply other threads:[~2020-03-05 9:59 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-05 17:56 [PATCH v1 00/11] PEBS virtualization enabling via DS Luwei Kang
2020-03-05 16:51 ` Paolo Bonzini
2020-03-05 17:56 ` [PATCH v1 01/11] perf/x86/core: Support KVM to assign a dedicated counter for guest PEBS Luwei Kang
2020-03-06 13:53 ` Peter Zijlstra
2020-03-06 14:42 ` Liang, Kan
2020-03-09 10:04 ` Peter Zijlstra
2020-03-09 13:12 ` Liang, Kan
2020-03-09 15:05 ` Peter Zijlstra
2020-03-09 19:28 ` Liang, Kan
2020-03-12 10:28 ` Kang, Luwei
2020-03-26 14:03 ` Liang, Kan
2020-04-07 12:34 ` Kang, Luwei
2020-06-12 5:28 ` Kang, Luwei
2020-06-19 9:30 ` Kang, Luwei
2020-08-20 3:32 ` Like Xu
2020-03-09 15:44 ` Andi Kleen
2020-03-05 17:56 ` [PATCH v1 02/11] perf/x86/ds: Handle guest PEBS events overflow and inject fake PMI Luwei Kang
2020-03-05 17:56 ` [PATCH v1 03/11] perf/x86: Expose a function to disable auto-reload Luwei Kang
2020-03-05 17:56 ` [PATCH v1 04/11] KVM: x86/pmu: Decouple event enablement from event creation Luwei Kang
2020-03-05 17:56 ` Luwei Kang [this message]
2020-03-06 16:28 ` [PATCH v1 05/11] KVM: x86/pmu: Add support to reprogram PEBS event for guest counters kbuild test robot
2020-03-09 0:58 ` Xu, Like
2020-03-05 17:57 ` [PATCH v1 06/11] KVM: x86/pmu: Implement is_pebs_via_ds_supported pmu ops Luwei Kang
2020-03-05 17:57 ` [PATCH v1 07/11] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 Luwei Kang
2020-03-05 17:57 ` [PATCH v1 08/11] KVM: x86/pmu: PEBS MSRs emulation Luwei Kang
2020-03-05 17:57 ` [PATCH v1 09/11] KVM: x86/pmu: Expose PEBS feature to guest Luwei Kang
2020-03-05 17:57 ` [PATCH v1 10/11] KVM: x86/pmu: Introduce the mask value for fixed counter Luwei Kang
2020-03-05 17:57 ` [PATCH v1 11/11] KVM: x86/pmu: Adaptive PEBS virtualization enabling Luwei Kang
2020-03-05 22:48 ` [PATCH v1 00/11] PEBS virtualization enabling via DS Andi Kleen
2020-03-06 5:37 ` Kang, Luwei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1583431025-19802-6-git-send-email-luwei.kang@intel.com \
--to=luwei.kang@intel.com \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=bp@alien8.de \
--cc=fenghua.yu@intel.com \
--cc=hpa@zytor.com \
--cc=jmattson@google.com \
--cc=jolsa@redhat.com \
--cc=joro@8bytes.org \
--cc=kan.liang@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=like.xu@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=sean.j.christopherson@intel.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).