* RE: [PATCH v7 00/13] Intel Processor Trace virtualization enabling
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
@ 2018-05-03 10:23 ` Kang, Luwei
2018-05-03 10:37 ` Peter Zijlstra
2018-05-03 12:13 ` [PATCH v7 01/13] perf/x86/intel/pt: Move Intel-PT MSRs bit definitions to a public header Luwei Kang
` (12 subsequent siblings)
13 siblings, 1 reply; 19+ messages in thread
From: Kang, Luwei @ 2018-05-03 10:23 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng
> Hi All,
>
> Here is a patch-series which adding Processor Trace enabling in KVM guest. You can get It's software developer manuals from:
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
> In Chapter 4 INTEL PROCESSOR TRACE: VMX IMPROVEMENTS.
>
> Introduction:
> Intel Processor Trace (Intel PT) is an extension of Intel Architecture that captures information about software execution using
> dedicated hardware facilities that cause only minimal performance perturbation to the software being traced. Details on the Intel PT
> infrastructure and trace capabilities can be found in the Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C.
>
> The suite of architecture changes serve to simplify the process of virtualizing Intel PT for use by a guest software. There are two
> primary elements to this new architecture support for VMX support improvements made for Intel PT.
> 1. Addition of a new guest IA32_RTIT_CTL value field to the VMCS.
> — This serves to speed and simplify the process of disabling trace on VM exit, and restoring it on VM entry.
> 2. Enabling use of EPT to redirect PT output.
> — This enables the VMM to elect to virtualize the PT output buffer using EPT. In this mode, the CPU will treat PT output addresses
> as Guest Physical Addresses (GPAs) and translate them using EPT. This means that Intel PT output reads (of the ToPA table) and
> writes (of trace output) can cause EPT violations, and other output events.
>
> Processor Trace virtualization can be work in one of 3 possible modes by set new option "pt_mode". Default value is system mode.
> a. system-wide: trace both host/guest and output to host buffer; b. host-only: only trace host and output to host buffer; c. host-
> guest: trace host/guest simultaneous and output to their respective buffer.
>
> >From V6:
> - split pathes 1~2 to four separate patches (these patches do 2 things) and add more descriptions.
>
> >From V5:
> - rename the function from pt_cap_get_ex() to __pt_cap_get();
> - replace the most of function from vmx_pt_supported() to "pt_mode == PT_MODE_HOST_GUEST"(or !=).
>
> >From V4:
> - add data check when setting the value of MSR_IA32_RTIT_CTL;
> - Invoke new interface to set the intercept of MSRs read/write after "MSR bitmap per-vcpu" patches.
>
> >From V3:
> - change default mode to SYSTEM mode;
> - add a new patch to move PT out of scattered features;
> - add a new fucntion kvm_get_pt_addr_cnt() to get the number of address ranges;
> - add a new function vmx_set_rtit_ctl() to set the value of guest RTIT_CTL, GUEST_IA32_RTIT_CTL and MSRs intercept.
>
> >From v2:
> - replace *_PT_SUPPRESS_PIP to *_PT_CONCEAL_PIP;
> - clean SECONDARY_EXEC_PT_USE_GPA, VM_EXIT_CLEAR_IA32_RTIT_CTL and VM_ENTRY_LOAD_IA32_RTIT_CTL in SYSTEM mode.
> These bits must be all set or all clean;
> - move processor tracing out of scattered features;
> - add a new function to enable/disable intercept MSRs read/write;
> - add all Intel PT MSRs read/write and disable intercept when PT is enabled in guest;
> - disable Intel PT and enable intercept MSRs when L1 guest VMXON;
> - performance optimization.
> In Host only mode. we just need to save host RTIT_CTL before vm-entry and restore host RTIT_CTL after vm-exit;
> In HOST_GUEST mode. we need to save and restore all MSRs only when PT has enabled in guest.
> - use XSAVES/XRESTORES implement context switch.
> Haven't implementation in this version and still in debuging. will make a separate patch work on this.
>
> >From v1:
> - remove guest-only mode because guest-only mode can be covered by host-guest mode;
> - always set "use GPA for processor tracing" in secondary execution control if it can be;
> - trap RTIT_CTL read/write. Forbid write this msr when VMXON in L1 hypervisor.
>
> Chao Peng (8):
> perf/x86/intel/pt: Move Intel-PT MSRs bit definitions to a public
> header
> perf/x86/intel/pt: Change pt_cap_get() to a public function
> KVM: x86: Add Intel Processor Trace virtualization mode
> KVM: x86: Add Intel Processor Trace cpuid emulation
> KVM: x86: Add Intel processor trace context for each vcpu
> KVM: x86: Implement Intel Processor Trace context switch
> KVM: x86: Implement Intel Processor Trace MSRs read/write
> KVM: x86: Set intercept for Intel PT MSRs read/write
>
> Luwei Kang (5):
> perf/x86/intel/pt: Add new bit definitions for Intel PT MSRs
> perf/x86/intel/pt: add new capability for Intel PT
> perf/x86/intel/pt: Introduce a new function to get capability of Intel
> PT
> KVM: x86: Introduce a function to initialize the PT configuration
> KVM: x86: Disable Intel Processor Trace when VMXON in L1 guest
>
> arch/x86/events/intel/pt.c | 12 +-
> arch/x86/events/intel/pt.h | 58 ------
> arch/x86/include/asm/intel_pt.h | 40 ++++
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/include/asm/msr-index.h | 38 ++++
> arch/x86/include/asm/vmx.h | 8 +
> arch/x86/kvm/cpuid.c | 22 ++-
> arch/x86/kvm/svm.c | 6 +
> arch/x86/kvm/vmx.c | 412 ++++++++++++++++++++++++++++++++++++++-
> arch/x86/kvm/x86.c | 33 +++-
> 10 files changed, 563 insertions(+), 67 deletions(-)
>
Send this patch set twice because path 2 send fail in the first time.
Thanks,
Luwei Kang
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v7 00/13] Intel Processor Trace virtualization enabling
2018-05-03 10:23 ` Kang, Luwei
@ 2018-05-03 10:37 ` Peter Zijlstra
2018-05-03 11:19 ` Kang, Luwei
0 siblings, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2018-05-03 10:37 UTC (permalink / raw)
To: Kang, Luwei
Cc: kvm, tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel,
joro, chao.p.peng
On Thu, May 03, 2018 at 10:23:12AM +0000, Kang, Luwei wrote:
> Send this patch set twice because path 2 send fail in the first time.
Then just resend patch 2; also I have two copies of patch 2, so clearly
it _did_ send.
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH v7 00/13] Intel Processor Trace virtualization enabling
2018-05-03 10:37 ` Peter Zijlstra
@ 2018-05-03 11:19 ` Kang, Luwei
0 siblings, 0 replies; 19+ messages in thread
From: Kang, Luwei @ 2018-05-03 11:19 UTC (permalink / raw)
To: Peter Zijlstra
Cc: kvm, tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel,
joro, chao.p.peng, thomas.lendacky, bp, brijesh.singh, Liang,
Kan, Janakarajan.Natarajan, dwmw, alexander.shishkin,
mathieu.poirier, kstewart, gregkh
> On Thu, May 03, 2018 at 10:23:12AM +0000, Kang, Luwei wrote:
> > Send this patch set twice because path 2 send fail in the first time.
>
> Then just resend patch 2; also I have two copies of patch 2, so clearly it _did_ send.
Cc all the maintainers and contributors.
Thanks,
Luwei Kang.
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH v7 00/13] Intel Processor Trace virtualization enabling
@ 2018-05-03 11:19 ` Kang, Luwei
0 siblings, 0 replies; 19+ messages in thread
From: Kang, Luwei @ 2018-05-03 11:19 UTC (permalink / raw)
To: Peter Zijlstra
Cc: kvm, tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel,
joro, chao.p.peng, thomas.lendacky, bp, brijesh.singh, Liang,
Kan, Janakarajan.Natarajan, dwmw,
alexander.shishkin@linux.intel.com
> On Thu, May 03, 2018 at 10:23:12AM +0000, Kang, Luwei wrote:
> > Send this patch set twice because path 2 send fail in the first time.
>
> Then just resend patch 2; also I have two copies of patch 2, so clearly it _did_ send.
Cc all the maintainers and contributors.
Thanks,
Luwei Kang.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v7 01/13] perf/x86/intel/pt: Move Intel-PT MSRs bit definitions to a public header
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
2018-05-03 10:23 ` Kang, Luwei
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 02/13] perf/x86/intel/pt: Change pt_cap_get() to a public function Luwei Kang
` (11 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
From: Chao Peng <chao.p.peng@linux.intel.com>
Intel Processor Trace virtualization enabling in KVM guest
need to access these MSRs bit definitions, so move them to
public header file msr-index.h.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/events/intel/pt.h | 37 -------------------------------------
arch/x86/include/asm/msr-index.h | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 34 insertions(+), 37 deletions(-)
diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h
index 0eb41d0..0050ca1 100644
--- a/arch/x86/events/intel/pt.h
+++ b/arch/x86/events/intel/pt.h
@@ -20,43 +20,6 @@
#define __INTEL_PT_H__
/*
- * PT MSR bit definitions
- */
-#define RTIT_CTL_TRACEEN BIT(0)
-#define RTIT_CTL_CYCLEACC BIT(1)
-#define RTIT_CTL_OS BIT(2)
-#define RTIT_CTL_USR BIT(3)
-#define RTIT_CTL_PWR_EVT_EN BIT(4)
-#define RTIT_CTL_FUP_ON_PTW BIT(5)
-#define RTIT_CTL_CR3EN BIT(7)
-#define RTIT_CTL_TOPA BIT(8)
-#define RTIT_CTL_MTC_EN BIT(9)
-#define RTIT_CTL_TSC_EN BIT(10)
-#define RTIT_CTL_DISRETC BIT(11)
-#define RTIT_CTL_PTW_EN BIT(12)
-#define RTIT_CTL_BRANCH_EN BIT(13)
-#define RTIT_CTL_MTC_RANGE_OFFSET 14
-#define RTIT_CTL_MTC_RANGE (0x0full << RTIT_CTL_MTC_RANGE_OFFSET)
-#define RTIT_CTL_CYC_THRESH_OFFSET 19
-#define RTIT_CTL_CYC_THRESH (0x0full << RTIT_CTL_CYC_THRESH_OFFSET)
-#define RTIT_CTL_PSB_FREQ_OFFSET 24
-#define RTIT_CTL_PSB_FREQ (0x0full << RTIT_CTL_PSB_FREQ_OFFSET)
-#define RTIT_CTL_ADDR0_OFFSET 32
-#define RTIT_CTL_ADDR0 (0x0full << RTIT_CTL_ADDR0_OFFSET)
-#define RTIT_CTL_ADDR1_OFFSET 36
-#define RTIT_CTL_ADDR1 (0x0full << RTIT_CTL_ADDR1_OFFSET)
-#define RTIT_CTL_ADDR2_OFFSET 40
-#define RTIT_CTL_ADDR2 (0x0full << RTIT_CTL_ADDR2_OFFSET)
-#define RTIT_CTL_ADDR3_OFFSET 44
-#define RTIT_CTL_ADDR3 (0x0full << RTIT_CTL_ADDR3_OFFSET)
-#define RTIT_STATUS_FILTEREN BIT(0)
-#define RTIT_STATUS_CONTEXTEN BIT(1)
-#define RTIT_STATUS_TRIGGEREN BIT(2)
-#define RTIT_STATUS_BUFFOVF BIT(3)
-#define RTIT_STATUS_ERROR BIT(4)
-#define RTIT_STATUS_STOPPED BIT(5)
-
-/*
* Single-entry ToPA: when this close to region boundary, switch
* buffers to avoid losing data.
*/
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 53d5b1b..c168e26 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -106,7 +106,40 @@
#define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6
#define MSR_IA32_RTIT_CTL 0x00000570
+#define RTIT_CTL_TRACEEN BIT(0)
+#define RTIT_CTL_CYCLEACC BIT(1)
+#define RTIT_CTL_OS BIT(2)
+#define RTIT_CTL_USR BIT(3)
+#define RTIT_CTL_PWR_EVT_EN BIT(4)
+#define RTIT_CTL_FUP_ON_PTW BIT(5)
+#define RTIT_CTL_CR3EN BIT(7)
+#define RTIT_CTL_TOPA BIT(8)
+#define RTIT_CTL_MTC_EN BIT(9)
+#define RTIT_CTL_TSC_EN BIT(10)
+#define RTIT_CTL_DISRETC BIT(11)
+#define RTIT_CTL_PTW_EN BIT(12)
+#define RTIT_CTL_BRANCH_EN BIT(13)
+#define RTIT_CTL_MTC_RANGE_OFFSET 14
+#define RTIT_CTL_MTC_RANGE (0x0full << RTIT_CTL_MTC_RANGE_OFFSET)
+#define RTIT_CTL_CYC_THRESH_OFFSET 19
+#define RTIT_CTL_CYC_THRESH (0x0full << RTIT_CTL_CYC_THRESH_OFFSET)
+#define RTIT_CTL_PSB_FREQ_OFFSET 24
+#define RTIT_CTL_PSB_FREQ (0x0full << RTIT_CTL_PSB_FREQ_OFFSET)
+#define RTIT_CTL_ADDR0_OFFSET 32
+#define RTIT_CTL_ADDR0 (0x0full << RTIT_CTL_ADDR0_OFFSET)
+#define RTIT_CTL_ADDR1_OFFSET 36
+#define RTIT_CTL_ADDR1 (0x0full << RTIT_CTL_ADDR1_OFFSET)
+#define RTIT_CTL_ADDR2_OFFSET 40
+#define RTIT_CTL_ADDR2 (0x0full << RTIT_CTL_ADDR2_OFFSET)
+#define RTIT_CTL_ADDR3_OFFSET 44
+#define RTIT_CTL_ADDR3 (0x0full << RTIT_CTL_ADDR3_OFFSET)
#define MSR_IA32_RTIT_STATUS 0x00000571
+#define RTIT_STATUS_FILTEREN BIT(0)
+#define RTIT_STATUS_CONTEXTEN BIT(1)
+#define RTIT_STATUS_TRIGGEREN BIT(2)
+#define RTIT_STATUS_BUFFOVF BIT(3)
+#define RTIT_STATUS_ERROR BIT(4)
+#define RTIT_STATUS_STOPPED BIT(5)
#define MSR_IA32_RTIT_ADDR0_A 0x00000580
#define MSR_IA32_RTIT_ADDR0_B 0x00000581
#define MSR_IA32_RTIT_ADDR1_A 0x00000582
@@ -115,6 +148,7 @@
#define MSR_IA32_RTIT_ADDR2_B 0x00000585
#define MSR_IA32_RTIT_ADDR3_A 0x00000586
#define MSR_IA32_RTIT_ADDR3_B 0x00000587
+#define MSR_IA32_RTIT_ADDR_COUNT 8
#define MSR_IA32_RTIT_CR3_MATCH 0x00000572
#define MSR_IA32_RTIT_OUTPUT_BASE 0x00000560
#define MSR_IA32_RTIT_OUTPUT_MASK 0x00000561
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 02/13] perf/x86/intel/pt: Change pt_cap_get() to a public function
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
2018-05-03 10:23 ` Kang, Luwei
2018-05-03 12:13 ` [PATCH v7 01/13] perf/x86/intel/pt: Move Intel-PT MSRs bit definitions to a public header Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 03/13] perf/x86/intel/pt: Add new bit definitions for Intel PT MSRs Luwei Kang
` (10 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
From: Chao Peng <chao.p.peng@linux.intel.com>
Change pt_cap_get() to a public function that KVM
can access this function to check if specific
feature is supported on hardware.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/events/intel/pt.c | 3 ++-
arch/x86/events/intel/pt.h | 21 ---------------------
arch/x86/include/asm/intel_pt.h | 23 +++++++++++++++++++++++
3 files changed, 25 insertions(+), 22 deletions(-)
diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 81fd41d..a5a7e44 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -75,7 +75,7 @@
PT_CAP(psb_periods, 1, CPUID_EBX, 0xffff0000),
};
-static u32 pt_cap_get(enum pt_capabilities cap)
+u32 pt_cap_get(enum pt_capabilities cap)
{
struct pt_cap_desc *cd = &pt_caps[cap];
u32 c = pt_pmu.caps[cd->leaf * PT_CPUID_REGS_NUM + cd->reg];
@@ -83,6 +83,7 @@ static u32 pt_cap_get(enum pt_capabilities cap)
return (c & cd->mask) >> shift;
}
+EXPORT_SYMBOL_GPL(pt_cap_get);
static ssize_t pt_cap_show(struct device *cdev,
struct device_attribute *attr,
diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h
index 0050ca1..269e15a 100644
--- a/arch/x86/events/intel/pt.h
+++ b/arch/x86/events/intel/pt.h
@@ -45,30 +45,9 @@ struct topa_entry {
u64 rsvd4 : 16;
};
-#define PT_CPUID_LEAVES 2
-#define PT_CPUID_REGS_NUM 4 /* number of regsters (eax, ebx, ecx, edx) */
-
/* TSC to Core Crystal Clock Ratio */
#define CPUID_TSC_LEAF 0x15
-enum pt_capabilities {
- PT_CAP_max_subleaf = 0,
- PT_CAP_cr3_filtering,
- PT_CAP_psb_cyc,
- PT_CAP_ip_filtering,
- PT_CAP_mtc,
- PT_CAP_ptwrite,
- PT_CAP_power_event_trace,
- PT_CAP_topa_output,
- PT_CAP_topa_multiple_entries,
- PT_CAP_single_range_output,
- PT_CAP_payloads_lip,
- PT_CAP_num_address_ranges,
- PT_CAP_mtc_periods,
- PT_CAP_cycle_thresholds,
- PT_CAP_psb_periods,
-};
-
struct pt_pmu {
struct pmu pmu;
u32 caps[PT_CPUID_REGS_NUM * PT_CPUID_LEAVES];
diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index b523f51..4270421 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -2,10 +2,33 @@
#ifndef _ASM_X86_INTEL_PT_H
#define _ASM_X86_INTEL_PT_H
+#define PT_CPUID_LEAVES 2
+#define PT_CPUID_REGS_NUM 4 /* number of regsters (eax, ebx, ecx, edx) */
+
+enum pt_capabilities {
+ PT_CAP_max_subleaf = 0,
+ PT_CAP_cr3_filtering,
+ PT_CAP_psb_cyc,
+ PT_CAP_ip_filtering,
+ PT_CAP_mtc,
+ PT_CAP_ptwrite,
+ PT_CAP_power_event_trace,
+ PT_CAP_topa_output,
+ PT_CAP_topa_multiple_entries,
+ PT_CAP_single_range_output,
+ PT_CAP_payloads_lip,
+ PT_CAP_num_address_ranges,
+ PT_CAP_mtc_periods,
+ PT_CAP_cycle_thresholds,
+ PT_CAP_psb_periods,
+};
+
#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
void cpu_emergency_stop_pt(void);
+extern u32 pt_cap_get(enum pt_capabilities cap);
#else
static inline void cpu_emergency_stop_pt(void) {}
+static inline u32 pt_cap_get(enum pt_capabilities cap) { return 0; }
#endif
#endif /* _ASM_X86_INTEL_PT_H */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 03/13] perf/x86/intel/pt: Add new bit definitions for Intel PT MSRs
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (2 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 02/13] perf/x86/intel/pt: Change pt_cap_get() to a public function Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 04/13] perf/x86/intel/pt: add new capability for Intel PT Luwei Kang
` (9 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
These new bit definitions are use for emulate MSRs read/write
for KVM. For example, IA32_RTIT_CTL.FabricEn[bit 6] is available
only when CPUID.(EAX=14H, ECX=0):ECX[bit 3] = 1. If KVM guest
try to set this bit with CPUID.(EAX=14H, ECX=0):ECX[bit3] = 0
a #GP would be injected to KVM guest.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/include/asm/msr-index.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index c168e26..cc9e681 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -112,6 +112,7 @@
#define RTIT_CTL_USR BIT(3)
#define RTIT_CTL_PWR_EVT_EN BIT(4)
#define RTIT_CTL_FUP_ON_PTW BIT(5)
+#define RTIT_CTL_FABRIC_EN BIT(6)
#define RTIT_CTL_CR3EN BIT(7)
#define RTIT_CTL_TOPA BIT(8)
#define RTIT_CTL_MTC_EN BIT(9)
@@ -140,6 +141,8 @@
#define RTIT_STATUS_BUFFOVF BIT(3)
#define RTIT_STATUS_ERROR BIT(4)
#define RTIT_STATUS_STOPPED BIT(5)
+#define RTIT_STATUS_BYTECNT_OFFSET 32
+#define RTIT_STATUS_BYTECNT (0x1ffffull << RTIT_STATUS_BYTECNT_OFFSET)
#define MSR_IA32_RTIT_ADDR0_A 0x00000580
#define MSR_IA32_RTIT_ADDR0_B 0x00000581
#define MSR_IA32_RTIT_ADDR1_A 0x00000582
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 04/13] perf/x86/intel/pt: add new capability for Intel PT
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (3 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 03/13] perf/x86/intel/pt: Add new bit definitions for Intel PT MSRs Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 05/13] perf/x86/intel/pt: Introduce a new function to get capability of " Luwei Kang
` (8 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
CPUID(EAX=14H,ECX=0):EBX[bit 3] = 1 indicates support of
output to Trace Transport subsystem.
MSR IA32_RTIT_CTL.FabricEn[bit 6] is reserved if
CPUID.(EAX=14H, ECX=0):ECX[bit 3] = 0.
This is use for emulate IA32_RTIT_CTL MSR read/write
in KVM. KVM guest write IA32_RTIT_CTL will trap to
root mode and a #GP would be injected to guest if set
IA32_RTIT_CTL.FabricEn with
CPUID.(EAX=14H, ECX=0):ECX[bit 3] = 0.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/events/intel/pt.c | 1 +
arch/x86/include/asm/intel_pt.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index a5a7e44..d5819a2 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -68,6 +68,7 @@
PT_CAP(topa_output, 0, CPUID_ECX, BIT(0)),
PT_CAP(topa_multiple_entries, 0, CPUID_ECX, BIT(1)),
PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)),
+ PT_CAP(output_subsys, 0, CPUID_ECX, BIT(3)),
PT_CAP(payloads_lip, 0, CPUID_ECX, BIT(31)),
PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3),
PT_CAP(mtc_periods, 1, CPUID_EAX, 0xffff0000),
diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index 4270421..2de4db0 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -16,6 +16,7 @@ enum pt_capabilities {
PT_CAP_topa_output,
PT_CAP_topa_multiple_entries,
PT_CAP_single_range_output,
+ PT_CAP_output_subsys,
PT_CAP_payloads_lip,
PT_CAP_num_address_ranges,
PT_CAP_mtc_periods,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 05/13] perf/x86/intel/pt: Introduce a new function to get capability of Intel PT
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (4 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 04/13] perf/x86/intel/pt: add new capability for Intel PT Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 06/13] KVM: x86: Add Intel Processor Trace virtualization mode Luwei Kang
` (7 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
New function __pt_cap_get() will be invoked in KVM to check
if a specific capability is availiable in KVM guest.
Another function pt_cap_get() can only check the hardware
capabilities but this may different with KVM guest because
some features may not be exposed to guest.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/events/intel/pt.c | 10 ++++++++--
arch/x86/include/asm/intel_pt.h | 2 ++
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index d5819a2..a1fe6ff 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -76,14 +76,20 @@
PT_CAP(psb_periods, 1, CPUID_EBX, 0xffff0000),
};
-u32 pt_cap_get(enum pt_capabilities cap)
+u32 __pt_cap_get(u32 *caps, enum pt_capabilities cap)
{
struct pt_cap_desc *cd = &pt_caps[cap];
- u32 c = pt_pmu.caps[cd->leaf * PT_CPUID_REGS_NUM + cd->reg];
+ u32 c = caps[cd->leaf * PT_CPUID_REGS_NUM + cd->reg];
unsigned int shift = __ffs(cd->mask);
return (c & cd->mask) >> shift;
}
+EXPORT_SYMBOL_GPL(__pt_cap_get);
+
+u32 pt_cap_get(enum pt_capabilities cap)
+{
+ return __pt_cap_get(pt_pmu.caps, cap);
+}
EXPORT_SYMBOL_GPL(pt_cap_get);
static ssize_t pt_cap_show(struct device *cdev,
diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index 2de4db0..3a4f524 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -27,9 +27,11 @@ enum pt_capabilities {
#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
void cpu_emergency_stop_pt(void);
extern u32 pt_cap_get(enum pt_capabilities cap);
+extern u32 __pt_cap_get(u32 *caps, enum pt_capabilities cap);
#else
static inline void cpu_emergency_stop_pt(void) {}
static inline u32 pt_cap_get(enum pt_capabilities cap) { return 0; }
+static u32 __pt_cap_get(u32 *caps, enum pt_capabilities cap) { return 0; }
#endif
#endif /* _ASM_X86_INTEL_PT_H */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 06/13] KVM: x86: Add Intel Processor Trace virtualization mode
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (5 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 05/13] perf/x86/intel/pt: Introduce a new function to get capability of " Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 07/13] KVM: x86: Add Intel Processor Trace cpuid emulation Luwei Kang
` (6 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
From: Chao Peng <chao.p.peng@linux.intel.com>
Intel PT virtualization can be work in one of 3 possible modes:
a. system-wide: trace both host/guest and output to host buffer;
b. host-only: only trace host and output to host buffer;
c. host-guest: trace host/guest simultaneous and output to their
respective buffer.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/include/asm/intel_pt.h | 6 ++++
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/vmx.h | 8 +++++
arch/x86/kvm/vmx.c | 68 +++++++++++++++++++++++++++++++++++++---
4 files changed, 79 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index 3a4f524..43ad260 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -5,6 +5,12 @@
#define PT_CPUID_LEAVES 2
#define PT_CPUID_REGS_NUM 4 /* number of regsters (eax, ebx, ecx, edx) */
+enum pt_mode {
+ PT_MODE_SYSTEM = 0,
+ PT_MODE_HOST,
+ PT_MODE_HOST_GUEST,
+};
+
enum pt_capabilities {
PT_CAP_max_subleaf = 0,
PT_CAP_cr3_filtering,
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index cc9e681..c813507 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -790,6 +790,7 @@
#define VMX_BASIC_INOUT 0x0040000000000000LLU
/* MSR_IA32_VMX_MISC bits */
+#define MSR_IA32_VMX_MISC_INTEL_PT (1ULL << 14)
#define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
#define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE 0x1F
/* AMD-V MSRs */
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 8b67807..9e828d4 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -76,7 +76,9 @@
#define SECONDARY_EXEC_SHADOW_VMCS 0x00004000
#define SECONDARY_EXEC_RDSEED_EXITING 0x00010000
#define SECONDARY_EXEC_ENABLE_PML 0x00020000
+#define SECONDARY_EXEC_PT_CONCEAL_VMX 0x00080000
#define SECONDARY_EXEC_XSAVES 0x00100000
+#define SECONDARY_EXEC_PT_USE_GPA 0x01000000
#define SECONDARY_EXEC_TSC_SCALING 0x02000000
#define PIN_BASED_EXT_INTR_MASK 0x00000001
@@ -97,6 +99,8 @@
#define VM_EXIT_LOAD_IA32_EFER 0x00200000
#define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER 0x00400000
#define VM_EXIT_CLEAR_BNDCFGS 0x00800000
+#define VM_EXIT_PT_CONCEAL_PIP 0x01000000
+#define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000
#define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff
@@ -108,6 +112,8 @@
#define VM_ENTRY_LOAD_IA32_PAT 0x00004000
#define VM_ENTRY_LOAD_IA32_EFER 0x00008000
#define VM_ENTRY_LOAD_BNDCFGS 0x00010000
+#define VM_ENTRY_PT_CONCEAL_PIP 0x00020000
+#define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000
#define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff
@@ -234,6 +240,8 @@ enum vmcs_field {
GUEST_PDPTR3_HIGH = 0x00002811,
GUEST_BNDCFGS = 0x00002812,
GUEST_BNDCFGS_HIGH = 0x00002813,
+ GUEST_IA32_RTIT_CTL = 0x00002814,
+ GUEST_IA32_RTIT_CTL_HIGH = 0x00002815,
HOST_IA32_PAT = 0x00002c00,
HOST_IA32_PAT_HIGH = 0x00002c01,
HOST_IA32_EFER = 0x00002c02,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5b49ad4..8680cd5 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -54,6 +54,7 @@
#include <asm/microcode.h>
#include <asm/nospec-branch.h>
#include <asm/mshyperv.h>
+#include <asm/intel_pt.h>
#include "trace.h"
#include "pmu.h"
@@ -187,6 +188,10 @@
static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX;
module_param(ple_window_max, uint, 0444);
+/* Default is SYSTEM mode. */
+static int __read_mostly pt_mode = PT_MODE_SYSTEM;
+module_param(pt_mode, int, S_IRUGO);
+
extern const ulong vmx_return;
struct kvm_vmx {
@@ -1488,6 +1493,19 @@ static inline bool cpu_has_vmx_vmfunc(void)
SECONDARY_EXEC_ENABLE_VMFUNC;
}
+static inline bool cpu_has_vmx_intel_pt(void)
+{
+ u64 vmx_msr;
+
+ rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
+ return vmx_msr & MSR_IA32_VMX_MISC_INTEL_PT;
+}
+
+static inline bool cpu_has_vmx_pt_use_gpa(void)
+{
+ return vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_PT_USE_GPA;
+}
+
static inline bool report_flexpriority(void)
{
return flexpriority_enabled;
@@ -4002,6 +4020,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
SECONDARY_EXEC_RDRAND_EXITING |
SECONDARY_EXEC_ENABLE_PML |
SECONDARY_EXEC_TSC_SCALING |
+ SECONDARY_EXEC_PT_USE_GPA |
+ SECONDARY_EXEC_PT_CONCEAL_VMX |
SECONDARY_EXEC_ENABLE_VMFUNC;
if (adjust_vmx_controls(min2, opt2,
MSR_IA32_VMX_PROCBASED_CTLS2,
@@ -4046,7 +4066,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
#endif
opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT |
- VM_EXIT_CLEAR_BNDCFGS;
+ VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_PT_CONCEAL_PIP |
+ VM_EXIT_CLEAR_IA32_RTIT_CTL;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
&_vmexit_control) < 0)
return -EIO;
@@ -4065,11 +4086,20 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
- opt = VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS;
+ opt = VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS |
+ VM_ENTRY_PT_CONCEAL_PIP | VM_ENTRY_LOAD_IA32_RTIT_CTL;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
&_vmentry_control) < 0)
return -EIO;
+ if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_PT_USE_GPA) ||
+ !(_vmexit_control & VM_EXIT_CLEAR_IA32_RTIT_CTL) ||
+ !(_vmentry_control & VM_ENTRY_LOAD_IA32_RTIT_CTL)) {
+ _cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_PT_USE_GPA;
+ _vmexit_control &= ~VM_EXIT_CLEAR_IA32_RTIT_CTL;
+ _vmentry_control &= ~VM_ENTRY_LOAD_IA32_RTIT_CTL;
+ }
+
rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);
/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -5780,6 +5810,28 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
return exec_control;
}
+static u32 vmx_vmexit_control(struct vcpu_vmx *vmx)
+{
+ u32 vmexit_control = vmcs_config.vmexit_ctrl;
+
+ if (pt_mode == PT_MODE_SYSTEM)
+ vmexit_control &= ~(VM_EXIT_CLEAR_IA32_RTIT_CTL |
+ VM_EXIT_PT_CONCEAL_PIP);
+
+ return vmexit_control;
+}
+
+static u32 vmx_vmentry_control(struct vcpu_vmx *vmx)
+{
+ u32 vmentry_control = vmcs_config.vmentry_ctrl;
+
+ if (pt_mode == PT_MODE_SYSTEM)
+ vmentry_control &= ~(VM_ENTRY_PT_CONCEAL_PIP |
+ VM_ENTRY_LOAD_IA32_RTIT_CTL);
+
+ return vmentry_control;
+}
+
static bool vmx_rdrand_supported(void)
{
return vmcs_config.cpu_based_2nd_exec_ctrl &
@@ -5916,6 +5968,10 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx)
}
}
+ if (pt_mode == PT_MODE_SYSTEM)
+ exec_control &= ~(SECONDARY_EXEC_PT_USE_GPA |
+ SECONDARY_EXEC_PT_CONCEAL_VMX);
+
vmx->secondary_exec_control = exec_control;
}
@@ -6026,10 +6082,10 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx)
if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
rdmsrl(MSR_IA32_ARCH_CAPABILITIES, vmx->arch_capabilities);
- vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl);
+ vm_exit_controls_init(vmx, vmx_vmexit_control(vmx));
/* 22.2.1, 20.8.1 */
- vm_entry_controls_init(vmx, vmcs_config.vmentry_ctrl);
+ vm_entry_controls_init(vmx, vmx_vmentry_control(vmx));
vmx->vcpu.arch.cr0_guest_owned_bits = X86_CR0_TS;
vmcs_writel(CR0_GUEST_HOST_MASK, ~X86_CR0_TS);
@@ -7350,6 +7406,10 @@ static __init int hardware_setup(void)
kvm_mce_cap_supported |= MCG_LMCE_P;
+ if (!enable_ept || !pt_cap_get(PT_CAP_topa_output) ||
+ !cpu_has_vmx_intel_pt() || !cpu_has_vmx_pt_use_gpa())
+ pt_mode = PT_MODE_SYSTEM;
+
return alloc_kvm_area();
out:
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 07/13] KVM: x86: Add Intel Processor Trace cpuid emulation
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (6 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 06/13] KVM: x86: Add Intel Processor Trace virtualization mode Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 08/13] KVM: x86: Add Intel processor trace context for each vcpu Luwei Kang
` (5 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
From: Chao Peng <chao.p.peng@linux.intel.com>
Expose Intel Processor Trace to guest only when PT work in
HOST_GUEST mode.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/cpuid.c | 22 ++++++++++++++++++++--
arch/x86/kvm/svm.c | 6 ++++++
arch/x86/kvm/vmx.c | 6 ++++++
4 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c25775f..03875aa 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1025,6 +1025,7 @@ struct kvm_x86_ops {
bool (*mpx_supported)(void);
bool (*xsaves_supported)(void);
bool (*umip_emulated)(void);
+ bool (*pt_supported)(void);
int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 82055b9..e04bf67 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -336,6 +336,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
unsigned f_mpx = kvm_mpx_supported() ? F(MPX) : 0;
unsigned f_xsaves = kvm_x86_ops->xsaves_supported() ? F(XSAVES) : 0;
unsigned f_umip = kvm_x86_ops->umip_emulated() ? F(UMIP) : 0;
+ unsigned f_intel_pt = kvm_x86_ops->pt_supported() ? F(INTEL_PT) : 0;
/* cpuid 1.edx */
const u32 kvm_cpuid_1_edx_x86_features =
@@ -393,7 +394,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | f_mpx | F(RDSEED) |
F(ADX) | F(SMAP) | F(AVX512IFMA) | F(AVX512F) | F(AVX512PF) |
F(AVX512ER) | F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB) | F(AVX512DQ) |
- F(SHA_NI) | F(AVX512BW) | F(AVX512VL);
+ F(SHA_NI) | F(AVX512BW) | F(AVX512VL) | f_intel_pt;
/* cpuid 0xD.1.eax */
const u32 kvm_cpuid_D_1_eax_x86_features =
@@ -423,7 +424,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
switch (function) {
case 0:
- entry->eax = min(entry->eax, (u32)0xd);
+ entry->eax = min(entry->eax, (u32)(f_intel_pt ? 0x14 : 0xd));
break;
case 1:
entry->edx &= kvm_cpuid_1_edx_x86_features;
@@ -595,6 +596,23 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
}
break;
}
+ /* Intel PT */
+ case 0x14: {
+ int t, times = entry->eax;
+
+ if (!f_intel_pt)
+ break;
+
+ entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+ for (t = 1; t <= times; ++t) {
+ if (*nent >= maxnent)
+ goto out;
+ do_cpuid_1_ent(&entry[t], function, t);
+ entry[t].flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+ ++*nent;
+ }
+ break;
+ }
case KVM_CPUID_SIGNATURE: {
static const char signature[12] = "KVMKVMKVM\0\0";
const u32 *sigptr = (const u32 *)signature;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 80ea262..23a1ef9 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5877,6 +5877,11 @@ static bool svm_umip_emulated(void)
return false;
}
+static bool svm_pt_supported(void)
+{
+ return false;
+}
+
static bool svm_has_wbinvd_exit(void)
{
return true;
@@ -7102,6 +7107,7 @@ static int svm_unregister_enc_region(struct kvm *kvm,
.mpx_supported = svm_mpx_supported,
.xsaves_supported = svm_xsaves_supported,
.umip_emulated = svm_umip_emulated,
+ .pt_supported = svm_pt_supported,
.set_supported_cpuid = svm_set_supported_cpuid,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 8680cd5..27cf417 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9563,6 +9563,11 @@ static bool vmx_umip_emulated(void)
SECONDARY_EXEC_DESC;
}
+static bool vmx_pt_supported(void)
+{
+ return (pt_mode == PT_MODE_HOST_GUEST);
+}
+
static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx)
{
u32 exit_intr_info;
@@ -12790,6 +12795,7 @@ static int enable_smi_window(struct kvm_vcpu *vcpu)
.mpx_supported = vmx_mpx_supported,
.xsaves_supported = vmx_xsaves_supported,
.umip_emulated = vmx_umip_emulated,
+ .pt_supported = vmx_pt_supported,
.check_nested_events = vmx_check_nested_events,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 08/13] KVM: x86: Add Intel processor trace context for each vcpu
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (7 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 07/13] KVM: x86: Add Intel Processor Trace cpuid emulation Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 09/13] KVM: x86: Implement Intel Processor Trace context switch Luwei Kang
` (4 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
From: Chao Peng <chao.p.peng@linux.intel.com>
Add a data structure to save Intel Processor Trace context.
It mainly include the value of Intel PT host/guest MSRs
and guest CPUID information.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/kvm/vmx.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 27cf417..9a5c26d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -597,6 +597,23 @@ static inline int pi_test_sn(struct pi_desc *pi_desc)
(unsigned long *)&pi_desc->control);
}
+struct pt_ctx {
+ u64 ctl;
+ u64 status;
+ u64 output_base;
+ u64 output_mask;
+ u64 cr3_match;
+ u64 addrs[MSR_IA32_RTIT_ADDR_COUNT];
+};
+
+struct pt_desc {
+ u64 ctl_bitmask;
+ u32 range_cnt;
+ u32 caps[PT_CPUID_REGS_NUM * PT_CPUID_LEAVES];
+ struct pt_ctx host;
+ struct pt_ctx guest;
+};
+
struct vcpu_vmx {
struct kvm_vcpu vcpu;
unsigned long host_rsp;
@@ -693,6 +710,8 @@ struct vcpu_vmx {
*/
u64 msr_ia32_feature_control;
u64 msr_ia32_feature_control_valid_bits;
+
+ struct pt_desc pt_desc;
};
enum segment_cache_field {
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 09/13] KVM: x86: Implement Intel Processor Trace context switch
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (8 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 08/13] KVM: x86: Add Intel processor trace context for each vcpu Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 10/13] KVM: x86: Introduce a function to initialize the PT configuration Luwei Kang
` (3 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
From: Chao Peng <chao.p.peng@linux.intel.com>
Load/Store Intel processor trace register in context switch.
MSR IA32_RTIT_CTL is loaded/stored automatically from VMCS.
In HOST mode, we just need to restore the status of IA32_RTIT_CTL.
In HOST_GUEST mode, we need load/resore PT MSRs only when PT is
enabled in guest.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/kvm/vmx.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9a5c26d..a08c61b 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2385,6 +2385,55 @@ static unsigned long segment_base(u16 selector)
}
#endif
+static inline void pt_load_msr(struct pt_ctx *ctx, u32 range_cnt)
+{
+ u32 i;
+
+ wrmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
+ wrmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
+ wrmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
+ wrmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
+ for (i = 0; i < range_cnt; i++)
+ wrmsrl(MSR_IA32_RTIT_ADDR0_A + i, ctx->addrs[i]);
+}
+
+static inline void pt_save_msr(struct pt_ctx *ctx, u32 range_cnt)
+{
+ u32 i;
+
+ rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
+ rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
+ rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
+ rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
+ for (i = 0; i < range_cnt; i++)
+ rdmsrl(MSR_IA32_RTIT_ADDR0_A + i, ctx->addrs[i]);
+}
+
+static void pt_guest_enter(struct vcpu_vmx *vmx)
+{
+ if (pt_mode == PT_MODE_HOST || pt_mode == PT_MODE_HOST_GUEST)
+ rdmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
+
+ if (pt_mode == PT_MODE_HOST_GUEST &&
+ vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
+ wrmsrl(MSR_IA32_RTIT_CTL, 0);
+ pt_save_msr(&vmx->pt_desc.host, vmx->pt_desc.range_cnt);
+ pt_load_msr(&vmx->pt_desc.guest, vmx->pt_desc.range_cnt);
+ }
+}
+
+static void pt_guest_exit(struct vcpu_vmx *vmx)
+{
+ if (pt_mode == PT_MODE_HOST_GUEST &&
+ vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
+ pt_save_msr(&vmx->pt_desc.guest, vmx->pt_desc.range_cnt);
+ pt_load_msr(&vmx->pt_desc.host, vmx->pt_desc.range_cnt);
+ }
+
+ if (pt_mode == PT_MODE_HOST || pt_mode == PT_MODE_HOST_GUEST)
+ wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
+}
+
static void vmx_save_host_state(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -6119,6 +6168,13 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx)
vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg));
vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1);
}
+
+ if (pt_mode == PT_MODE_HOST_GUEST) {
+ memset(&vmx->pt_desc, 0, sizeof(vmx->pt_desc));
+ /* Bit[6~0] are forced to 1, writes are ignored. */
+ vmx->pt_desc.guest.output_mask = 0x7F;
+ vmcs_write64(GUEST_IA32_RTIT_CTL, 0);
+ }
}
static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
@@ -9794,6 +9850,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
vcpu->arch.pkru != vmx->host_pkru)
__write_pkru(vcpu->arch.pkru);
+ pt_guest_enter(vmx);
+
atomic_switch_perf_msrs(vmx);
vmx_arm_hv_timer(vcpu);
@@ -9988,6 +10046,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
| (1 << VCPU_EXREG_CR3));
vcpu->arch.regs_dirty = 0;
+ pt_guest_exit(vmx);
+
/*
* eager fpu is enabled if PKEY is supported and CR4 is switched
* back on host, so it is safe to read guest PKRU from current
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 10/13] KVM: x86: Introduce a function to initialize the PT configuration
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (9 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 09/13] KVM: x86: Implement Intel Processor Trace context switch Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 11/13] KVM: x86: Implement Intel Processor Trace MSRs read/write Luwei Kang
` (2 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
Initialize the Intel PT configuration when cpuid update.
Include cpuid inforamtion, rtit_ctl bit mask and the number of
address ranges.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/kvm/vmx.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 69 insertions(+)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a08c61b..3ed02a8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -10371,6 +10371,71 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
#undef cr4_fixed1_update
}
+static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+ struct kvm_cpuid_entry2 *best = NULL;
+ int i;
+
+ for (i = 0; i < PT_CPUID_LEAVES; i++) {
+ best = kvm_find_cpuid_entry(vcpu, 0x14, i);
+ if (!best)
+ return;
+ vmx->pt_desc.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM] = best->eax;
+ vmx->pt_desc.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM] = best->ebx;
+ vmx->pt_desc.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM] = best->ecx;
+ vmx->pt_desc.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM] = best->edx;
+ }
+
+ /* Get the number of configurable Address Ranges for filtering */
+ vmx->pt_desc.range_cnt = __pt_cap_get(vmx->pt_desc.caps,
+ PT_CAP_num_address_ranges);
+
+ /* Clear the no dependency bits */
+ vmx->pt_desc.ctl_bitmask = ~(RTIT_CTL_TRACEEN | RTIT_CTL_OS |
+ RTIT_CTL_USR | RTIT_CTL_TSC_EN | RTIT_CTL_DISRETC);
+
+ /* If CPUID.(EAX=14H,ECX=0):EBX[0]=1 CR3Filter can be set */
+ if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_cr3_filtering))
+ vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_CR3EN;
+
+ /*
+ * If CPUID.(EAX=14H,ECX=0):EBX[1]=1 CYCEn, CycThresh and
+ * PSBFreq can be set
+ */
+ if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_psb_cyc))
+ vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_CYCLEACC |
+ RTIT_CTL_CYC_THRESH | RTIT_CTL_PSB_FREQ);
+
+ /*
+ * If CPUID.(EAX=14H,ECX=0):EBX[3]=1 MTCEn BranchEn and
+ * MTCFreq can be set
+ */
+ if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_mtc))
+ vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_MTC_EN |
+ RTIT_CTL_BRANCH_EN | RTIT_CTL_MTC_RANGE);
+
+ /* If CPUID.(EAX=14H,ECX=0):EBX[4]=1 FUPonPTW and PTWEn can be set */
+ if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_ptwrite))
+ vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_FUP_ON_PTW |
+ RTIT_CTL_PTW_EN);
+
+ /* If CPUID.(EAX=14H,ECX=0):EBX[5]=1 PwrEvEn can be set */
+ if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_power_event_trace))
+ vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_PWR_EVT_EN;
+
+ /* If CPUID.(EAX=14H,ECX=0):ECX[0]=1 ToPA can be set */
+ if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_topa_output))
+ vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_TOPA;
+ /* If CPUID.(EAX=14H,ECX=0):ECX[3]=1 FabircEn can be set */
+ if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_output_subsys))
+ vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_FABRIC_EN;
+
+ /* unmask address range configure area */
+ for (i = 0; i < vmx->pt_desc.range_cnt; i++)
+ vmx->pt_desc.ctl_bitmask &= ~(0xf << (32 + i*4));
+}
+
static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -10389,6 +10454,10 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
if (nested_vmx_allowed(vcpu))
nested_vmx_cr_fixed1_bits_update(vcpu);
+
+ if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
+ guest_cpuid_has(vcpu, X86_FEATURE_INTEL_PT))
+ update_intel_pt_cfg(vcpu);
}
static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 11/13] KVM: x86: Implement Intel Processor Trace MSRs read/write
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (10 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 10/13] KVM: x86: Introduce a function to initialize the PT configuration Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 12/13] KVM: x86: Set intercept for Intel PT " Luwei Kang
2018-05-03 12:13 ` [PATCH v7 13/13] KVM: x86: Disable Intel Processor Trace when VMXON in L1 guest Luwei Kang
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
From: Chao Peng <chao.p.peng@linux.intel.com>
Implement Intel Processor Trace MSRs read/write.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/include/asm/intel_pt.h | 8 ++
arch/x86/kvm/vmx.c | 163 ++++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.c | 33 +++++++-
3 files changed, 203 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index 43ad260..dc0f3f0 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -5,6 +5,14 @@
#define PT_CPUID_LEAVES 2
#define PT_CPUID_REGS_NUM 4 /* number of regsters (eax, ebx, ecx, edx) */
+#define MSR_IA32_RTIT_STATUS_MASK (~(RTIT_STATUS_FILTEREN | \
+ RTIT_STATUS_CONTEXTEN | RTIT_STATUS_TRIGGEREN | \
+ RTIT_STATUS_ERROR | RTIT_STATUS_STOPPED | \
+ RTIT_STATUS_BYTECNT))
+
+#define MSR_IA32_RTIT_OUTPUT_BASE_MASK \
+ (~((1UL << cpuid_query_maxphyaddr(vcpu)) - 1) | 0x7f)
+
enum pt_mode {
PT_MODE_SYSTEM = 0,
PT_MODE_HOST,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 3ed02a8..2a29ab9 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2769,6 +2769,77 @@ static void vmx_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, interruptibility);
}
+static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
+{
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+ unsigned long value;
+
+ /*
+ * Any MSR write that attempts to change bits marked reserved will
+ * case a #GP fault.
+ */
+ if (data & vmx->pt_desc.ctl_bitmask)
+ return 1;
+
+ /*
+ * Any attempt to modify IA32_RTIT_CTL while TraceEn is set will
+ * result in a #GP unless the same write also clears TraceEn.
+ */
+ if ((vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) &&
+ ((vmx->pt_desc.guest.ctl ^ data) & ~RTIT_CTL_TRACEEN))
+ return 1;
+
+ /*
+ * WRMSR to IA32_RTIT_CTL that sets TraceEn but clears this bit
+ * and FabricEn would cause #GP, if
+ * CPUID.(EAX=14H, ECX=0):ECX.SNGLRGNOUT[bit 2] = 0
+ */
+ if ((data & RTIT_CTL_TRACEEN) && !(data & RTIT_CTL_TOPA) &&
+ !(data & RTIT_CTL_FABRIC_EN) &&
+ !__pt_cap_get(vmx->pt_desc.caps, PT_CAP_single_range_output))
+ return 1;
+
+ /*
+ * MTCFreq, CycThresh and PSBFreq encodings check, any MSR write that
+ * utilize encodings marked reserved will casue a #GP fault.
+ */
+ value = __pt_cap_get(vmx->pt_desc.caps, PT_CAP_mtc_periods);
+ if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_mtc) &&
+ !test_bit((data & RTIT_CTL_MTC_RANGE) >>
+ RTIT_CTL_MTC_RANGE_OFFSET, &value))
+ return 1;
+ value = __pt_cap_get(vmx->pt_desc.caps, PT_CAP_cycle_thresholds);
+ if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_psb_cyc) &&
+ !test_bit((data & RTIT_CTL_CYC_THRESH) >>
+ RTIT_CTL_CYC_THRESH_OFFSET, &value))
+ return 1;
+ value = __pt_cap_get(vmx->pt_desc.caps, PT_CAP_psb_periods);
+ if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_psb_cyc) &&
+ !test_bit((data & RTIT_CTL_PSB_FREQ) >>
+ RTIT_CTL_PSB_FREQ_OFFSET, &value))
+ return 1;
+
+ /*
+ * If ADDRx_CFG is reserved or the encodings is >2 will
+ * cause a #GP fault.
+ */
+ value = (data & RTIT_CTL_ADDR0) >> RTIT_CTL_ADDR0_OFFSET;
+ if ((value && (vmx->pt_desc.range_cnt < 1)) || (value > 2))
+ return 1;
+ value = (data & RTIT_CTL_ADDR1) >> RTIT_CTL_ADDR1_OFFSET;
+ if ((value && (vmx->pt_desc.range_cnt < 2)) || (value > 2))
+ return 1;
+ value = (data & RTIT_CTL_ADDR2) >> RTIT_CTL_ADDR2_OFFSET;
+ if ((value && (vmx->pt_desc.range_cnt < 3)) || (value > 2))
+ return 1;
+ value = (data & RTIT_CTL_ADDR3) >> RTIT_CTL_ADDR3_OFFSET;
+ if ((value && (vmx->pt_desc.range_cnt < 4)) || (value > 2))
+ return 1;
+
+ return 0;
+}
+
+
static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
{
unsigned long rip;
@@ -3651,6 +3722,48 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 1;
msr_info->data = vcpu->arch.ia32_xss;
break;
+ case MSR_IA32_RTIT_CTL:
+ if (pt_mode != PT_MODE_HOST_GUEST)
+ return 1;
+ msr_info->data = vmx->pt_desc.guest.ctl;
+ break;
+ case MSR_IA32_RTIT_STATUS:
+ if (pt_mode != PT_MODE_HOST_GUEST)
+ return 1;
+ msr_info->data = vmx->pt_desc.guest.status;
+ break;
+ case MSR_IA32_RTIT_CR3_MATCH:
+ if ((pt_mode != PT_MODE_HOST_GUEST) ||
+ !__pt_cap_get(vmx->pt_desc.caps, PT_CAP_cr3_filtering))
+ return 1;
+ msr_info->data = vmx->pt_desc.guest.cr3_match;
+ break;
+ case MSR_IA32_RTIT_OUTPUT_BASE:
+ if ((pt_mode != PT_MODE_HOST_GUEST) ||
+ (!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_topa_output) &&
+ !__pt_cap_get(vmx->pt_desc.caps,
+ PT_CAP_single_range_output)))
+ return 1;
+ msr_info->data = vmx->pt_desc.guest.output_base;
+ break;
+ case MSR_IA32_RTIT_OUTPUT_MASK:
+ if ((pt_mode != PT_MODE_HOST_GUEST) ||
+ (!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_topa_output) &&
+ !__pt_cap_get(vmx->pt_desc.caps,
+ PT_CAP_single_range_output)))
+ return 1;
+ msr_info->data = vmx->pt_desc.guest.output_mask;
+ break;
+ case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B:
+ if ((pt_mode != PT_MODE_HOST_GUEST) ||
+ (msr_info->index - MSR_IA32_RTIT_ADDR0_A >=
+ 2 * __pt_cap_get(vmx->pt_desc.caps,
+ PT_CAP_num_address_ranges)))
+ return 1;
+ msr_info->data =
+ vmx->pt_desc.guest.addrs[msr_info->index -
+ MSR_IA32_RTIT_ADDR0_A];
+ break;
case MSR_TSC_AUX:
if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
@@ -3836,6 +3949,56 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
else
clear_atomic_switch_msr(vmx, MSR_IA32_XSS);
break;
+ case MSR_IA32_RTIT_CTL:
+ if ((pt_mode != PT_MODE_HOST_GUEST) ||
+ vmx_rtit_ctl_check(vcpu, data))
+ return 1;
+ vmcs_write64(GUEST_IA32_RTIT_CTL, data);
+ vmx->pt_desc.guest.ctl = data;
+ break;
+ case MSR_IA32_RTIT_STATUS:
+ if ((pt_mode != PT_MODE_HOST_GUEST) ||
+ (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
+ (data & MSR_IA32_RTIT_STATUS_MASK))
+ return 1;
+ vmx->pt_desc.guest.status = data;
+ break;
+ case MSR_IA32_RTIT_CR3_MATCH:
+ if ((pt_mode != PT_MODE_HOST_GUEST) ||
+ (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
+ !__pt_cap_get(vmx->pt_desc.caps, PT_CAP_cr3_filtering))
+ return 1;
+ vmx->pt_desc.guest.cr3_match = data;
+ break;
+ case MSR_IA32_RTIT_OUTPUT_BASE:
+ if ((pt_mode != PT_MODE_HOST_GUEST) ||
+ (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
+ (!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_topa_output) &&
+ !__pt_cap_get(vmx->pt_desc.caps,
+ PT_CAP_single_range_output)) ||
+ (data & MSR_IA32_RTIT_OUTPUT_BASE_MASK))
+ return 1;
+ vmx->pt_desc.guest.output_base = data;
+ break;
+ case MSR_IA32_RTIT_OUTPUT_MASK:
+ if ((pt_mode != PT_MODE_HOST_GUEST) ||
+ (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
+ (!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_topa_output) &&
+ !__pt_cap_get(vmx->pt_desc.caps,
+ PT_CAP_single_range_output)))
+ return 1;
+ vmx->pt_desc.guest.output_mask = data;
+ break;
+ case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B:
+ if ((pt_mode != PT_MODE_HOST_GUEST) ||
+ (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
+ (msr_info->index - MSR_IA32_RTIT_ADDR0_A >=
+ 2 * __pt_cap_get(vmx->pt_desc.caps,
+ PT_CAP_num_address_ranges)))
+ return 1;
+ vmx->pt_desc.guest.addrs[msr_info->index -
+ MSR_IA32_RTIT_ADDR0_A] = data;
+ break;
case MSR_TSC_AUX:
if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 51ecd38..cea0ee2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -69,6 +69,7 @@
#include <asm/irq_remapping.h>
#include <asm/mshyperv.h>
#include <asm/hypervisor.h>
+#include <asm/intel_pt.h>
#define CREATE_TRACE_POINTS
#include "trace.h"
@@ -1023,7 +1024,13 @@ bool kvm_rdpmc(struct kvm_vcpu *vcpu)
#endif
MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
- MSR_IA32_SPEC_CTRL, MSR_IA32_ARCH_CAPABILITIES
+ MSR_IA32_SPEC_CTRL, MSR_IA32_ARCH_CAPABILITIES,
+ MSR_IA32_RTIT_CTL, MSR_IA32_RTIT_STATUS, MSR_IA32_RTIT_CR3_MATCH,
+ MSR_IA32_RTIT_OUTPUT_BASE, MSR_IA32_RTIT_OUTPUT_MASK,
+ MSR_IA32_RTIT_ADDR0_A, MSR_IA32_RTIT_ADDR0_B,
+ MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
+ MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
+ MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
};
static unsigned num_msrs_to_save;
@@ -4592,6 +4599,30 @@ static void kvm_init_msr_list(void)
if (!kvm_x86_ops->rdtscp_supported())
continue;
break;
+ case MSR_IA32_RTIT_CTL:
+ case MSR_IA32_RTIT_STATUS:
+ if (!kvm_x86_ops->pt_supported())
+ continue;
+ break;
+ case MSR_IA32_RTIT_CR3_MATCH:
+ if (!kvm_x86_ops->pt_supported() ||
+ !pt_cap_get(PT_CAP_cr3_filtering))
+ continue;
+ break;
+ case MSR_IA32_RTIT_OUTPUT_BASE:
+ case MSR_IA32_RTIT_OUTPUT_MASK:
+ if (!kvm_x86_ops->pt_supported() ||
+ (!pt_cap_get(PT_CAP_topa_output) &&
+ !pt_cap_get(PT_CAP_single_range_output)))
+ continue;
+ break;
+ case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B: {
+ if (!kvm_x86_ops->pt_supported() ||
+ msrs_to_save[i] - MSR_IA32_RTIT_ADDR0_A >=
+ 2 * pt_cap_get(PT_CAP_num_address_ranges))
+ continue;
+ break;
+ }
default:
break;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 12/13] KVM: x86: Set intercept for Intel PT MSRs read/write
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (11 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 11/13] KVM: x86: Implement Intel Processor Trace MSRs read/write Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
2018-05-03 12:13 ` [PATCH v7 13/13] KVM: x86: Disable Intel Processor Trace when VMXON in L1 guest Luwei Kang
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
From: Chao Peng <chao.p.peng@linux.intel.com>
Disable intercept Intel PT MSRs only when Intel PT is
enabled in guest. But MSR_IA32_RTIT_CTL will alway be
intercept.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/kvm/vmx.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 2a29ab9..c125fb1 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -947,6 +947,7 @@ static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
u32 msr, int type);
+static void pt_set_intercept_for_msr(struct vcpu_vmx *vmx, bool flag);
static DEFINE_PER_CPU(struct vmcs *, vmxarea);
static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -3954,6 +3955,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
vmx_rtit_ctl_check(vcpu, data))
return 1;
vmcs_write64(GUEST_IA32_RTIT_CTL, data);
+ pt_set_intercept_for_msr(vmx, !(data & RTIT_CTL_TRACEEN));
vmx->pt_desc.guest.ctl = data;
break;
case MSR_IA32_RTIT_STATUS:
@@ -5761,6 +5763,24 @@ static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu)
vmx->msr_bitmap_mode = mode;
}
+static void pt_set_intercept_for_msr(struct vcpu_vmx *vmx, bool flag)
+{
+ unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
+ u32 i;
+
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_STATUS,
+ MSR_TYPE_RW, flag);
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_BASE,
+ MSR_TYPE_RW, flag);
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_MASK,
+ MSR_TYPE_RW, flag);
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_CR3_MATCH,
+ MSR_TYPE_RW, flag);
+ for (i = 0; i < vmx->pt_desc.range_cnt; i++)
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_ADDR0_A + i,
+ MSR_TYPE_RW, flag);
+}
+
static bool vmx_get_enable_apicv(struct kvm_vcpu *vcpu)
{
return enable_apicv;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v7 13/13] KVM: x86: Disable Intel Processor Trace when VMXON in L1 guest
2018-05-03 12:13 [PATCH v7 00/13] Intel Processor Trace virtualization enabling Luwei Kang
` (12 preceding siblings ...)
2018-05-03 12:13 ` [PATCH v7 12/13] KVM: x86: Set intercept for Intel PT " Luwei Kang
@ 2018-05-03 12:13 ` Luwei Kang
13 siblings, 0 replies; 19+ messages in thread
From: Luwei Kang @ 2018-05-03 12:13 UTC (permalink / raw)
To: kvm
Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
peterz, chao.p.peng, Luwei Kang
Currently, Intel Processor Trace do not support tracing in L1 guest
VMX operation(IA32_VMX_MISC[bit 14] is 0). As mentioned in SDM,
on these type of processors, execution of the VMXON instruction will
clears IA32_RTIT_CTL.TraceEn and any attempt to write IA32_RTIT_CTL
causes a general-protection exception (#GP).
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
arch/x86/kvm/vmx.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c125fb1..1e800d0 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3952,7 +3952,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
break;
case MSR_IA32_RTIT_CTL:
if ((pt_mode != PT_MODE_HOST_GUEST) ||
- vmx_rtit_ctl_check(vcpu, data))
+ vmx_rtit_ctl_check(vcpu, data) ||
+ vmx->nested.vmxon)
return 1;
vmcs_write64(GUEST_IA32_RTIT_CTL, data);
pt_set_intercept_for_msr(vmx, !(data & RTIT_CTL_TRACEEN));
@@ -8029,6 +8030,12 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
if (ret)
return ret;
+ if (pt_mode == PT_MODE_HOST_GUEST) {
+ vmx->pt_desc.guest.ctl = 0;
+ vmcs_write64(GUEST_IA32_RTIT_CTL, 0);
+ pt_set_intercept_for_msr(vmx, 1);
+ }
+
nested_vmx_succeed(vcpu);
return kvm_skip_emulated_instruction(vcpu);
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 19+ messages in thread