* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS [not found] <EEC2A80E7137D84ABF791B01D40FA9A601EC200E@DGGEMM506-MBX.china.huawei.com> @ 2021-01-25 2:41 ` Like Xu 2021-01-25 14:47 ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) 0 siblings, 1 reply; 18+ messages in thread From: Like Xu @ 2021-01-25 2:41 UTC (permalink / raw) To: Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) Cc: linux-kernel, Xiexiangyou, Fangyi (Eric), kvm, Wei Wang + kvm@vger.kernel.org Hi Liuxiangdong, On 2021/1/22 18:02, Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) wrote: > Hi Like, > > Some questions about > https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/ <https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/> Thanks for trying the PEBS feature in the guest, and I assume you have correctly applied the QEMU patches for guest PEBS. > > 1)Test in IceLake In the [PATCH v3 10/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64, we only support Ice Lake with the following x86_model(s): #define INTEL_FAM6_ICELAKE_X 0x6A #define INTEL_FAM6_ICELAKE_D 0x6C you can check the eax output of "cpuid -l 1 -1 -r", for example "0x000606a4" meets this requirement. > > HOST: > > CPU family: 6 > > Model: 106 > > Model name: Intel(R) Xeon(R) Platinum 8378A CPU $@ $@ > > microcode: sig=0x606a6, pf=0x1, revision=0xd000122 As long as you get the latest BIOS from the provider, you may check 'cat /proc/cpuinfo | grep code | uniq' with the latest one. > > Guest: linux kernel 5.11.0-rc2 I assume it's the "upstream tag v5.11-rc2" which is fine. > > We can find pebs/intel_pt flag in guest cpuinfo, but there still exists > error when we use perf Just a note, intel_pt and pebs are two features and we can write pebs records to intel_pt buffer with extra hardware support. (by default, pebs records are written to the pebs buffer) You may check the output of "dmesg | grep PEBS" in the guest to see if the guest PEBS cpuinfo is exposed and use "perf record –e cycles:pp" to see if PEBS feature actually works in the guest. > > # perf record –e cycles:pp > > Error: > > cycles:pp: PMU Hardware doesn’t support sampling/overflow-interrupts. Try > ‘perf stat’ > > Could you give some advice? If you have more specific comments or any concerns, just let me know. > > 2)Test in Skylake > > HOST: > > CPU family: 6 > > Model: 85 > > Model name: Intel(R) Xeon(R) Gold 6146 CPU @ > > 3.20GHz > > microcode : 0x2000064 > > Guest: linux 4.18 > > we cannot find intel_pt flag in guest cpuinfo because > cpu_has_vmx_intel_pt() return false. You may check vmx_pebs_supported(). > > SECONDARY_EXEC_PT_USE_GPA/VM_EXIT_CLEAR_IA32_RTIT_CTL/VM_ENTRY_LOAD_IA32_RTIT_CTL > are both disable. > > Is it because microcode is not supported? > > And, isthere a new macrocode which can support these bits? How can we get this? Currently, this patch set doesn't support guest PEBS on the Skylake platforms, and if we choose to support it, we will let you know. --- thx,likexu > > Thanks, > > Liuxiangdong > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-25 2:41 ` [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS Like Xu @ 2021-01-25 14:47 ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) 2021-01-26 7:08 ` Xu, Like 0 siblings, 1 reply; 18+ messages in thread From: Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) @ 2021-01-25 14:47 UTC (permalink / raw) To: Like Xu, kvm, Wei Wang; +Cc: linux-kernel, Xiexiangyou Thanks for replying, On 2021/1/25 10:41, Like Xu wrote: > + kvm@vger.kernel.org > > Hi Liuxiangdong, > > On 2021/1/22 18:02, Liuxiangdong (Aven, Cloud Infrastructure Service > Product Dept.) wrote: >> Hi Like, >> >> Some questions about >> https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/ >> <https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/> >> > > Thanks for trying the PEBS feature in the guest, > and I assume you have correctly applied the QEMU patches for guest PEBS. > Is there any other patch that needs to be apply? I use qemu 5.2.0. (download from github on January 14th) >> 1)Test in IceLake > > In the [PATCH v3 10/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, > DS, DTES64, we only support Ice Lake with the following x86_model(s): > > #define INTEL_FAM6_ICELAKE_X 0x6A > #define INTEL_FAM6_ICELAKE_D 0x6C > > you can check the eax output of "cpuid -l 1 -1 -r", > for example "0x000606a4" meets this requirement. It's INTEL_FAM6_ICELAKE_X cpuid -l 1 -1 -r CPU: 0x00000001 0x00: eax=0x000606a6 ebx=0xb4800800 ecx=0x7ffefbf7 edx=0xbfebfbff >> >> HOST: >> >> CPU family: 6 >> >> Model: 106 >> >> Model name: Intel(R) Xeon(R) Platinum 8378A CPU >> $@ $@ >> >> microcode: sig=0x606a6, pf=0x1, revision=0xd000122 > > As long as you get the latest BIOS from the provider, > you may check 'cat /proc/cpuinfo | grep code | uniq' with the latest one. OK. I'll do it later. > >> >> Guest: linux kernel 5.11.0-rc2 > > I assume it's the "upstream tag v5.11-rc2" which is fine. Yes. > >> >> We can find pebs/intel_pt flag in guest cpuinfo, but there still >> exists error when we use perf > > Just a note, intel_pt and pebs are two features and we can write > pebs records to intel_pt buffer with extra hardware support. > (by default, pebs records are written to the pebs buffer) > > You may check the output of "dmesg | grep PEBS" in the guest > to see if the guest PEBS cpuinfo is exposed and use "perf record > –e cycles:pp" to see if PEBS feature actually works in the guest. I apply only pebs patch set to linux kernel 5.11.0-rc2, test perf in guest and dump stack when return -EOPNOTSUPP (1) # perf record -e instructions:pp Error: instructions:pp: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' [ 117.793266] Call Trace: [ 117.793270] dump_stack+0x57/0x6a [ 117.793275] intel_pmu_setup_lbr_filter+0x137/0x190 [ 117.793280] intel_pmu_hw_config+0x18b/0x320 [ 117.793288] hsw_hw_config+0xe/0xa0 [ 117.793290] x86_pmu_event_init+0x8e/0x210 [ 117.793293] perf_try_init_event+0x40/0x130 [ 117.793297] perf_event_alloc.part.22+0x611/0xde0 [ 117.793299] ? alloc_fd+0xba/0x180 [ 117.793302] __do_sys_perf_event_open+0x1bd/0xd90 [ 117.793305] do_syscall_64+0x33/0x40 [ 117.793308] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Do we need lbr when we use pebs? I tried to apply lbr patch set(https://lore.kernel.org/kvm/911adb63-ba05-ea93-c038-1c09cff15eda@intel.com/) to kernel and qemu, but there is still other problem. Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event ... (2) # perf record -e instructions:ppp Error: instructions:ppp: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' [ 115.188498] Call Trace: [ 115.188503] dump_stack+0x57/0x6a [ 115.188509] x86_pmu_hw_config+0x1eb/0x220 [ 115.188515] intel_pmu_hw_config+0x13/0x320 [ 115.188519] hsw_hw_config+0xe/0xa0 [ 115.188521] x86_pmu_event_init+0x8e/0x210 [ 115.188524] perf_try_init_event+0x40/0x130 [ 115.188528] perf_event_alloc.part.22+0x611/0xde0 [ 115.188530] ? alloc_fd+0xba/0x180 [ 115.188534] __do_sys_perf_event_open+0x1bd/0xd90 [ 115.188538] do_syscall_64+0x33/0x40 [ 115.188541] entry_SYSCALL_64_after_hwframe+0x44/0xa9 This is beacuse x86_pmu.intel_cap.pebs_format is always 0 in x86_pmu_max_precise(). We rdmsr MSR_IA32_PERF_CAPABILITIES(0x00000345) from HOST, it's f4c5. From guest, it's 2000 >> >> # perf record –e cycles:pp >> >> Error: >> >> cycles:pp: PMU Hardware doesn’t support sampling/overflow-interrupts. >> Try ‘perf stat’ >> >> Could you give some advice? > > If you have more specific comments or any concerns, just let me know. > >> >> 2)Test in Skylake >> >> HOST: >> >> CPU family: 6 >> >> Model: 85 >> >> Model name: Intel(R) Xeon(R) Gold 6146 CPU @ >> >> 3.20GHz >> >> microcode : 0x2000064 >> >> Guest: linux 4.18 >> >> we cannot find intel_pt flag in guest cpuinfo because >> cpu_has_vmx_intel_pt() return false. > > You may check vmx_pebs_supported(). It's true. > >> >> SECONDARY_EXEC_PT_USE_GPA/VM_EXIT_CLEAR_IA32_RTIT_CTL/VM_ENTRY_LOAD_IA32_RTIT_CTL >> are both disable. >> >> Is it because microcode is not supported? >> >> And, isthere a new macrocode which can support these bits? How can we >> get this? > > Currently, this patch set doesn't support guest PEBS on the Skylake > platforms, and if we choose to support it, we will let you know. > And now, we want to use pebs in skylake. If we develop based on pebs patch set, do you have any suggestions? I think microcode requirements need to be satisfied. Can we use https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files ? > --- > thx,likexu > >> >> Thanks, >> >> Liuxiangdong >> > Thanks. Liuxiangdong ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-25 14:47 ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) @ 2021-01-26 7:08 ` Xu, Like 2021-01-29 2:52 ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) 0 siblings, 1 reply; 18+ messages in thread From: Xu, Like @ 2021-01-26 7:08 UTC (permalink / raw) To: Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) Cc: linux-kernel, Xiexiangyou, Wei Wang, kvm, Like Xu, Fangyi (Eric) [-- Attachment #1: Type: text/plain, Size: 6844 bytes --] On 2021/1/25 22:47, Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) wrote: > Thanks for replying, > > On 2021/1/25 10:41, Like Xu wrote: >> + kvm@vger.kernel.org >> >> Hi Liuxiangdong, >> >> On 2021/1/22 18:02, Liuxiangdong (Aven, Cloud Infrastructure Service >> Product Dept.) wrote: >>> Hi Like, >>> >>> Some questions about >>> https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/ >>> <https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/> >>> >> >> Thanks for trying the PEBS feature in the guest, >> and I assume you have correctly applied the QEMU patches for guest PEBS. >> > Is there any other patch that needs to be apply? I use qemu 5.2.0. > (download from github on January 14th) Two qemu patches are attached against qemu tree (commit 31ee895047bdcf7387e3570cbd2a473c6f744b08) and then run the guest with "-cpu,pebs=true". Note, this two patch are just for test and not finalized for qemu upstream. > >>> 1)Test in IceLake >> >> In the [PATCH v3 10/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, >> DS, DTES64, we only support Ice Lake with the following x86_model(s): >> >> #define INTEL_FAM6_ICELAKE_X 0x6A >> #define INTEL_FAM6_ICELAKE_D 0x6C >> >> you can check the eax output of "cpuid -l 1 -1 -r", >> for example "0x000606a4" meets this requirement. > It's INTEL_FAM6_ICELAKE_X Yes, it's the target hardware. > cpuid -l 1 -1 -r > > CPU: > 0x00000001 0x00: eax=0x000606a6 ebx=0xb4800800 ecx=0x7ffefbf7 > edx=0xbfebfbff > >>> >>> HOST: >>> >>> CPU family: 6 >>> >>> Model: 106 >>> >>> Model name: Intel(R) Xeon(R) Platinum 8378A CPU $@ $@ >>> >>> microcode: sig=0x606a6, pf=0x1, revision=0xd000122 >> >> As long as you get the latest BIOS from the provider, >> you may check 'cat /proc/cpuinfo | grep code | uniq' with the latest one. > OK. I'll do it later. >> >>> >>> Guest: linux kernel 5.11.0-rc2 >> >> I assume it's the "upstream tag v5.11-rc2" which is fine. > Yes. >> >>> >>> We can find pebs/intel_pt flag in guest cpuinfo, but there still exists >>> error when we use perf >> >> Just a note, intel_pt and pebs are two features and we can write >> pebs records to intel_pt buffer with extra hardware support. >> (by default, pebs records are written to the pebs buffer) >> >> You may check the output of "dmesg | grep PEBS" in the guest >> to see if the guest PEBS cpuinfo is exposed and use "perf record >> –e cycles:pp" to see if PEBS feature actually works in the guest. > > I apply only pebs patch set to linux kernel 5.11.0-rc2, test perf in > guest and dump stack when return -EOPNOTSUPP Yes, you may apply the qemu patches and try it again. > > (1) > # perf record -e instructions:pp > Error: > instructions:pp: PMU Hardware doesn't support > sampling/overflow-interrupts. Try 'perf stat' > > [ 117.793266] Call Trace: > [ 117.793270] dump_stack+0x57/0x6a > [ 117.793275] intel_pmu_setup_lbr_filter+0x137/0x190 > [ 117.793280] intel_pmu_hw_config+0x18b/0x320 > [ 117.793288] hsw_hw_config+0xe/0xa0 > [ 117.793290] x86_pmu_event_init+0x8e/0x210 > [ 117.793293] perf_try_init_event+0x40/0x130 > [ 117.793297] perf_event_alloc.part.22+0x611/0xde0 > [ 117.793299] ? alloc_fd+0xba/0x180 > [ 117.793302] __do_sys_perf_event_open+0x1bd/0xd90 > [ 117.793305] do_syscall_64+0x33/0x40 > [ 117.793308] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > Do we need lbr when we use pebs? No, lbr ane pebs are two features and we enable it separately. > > I tried to apply lbr patch > set(https://lore.kernel.org/kvm/911adb63-ba05-ea93-c038-1c09cff15eda@intel.com/) > to kernel and qemu, but there is still other problem. > Error: > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for > event > ... We don't need that patch for PEBS feature. > > (2) > # perf record -e instructions:ppp > Error: > instructions:ppp: PMU Hardware doesn't support > sampling/overflow-interrupts. Try 'perf stat' > > [ 115.188498] Call Trace: > [ 115.188503] dump_stack+0x57/0x6a > [ 115.188509] x86_pmu_hw_config+0x1eb/0x220 > [ 115.188515] intel_pmu_hw_config+0x13/0x320 > [ 115.188519] hsw_hw_config+0xe/0xa0 > [ 115.188521] x86_pmu_event_init+0x8e/0x210 > [ 115.188524] perf_try_init_event+0x40/0x130 > [ 115.188528] perf_event_alloc.part.22+0x611/0xde0 > [ 115.188530] ? alloc_fd+0xba/0x180 > [ 115.188534] __do_sys_perf_event_open+0x1bd/0xd90 > [ 115.188538] do_syscall_64+0x33/0x40 > [ 115.188541] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > This is beacuse x86_pmu.intel_cap.pebs_format is always 0 in > x86_pmu_max_precise(). > > We rdmsr MSR_IA32_PERF_CAPABILITIES(0x00000345) from HOST, it's f4c5. > From guest, it's 2000 > >>> >>> # perf record –e cycles:pp >>> >>> Error: >>> >>> cycles:pp: PMU Hardware doesn’t support sampling/overflow-interrupts. >>> Try ‘perf stat’ >>> >>> Could you give some advice? >> >> If you have more specific comments or any concerns, just let me know. >> >>> >>> 2)Test in Skylake >>> >>> HOST: >>> >>> CPU family: 6 >>> >>> Model: 85 >>> >>> Model name: Intel(R) Xeon(R) Gold 6146 CPU @ >>> >>> 3.20GHz >>> >>> microcode : 0x2000064 >>> >>> Guest: linux 4.18 >>> >>> we cannot find intel_pt flag in guest cpuinfo because >>> cpu_has_vmx_intel_pt() return false. >> >> You may check vmx_pebs_supported(). > It's true. >> >>> >>> SECONDARY_EXEC_PT_USE_GPA/VM_EXIT_CLEAR_IA32_RTIT_CTL/VM_ENTRY_LOAD_IA32_RTIT_CTL >>> are both disable. >>> >>> Is it because microcode is not supported? >>> >>> And, isthere a new macrocode which can support these bits? How can we >>> get this? >> >> Currently, this patch set doesn't support guest PEBS on the Skylake >> platforms, and if we choose to support it, we will let you know. >> > And now, we want to use pebs in skylake. If we develop based on pebs > patch set, do you have any suggestions? - At least you need to pin guest memory such as "-overcommit mem-lock=true" for qemu - You may rewrite the patches 13 - 17 for Skylake specific because the records format is different with Ice Lake. > I think microcode requirements need to be satisfied. Can we use > https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files ? You may try it at your risk and again, this patch set doesn't support guest PEBS on the Skylake platforms currently. > >> --- >> thx,likexu >> >>> >>> Thanks, >>> >>> Liuxiangdong >>> >> > Thanks. Liuxiangdong > [-- Attachment #2: 0001-target-i386-Expose-PEBS-capabilities-in-the-FEAT_PER.patch --] [-- Type: text/plain, Size: 1742 bytes --] From 24a04b800d24e3b493e5094f88649402923147a2 Mon Sep 17 00:00:00 2001 From: Like Xu <like.xu@linux.intel.com> Date: Fri, 4 Sep 2020 10:19:27 +0800 Subject: [PATCH 1/2] target/i386: Expose PEBS capabilities in the FEAT_PERF_CAPABILITIES The IA32_PERF_CAPABILITIES MSR provides enumeration of a variety of PEBS feature interfaces: - PEBSTrap[6]: Trap/Fault-like indicator of PEBS recording assist; - PEBSArchRegs[7]: Indicator of PEBS assist save architectural registers; - PEBS_FMT[bits 11:8]: Specifies the encoding of the layout of PEBS records; - PEBS_BASELINE [bit 14]: If set, the following is true: (1) Extended PEBS is supported. All counters support the PEBS facility, and all events can generate PEBS records when PEBS is enabled. (2) Adaptive PEBS is supported. The PEBS_DATA_CFG MSR and adaptive record enable bits are supported. Signed-off-by: Like Xu <like.xu@linux.intel.com> --- target/i386/cpu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 72a79e6019..14262c7bf7 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -1136,9 +1136,9 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { .type = MSR_FEATURE_WORD, .feat_names = { NULL, NULL, NULL, NULL, - NULL, NULL, NULL, NULL, - NULL, NULL, NULL, NULL, - NULL, "full-width-write", NULL, NULL, + NULL, NULL, "pebs-trap", "pebs-arch-reg", + "pebs-fmt-0", "pebs-fmt-1", "pebs-fmt-2", "pebs-fmt-3", + NULL, "full-width-write", "pebs-baseline", NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, -- 2.29.2 [-- Attachment #3: 0002-target-i386-add-cpu-pebs-true-support-to-enable-gues.patch --] [-- Type: text/plain, Size: 5489 bytes --] From be5246694aaf2132396ee0b907e679f5c9ccd089 Mon Sep 17 00:00:00 2001 From: Like Xu <like.xu@linux.intel.com> Date: Fri, 4 Sep 2020 10:42:28 +0800 Subject: [PATCH 2/2] target/i386: add -cpu,pebs=true support to enable guest PEBS The PEBS feature would be enabled on the guest if: - the KVM is enabled and the PMU is enabled and, - the msr-based-feature IA32_PERF_CAPABILITIES is supporterd and, - the supported returned value for PEBS from this msr is not zero. The PEBS feature would be disabled on the guest if: - the msr-based-feature IA32_PERF_CAPABILITIES is unsupporterd OR, - qemu set the IA32_PERF_CAPABILITIES msr feature without pebs_fmt values OR, - the requested guest vcpu model doesn't support PDCM. Signed-off-by: Like Xu <like.xu@linux.intel.com> --- hw/i386/pc.c | 1 + target/i386/cpu.c | 20 ++++++++++++++++++++ target/i386/cpu.h | 7 +++++++ target/i386/kvm/kvm.c | 10 ++++++++++ 4 files changed, 38 insertions(+) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 5458f61d10..8e9c1b7545 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -330,6 +330,7 @@ GlobalProperty pc_compat_1_5[] = { { "Nehalem-" TYPE_X86_CPU, "min-level", "2" }, { "virtio-net-pci", "any_layout", "off" }, { TYPE_X86_CPU, "pmu", "on" }, + { TYPE_X86_CPU, "pebs", "on" }, { "i440FX-pcihost", "short_root_bus", "0" }, { "q35-pcihost", "short_root_bus", "0" }, }; diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 14262c7bf7..9dffc85542 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -4228,6 +4228,12 @@ static bool lmce_supported(void) return !!(mce_cap & MCG_LMCE_P); } +static inline bool lbr_supported(void) +{ + return kvm_enabled() && (kvm_arch_get_supported_msr_feature(kvm_state, + MSR_IA32_PERF_CAPABILITIES) & PERF_CAP_PEBS_FORMAT); +} + #define CPUID_MODEL_ID_SZ 48 /** @@ -4332,6 +4338,9 @@ static void max_x86_cpu_initfn(Object *obj) } object_property_set_bool(OBJECT(cpu), "pmu", true, &error_abort); + if (lbr_supported()) { + object_property_set_bool(OBJECT(cpu), "pebs", true, &error_abort); + } } static const TypeInfo max_x86_cpu_type_info = { @@ -5545,6 +5554,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, } if (!cpu->enable_pmu) { *ecx &= ~CPUID_EXT_PDCM; + if (cpu->enable_pebs) { + warn_report("PEBS is unsupported since guest PMU is disabled."); + exit(1); + } } break; case 2: @@ -6610,6 +6623,12 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp) } } + if (!cpu->max_features && cpu->enable_pebs && + !(env->features[FEAT_1_ECX] & CPUID_EXT_PDCM)) { + warn_report("requested vcpu model doesn't support PDCM for PEBS."); + exit(1); + } + if (cpu->ucode_rev == 0) { /* The default is the same as KVM's. */ if (IS_AMD_CPU(env)) { @@ -7192,6 +7211,7 @@ static Property x86_cpu_properties[] = { #endif DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID), DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false), + DEFINE_PROP_BOOL("pebs", X86CPU, enable_pebs, false), DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts, HYPERV_SPINLOCK_NEVER_NOTIFY), diff --git a/target/i386/cpu.h b/target/i386/cpu.h index d23a5b340a..eac8d8c68e 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -354,6 +354,12 @@ typedef enum X86Seg { #define ARCH_CAP_TSX_CTRL_MSR (1<<7) #define MSR_IA32_PERF_CAPABILITIES 0x345 +#define PERF_CAP_PEBS_TRAP BIT_ULL(6) +#define PERF_CAP_ARCH_REG BIT_ULL(7) +#define PERF_CAP_PEBS_FORMAT 0xf00 +#define PERF_CAP_PEBS_BASELINE BIT_ULL(14) +#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \ + PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE) #define MSR_IA32_TSX_CTRL 0x122 #define MSR_IA32_TSCDEADLINE 0x6e0 @@ -1708,6 +1714,7 @@ struct X86CPU { * capabilities) directly to the guest. */ bool enable_pmu; + bool enable_pebs; /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is * disabled by default to avoid breaking migration between QEMU with diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 6dc1ee052d..8fe1d2feea 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2705,6 +2705,13 @@ static void kvm_msr_entry_add_perf(X86CPU *cpu, FeatureWordArray f) MSR_IA32_PERF_CAPABILITIES); if (kvm_perf_cap) { + if (!cpu->enable_pebs) { + kvm_perf_cap &= ~PERF_CAP_PEBS_MASK; + } + if (!(kvm_perf_cap & PERF_CAP_PEBS_MASK) && cpu->enable_pebs) { + warn_report("MSR_IA32_PERF_CAPABILITIES reported by KVM does not support PEBS."); + exit(1); + } kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES, kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]); } @@ -2744,6 +2751,9 @@ static void kvm_init_msrs(X86CPU *cpu) if (has_msr_perf_capabs && cpu->enable_pmu) { kvm_msr_entry_add_perf(cpu, env->features); + } else if (!has_msr_perf_capabs && cpu->enable_pebs) { + warn_report("KVM doesn't support MSR_IA32_PERF_CAPABILITIES for PEBS."); + exit(1); } if (has_msr_ucode_rev) { -- 2.29.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-26 7:08 ` Xu, Like @ 2021-01-29 2:52 ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) 2021-02-01 8:43 ` Xu, Like 0 siblings, 1 reply; 18+ messages in thread From: Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) @ 2021-01-29 2:52 UTC (permalink / raw) To: Xu, Like Cc: linux-kernel, Xiexiangyou, Wei Wang, kvm, Like Xu, Fangyi (Eric), liuxiangdong5 On 2021/1/26 15:08, Xu, Like wrote: > On 2021/1/25 22:47, Liuxiangdong (Aven, Cloud Infrastructure Service > Product Dept.) wrote: >> Thanks for replying, >> >> On 2021/1/25 10:41, Like Xu wrote: >>> + kvm@vger.kernel.org >>> >>> Hi Liuxiangdong, >>> >>> On 2021/1/22 18:02, Liuxiangdong (Aven, Cloud Infrastructure Service >>> Product Dept.) wrote: >>>> Hi Like, >>>> >>>> Some questions about >>>> https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/ >>>> <https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/> >>>> >>> Thanks for trying the PEBS feature in the guest, >>> and I assume you have correctly applied the QEMU patches for guest PEBS. >>> >> Is there any other patch that needs to be apply? I use qemu 5.2.0. >> (download from github on January 14th) > Two qemu patches are attached against qemu tree > (commit 31ee895047bdcf7387e3570cbd2a473c6f744b08) > and then run the guest with "-cpu,pebs=true". > > Note, this two patch are just for test and not finalized for qemu upstream. Yes, we can use pebs in IceLake when qemu patches applied. Thanks very much! >>>> 1)Test in IceLake >>> In the [PATCH v3 10/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, >>> DS, DTES64, we only support Ice Lake with the following x86_model(s): >>> >>> #define INTEL_FAM6_ICELAKE_X 0x6A >>> #define INTEL_FAM6_ICELAKE_D 0x6C >>> >>> you can check the eax output of "cpuid -l 1 -1 -r", >>> for example "0x000606a4" meets this requirement. >> It's INTEL_FAM6_ICELAKE_X > Yes, it's the target hardware. > >> cpuid -l 1 -1 -r >> >> CPU: >> 0x00000001 0x00: eax=0x000606a6 ebx=0xb4800800 ecx=0x7ffefbf7 >> edx=0xbfebfbff >> >>>> HOST: >>>> >>>> CPU family: 6 >>>> >>>> Model: 106 >>>> >>>> Model name: Intel(R) Xeon(R) Platinum 8378A CPU $@ $@ >>>> >>>> microcode: sig=0x606a6, pf=0x1, revision=0xd000122 >>> As long as you get the latest BIOS from the provider, >>> you may check 'cat /proc/cpuinfo | grep code | uniq' with the latest one. >> OK. I'll do it later. >>>> Guest: linux kernel 5.11.0-rc2 >>> I assume it's the "upstream tag v5.11-rc2" which is fine. >> Yes. >>>> We can find pebs/intel_pt flag in guest cpuinfo, but there still exists >>>> error when we use perf >>> Just a note, intel_pt and pebs are two features and we can write >>> pebs records to intel_pt buffer with extra hardware support. >>> (by default, pebs records are written to the pebs buffer) >>> >>> You may check the output of "dmesg | grep PEBS" in the guest >>> to see if the guest PEBS cpuinfo is exposed and use "perf record >>> –e cycles:pp" to see if PEBS feature actually works in the guest. >> I apply only pebs patch set to linux kernel 5.11.0-rc2, test perf in >> guest and dump stack when return -EOPNOTSUPP > Yes, you may apply the qemu patches and try it again. > >> (1) >> # perf record -e instructions:pp >> Error: >> instructions:pp: PMU Hardware doesn't support >> sampling/overflow-interrupts. Try 'perf stat' >> >> [ 117.793266] Call Trace: >> [ 117.793270] dump_stack+0x57/0x6a >> [ 117.793275] intel_pmu_setup_lbr_filter+0x137/0x190 >> [ 117.793280] intel_pmu_hw_config+0x18b/0x320 >> [ 117.793288] hsw_hw_config+0xe/0xa0 >> [ 117.793290] x86_pmu_event_init+0x8e/0x210 >> [ 117.793293] perf_try_init_event+0x40/0x130 >> [ 117.793297] perf_event_alloc.part.22+0x611/0xde0 >> [ 117.793299] ? alloc_fd+0xba/0x180 >> [ 117.793302] __do_sys_perf_event_open+0x1bd/0xd90 >> [ 117.793305] do_syscall_64+0x33/0x40 >> [ 117.793308] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> >> Do we need lbr when we use pebs? > No, lbr ane pebs are two features and we enable it separately. > >> I tried to apply lbr patch >> set(https://lore.kernel.org/kvm/911adb63-ba05-ea93-c038-1c09cff15eda@intel.com/) >> to kernel and qemu, but there is still other problem. >> Error: >> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for >> event >> ... > We don't need that patch for PEBS feature. > >> (2) >> # perf record -e instructions:ppp >> Error: >> instructions:ppp: PMU Hardware doesn't support >> sampling/overflow-interrupts. Try 'perf stat' >> >> [ 115.188498] Call Trace: >> [ 115.188503] dump_stack+0x57/0x6a >> [ 115.188509] x86_pmu_hw_config+0x1eb/0x220 >> [ 115.188515] intel_pmu_hw_config+0x13/0x320 >> [ 115.188519] hsw_hw_config+0xe/0xa0 >> [ 115.188521] x86_pmu_event_init+0x8e/0x210 >> [ 115.188524] perf_try_init_event+0x40/0x130 >> [ 115.188528] perf_event_alloc.part.22+0x611/0xde0 >> [ 115.188530] ? alloc_fd+0xba/0x180 >> [ 115.188534] __do_sys_perf_event_open+0x1bd/0xd90 >> [ 115.188538] do_syscall_64+0x33/0x40 >> [ 115.188541] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> >> This is beacuse x86_pmu.intel_cap.pebs_format is always 0 in >> x86_pmu_max_precise(). >> >> We rdmsr MSR_IA32_PERF_CAPABILITIES(0x00000345) from HOST, it's f4c5. >> From guest, it's 2000 >> >>>> # perf record –e cycles:pp >>>> >>>> Error: >>>> >>>> cycles:pp: PMU Hardware doesn’t support sampling/overflow-interrupts. >>>> Try ‘perf stat’ >>>> >>>> Could you give some advice? >>> If you have more specific comments or any concerns, just let me know. >>> >>>> 2)Test in Skylake >>>> >>>> HOST: >>>> >>>> CPU family: 6 >>>> >>>> Model: 85 >>>> >>>> Model name: Intel(R) Xeon(R) Gold 6146 CPU @ >>>> >>>> 3.20GHz >>>> >>>> microcode : 0x2000064 >>>> >>>> Guest: linux 4.18 >>>> >>>> we cannot find intel_pt flag in guest cpuinfo because >>>> cpu_has_vmx_intel_pt() return false. >>> You may check vmx_pebs_supported(). >> It's true. >>>> SECONDARY_EXEC_PT_USE_GPA/VM_EXIT_CLEAR_IA32_RTIT_CTL/VM_ENTRY_LOAD_IA32_RTIT_CTL >>>> are both disable. >>>> >>>> Is it because microcode is not supported? >>>> >>>> And, isthere a new macrocode which can support these bits? How can we >>>> get this? >>> Currently, this patch set doesn't support guest PEBS on the Skylake >>> platforms, and if we choose to support it, we will let you know. >>> >> And now, we want to use pebs in skylake. If we develop based on pebs >> patch set, do you have any suggestions? > - At least you need to pin guest memory such as "-overcommit mem-lock=true" > for qemu > - You may rewrite the patches 13 - 17 for Skylake specific because the > records format is different with Ice Lake. OK. So, is there anything else we need to pay attention to except record format when used for Skylake? >> I think microcode requirements need to be satisfied. Can we use >> https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files ? > You may try it at your risk and again, > this patch set doesn't support guest PEBS on the Skylake platforms currently. > >>> --- >>> thx,likexu >>> >>>> Thanks, >>>> >>>> Liuxiangdong >>>> >> Thanks. Liuxiangdong >> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-29 2:52 ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) @ 2021-02-01 8:43 ` Xu, Like 0 siblings, 0 replies; 18+ messages in thread From: Xu, Like @ 2021-02-01 8:43 UTC (permalink / raw) To: Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) Cc: linux-kernel, Xiexiangyou, Wei Wang, kvm, Like Xu, Fangyi (Eric) On 2021/1/29 10:52, Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) wrote: > > > On 2021/1/26 15:08, Xu, Like wrote: >> On 2021/1/25 22:47, Liuxiangdong (Aven, Cloud Infrastructure Service >> Product Dept.) wrote: >>> Thanks for replying, >>> >>> On 2021/1/25 10:41, Like Xu wrote: >>>> + kvm@vger.kernel.org >>>> >>>> Hi Liuxiangdong, >>>> >>>> On 2021/1/22 18:02, Liuxiangdong (Aven, Cloud Infrastructure Service >>>> Product Dept.) wrote: >>>>> Hi Like, >>>>> >>>>> Some questions about >>>>> https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/ >>>>> >>>>> <https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/> >>>>> >>>>> >>>> Thanks for trying the PEBS feature in the guest, >>>> and I assume you have correctly applied the QEMU patches for guest PEBS. >>>> >>> Is there any other patch that needs to be apply? I use qemu 5.2.0. >>> (download from github on January 14th) >> Two qemu patches are attached against qemu tree >> (commit 31ee895047bdcf7387e3570cbd2a473c6f744b08) >> and then run the guest with "-cpu,pebs=true". >> >> Note, this two patch are just for test and not finalized for qemu upstream. > Yes, we can use pebs in IceLake when qemu patches applied. > Thanks very much! Thanks for your verification on this earlier version. >>>>> 1)Test in IceLake >>>> In the [PATCH v3 10/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, >>>> DS, DTES64, we only support Ice Lake with the following x86_model(s): >>>> >>>> #define INTEL_FAM6_ICELAKE_X 0x6A >>>> #define INTEL_FAM6_ICELAKE_D 0x6C >>>> >>>> you can check the eax output of "cpuid -l 1 -1 -r", >>>> for example "0x000606a4" meets this requirement. >>> It's INTEL_FAM6_ICELAKE_X >> Yes, it's the target hardware. >> >>> cpuid -l 1 -1 -r >>> >>> CPU: >>> 0x00000001 0x00: eax=0x000606a6 ebx=0xb4800800 ecx=0x7ffefbf7 >>> edx=0xbfebfbff >>> >>>>> HOST: >>>>> >>>>> CPU family: 6 >>>>> >>>>> Model: 106 >>>>> >>>>> Model name: Intel(R) Xeon(R) Platinum 8378A CPU >>>>> $@ $@ >>>>> >>>>> microcode: sig=0x606a6, pf=0x1, revision=0xd000122 >>>> As long as you get the latest BIOS from the provider, >>>> you may check 'cat /proc/cpuinfo | grep code | uniq' with the latest one. >>> OK. I'll do it later. >>>>> Guest: linux kernel 5.11.0-rc2 >>>> I assume it's the "upstream tag v5.11-rc2" which is fine. >>> Yes. >>>>> We can find pebs/intel_pt flag in guest cpuinfo, but there still exists >>>>> error when we use perf >>>> Just a note, intel_pt and pebs are two features and we can write >>>> pebs records to intel_pt buffer with extra hardware support. >>>> (by default, pebs records are written to the pebs buffer) >>>> >>>> You may check the output of "dmesg | grep PEBS" in the guest >>>> to see if the guest PEBS cpuinfo is exposed and use "perf record >>>> –e cycles:pp" to see if PEBS feature actually works in the guest. >>> I apply only pebs patch set to linux kernel 5.11.0-rc2, test perf in >>> guest and dump stack when return -EOPNOTSUPP >> Yes, you may apply the qemu patches and try it again. >> >>> (1) >>> # perf record -e instructions:pp >>> Error: >>> instructions:pp: PMU Hardware doesn't support >>> sampling/overflow-interrupts. Try 'perf stat' >>> >>> [ 117.793266] Call Trace: >>> [ 117.793270] dump_stack+0x57/0x6a >>> [ 117.793275] intel_pmu_setup_lbr_filter+0x137/0x190 >>> [ 117.793280] intel_pmu_hw_config+0x18b/0x320 >>> [ 117.793288] hsw_hw_config+0xe/0xa0 >>> [ 117.793290] x86_pmu_event_init+0x8e/0x210 >>> [ 117.793293] perf_try_init_event+0x40/0x130 >>> [ 117.793297] perf_event_alloc.part.22+0x611/0xde0 >>> [ 117.793299] ? alloc_fd+0xba/0x180 >>> [ 117.793302] __do_sys_perf_event_open+0x1bd/0xd90 >>> [ 117.793305] do_syscall_64+0x33/0x40 >>> [ 117.793308] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>> >>> Do we need lbr when we use pebs? >> No, lbr ane pebs are two features and we enable it separately. >> >>> I tried to apply lbr patch >>> set(https://lore.kernel.org/kvm/911adb63-ba05-ea93-c038-1c09cff15eda@intel.com/) >>> >>> to kernel and qemu, but there is still other problem. >>> Error: >>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for >>> event >>> ... >> We don't need that patch for PEBS feature. >> >>> (2) >>> # perf record -e instructions:ppp >>> Error: >>> instructions:ppp: PMU Hardware doesn't support >>> sampling/overflow-interrupts. Try 'perf stat' >>> >>> [ 115.188498] Call Trace: >>> [ 115.188503] dump_stack+0x57/0x6a >>> [ 115.188509] x86_pmu_hw_config+0x1eb/0x220 >>> [ 115.188515] intel_pmu_hw_config+0x13/0x320 >>> [ 115.188519] hsw_hw_config+0xe/0xa0 >>> [ 115.188521] x86_pmu_event_init+0x8e/0x210 >>> [ 115.188524] perf_try_init_event+0x40/0x130 >>> [ 115.188528] perf_event_alloc.part.22+0x611/0xde0 >>> [ 115.188530] ? alloc_fd+0xba/0x180 >>> [ 115.188534] __do_sys_perf_event_open+0x1bd/0xd90 >>> [ 115.188538] do_syscall_64+0x33/0x40 >>> [ 115.188541] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>> >>> This is beacuse x86_pmu.intel_cap.pebs_format is always 0 in >>> x86_pmu_max_precise(). >>> >>> We rdmsr MSR_IA32_PERF_CAPABILITIES(0x00000345) from HOST, it's f4c5. >>> From guest, it's 2000 >>> >>>>> # perf record –e cycles:pp >>>>> >>>>> Error: >>>>> >>>>> cycles:pp: PMU Hardware doesn’t support sampling/overflow-interrupts. >>>>> Try ‘perf stat’ >>>>> >>>>> Could you give some advice? >>>> If you have more specific comments or any concerns, just let me know. >>>> >>>>> 2)Test in Skylake >>>>> >>>>> HOST: >>>>> >>>>> CPU family: 6 >>>>> >>>>> Model: 85 >>>>> >>>>> Model name: Intel(R) Xeon(R) Gold 6146 CPU @ >>>>> >>>>> 3.20GHz >>>>> >>>>> microcode : 0x2000064 >>>>> >>>>> Guest: linux 4.18 >>>>> >>>>> we cannot find intel_pt flag in guest cpuinfo because >>>>> cpu_has_vmx_intel_pt() return false. >>>> You may check vmx_pebs_supported(). >>> It's true. >>>>> SECONDARY_EXEC_PT_USE_GPA/VM_EXIT_CLEAR_IA32_RTIT_CTL/VM_ENTRY_LOAD_IA32_RTIT_CTL >>>>> >>>>> are both disable. >>>>> >>>>> Is it because microcode is not supported? >>>>> >>>>> And, isthere a new macrocode which can support these bits? How can we >>>>> get this? >>>> Currently, this patch set doesn't support guest PEBS on the Skylake >>>> platforms, and if we choose to support it, we will let you know. >>>> >>> And now, we want to use pebs in skylake. If we develop based on pebs >>> patch set, do you have any suggestions? >> - At least you need to pin guest memory such as "-overcommit mem-lock=true" >> for qemu >> - You may rewrite the patches 13 - 17 for Skylake specific because the >> records format is different with Ice Lake. > OK. So, is there anything else we need to pay attention to except record > format when used for Skylake? You may need: - remove x86_match_cpu check in the vmx_pebs_supported() - add intel_pmu_handle_guest_pebs() to the intel_pmu_drain_pebs_nhm() I suggest that you may pick up the one-one mapping patch from the v1 so that you can get avoid of patches 13 - 17. >>> I think microcode requirements need to be satisfied. Can we use >>> https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files ? >> You may try it at your risk and again, >> this patch set doesn't support guest PEBS on the Skylake platforms >> currently. >> >>>> --- >>>> thx,likexu >>>> >>>>> Thanks, >>>>> >>>>> Liuxiangdong >>>>> >>> Thanks. Liuxiangdong >>> > ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS @ 2021-01-04 13:15 Like Xu 2021-01-14 19:10 ` Sean Christopherson 0 siblings, 1 reply; 18+ messages in thread From: Like Xu @ 2021-01-04 13:15 UTC (permalink / raw) To: Peter Zijlstra, Paolo Bonzini, eranian, kvm Cc: Ingo Molnar, Sean Christopherson, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andi Kleen, Kan Liang, wei.w.wang, luwei.kang, linux-kernel The Precise Event Based Sampling (PEBS) facility on Intel Ice Lake Server platforms can provide an architectural state of the instruction executed after the guest instruction that caused the event. This patch set enables the PEBS via DS feature for KVM guests on the Ice Lake Server. We can use PEBS feature on the linux guest like native: # perf record -e instructions:ppp ./br_instr a # perf record -c 100000 -e instructions:pp ./br_instr a The guest PEBS will be disabled on purpose when host is using PEBS. By default, KVM disables the co-existence of guest PEBS and host PEBS. The whole patch set could be divided into three parts and the first two parts enables the basic PEBS via DS feature which could be considered to be merged and no regression about host perf is expected. Compared to the first version, an important change here is the removal of the forced 1-1 mapping of the virtual to physical PMC and we handle the cross-mapping issue carefully in the part 3 which may address artificial competition concern from PeterZ. In general, there are 2 code paths to emulate guest PEBS facility. 1) Fast path (part 2, patch 0004-0012) This is when the host assigned physical PMC has an identical index as the virtual PMC (e.g. using physical PMC0 to emulate virtual PMC0). It works as the 1-1 mapping that we did in the first version. 2) Slow path (part 3, patch 0012-0017) This is when the host assigned physical PMC has a different index from the virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) In this case, KVM needs to rewrite the PEBS records to change the applicable counter indexes to the virtual PMC indexes, which would otherwise contain the physical counter index written by PEBS facility, and switch the counter reset values to the offset corresponding to the physical counter indexes in the DS data structure. Large PEBS needs to be disabled by KVM rewriting the pebs_interrupt_threshold filed in DS to only one record in the slow path. This is because a guest may implicitly drain PEBS buffer, e.g., context switch. KVM doesn't get a chance to update the PEBS buffer. The physical PMC index will confuse the guest. The difficulty comes when multiple events get rescheduled inside the guest. Hence disabling large PEBS in this case might be an easy and safe way to keep it corrects as an initial step here. We don't expect this change would break any guest code, which can generally tolerate earlier PMIs. In the fast path with 1:1 mapping this is not needed. The rewriting work is performed before delivering a vPMI to the guest to notify the guest to read the record (before entering the guest, where interrupt has been disabled so no counter reschedule would happen at that point on the host). For the DS area virtualization, the PEBS hardware is registered with the guest virtual address (gva) of the guest DS memory. In the past, the difficulty is that the host needs to pin the guest DS memory, as the page fault caused by the PEBS hardware can't be fixed. This isn't needed from ICX thanks to the hardware support. KVM rewriting the guest DS area needs to walk the guest page tables to translate gva to host virtual address (hva). To reduce the translation overhead, we cache the translation on the first time of DS memory rewriting. The cached translation is valid to use by KVM until the guest disables PEBS (VMExits to KVM), which means the guest may do re-allocation of the PEBS buffer next time and KVM needs to re-walk the guest pages tables to update the cached translation. In summary, this patch set enables the guest PEBS to retrieve the correct information from its own PEBS records on the Ice Lake server platforms when host is not using PEBS facility at the same time. And we expect it should work when migrating to another Ice Lake. Here are the results of pebs test from guest/host for same workload: perf report on guest: # Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1473377250 # Overhead Command Shared Object Symbol 57.74% br_instr br_instr [.] lfsr_cond 41.40% br_instr br_instr [.] cmp_end 0.21% br_instr [kernel.kallsyms] [k] __lock_acquire perf report on host: # Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1462721386 # Overhead Command Shared Object Symbol 57.90% br_instr br_instr [.] lfsr_cond 41.95% br_instr br_instr [.] cmp_end 0.05% br_instr [kernel.vmlinux] [k] lock_acquire Conclusion: the profiling results on the guest are similar to that on the host. Please check more details in each commit and feel free to comment. v2->v3 Changelog: - drop the counter_freezing check and disable guest PEBS when host uses PEBS; - use kvm_read/write_guest_[offset]_cached() to reduce memory rewrite overhead; - use GLOBAL_STATUS_BUFFER_OVF_BIT instead of 62; - make intel_pmu_handle_event() static; - rebased to kvm-queue d45f89f7437d; Previous: https://lore.kernel.org/kvm/20201109021254.79755-1-like.xu@linux.intel.com/ Like Xu (17): KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled KVM: vmx/pmu: Use IA32_PERF_CAPABILITIES to adjust features visibility KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter perf: x86/ds: Handle guest PEBS overflow PMI and inject it to guest KVM: x86/pmu: Reprogram guest PEBS event to emulate guest PEBS counter KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to manage guest DS buffer KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter KVM: x86/pmu: Disable guest PEBS when counters are cross-mapped KVM: x86/pmu: Add hook to emulate pebs for cross-mapped counters KVM: vmx/pmu: Limit pebs_interrupt_threshold in the guest DS area KVM: vmx/pmu: Rewrite applicable_counters field in guest PEBS records KVM: x86/pmu: Save guest pebs reset values when pebs is configured KVM: x86/pmu: Adjust guest pebs reset values for crpss-mapped counters arch/x86/events/intel/core.c | 45 +++++ arch/x86/events/intel/ds.c | 62 +++++++ arch/x86/include/asm/kvm_host.h | 18 ++ arch/x86/include/asm/msr-index.h | 6 + arch/x86/kvm/pmu.c | 92 +++++++-- arch/x86/kvm/pmu.h | 20 ++ arch/x86/kvm/vmx/capabilities.h | 17 +- arch/x86/kvm/vmx/pmu_intel.c | 310 ++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.c | 29 +++ arch/x86/kvm/x86.c | 12 +- 10 files changed, 592 insertions(+), 19 deletions(-) -- 2.29.2 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-04 13:15 Like Xu @ 2021-01-14 19:10 ` Sean Christopherson 2021-01-15 2:02 ` Xu, Like 0 siblings, 1 reply; 18+ messages in thread From: Sean Christopherson @ 2021-01-14 19:10 UTC (permalink / raw) To: Like Xu Cc: Peter Zijlstra, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, Andi Kleen, Kan Liang, wei.w.wang, luwei.kang, linux-kernel On Mon, Jan 04, 2021, Like Xu wrote: > 2) Slow path (part 3, patch 0012-0017) > > This is when the host assigned physical PMC has a different index > from the virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) > In this case, KVM needs to rewrite the PEBS records to change the > applicable counter indexes to the virtual PMC indexes, which would > otherwise contain the physical counter index written by PEBS facility, > and switch the counter reset values to the offset corresponding to > the physical counter indexes in the DS data structure. > > Large PEBS needs to be disabled by KVM rewriting the > pebs_interrupt_threshold filed in DS to only one record in > the slow path. This is because a guest may implicitly drain PEBS buffer, > e.g., context switch. KVM doesn't get a chance to update the PEBS buffer. Are the PEBS record write, PEBS index update, and subsequent PMI atomic with respect to instruction execution? If not, doesn't this approach still leave a window where the guest could see the wrong counter? The virtualization hole is also visible if the guest is reading the PEBS records from a different vCPU, though I assume no sane kernel does that? > The physical PMC index will confuse the guest. The difficulty comes > when multiple events get rescheduled inside the guest. Hence disabling > large PEBS in this case might be an easy and safe way to keep it corrects > as an initial step here. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-14 19:10 ` Sean Christopherson @ 2021-01-15 2:02 ` Xu, Like 2021-01-15 17:57 ` Sean Christopherson 0 siblings, 1 reply; 18+ messages in thread From: Xu, Like @ 2021-01-15 2:02 UTC (permalink / raw) To: Sean Christopherson, Andi Kleen, Kan Liang, Peter Zijlstra Cc: Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel Hi Sean, Thanks for your comments ! On 2021/1/15 3:10, Sean Christopherson wrote: > On Mon, Jan 04, 2021, Like Xu wrote: >> 2) Slow path (part 3, patch 0012-0017) >> >> This is when the host assigned physical PMC has a different index >> from the virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) >> In this case, KVM needs to rewrite the PEBS records to change the >> applicable counter indexes to the virtual PMC indexes, which would >> otherwise contain the physical counter index written by PEBS facility, >> and switch the counter reset values to the offset corresponding to >> the physical counter indexes in the DS data structure. >> >> Large PEBS needs to be disabled by KVM rewriting the >> pebs_interrupt_threshold filed in DS to only one record in >> the slow path. This is because a guest may implicitly drain PEBS buffer, >> e.g., context switch. KVM doesn't get a chance to update the PEBS buffer. > Are the PEBS record write, PEBS index update, and subsequent PMI atomic with > respect to instruction execution? If not, doesn't this approach still leave a > window where the guest could see the wrong counter? First, KVM would limit/rewrite guest DS pebs_interrupt_threshold to one record before vm-entry, (see patch [PATCH v3 14/17] KVM: vmx/pmu: Limit pebs_interrupt_threshold in the guest DS area) which means once a PEBS record is written into the guest pebs buffer, a PEBS PMI will be generated immediately and thus vm-exit. Second, KVM would complete the PEBS record rewriting, PEBS index update, and inject vPMI before the next vm-entry (we deal with these separately in patches 15-17 for easy review). After the updated PEBS record(s) are (atomically?) prepared, guests will be notified via PMI and there is no window for vcpu to check whether there is a PEBS record due to vm-exit. > The virtualization hole is also visible if the guest is reading the PEBS records > from a different vCPU, though I assume no sane kernel does that? I have checked the guest PEBS driver behavior for Linux and Windows, and they're sane. Theoretically, it's true for busy-poll PBES buffer readers from other vCPUs and to fix it, making all vCPUs vm-exit is onerous for a large-size guest and I don't think you would accept this or do we have a better idea ? In fact, we don't think it's a hole or vulnerability because the motivation for correcting the counter index(s) is to help guest PEBS reader understand their PEBS records correctly and provide the same sampling accuracy as the non-cross mapped case, instead of providing a new attack interface from guest to host. PeterZ commented on the V1 version and insisted that the host perf allows the guest counter to be assigned a cross-mapped back-end counter. In this case, the slow path patches (13-17) are introduced to ensure that from the guest counter perspective, the PEBS records are also correct. We do not want these records to be invalid and ignored, which would undermine the accuracy of PEBS. In the practical use, the slow patch rarely happens and we're glad to see if the fast patch could be upstream and the cross-mapped case is teamprily disabled until we're on the same page for the cross mapped case. In actual use, slow path rarely occur. As a first step, we propose to upstream the quick patches (patch 01-12) with your help. The guest PEBS would been disabled temporarily when guest PEBS counters are cross-mapped until we figure out a satisfactory cross-mapping solution. --- thx,likexu > >> The physical PMC index will confuse the guest. The difficulty comes >> when multiple events get rescheduled inside the guest. Hence disabling >> large PEBS in this case might be an easy and safe way to keep it corrects >> as an initial step here. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-15 2:02 ` Xu, Like @ 2021-01-15 17:57 ` Sean Christopherson 2021-01-15 18:27 ` Andi Kleen 0 siblings, 1 reply; 18+ messages in thread From: Sean Christopherson @ 2021-01-15 17:57 UTC (permalink / raw) To: Xu, Like Cc: Andi Kleen, Kan Liang, Peter Zijlstra, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel On Fri, Jan 15, 2021, Xu, Like wrote: > Hi Sean, > > Thanks for your comments ! > > On 2021/1/15 3:10, Sean Christopherson wrote: > > On Mon, Jan 04, 2021, Like Xu wrote: > > > 2) Slow path (part 3, patch 0012-0017) > > > > > > This is when the host assigned physical PMC has a different index > > > from the virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) > > > In this case, KVM needs to rewrite the PEBS records to change the > > > applicable counter indexes to the virtual PMC indexes, which would > > > otherwise contain the physical counter index written by PEBS facility, > > > and switch the counter reset values to the offset corresponding to > > > the physical counter indexes in the DS data structure. > > > > > > Large PEBS needs to be disabled by KVM rewriting the > > > pebs_interrupt_threshold filed in DS to only one record in > > > the slow path. This is because a guest may implicitly drain PEBS buffer, > > > e.g., context switch. KVM doesn't get a chance to update the PEBS buffer. > > Are the PEBS record write, PEBS index update, and subsequent PMI atomic with > > respect to instruction execution? If not, doesn't this approach still leave a > > window where the guest could see the wrong counter? > > First, KVM would limit/rewrite guest DS pebs_interrupt_threshold to one > record before vm-entry, > (see patch [PATCH v3 14/17] KVM: vmx/pmu: Limit pebs_interrupt_threshold in > the guest DS area) > which means once a PEBS record is written into the guest pebs buffer, > a PEBS PMI will be generated immediately and thus vm-exit. I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" guaranteed to be atomic? In practice, under what scenarios will guest counters get cross-mapped? And, how does this support affect guest accuracy? I.e. how bad do things get for the guest if we simply disable guest counters if they can't have a 1:1 association with their physical counter? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-15 17:57 ` Sean Christopherson @ 2021-01-15 18:27 ` Andi Kleen 2021-01-15 18:51 ` Sean Christopherson 0 siblings, 1 reply; 18+ messages in thread From: Andi Kleen @ 2021-01-15 18:27 UTC (permalink / raw) To: Sean Christopherson Cc: Xu, Like, Andi Kleen, Kan Liang, Peter Zijlstra, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel > I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" > guaranteed to be atomic? Of course not. > > In practice, under what scenarios will guest counters get cross-mapped? And, > how does this support affect guest accuracy? I.e. how bad do things get for the > guest if we simply disable guest counters if they can't have a 1:1 association > with their physical counter? This would completely break perfmon for the guest, likely with no way to recover. -Andi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-15 18:27 ` Andi Kleen @ 2021-01-15 18:51 ` Sean Christopherson 2021-01-15 19:11 ` Andi Kleen 2021-01-22 9:56 ` Peter Zijlstra 0 siblings, 2 replies; 18+ messages in thread From: Sean Christopherson @ 2021-01-15 18:51 UTC (permalink / raw) To: Andi Kleen Cc: Xu, Like, Kan Liang, Peter Zijlstra, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel On Fri, Jan 15, 2021, Andi Kleen wrote: > > I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" > > guaranteed to be atomic? > > Of course not. So there's still a window where the guest could observe the bad counter index, correct? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-15 18:51 ` Sean Christopherson @ 2021-01-15 19:11 ` Andi Kleen 2021-01-22 9:56 ` Peter Zijlstra 1 sibling, 0 replies; 18+ messages in thread From: Andi Kleen @ 2021-01-15 19:11 UTC (permalink / raw) To: Sean Christopherson Cc: Andi Kleen, Xu, Like, Kan Liang, Peter Zijlstra, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote: > On Fri, Jan 15, 2021, Andi Kleen wrote: > > > I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" > > > guaranteed to be atomic? > > > > Of course not. > > So there's still a window where the guest could observe the bad counter index, > correct? Yes. But with single record PEBS it doesn't really matter with normal perfmon drivers. -Andi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-15 18:51 ` Sean Christopherson 2021-01-15 19:11 ` Andi Kleen @ 2021-01-22 9:56 ` Peter Zijlstra 2021-01-25 8:08 ` Like Xu 1 sibling, 1 reply; 18+ messages in thread From: Peter Zijlstra @ 2021-01-22 9:56 UTC (permalink / raw) To: Sean Christopherson Cc: Andi Kleen, Xu, Like, Kan Liang, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote: > On Fri, Jan 15, 2021, Andi Kleen wrote: > > > I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" > > > guaranteed to be atomic? > > > > Of course not. > > So there's still a window where the guest could observe the bad counter index, > correct? Guest could do a hypercall to fix up the DS area before it tries to read it I suppose. Or the HV could expose the index mapping and have the guest fix up it. Adding a little virt crud on top shouldn't be too hard. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-22 9:56 ` Peter Zijlstra @ 2021-01-25 8:08 ` Like Xu 2021-01-25 11:13 ` Peter Zijlstra 0 siblings, 1 reply; 18+ messages in thread From: Like Xu @ 2021-01-25 8:08 UTC (permalink / raw) To: Peter Zijlstra, Sean Christopherson, Andi Kleen Cc: Xu, Like, Kan Liang, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel Hi Peter, On 2021/1/22 17:56, Peter Zijlstra wrote: > On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote: >> On Fri, Jan 15, 2021, Andi Kleen wrote: >>>> I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" >>>> guaranteed to be atomic? >>> >>> Of course not. >> >> So there's still a window where the guest could observe the bad counter index, >> correct? > > Guest could do a hypercall to fix up the DS area before it tries to read > it I suppose. Or the HV could expose the index mapping and have the > guest fix up it. A weird (malicious) guest would read unmodified PEBS records in the guest PEBS buffer from other vCPUs without the need for hypercall or index mapping from HV. Do you see any security issues on this host index leak window? > > Adding a little virt crud on top shouldn't be too hard. > The patches 13-17 in this version has modified the guest PEBS buffer to correct the index mapping information in the guest PEBS records. --- thx,likexu ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-25 8:08 ` Like Xu @ 2021-01-25 11:13 ` Peter Zijlstra 2021-01-25 12:07 ` Xu, Like 0 siblings, 1 reply; 18+ messages in thread From: Peter Zijlstra @ 2021-01-25 11:13 UTC (permalink / raw) To: Like Xu Cc: Sean Christopherson, Andi Kleen, Xu, Like, Kan Liang, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel On Mon, Jan 25, 2021 at 04:08:22PM +0800, Like Xu wrote: > Hi Peter, > > On 2021/1/22 17:56, Peter Zijlstra wrote: > > On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote: > > > On Fri, Jan 15, 2021, Andi Kleen wrote: > > > > > I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" > > > > > guaranteed to be atomic? > > > > > > > > Of course not. > > > > > > So there's still a window where the guest could observe the bad counter index, > > > correct? > > > > Guest could do a hypercall to fix up the DS area before it tries to read > > it I suppose. Or the HV could expose the index mapping and have the > > guest fix up it. > > A weird (malicious) guest would read unmodified PEBS records in the > guest PEBS buffer from other vCPUs without the need for hypercall or > index mapping from HV. > > Do you see any security issues on this host index leak window? > > > > > Adding a little virt crud on top shouldn't be too hard. > > > > The patches 13-17 in this version has modified the guest PEBS buffer > to correct the index mapping information in the guest PEBS records. Right, but given there is no atomicity between writing the DS area and triggering the PMI (as already established earlier in this thread), a malicious guest can already access this information, no? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-25 11:13 ` Peter Zijlstra @ 2021-01-25 12:07 ` Xu, Like 2021-01-25 12:18 ` Peter Zijlstra 0 siblings, 1 reply; 18+ messages in thread From: Xu, Like @ 2021-01-25 12:07 UTC (permalink / raw) To: Peter Zijlstra Cc: Sean Christopherson, Andi Kleen, Kan Liang, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel, Like Xu On 2021/1/25 19:13, Peter Zijlstra wrote: > On Mon, Jan 25, 2021 at 04:08:22PM +0800, Like Xu wrote: >> Hi Peter, >> >> On 2021/1/22 17:56, Peter Zijlstra wrote: >>> On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote: >>>> On Fri, Jan 15, 2021, Andi Kleen wrote: >>>>>> I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" >>>>>> guaranteed to be atomic? >>>>> Of course not. >>>> So there's still a window where the guest could observe the bad counter index, >>>> correct? >>> Guest could do a hypercall to fix up the DS area before it tries to read >>> it I suppose. Or the HV could expose the index mapping and have the >>> guest fix up it. >> A weird (malicious) guest would read unmodified PEBS records in the >> guest PEBS buffer from other vCPUs without the need for hypercall or >> index mapping from HV. >> >> Do you see any security issues on this host index leak window? >> >>> Adding a little virt crud on top shouldn't be too hard. >>> >> The patches 13-17 in this version has modified the guest PEBS buffer >> to correct the index mapping information in the guest PEBS records. > Right, but given there is no atomicity between writing the DS area and > triggering the PMI (as already established earlier in this thread), a > malicious guest can already access this information, no? > So under the premise that counter cross-mapping is allowed, how can hypercall help fix it ? Personally, I think it is acceptable at the moment, and no security issues based on this have been defined and found. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-25 12:07 ` Xu, Like @ 2021-01-25 12:18 ` Peter Zijlstra 2021-01-25 12:53 ` Xu, Like 0 siblings, 1 reply; 18+ messages in thread From: Peter Zijlstra @ 2021-01-25 12:18 UTC (permalink / raw) To: Xu, Like Cc: Sean Christopherson, Andi Kleen, Kan Liang, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel, Like Xu On Mon, Jan 25, 2021 at 08:07:06PM +0800, Xu, Like wrote: > So under the premise that counter cross-mapping is allowed, > how can hypercall help fix it ? Hypercall or otherwise exposing the mapping, will let the guest fix it up when it already touches the data. Which avoids the host from having to access the guest memory and is faster, no? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS 2021-01-25 12:18 ` Peter Zijlstra @ 2021-01-25 12:53 ` Xu, Like 0 siblings, 0 replies; 18+ messages in thread From: Xu, Like @ 2021-01-25 12:53 UTC (permalink / raw) To: Peter Zijlstra, Sean Christopherson Cc: Andi Kleen, Kan Liang, Paolo Bonzini, eranian, kvm, Ingo Molnar, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, luwei.kang, linux-kernel, Like Xu On 2021/1/25 20:18, Peter Zijlstra wrote: > On Mon, Jan 25, 2021 at 08:07:06PM +0800, Xu, Like wrote: > >> So under the premise that counter cross-mapping is allowed, >> how can hypercall help fix it ? > Hypercall or otherwise exposing the mapping, will let the guest fix it > up when it already touches the data. Which avoids the host from having > to access the guest memory and is faster, no? - as you may know, the mapping table is changing rapidly from the time records to be rewritten to the time records to be read; - the patches will modify the records before it is notified via PMI which means it's transparent to normal guests (including Windows); - a malicious guest would ignore the exposed mapping and the hypercall and I don't think it can solve the leakage issue at all; - make the guest aware of that hypercall or mapping requires more code changes in the guest side; but now we can make it on the KVM side and we also know that cross-mapping case rarely happens, and the overhead is acceptable based on our tests; Please let me know if you or Sean are not going to buy in the PEBS records rewrite proposal in the patch 13 - 17. --- thx,likexu ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2021-02-01 8:45 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <EEC2A80E7137D84ABF791B01D40FA9A601EC200E@DGGEMM506-MBX.china.huawei.com> 2021-01-25 2:41 ` [PATCH v3 00/17] KVM: x86/pmu: Add support to enable Guest PEBS via DS Like Xu 2021-01-25 14:47 ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) 2021-01-26 7:08 ` Xu, Like 2021-01-29 2:52 ` Liuxiangdong (Aven, Cloud Infrastructure Service Product Dept.) 2021-02-01 8:43 ` Xu, Like 2021-01-04 13:15 Like Xu 2021-01-14 19:10 ` Sean Christopherson 2021-01-15 2:02 ` Xu, Like 2021-01-15 17:57 ` Sean Christopherson 2021-01-15 18:27 ` Andi Kleen 2021-01-15 18:51 ` Sean Christopherson 2021-01-15 19:11 ` Andi Kleen 2021-01-22 9:56 ` Peter Zijlstra 2021-01-25 8:08 ` Like Xu 2021-01-25 11:13 ` Peter Zijlstra 2021-01-25 12:07 ` Xu, Like 2021-01-25 12:18 ` Peter Zijlstra 2021-01-25 12:53 ` Xu, Like
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).