From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13DFBC433B4 for ; Tue, 18 May 2021 12:40:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E5A1E611CE for ; Tue, 18 May 2021 12:40:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343862AbhERMmK (ORCPT ); Tue, 18 May 2021 08:42:10 -0400 Received: from mga03.intel.com ([134.134.136.65]:57939 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349284AbhERMmI (ORCPT ); Tue, 18 May 2021 08:42:08 -0400 IronPort-SDR: v16exUBZhBDwpB0GgdhKq6SKhgkPt0dJgx/Ky4P0U2diZBf0tfBkPKY5m1BTt+5yL5IVBGxyD0 sFjaSg/ftMmw== X-IronPort-AV: E=McAfee;i="6200,9189,9987"; a="200753814" X-IronPort-AV: E=Sophos;i="5.82,310,1613462400"; d="scan'208";a="200753814" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2021 05:40:49 -0700 IronPort-SDR: f1EHggIkxe+EbxF7DCWqC8JNCxPz7fjaOSzftIN32fcFvVJbc3Fb3wJi5OS0jLHG3GxUGr+QDX q41qnX+ioSEg== X-IronPort-AV: E=Sophos;i="5.82,310,1613462400"; d="scan'208";a="472932722" Received: from likexu-mobl1.ccr.corp.intel.com (HELO [10.255.30.127]) ([10.255.30.127]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2021 05:40:44 -0700 Subject: Re: [PATCH v6 00/16] KVM: x86/pmu: Add *basic* support to enable guest PEBS via DS To: Liuxiangdong Cc: Borislav Petkov , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , weijiang.yang@intel.com, Kan Liang , ak@linux.intel.com, wei.w.wang@intel.com, eranian@google.com, linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org, "Fangyi (Eric)" , Xiexiangyou , Peter Zijlstra , Paolo Bonzini , Like Xu References: <20210511024214.280733-1-like.xu@linux.intel.com> <609FA2B7.7030801@huawei.com> <868a0ed9-d4a5-c135-811e-a3420b7913ac@linux.intel.com> <60A3B1DC.7000002@huawei.com> From: "Xu, Like" Message-ID: Date: Tue, 18 May 2021 20:40:42 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.2 MIME-Version: 1.0 In-Reply-To: <60A3B1DC.7000002@huawei.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/5/18 20:23, Liuxiangdong wrote: > > > On 2021/5/17 14:38, Like Xu wrote: >> Hi xiangdong, >> >> On 2021/5/15 18:30, Liuxiangdong wrote: >>> >>> >>> On 2021/5/11 10:41, Like Xu wrote: >>>> A new kernel cycle has begun, and this version looks promising. >>>> >>>> The guest Precise Event Based Sampling (PEBS) feature can provide >>>> an architectural state of the instruction executed after the guest >>>> instruction that exactly caused the event. It needs new hardware >>>> facility only available on Intel Ice Lake Server platforms. This >>>> patch set enables the basic PEBS feature for KVM guests on ICX. >>>> >>>> We can use PEBS feature on the Linux guest like native: >>>> >>>>    # perf record -e instructions:ppp ./br_instr a >>>>    # perf record -c 100000 -e instructions:pp ./br_instr a >>> >>> Hi, Like. >>> Has the qemu patch been modified? >>> >>> https://lore.kernel.org/kvm/f4dcb068-2ddf-428f-50ad-39f65cad3710@intel.com/ >>> ? >> >> I think the qemu part still works based on >> 609d7596524ab204ccd71ef42c9eee4c7c338ea4 (tag: v6.0.0). >> > > Yes. I applied these two qemu patches to qemu v6.0.0 and this kvm patches > set to latest kvm tree. > > I can see pebs flags in Guest(linux 5.11) on the IceLake( Model: 106  > Model name: Intel(R) Xeon(R) Platinum 8378A CPU), > and i can use PEBS like this. > >     #perf record -e instructions:pp > > It can work normally. > > But  there is no sampling when i use "perf record -e events:pp" or just > "perf record" in guest > unless i delete patch 09 and patch 13 from this kvm patches set. > > With patch 9 and 13, does the basic counter sampling still work ? You may retry w/ "echo 0 > /proc/sys/kernel/watchdog" on the host and guest. > Have you tried "perf record -e events:pp" in this patches set? Does it > work normally? All my PEBS testcases passed. You may dump guest msr traces from your testcase with me. > > > > Thanks! > Xiangdong Liu > > > >> When the LBR qemu patch receives the ACK from the maintainer, >> I will submit PBES qemu support because their changes are very similar. >> >> Please help review this version and >> feel free to add your comments or "Reviewed-by". >> >> Thanks, >> Like Xu >> >>> >>> >>>> To emulate guest PEBS facility for the above perf usages, >>>> we need to implement 2 code paths: >>>> >>>> 1) Fast path >>>> >>>> This is when the host assigned physical PMC has an identical index as >>>> the virtual PMC (e.g. using physical PMC0 to emulate virtual PMC0). >>>> This path is used in most common use cases. >>>> >>>> 2) Slow path >>>> >>>> This is when the host assigned physical PMC has a different index >>>> from the virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) >>>> In this case, KVM needs to rewrite the PEBS records to change the >>>> applicable counter indexes to the virtual PMC indexes, which would >>>> otherwise contain the physical counter index written by PEBS facility, >>>> and switch the counter reset values to the offset corresponding to >>>> the physical counter indexes in the DS data structure. >>>> >>>> The previous version [0] enables both fast path and slow path, which >>>> seems a bit more complex as the first step. In this patchset, we want >>>> to start with the fast path to get the basic guest PEBS enabled while >>>> keeping the slow path disabled. More focused discussion on the slow >>>> path [1] is planned to be put to another patchset in the next step. >>>> >>>> Compared to later versions in subsequent steps, the functionality >>>> to support host-guest PEBS both enabled and the functionality to >>>> emulate guest PEBS when the counter is cross-mapped are missing >>>> in this patch set (neither of these are typical scenarios). >>>> >>>> With the basic support, the guest can retrieve the correct PEBS >>>> information from its own PEBS records on the Ice Lake servers. >>>> And we expect it should work when migrating to another Ice Lake >>>> and no regression about host perf is expected. >>>> >>>> Here are the results of pebs test from guest/host for same workload: >>>> >>>> perf report on guest: >>>> # Samples: 2K of event 'instructions:ppp', # Event count (approx.): >>>> 1473377250 >>>> # Overhead  Command   Shared Object      Symbol >>>>    57.74%  br_instr  br_instr           [.] lfsr_cond >>>>    41.40%  br_instr  br_instr           [.] cmp_end >>>>     0.21%  br_instr  [kernel.kallsyms]  [k] __lock_acquire >>>> >>>> perf report on host: >>>> # Samples: 2K of event 'instructions:ppp', # Event count (approx.): >>>> 1462721386 >>>> # Overhead  Command   Shared Object     Symbol >>>>    57.90%  br_instr  br_instr          [.] lfsr_cond >>>>    41.95%  br_instr  br_instr          [.] cmp_end >>>>     0.05%  br_instr  [kernel.vmlinux]  [k] lock_acquire >>>>     Conclusion: the profiling results on the guest are similar tothat >>>> on the host. >>>> >>>> A minimum guest kernel version may be v5.4 or a backport version >>>> support Icelake server PEBS. >>>> >>>> Please check more details in each commit and feel free to comment. >>>> >>>> Previous: >>>> https://lore.kernel.org/kvm/20210415032016.166201-1-like.xu@linux.intel.com/ >>>> >>>> >>>> [0] >>>> https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/ >>>> [1] >>>> https://lore.kernel.org/kvm/20210115191113.nktlnmivc3edstiv@two.firstfloor.org/ >>>> >>>> >>>> V5 -> V6 Changelog: >>>> - Rebased on the latest kvm/queue tree; >>>> - Fix a git rebase issue (Liuxiangdong); >>>> - Adjust the patch sequence 06/07 for bisection (Liuxiangdong); >>>> >>>> Like Xu (16): >>>>    perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server >>>>    perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest >>>>    perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values >>>>    KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled >>>>    KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter >>>>    KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS >>>>    KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter >>>>    KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS >>>>    KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS >>>>    KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled >>>>    KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter >>>>    KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h >>>>    KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations >>>>    KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability >>>>    KVM: x86/cpuid: Refactor host/guest CPU model consistency check >>>>    KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 >>>> >>>>   arch/x86/events/core.c            |   5 +- >>>>   arch/x86/events/intel/core.c      | 129 ++++++++++++++++++++++++------ >>>>   arch/x86/events/perf_event.h      |   5 +- >>>>   arch/x86/include/asm/kvm_host.h   |  16 ++++ >>>>   arch/x86/include/asm/msr-index.h  |   6 ++ >>>>   arch/x86/include/asm/perf_event.h |   5 +- >>>>   arch/x86/kvm/cpuid.c              |  24 ++---- >>>>   arch/x86/kvm/cpuid.h              |   5 ++ >>>>   arch/x86/kvm/pmu.c                |  50 +++++++++--- >>>>   arch/x86/kvm/pmu.h                |  38 +++++++++ >>>>   arch/x86/kvm/vmx/capabilities.h   |  26 ++++-- >>>>   arch/x86/kvm/vmx/pmu_intel.c      | 115 +++++++++++++++++++++----- >>>>   arch/x86/kvm/vmx/vmx.c            |  24 +++++- >>>>   arch/x86/kvm/vmx/vmx.h            |   2 +- >>>>   arch/x86/kvm/x86.c                |  14 ++-- >>>>   15 files changed, 368 insertions(+), 96 deletions(-) >>>> >> >