All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Like Xu <like.xu@linux.intel.com>,
	Sean Christopherson <seanjc@google.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	ak@linux.intel.com, wei.w.wang@intel.com, kan.liang@intel.com,
	alex.shi@linux.alibaba.com, kvm@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v14 00/11] KVM: x86/pmu: Guest Last Branch Recording Enabling
Date: Tue, 2 Feb 2021 13:37:56 +0100	[thread overview]
Message-ID: <196ffa8c-4027-724a-ca29-f1db69f48ab7@redhat.com> (raw)
In-Reply-To: <20210201051039.255478-1-like.xu@linux.intel.com>

On 01/02/21 06:10, Like Xu wrote:
> Hi geniuses,
> 
> Please help review this new version which enables the guest LBR.
> 
> We already upstreamed the guest LBR support in the host perf, please
> check more details in each commit and feel free to test and comment.
> 
> QEMU part: https://lore.kernel.org/qemu-devel/20210201045453.240258-1-like.xu@linux.intel.com
> kvm-unit-tests: https://lore.kernel.org/kvm/20210201045751.243231-1-like.xu@linux.intel.com
> 
> v13-v14 Changelog:
> - Rewrite crud about vcpu->arch.perf_capabilities;
> - Add PERF_CAPABILITIES testcases to tools/testing/selftests/kvm;
> - Add basic LBR testcases to the kvm-unit-tests (w/ QEMU patches);
> - Apply rewritten commit log from Paolo;
> - Queued the first patch "KVM: x86: Move common set/get handler ...";
> - Rename 'already_passthrough' to 'msr_passthrough';
> - Check the values of MSR_IA32_PERF_CAPABILITIES early;
> - Call kvm_x86_ops.pmu_ops->cleanup() always and drop extra_cleanup;
> - Use INTEL_PMC_IDX_FIXED_VLBR directly;
> - Fix a bug in the vmx_get_perf_capabilities();
> 
> Previous:
> https://lore.kernel.org/kvm/20210108013704.134985-1-like.xu@linux.intel.com/

Queued, thanks.  There were some conflicts with the bus lock detection 
patches, so I had to tweak a bit the DEBUGCTL MSR handling.

Paolo

> ---
> 
> The last branch recording (LBR) is a performance monitor unit (PMU)
> feature on Intel processors that records a running trace of the most
> recent branches taken by the processor in the LBR stack. This patch
> series is going to enable this feature for plenty of KVM guests.
> 
> with this patch set, the following error will be gone forever and cloud
> developers can better understand their programs with less profiling overhead:
> 
>    $ perf record -b lbr ${WORKLOAD}
>    or $ perf record --call-graph lbr ${WORKLOAD}
>    Error:
>    cycles: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
> 
> The user space could configure whether it's enabled or not for each
> guest via MSR_IA32_PERF_CAPABILITIES msr. As a first step, a guest
> could only enable LBR feature if its cpu model is the same as the
> host since the LBR feature is still one of model specific features.
> 
> If it's enabled on the guest, the guest LBR driver would accesses the
> LBR MSR (including IA32_DEBUGCTLMSR and records MSRs) as host does.
> The first guest access on the LBR related MSRs is always interceptible.
> The KVM trap would create a special LBR event (called guest LBR event)
> which enables the callstack mode and none of hardware counter is assigned.
> The host perf would enable and schedule this event as usual.
> 
> Guest's first access to a LBR registers gets trapped to KVM, which
> creates a guest LBR perf event. It's a regular LBR perf event which gets
> the LBR facility assigned from the perf subsystem. Once that succeeds,
> the LBR stack msrs are passed through to the guest for efficient accesses.
> However, if another host LBR event comes in and takes over the LBR
> facility, the LBR msrs will be made interceptible, and guest following
> accesses to the LBR msrs will be trapped and meaningless.
> 
> Because saving/restoring tens of LBR MSRs (e.g. 32 LBR stack entries) in
> VMX transition brings too excessive overhead to frequent vmx transition
> itself, the guest LBR event would help save/restore the LBR stack msrs
> during the context switching with the help of native LBR event callstack
> mechanism, including LBR_SELECT msr.
> 
> If the guest no longer accesses the LBR-related MSRs within a scheduling
> time slice and the LBR enable bit is unset, vPMU would release its guest
> LBR event as a normal event of a unused vPMC and the pass-through
> state of the LBR stack msrs would be canceled.
> 
> ---
> 
> LBR testcase:
> echo 1 > /proc/sys/kernel/watchdog
> echo 25 > /proc/sys/kernel/perf_cpu_time_max_percent
> echo 5000 > /proc/sys/kernel/perf_event_max_sample_rate
> echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent
> perf record -b ./br_instr a
> (perf record --call-graph lbr ./br_instr a)
> 
> - Perf report on the host:
> Samples: 72K of event 'cycles', Event count (approx.): 72512
> Overhead  Command   Source Shared Object           Source Symbol                           Target Symbol                           Basic Block Cycles
>    12.12%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           1
>    11.05%  br_instr  br_instr                       [.] lfsr_cond                           [.] cmp_end                             5
>     8.81%  br_instr  br_instr                       [.] lfsr_cond                           [.] cmp_end                             4
>     5.04%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           20
>     4.92%  br_instr  br_instr                       [.] lfsr_cond                           [.] cmp_end                             6
>     4.88%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           6
>     4.58%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           5
> 
> - Perf report on the guest:
> Samples: 92K of event 'cycles', Event count (approx.): 92544
> Overhead  Command   Source Shared Object  Source Symbol                                   Target Symbol                                   Basic Block Cycles
>    12.03%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   1
>    11.09%  br_instr  br_instr              [.] lfsr_cond                                   [.] cmp_end                                     5
>     8.57%  br_instr  br_instr              [.] lfsr_cond                                   [.] cmp_end                                     4
>     5.08%  br_instr  br_instr              [.] lfsr_cond                                   [.] cmp_end                                     6
>     5.06%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   20
>     4.87%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   6
>     4.70%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   5
> 
> Conclusion: the profiling results on the guest are similar to that on the host.
> 
> Like Xu (11):
>    KVM: x86/vmx: Make vmx_set_intercept_for_msr() non-static
>    KVM: x86/pmu: Set up IA32_PERF_CAPABILITIES if PDCM bit is available
>    KVM: vmx/pmu: Add PMU_CAP_LBR_FMT check when guest LBR is enabled
>    KVM: vmx/pmu: Expose DEBUGCTLMSR_LBR in the MSR_IA32_DEBUGCTLMSR
>    KVM: vmx/pmu: Create a guest LBR event when vcpu sets DEBUGCTLMSR_LBR
>    KVM: vmx/pmu: Pass-through LBR msrs when the guest LBR event is ACTIVE
>    KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
>    KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
>    KVM: vmx/pmu: Release guest LBR event via lazy release mechanism
>    KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES
>    selftests: kvm/x86: add test for pmu msr MSR_IA32_PERF_CAPABILITIES
> 
>   arch/x86/kvm/pmu.c                            |   8 +-
>   arch/x86/kvm/pmu.h                            |   2 +
>   arch/x86/kvm/vmx/capabilities.h               |  19 +-
>   arch/x86/kvm/vmx/pmu_intel.c                  | 281 +++++++++++++++++-
>   arch/x86/kvm/vmx/vmx.c                        |  55 +++-
>   arch/x86/kvm/vmx/vmx.h                        |  28 ++
>   arch/x86/kvm/x86.c                            |   2 +-
>   tools/testing/selftests/kvm/.gitignore        |   1 +
>   tools/testing/selftests/kvm/Makefile          |   1 +
>   .../selftests/kvm/x86_64/vmx_pmu_msrs_test.c  | 149 ++++++++++
>   10 files changed, 524 insertions(+), 22 deletions(-)
>   create mode 100644 tools/testing/selftests/kvm/x86_64/vmx_pmu_msrs_test.c
> 


  parent reply	other threads:[~2021-02-02 12:39 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-01  5:10 [PATCH v14 00/11] KVM: x86/pmu: Guest Last Branch Recording Enabling Like Xu
2021-02-01  5:10 ` [PATCH v14 01/11] KVM: x86/vmx: Make vmx_set_intercept_for_msr() non-static Like Xu
2021-02-01  5:10 ` [PATCH v14 02/11] KVM: x86/pmu: Set up IA32_PERF_CAPABILITIES if PDCM bit is available Like Xu
2021-02-02 11:48   ` Paolo Bonzini
2021-02-01  5:10 ` [PATCH v14 03/11] KVM: vmx/pmu: Add PMU_CAP_LBR_FMT check when guest LBR is enabled Like Xu
2021-02-02 12:00   ` Paolo Bonzini
2021-02-01  5:10 ` [PATCH v14 04/11] KVM: vmx/pmu: Expose DEBUGCTLMSR_LBR in the MSR_IA32_DEBUGCTLMSR Like Xu
2021-02-01  5:10 ` [PATCH v14 05/11] KVM: vmx/pmu: Create a guest LBR event when vcpu sets DEBUGCTLMSR_LBR Like Xu
2021-02-01  5:10 ` [PATCH v14 06/11] KVM: vmx/pmu: Pass-through LBR msrs when the guest LBR event is ACTIVE Like Xu
2021-02-01  5:10 ` [PATCH v14 07/11] KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation Like Xu
2021-02-01  5:10 ` [PATCH v14 08/11] KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI Like Xu
2021-02-01  5:10 ` [PATCH v14 09/11] KVM: vmx/pmu: Release guest LBR event via lazy release mechanism Like Xu
2021-02-01  5:10 ` [PATCH v14 10/11] KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES Like Xu
2021-02-01  5:10 ` [PATCH v14 11/11] selftests: kvm/x86: add test for pmu msr MSR_IA32_PERF_CAPABILITIES Like Xu
2021-02-01  6:01 ` [PATCH v14 07/11] KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation Like Xu
2021-02-01  6:01   ` [PATCH v14 08/11] KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI Like Xu
2021-02-01  6:01   ` [PATCH v14 09/11] KVM: vmx/pmu: Release guest LBR event via lazy release mechanism Like Xu
2021-02-01  6:01   ` [PATCH v14 10/11] KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES Like Xu
2021-02-01  6:01   ` [PATCH v14 11/11] selftests: kvm/x86: add test for pmu msr MSR_IA32_PERF_CAPABILITIES Like Xu
2021-02-02 14:53     ` Paolo Bonzini
2021-02-02 12:37 ` Paolo Bonzini [this message]
2021-07-29 12:40 ` [PATCH v14 00/11] KVM: x86/pmu: Guest Last Branch Recording Enabling Liuxiangdong
2021-07-30  3:15   ` Liuxiangdong
2021-07-30  3:28     ` Like Xu
2022-09-13 23:42 ` Jim Mattson
2022-09-19  7:26   ` Like Xu
2022-09-19 18:08     ` Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=196ffa8c-4027-724a-ca29-f1db69f48ab7@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=alex.shi@linux.alibaba.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kan.liang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.