kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Like Xu <like.xu@linux.intel.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	wei.w.wang@intel.com, ak@linux.intel.com,
	Like Xu <like.xu@linux.intel.com>
Subject: [PATCH v10 00/11] Guest Last Branch Recording Enabling
Date: Thu, 23 Apr 2020 16:14:01 +0800	[thread overview]
Message-ID: <20200423081412.164863-1-like.xu@linux.intel.com> (raw)

Hi all,

Please help review your interesting parts in this stable version,
e.g. the first five patches involve the perf event subsystem
and the sixth patch concerns the KVM userspace interface.

v9->v10 Changelog:
- new patch (0002) to refactor hw->idx checks and cleanup for host perf;
- refine comments in guest LBR constraint patch and rename functions;
- still ack LBRS_FROZEN for guest LBR event in the intel_pmu_ack_status();
- add more checks before enabling LBR in the kvm_vm_ioctl_enable_cap();
- add pmu_ops->deliver_pmi to clear LBR enable bit for guest debugctl_msr;
- use vmx_supported_perf_capabilities() to expose PDCM via kvm_cpu_cap*();

You may check more details in each commit.

Previous:
https://lore.kernel.org/kvm/20200313021616.112322-1-like.xu@linux.intel.com/

---

The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. This patch
series is going to enable this feature for plenty of KVM guests.

The userspace could configure whether it's enabled or not for each
guest via vm_ioctl KVM_CAP_X86_GUEST_LBR. As a first step, a guest
could only enable LBR feature if its cpu model is the same as the
host since the LBR feature is still one of model specific features.

If it's enabled on the guest, the guest LBR driver would accesses the
LBR MSR (including IA32_DEBUGCTLMSR and records MSRs) as host does.
The first guest access on the LBR related MSRs is always interceptible.
The KVM trap would create a special LBR event (called guest LBR event)
which enables the callstack mode and none of hardware counter is assigned.
The host perf would enable and schedule this event as usual. 

Guest's first access to a LBR registers gets trapped to KVM, which
creates a guest LBR perf event. It's a regular LBR perf event which gets
the LBR facility assigned from the perf subsystem. Once that succeeds,
the LBR stack msrs are passed through to the guest for efficient accesses.
However, if another host LBR event comes in and takes over the LBR
facility, the LBR msrs will be made interceptible, and guest following
accesses to the LBR msrs will be trapped and meaningless. 

Because saving/restoring tens of LBR MSRs (e.g. 32 LBR stack entries) in
VMX transition brings too excessive overhead to frequent vmx transition
itself, the guest LBR event would help save/restore the LBR stack msrs
during the context switching with the help of native LBR event callstack
mechanism, including LBR_SELECT msr.

If the guest no longer accesses the LBR-related MSRs within a scheduling
time slice and the LBR enable bit is unset, vPMU would release its guest
LBR event as a normal event of a unused vPMC and the pass-through
state of the LBR stack msrs would be canceled.

---

LBR testcase:
echo 1 > /proc/sys/kernel/watchdog
echo 25 > /proc/sys/kernel/perf_cpu_time_max_percent
echo 5000 > /proc/sys/kernel/perf_event_max_sample_rate
echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent
./perf record -b ./br_instr a

- Perf report on the host:
Samples: 72K of event 'cycles', Event count (approx.): 72512
Overhead  Command   Source Shared Object           Source Symbol                           Target Symbol                           Basic Block Cycles
  12.12%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           1
  11.05%  br_instr  br_instr                       [.] lfsr_cond                           [.] cmp_end                             5
   8.81%  br_instr  br_instr                       [.] lfsr_cond                           [.] cmp_end                             4
   5.04%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           20
   4.92%  br_instr  br_instr                       [.] lfsr_cond                           [.] cmp_end                             6
   4.88%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           6
   4.58%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           5

- Perf report on the guest:
Samples: 92K of event 'cycles', Event count (approx.): 92544
Overhead  Command   Source Shared Object  Source Symbol                                   Target Symbol                                   Basic Block Cycles
  12.03%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   1
  11.09%  br_instr  br_instr              [.] lfsr_cond                                   [.] cmp_end                                     5
   8.57%  br_instr  br_instr              [.] lfsr_cond                                   [.] cmp_end                                     4
   5.08%  br_instr  br_instr              [.] lfsr_cond                                   [.] cmp_end                                     6
   5.06%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   20
   4.87%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   6
   4.70%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   5

Conclusion: the profiling results on the guest are similar to that on the host.

Like Xu (8):
  perf/x86/core: Refactor hw->idx checks and cleanup
  perf/x86/lbr: Add interface to get basic information about LBR stack
  perf/x86: Add constraint to create guest LBR event without hw counter
  perf/x86: Keep LBR stack unchanged in host context for guest LBR event
  KVM: x86: Add KVM_CAP_X86_GUEST_LBR to dis/enable LBR from user-space
  KVM: x86/pmu: Add LBR feature emulation via guest LBR event
  KVM: x86/pmu: Release guest LBR event via vPMU lazy release mechanism
  KVM: x86: Expose MSR_IA32_PERF_CAPABILITIES for LBR record format

Wei Wang (3):
  perf/x86: Fix variable type for LBR registers
  KVM: x86/pmu: Tweak kvm_pmu_get_msr to pass 'struct msr_data' in
  KVM: x86: Remove the common trap handler of the MSR_IA32_DEBUGCTLMSR

 Documentation/virt/kvm/api.rst    |  28 +++
 arch/x86/events/core.c            |  26 ++-
 arch/x86/events/intel/core.c      | 102 ++++++----
 arch/x86/events/intel/lbr.c       |  56 +++++-
 arch/x86/events/perf_event.h      |  18 +-
 arch/x86/include/asm/kvm_host.h   |  14 ++
 arch/x86/include/asm/perf_event.h |  28 ++-
 arch/x86/kvm/pmu.c                |  27 ++-
 arch/x86/kvm/pmu.h                |  17 +-
 arch/x86/kvm/svm/pmu.c            |   7 +-
 arch/x86/kvm/vmx/capabilities.h   |  15 ++
 arch/x86/kvm/vmx/pmu_intel.c      | 301 ++++++++++++++++++++++++++++--
 arch/x86/kvm/vmx/vmx.c            |  11 +-
 arch/x86/kvm/vmx/vmx.h            |   2 +
 arch/x86/kvm/x86.c                |  34 ++--
 include/linux/perf_event.h        |   7 +
 include/uapi/linux/kvm.h          |   1 +
 kernel/events/core.c              |   7 -
 18 files changed, 603 insertions(+), 98 deletions(-)

-- 
2.21.1


             reply	other threads:[~2020-04-23  8:17 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-23  8:14 Like Xu [this message]
2020-04-23  8:14 ` [PATCH v10 01/11] perf/x86: Fix variable type for LBR registers Like Xu
2020-04-23  8:14 ` [PATCH v10 02/11] perf/x86/core: Refactor hw->idx checks and cleanup Like Xu
2020-04-23  8:14 ` [PATCH v10 03/11] perf/x86/lbr: Add interface to get basic information about LBR stack Like Xu
2020-04-23  8:14 ` [PATCH v10 04/11] perf/x86: Add constraint to create guest LBR event without hw counter Like Xu
2020-04-23  8:14 ` [PATCH v10 05/11] perf/x86: Keep LBR stack unchanged in host context for guest LBR event Like Xu
2020-04-23  8:14 ` [PATCH v10 06/11] KVM: x86: Add KVM_CAP_X86_GUEST_LBR to dis/enable LBR from user-space Like Xu
2020-04-23  8:14 ` [PATCH v10 07/11] KVM: x86/pmu: Tweak kvm_pmu_get_msr to pass 'struct msr_data' in Like Xu
2020-04-23  8:14 ` [PATCH v10 08/11] KVM: x86/pmu: Add LBR feature emulation via guest LBR event Like Xu
2020-04-24 12:16   ` Peter Zijlstra
2020-04-27  3:16     ` Like Xu
2020-05-08  8:48       ` Like Xu
2020-05-08 13:09       ` Peter Zijlstra
2020-05-12  4:58         ` Xu, Like
2020-04-23  8:14 ` [PATCH v10 09/11] KVM: x86/pmu: Release guest LBR event via vPMU lazy release mechanism Like Xu
2020-04-28  5:06   ` kbuild test robot
2020-04-28  5:06   ` [RFC PATCH] KVM: x86/pmu: kvm_pmu_lbr_cleanup() can be static kbuild test robot
2020-04-23  8:14 ` [PATCH v10 10/11] KVM: x86: Expose MSR_IA32_PERF_CAPABILITIES for LBR record format Like Xu
2020-04-23  8:14 ` [PATCH v10 11/11] KVM: x86: Remove the common trap handler of the MSR_IA32_DEBUGCTLMSR Like Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200423081412.164863-1-like.xu@linux.intel.com \
    --to=like.xu@linux.intel.com \
    --cc=ak@linux.intel.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=wei.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).