kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Xu, Like" <like.xu@intel.com>
To: Like Xu <like.xu@linux.intel.com>, Paolo Bonzini <pbonzini@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>,
	Jim Mattson <jmattson@google.com>,
	kvm@vger.kernel.org,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Joerg Roedel <joro@8bytes.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v13 00/10] Guest Last Branch Recording Enabling (KVM part)
Date: Wed, 30 Sep 2020 10:23:27 +0800	[thread overview]
Message-ID: <c9904bf5-89e0-a5c1-2d1c-98c50c50d7d9@intel.com> (raw)
In-Reply-To: <6d4d7b00-cbca-9875-24bd-e6c4efaf0586@intel.com>

Are there volunteers or maintainer to help review this patch-set ?

Just a kindly ping.

Please let me know if you need a re-based version.

Thanks,
Like Xu

On 2020/8/14 16:48, Xu, Like wrote:
> Are there no interested reviewers or users?
>
> Just a kindly ping.
>
> On 2020/7/26 23:32, Like Xu wrote:
>> Hi Paolo,
>>
>> Please review this new version for the Kernel 5.9 release, and
>> Sean may not review them as he said in the previous email
>> https://lore.kernel.org/kvm/20200710162819.GF1749@linux.intel.com/
>>
>> You may cherry-pick the perf patches "3cb9d5464c1c..e1ad1ac2deb8"
>> from the branch "tip/perf/core" of scm/linux/kernel/git/tip/tip.git
>> as PeterZ said in the previous email
>> https://lore.kernel.org/kvm/20200703075646.GJ117543@hirez.programming.kicks-ass.net/ 
>>
>>
>> We may also apply the qemu-devel patch to the upstream qemu and try
>> the QEMU command lines with '-cpu host' or '-cpu host,pmu=true,lbr=true'.
>>
>> The following error will be gone forever with the patchset:
>>
>>    $ perf record -b lbr ${WORKLOAD}
>>    or $ perf record --call-graph lbr ${WORKLOAD}
>>    Error:
>>    cycles: PMU Hardware doesn't support sampling/overflow-interrupts. 
>> Try 'perf stat'
>>
>> Please check more details in each commit and feel free to test.
>>
>> v12->v13 Changelog:
>> - remove perf patches since they're queued in the tip/perf/core;
>> - add a minor patch to refactor MSR_IA32_DEBUGCTLMSR set/get handler;
>> - add a minor patch to expose vmx_set_intercept_for_msr();
>> - add a minor patch to initialize perf_capabilities in the 
>> intel_pmu_init();
>> - spilt the big patch to three pieces (0004-0006) for better 
>> understanding and review
>> - make the LBR_FMT exposure patch as the last step to enable guest LBR;
>>
>> Previous:
>> https://lore.kernel.org/kvm/20200613080958.132489-1-like.xu@linux.intel.com/ 
>>
>>
>> ---
>>
>> The last branch recording (LBR) is a performance monitor unit (PMU)
>> feature on Intel processors that records a running trace of the most
>> recent branches taken by the processor in the LBR stack. This patch
>> series is going to enable this feature for plenty of KVM guests.
>>
>> The user space could configure whether it's enabled or not for each
>> guest via MSR_IA32_PERF_CAPABILITIES msr. As a first step, a guest
>> could only enable LBR feature if its cpu model is the same as the
>> host since the LBR feature is still one of model specific features.
>>
>> If it's enabled on the guest, the guest LBR driver would accesses the
>> LBR MSR (including IA32_DEBUGCTLMSR and records MSRs) as host does.
>> The first guest access on the LBR related MSRs is always interceptible.
>> The KVM trap would create a special LBR event (called guest LBR event)
>> which enables the callstack mode and none of hardware counter is assigned.
>> The host perf would enable and schedule this event as usual.
>>
>> Guest's first access to a LBR registers gets trapped to KVM, which
>> creates a guest LBR perf event. It's a regular LBR perf event which gets
>> the LBR facility assigned from the perf subsystem. Once that succeeds,
>> the LBR stack msrs are passed through to the guest for efficient accesses.
>> However, if another host LBR event comes in and takes over the LBR
>> facility, the LBR msrs will be made interceptible, and guest following
>> accesses to the LBR msrs will be trapped and meaningless.
>>
>> Because saving/restoring tens of LBR MSRs (e.g. 32 LBR stack entries) in
>> VMX transition brings too excessive overhead to frequent vmx transition
>> itself, the guest LBR event would help save/restore the LBR stack msrs
>> during the context switching with the help of native LBR event callstack
>> mechanism, including LBR_SELECT msr.
>>
>> If the guest no longer accesses the LBR-related MSRs within a scheduling
>> time slice and the LBR enable bit is unset, vPMU would release its guest
>> LBR event as a normal event of a unused vPMC and the pass-through
>> state of the LBR stack msrs would be canceled.
>>
>> ---
>>
>> LBR testcase:
>> echo 1 > /proc/sys/kernel/watchdog
>> echo 25 > /proc/sys/kernel/perf_cpu_time_max_percent
>> echo 5000 > /proc/sys/kernel/perf_event_max_sample_rate
>> echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent
>> ./perf record -b ./br_instr a
>>
>> - Perf report on the host:
>> Samples: 72K of event 'cycles', Event count (approx.): 72512
>> Overhead  Command   Source Shared Object           Source 
>> Symbol                           Target Symbol                           
>> Basic Block Cycles
>>    12.12%  br_instr  br_instr                       [.] 
>> cmp_end                             [.] 
>> lfsr_cond                           1
>>    11.05%  br_instr  br_instr                       [.] 
>> lfsr_cond                           [.] 
>> cmp_end                             5
>>     8.81%  br_instr  br_instr                       [.] 
>> lfsr_cond                           [.] 
>> cmp_end                             4
>>     5.04%  br_instr  br_instr                       [.] 
>> cmp_end                             [.] 
>> lfsr_cond                           20
>>     4.92%  br_instr  br_instr                       [.] 
>> lfsr_cond                           [.] 
>> cmp_end                             6
>>     4.88%  br_instr  br_instr                       [.] 
>> cmp_end                             [.] 
>> lfsr_cond                           6
>>     4.58%  br_instr  br_instr                       [.] 
>> cmp_end                             [.] 
>> lfsr_cond                           5
>>
>> - Perf report on the guest:
>> Samples: 92K of event 'cycles', Event count (approx.): 92544
>> Overhead  Command   Source Shared Object  Source 
>> Symbol                                   Target 
>> Symbol                                   Basic Block Cycles
>>    12.03%  br_instr  br_instr              [.] 
>> cmp_end                                     [.] 
>> lfsr_cond                                   1
>>    11.09%  br_instr  br_instr              [.] 
>> lfsr_cond                                   [.] 
>> cmp_end                                     5
>>     8.57%  br_instr  br_instr              [.] 
>> lfsr_cond                                   [.] 
>> cmp_end                                     4
>>     5.08%  br_instr  br_instr              [.] 
>> lfsr_cond                                   [.] 
>> cmp_end                                     6
>>     5.06%  br_instr  br_instr              [.] 
>> cmp_end                                     [.] 
>> lfsr_cond                                   20
>>     4.87%  br_instr  br_instr              [.] 
>> cmp_end                                     [.] 
>> lfsr_cond                                   6
>>     4.70%  br_instr  br_instr              [.] 
>> cmp_end                                     [.] 
>> lfsr_cond                                   5
>>
>> Conclusion: the profiling results on the guest are similar to that on 
>> the host.
>>
>> Like Xu (10):
>>    KVM: x86: Move common set/get handler of MSR_IA32_DEBUGCTLMSR to VMX
>>    KVM: x86/vmx: Make vmx_set_intercept_for_msr() non-static and expose it
>>    KVM: vmx/pmu: Initialize vcpu perf_capabilities once in intel_pmu_init()
>>    KVM: vmx/pmu: Clear PMU_CAP_LBR_FMT when guest LBR is disabled
>>    KVM: vmx/pmu: Create a guest LBR event when vcpu sets DEBUGCTLMSR_LBR
>>    KVM: vmx/pmu: Pass-through LBR msrs to when the guest LBR event is 
>> ACTIVE
>>    KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
>>    KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
>>    KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES
>>    KVM: vmx/pmu: Release guest LBR event via lazy release mechanism
>>
>>   arch/x86/kvm/pmu.c              |  12 +-
>>   arch/x86/kvm/pmu.h              |   5 +
>>   arch/x86/kvm/vmx/capabilities.h |  22 ++-
>>   arch/x86/kvm/vmx/pmu_intel.c    | 296 +++++++++++++++++++++++++++++++-
>>   arch/x86/kvm/vmx/vmx.c          |  44 ++++-
>>   arch/x86/kvm/vmx/vmx.h          |  28 +++
>>   arch/x86/kvm/x86.c              |  15 +-
>>   7 files changed, 395 insertions(+), 27 deletions(-)
>>
>


      parent reply	other threads:[~2020-09-30  2:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-26 15:32 [PATCH v13 00/10] Guest Last Branch Recording Enabling (KVM part) Like Xu
2020-07-26 15:32 ` [PATCH v13 01/10] KVM: x86: Move common set/get handler of MSR_IA32_DEBUGCTLMSR to VMX Like Xu
2020-07-26 15:32 ` [PATCH] target/i386: add -cpu,lbr=true support to enable guest LBR Like Xu
2020-09-24 22:05   ` Eduardo Habkost
     [not found]     ` <958128c6-39e8-96fe-34d8-7be1888f4144@intel.com>
2020-09-28 15:41       ` Eduardo Habkost
2020-09-29  6:24         ` Xu, Like
2020-07-26 15:32 ` [PATCH v13 02/10] KVM: x86/vmx: Make vmx_set_intercept_for_msr() non-static and expose it Like Xu
2020-09-29  3:13   ` Sean Christopherson
2020-09-29  7:12     ` Xu, Like
2020-07-26 15:32 ` [PATCH v13 03/10] KVM: vmx/pmu: Initialize vcpu perf_capabilities once in intel_pmu_init() Like Xu
2020-07-26 15:32 ` [PATCH v13 04/10] KVM: vmx/pmu: Clear PMU_CAP_LBR_FMT when guest LBR is disabled Like Xu
2020-07-26 15:32 ` [PATCH v13 05/10] KVM: vmx/pmu: Create a guest LBR event when vcpu sets DEBUGCTLMSR_LBR Like Xu
2020-07-26 15:32 ` [PATCH v13 06/10] KVM: vmx/pmu: Pass-through LBR msrs to when the guest LBR event is ACTIVE Like Xu
2020-07-26 15:32 ` [PATCH v13 07/10] KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation Like Xu
2020-07-26 15:32 ` [PATCH v13 08/10] KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI Like Xu
2020-07-26 15:32 ` [PATCH v13 09/10] KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES Like Xu
2020-07-26 15:32 ` [PATCH v13 10/10] KVM: vmx/pmu: Release guest LBR event via lazy release mechanism Like Xu
2020-08-14  8:48 ` [PATCH v13 00/10] Guest Last Branch Recording Enabling (KVM part) Xu, Like
2020-09-04  1:57   ` Xu, Like
2020-09-30  2:23   ` Xu, Like [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c9904bf5-89e0-a5c1-2d1c-98c50c50d7d9@intel.com \
    --to=like.xu@intel.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).