linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Wang, Wei W" <wei.w.wang@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"ak@linux.intel.com" <ak@linux.intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"rkrcmar@redhat.com" <rkrcmar@redhat.com>,
	"Xu, Like" <like.xu@intel.com>
Subject: RE: [PATCH v1 1/8] perf/x86: add support to mask counters from host
Date: Mon, 5 Nov 2018 15:37:24 +0000	[thread overview]
Message-ID: <286AC319A985734F985F78AFA26841F73DE3AC8B@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <20181105121413.GC22431@hirez.programming.kicks-ass.net>

On Monday, November 5, 2018 8:14 PM, Peter Zijlstra wrote:
> The answer for PEBS is really simple; PEBS does not virtualize (Andi tried and
> can tell you why; IIRC it has something to do with how the hardware asks for
> a Linear Address instead of a Physical Address). So the problem will not arrise.
> 
> But there are certainly constrained events that will result in the same
> problem.

Yes. I just followed the PEBS assumption to discuss the counter contention that you mentioned.

> 
> The traditional approach of perf on resource contention is to share it; you
> get only partial runtime and can scale up the events given the runtime
> metrics provided.
> 
> We also have perf_event_attr::pinned, which is normally only available to
> root, in which case we'll end up marking any contending event to an error
> state.

Yes,  that's one of the limitations with the existing host event emulation approach - in that case (i.e. both require counter 0 to work) the existing approach fails to have the guest perf function, even the pinned event has ".exclude_guest" set.

> 
> Neither are ideal for MSR level emulation.
> 
> That can only work if the host counter has perf_event_attr::exclude_guest=1,
> any counter without that must also count when the guest is running.
> 
> (and, IIRC, normal perf tool events do not have that set by default)

Probably no. Please see Line 81 at
https://github.com/torvalds/linux/blob/master/tools/perf/util/util.c
perf_guest by default is false, which makes "attr->exclude_guest = 1"


> The thing is; you cannot do blind pass-through of the PMU, some of its
> features simply do not work in a guest. Also, the host perf driver expects
> certain functionality that must be respected.

Actually we are not blindly assigning the perf counters. Guest works with its own complete perf stack (like the one on the host) which also has its own constraints. The perf counter that the guest is requesting from the hypervisor is the one that comes out from its event constraints (i.e. the one that will work for that feature on the guest). 

The counter is also not passed through to the guest, guest accesses to the assigned counter will still exit to the hypervisor, and the hypervisor helps update the counter. 

Please correct me if I misunderstood your point.

> 
> Those are the constraints you have to work with.
> 
> Back when we all started down this virt rathole, I proposed people do
> paravirt perf, where events would be handed to the host kernel and let the
> host kernel do its normal thing. But people wanted to do the MSR based
> thing because of !linux guests.

IMHO, it is worthwhile to care more about the real use case. When a user gets a virtual machine from a vendor, all he can do is to run perf inside the guest. The above contention concerns would not happen, because the user wouldn't be able to come to the host to run perf on the virtualization software (e.g. ./perf qemu..) and in the meantime running perf in the guest to cause the contention.

On the other hand, when we improve the user experience of running perf inside the guest by reducing the virtualization overhead, that would bring real benefits to the real use case.

Best,
Wei


  reply	other threads:[~2018-11-05 15:37 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-01 10:04 [PATCH v1 0/8] Intel Virtual PMU Optimization Wei Wang
2018-11-01 10:04 ` [PATCH v1 1/8] perf/x86: add support to mask counters from host Wei Wang
2018-11-01 14:52   ` Peter Zijlstra
2018-11-02  9:08     ` Wei Wang
2018-11-05  9:34       ` Peter Zijlstra
2018-11-05 11:19         ` Wei Wang
2018-11-05 12:14           ` Peter Zijlstra
2018-11-05 15:37             ` Wang, Wei W [this message]
2018-11-05 16:56               ` Peter Zijlstra
2018-11-05 18:20               ` Andi Kleen
2018-11-01 10:04 ` [PATCH v1 2/8] perf/x86/intel: add pmi callback support Wei Wang
2018-11-01 10:04 ` [PATCH v1 3/8] KVM/x86/vPMU: optimize intel vPMU Wei Wang
2018-11-01 10:04 ` [PATCH v1 4/8] KVM/x86/vPMU: support msr switch on vmx transitions Wei Wang
2018-11-01 10:04 ` [PATCH v1 5/8] KVM/x86/vPMU: intel_pmu_read_pmc Wei Wang
2018-11-01 10:04 ` [PATCH v1 6/8] KVM/x86/vPMU: remove some unused functions Wei Wang
2018-11-01 10:04 ` [PATCH v1 7/8] KVM/x86/vPMU: save/restore guest perf counters on vCPU switching Wei Wang
2018-11-01 10:04 ` [PATCH v1 8/8] KVM/x86/vPMU: return the counters to host if guest is torn down Wei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=286AC319A985734F985F78AFA26841F73DE3AC8B@shsmsx102.ccr.corp.intel.com \
    --to=wei.w.wang@intel.com \
    --cc=ak@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).