linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Wang, Wei W" <wei.w.wang@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"ak@linux.intel.com" <ak@linux.intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"rkrcmar@redhat.com" <rkrcmar@redhat.com>,
	"Xu, Like" <like.xu@intel.com>
Subject: Re: [PATCH v1 1/8] perf/x86: add support to mask counters from host
Date: Mon, 5 Nov 2018 17:56:57 +0100	[thread overview]
Message-ID: <20181105165657.GD22431@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <286AC319A985734F985F78AFA26841F73DE3AC8B@shsmsx102.ccr.corp.intel.com>


Please; don't send malformed emails like this. Lines wrap at 78 chars.

On Mon, Nov 05, 2018 at 03:37:24PM +0000, Wang, Wei W wrote:
> On Monday, November 5, 2018 8:14 PM, Peter Zijlstra wrote:


> > That can only work if the host counter has perf_event_attr::exclude_guest=1,
> > any counter without that must also count when the guest is running.
> > 
> > (and, IIRC, normal perf tool events do not have that set by default)
> 
> Probably no. Please see Line 81 at
> https://github.com/torvalds/linux/blob/master/tools/perf/util/util.c
> perf_guest by default is false, which makes "attr->exclude_guest = 1"

Then you're in luck. But if the host creates an even that has
exclude_guest=0 set, it should still work.

> > The thing is; you cannot do blind pass-through of the PMU, some of its
> > features simply do not work in a guest. Also, the host perf driver expects
> > certain functionality that must be respected.
> 
> Actually we are not blindly assigning the perf counters. Guest works
> with its own complete perf stack (like the one on the host) which also
> has its own constraints. 

But it knows nothing of the host state.

> The counter is also not passed through to the guest, guest accesses to
> the assigned counter will still exit to the hypervisor, and the
> hypervisor helps update the counter. 

Yes, you have to; because the PMU doesn't properly virtualize, also
because the HV -- linux in our case -- already claimed the PMU.

So the network passthrough case you mentioned simply doesn't apply at
all. Don't bother looking at it for inspiration.

> > Those are the constraints you have to work with.
> > 
> > Back when we all started down this virt rathole, I proposed people do
> > paravirt perf, where events would be handed to the host kernel and let the
> > host kernel do its normal thing. But people wanted to do the MSR based
> > thing because of !linux guests.
> 
> IMHO, it is worthwhile to care more about the real use case. When a
> user gets a virtual machine from a vendor, all he can do is to run
> perf inside the guest. The above contention concerns would not happen,
> because the user wouldn't be able to come to the host to run perf on
> the virtualization software (e.g. ./perf qemu..) and in the meantime
> running perf in the guest to cause the contention.

That's your job. Mine is to make sure that whatever you propose fits in
the existing model and doesn't make a giant mess of things.

And for Linux guests on Linux hosts, paravirt perf still makes the most
sense to me; then you get the host scheduling all the events and
providing the guest with the proper counts/runtimes/state.

> On the other hand, when we improve the user experience of running perf
> inside the guest by reducing the virtualization overhead, that would
> bring real benefits to the real use case.

You can start to improve things by doing a less stupid implementation of
the existing code.

  reply	other threads:[~2018-11-05 16:57 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-01 10:04 [PATCH v1 0/8] Intel Virtual PMU Optimization Wei Wang
2018-11-01 10:04 ` [PATCH v1 1/8] perf/x86: add support to mask counters from host Wei Wang
2018-11-01 14:52   ` Peter Zijlstra
2018-11-02  9:08     ` Wei Wang
2018-11-05  9:34       ` Peter Zijlstra
2018-11-05 11:19         ` Wei Wang
2018-11-05 12:14           ` Peter Zijlstra
2018-11-05 15:37             ` Wang, Wei W
2018-11-05 16:56               ` Peter Zijlstra [this message]
2018-11-05 18:20               ` Andi Kleen
2018-11-01 10:04 ` [PATCH v1 2/8] perf/x86/intel: add pmi callback support Wei Wang
2018-11-01 10:04 ` [PATCH v1 3/8] KVM/x86/vPMU: optimize intel vPMU Wei Wang
2018-11-01 10:04 ` [PATCH v1 4/8] KVM/x86/vPMU: support msr switch on vmx transitions Wei Wang
2018-11-01 10:04 ` [PATCH v1 5/8] KVM/x86/vPMU: intel_pmu_read_pmc Wei Wang
2018-11-01 10:04 ` [PATCH v1 6/8] KVM/x86/vPMU: remove some unused functions Wei Wang
2018-11-01 10:04 ` [PATCH v1 7/8] KVM/x86/vPMU: save/restore guest perf counters on vCPU switching Wei Wang
2018-11-01 10:04 ` [PATCH v1 8/8] KVM/x86/vPMU: return the counters to host if guest is torn down Wei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181105165657.GD22431@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=ak@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=wei.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).