Re: [PATCH v1 1/8] perf/x86: add support to mask counters from host

From: Peter Zijlstra <peterz@infradead.org>
To: "Wang, Wei W" <wei.w.wang@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"ak@linux.intel.com" <ak@linux.intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"rkrcmar@redhat.com" <rkrcmar@redhat.com>,
	"Xu, Like" <like.xu@intel.com>
Subject: Re: [PATCH v1 1/8] perf/x86: add support to mask counters from host
Date: Mon, 5 Nov 2018 17:56:57 +0100	[thread overview]
Message-ID: <20181105165657.GD22431@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <286AC319A985734F985F78AFA26841F73DE3AC8B@shsmsx102.ccr.corp.intel.com>

Please; don't send malformed emails like this. Lines wrap at 78 chars.

On Mon, Nov 05, 2018 at 03:37:24PM +0000, Wang, Wei W wrote:
> On Monday, November 5, 2018 8:14 PM, Peter Zijlstra wrote:

> > That can only work if the host counter has perf_event_attr::exclude_guest=1,
> > any counter without that must also count when the guest is running.
> > 
> > (and, IIRC, normal perf tool events do not have that set by default)
> 
> Probably no. Please see Line 81 at
> https://github.com/torvalds/linux/blob/master/tools/perf/util/util.c
> perf_guest by default is false, which makes "attr->exclude_guest = 1"

Then you're in luck. But if the host creates an even that has
exclude_guest=0 set, it should still work.

> > The thing is; you cannot do blind pass-through of the PMU, some of its
> > features simply do not work in a guest. Also, the host perf driver expects
> > certain functionality that must be respected.
> 
> Actually we are not blindly assigning the perf counters. Guest works
> with its own complete perf stack (like the one on the host) which also
> has its own constraints. 

But it knows nothing of the host state.

> The counter is also not passed through to the guest, guest accesses to
> the assigned counter will still exit to the hypervisor, and the
> hypervisor helps update the counter. 

Yes, you have to; because the PMU doesn't properly virtualize, also
because the HV -- linux in our case -- already claimed the PMU.

So the network passthrough case you mentioned simply doesn't apply at
all. Don't bother looking at it for inspiration.

> > Those are the constraints you have to work with.
> > 
> > Back when we all started down this virt rathole, I proposed people do
> > paravirt perf, where events would be handed to the host kernel and let the
> > host kernel do its normal thing. But people wanted to do the MSR based
> > thing because of !linux guests.
> 
> IMHO, it is worthwhile to care more about the real use case. When a
> user gets a virtual machine from a vendor, all he can do is to run
> perf inside the guest. The above contention concerns would not happen,
> because the user wouldn't be able to come to the host to run perf on
> the virtualization software (e.g. ./perf qemu..) and in the meantime
> running perf in the guest to cause the contention.

That's your job. Mine is to make sure that whatever you propose fits in
the existing model and doesn't make a giant mess of things.

And for Linux guests on Linux hosts, paravirt perf still makes the most
sense to me; then you get the host scheduling all the events and
providing the guest with the proper counts/runtimes/state.

> On the other hand, when we improve the user experience of running perf
> inside the guest by reducing the virtualization overhead, that would
> bring real benefits to the real use case.

You can start to improve things by doing a less stupid implementation of
the existing code.