linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Murray <andrew.murray@arm.com>
To: Christoffer Dall <christoffer.dall@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Julien Thierry <julien.thierry@arm.com>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	joro@8bytes.org, Suzuki K Poulose <suzuki.poulose@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes
Date: Tue, 18 Dec 2018 16:27:05 +0000	[thread overview]
Message-ID: <20181218162705.GG47018@e119886-lin.cambridge.arm.com> (raw)
In-Reply-To: <20181218143833.GD25383@e113682-lin.lund.arm.com>

On Tue, Dec 18, 2018 at 03:38:33PM +0100, Christoffer Dall wrote:
> On Tue, Dec 18, 2018 at 01:25:32PM +0000, Andrew Murray wrote:
> > On Tue, Dec 18, 2018 at 01:02:26PM +0100, Christoffer Dall wrote:
> > > On Wed, Dec 12, 2018 at 10:29:32AM +0000, Andrew Murray wrote:
> > > > Add support for the :G and :H attributes in perf by handling the
> > > > exclude_host/exclude_guest event attributes.
> > > > 
> > > > We notify KVM of counters that we wish to be enabled or disabled on
> > > > guest entry/exit and thus defer from starting or stopping :G events
> > > > as per the events exclude_host attribute.
> > > > 
> > > > With both VHE and non-VHE we switch the counters between host/guest
> > > > at EL2. We are able to eliminate counters counting host events on
> > > > the boundaries of guest entry/exit when using :G by filtering out
> > > > EL2 for exclude_host. However when using :H unless exclude_hv is set
> > > > on non-VHE then there is a small blackout window at the guest
> > > > entry/exit where host events are not captured.
> > > > 
> > > > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > > > ---
> > > >  arch/arm64/kernel/perf_event.c | 51 ++++++++++++++++++++++++++++++++++++------
> > > >  1 file changed, 44 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > > index de564ae..4a3c73d 100644
> > > > --- a/arch/arm64/kernel/perf_event.c
> > > > +++ b/arch/arm64/kernel/perf_event.c
> > > > @@ -26,6 +26,7 @@
> > > >  
> > > >  #include <linux/acpi.h>
> > > >  #include <linux/clocksource.h>
> > > > +#include <linux/kvm_host.h>
> > > >  #include <linux/of.h>
> > > >  #include <linux/perf/arm_pmu.h>
> > > >  #include <linux/platform_device.h>
> > > > @@ -647,11 +648,26 @@ static inline int armv8pmu_enable_counter(int idx)
> > > >  
> > > >  static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> > > >  {
> > > > +	struct perf_event_attr *attr = &event->attr;
> > > >  	int idx = event->hw.idx;
> > > > +	int flags = 0;
> > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > >  
> > > > -	armv8pmu_enable_counter(idx);
> > > >  	if (armv8pmu_event_is_chained(event))
> > > > -		armv8pmu_enable_counter(idx - 1);
> > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > +
> > > > +	if (!attr->exclude_host)
> > > > +		flags |= KVM_PMU_EVENTS_HOST;
> > > > +	if (!attr->exclude_guest)
> > > > +		flags |= KVM_PMU_EVENTS_GUEST;
> > > > +
> > > > +	kvm_set_pmu_events(counter_bits, flags);
> > > > +
> > > > +	if (!attr->exclude_host) {
> > > > +		armv8pmu_enable_counter(idx);
> > > > +		if (armv8pmu_event_is_chained(event))
> > > > +			armv8pmu_enable_counter(idx - 1);
> > > > +	}
> > > >  }
> > > >  
> > > >  static inline int armv8pmu_disable_counter(int idx)
> > > > @@ -664,11 +680,20 @@ static inline int armv8pmu_disable_counter(int idx)
> > > >  static inline void armv8pmu_disable_event_counter(struct perf_event *event)
> > > >  {
> > > >  	struct hw_perf_event *hwc = &event->hw;
> > > > +	struct perf_event_attr *attr = &event->attr;
> > > >  	int idx = hwc->idx;
> > > > +	u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
> > > >  
> > > >  	if (armv8pmu_event_is_chained(event))
> > > > -		armv8pmu_disable_counter(idx - 1);
> > > > -	armv8pmu_disable_counter(idx);
> > > > +		counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
> > > > +
> > > > +	kvm_clr_pmu_events(counter_bits);
> > > > +
> > > > +	if (!attr->exclude_host) {
> > > > +		if (armv8pmu_event_is_chained(event))
> > > > +			armv8pmu_disable_counter(idx - 1);
> > > > +		armv8pmu_disable_counter(idx);
> > > > +	}
> > > >  }
> > > >  
> > > >  static inline int armv8pmu_enable_intens(int idx)
> > > > @@ -943,16 +968,25 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
> > > >  	 * Therefore we ignore exclude_hv in this configuration, since
> > > >  	 * there's no hypervisor to sample anyway. This is consistent
> > > >  	 * with other architectures (x86 and Power).
> > > > +	 *
> > > > +	 * To eliminate counting host events on the boundaries of
> > > > +	 * guest entry/exit we ensure EL2 is not included in hyp mode
> > > > +	 * with !exclude_host.
> > > >  	 */
> > > >  	if (is_kernel_in_hyp_mode()) {
> > > > -		if (!attr->exclude_kernel)
> > > > +		if (!attr->exclude_kernel && !attr->exclude_host)
> > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > >  	} else {
> > > > -		if (attr->exclude_kernel)
> > > > -			config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > > >  		if (!attr->exclude_hv)
> > > >  			config_base |= ARMV8_PMU_INCLUDE_EL2;
> > > 
> > > I'm not sure about the current use of exclude_hv here.  The comment says
> > > it's consistent with other architectures, but I can't find an example to
> > > confirm this, and I don't think we have a comparable thing to the split
> > > of the hypervisor between EL1 and EL2 we have on non-VHE.
> > > 
> > > Joerg told me the semantics were designed to be:
> > > 
> > > 	exclude_hv: When running as a guest, stop counting events when
> > > 		    the HV runs.
> > 
> > Can the definition of "guest" here refer to both type 1 and type 2
> > hypervisor guests? Or do we assume type 1 only?
> > 
> 
> A guest is a guest.  Linux can run as a guest under a hypervisor with a
> type 1 or type 2 design, doesn't matter for this conversation.
> 
> > > 
> > > 	exclude_host: When Linux runs as a HV itself, only count events
> > > 	              while a guest is running.
> > > 
> > > 	exclude_guest: When Linux runs as a HV, only count events when
> > > 	               running in host mode.
> > > 
> > > (But tools/perf/design.txt does not really confirm this).
> > > 
> > > On arm64 that would mean:
> > > 
> > > 	exclude_hv: As a host, no effect.
> > > 		    As a guest, set the counter to include EL2 for a
> > > 		    hypervisor to emulate.
> 
> [...]
> 
> > 
> > Though more correctly we should count EL2 *and EL1* events whilst pinned
> > to the KVM task and whilst running outside of the guest.  This then
> > covers both !VHE and VHE and allows for fair comparasion between !VHE
> > and VHE systems.
> 
> Yes, if the guest has cleared exclude_hv and if we can properly detect
> that from the hypervisor.
> 
> > 
> > This then gives us the unique benefit of the type 2 host being able to
> > examine the hypervisor overhead of its individual guests.
> 
> Not sure I understand this part.

I got a bit confused here so ignore this. (I thought it would be useful to
measure from the KVM host perspective, the host-only time of the KVM guest,
which I incorrectly thought could also be the exclude_hv flag. Though I
forgot that 'perf stat -e instructions:H kvmtask' from the host should be
equivalent to 'perf stat -e instructions:h' from the guest.)

> 
> > 
> > The only issue here is that the type 2 host wouldn't be able to examine
> > the HV overhead of all its guests across the system as you wouldn't be
> > able to rely on the perf task pinning to distinguish between EL1 from
> > host and EL1 from guests in a !VHE system. I'm not sure the best way
> > to overcome this limitation.
> 
> Why can't you disable EL1 counting whilst running in the host, and
> enable EL1 counting whilst running in the guest?

You can. (I was assuming you could use exclude_hv from KVM host to measure
all the KVM guest overheads - but you would need to know which tasks are
KVM tasks such that you can consider their EL1 time - again ignore this).

> 
> > > 
> > > 	exclude_host: As a guest, has no effect.
> > > 		      Don't count EL1 host or EL2, but count EL1 guest
> > > 		      by enabling EL1 counting at EL2 when entering a
> > > 		      guest, and disabling EL1 counting when returning
> > > 		      from a guest.
> > > 
> > > 	exclude_guest: As a guest, has no effect.  As a host, disable
> > > 		       EL1 counting at EL2 when entering a guest.
> > > 
> > > Not sure if we break anything by changing the behavior on arm64 now, but
> > > I really doubt that being able to exclude an arbitrary part (the one tha
> > > happens to run in EL2 on non-VHE systems) is meaningful, and the fact
> > > that behavior and semantics change depending on the version of the
> > > underlying CPU is not great, if what you care about is understanding the
> > > system's performance.
> > 
> > This is a bit strange. It's arbitary as it only represents a bit of the
> > HV overhead - this is solved though by counting the whole overhead (EL1
> > and EL2 instead (but only counting outside the guest and pinned to the
> > guest tasks).
> 
> Not sure I understand your point here?

I was considering how we support exclude_hv from a KVM guest. If we count
only EL2 for HV then we are not measuring the true overhead, we also need
to include the EL1 time from the KVM host process when on !VHE.


> 
> > 
> > > 
> > > Thoughts?
> > > 
> > 
> > Though if I've understood you correctly, you're suggesting that the only
> > time we count EL2 is when exclude_hv is not set on the immediate guest
> > of a type 1 hypervisor?
> > 
> No, I didn't say anything about a type 1 or type 2 hypervisor, and I
> think that distinction is completely irrelevant to the discussion at
> hand.  I also don't know what an immediate guest is -- is there any
> other kind?

An immediate guest = the host's guest but not a guest-guest or further
(Apologies if I'm making this up as I go along).

> 
> I don't think exclude_hv, exclude_host, and exclude_guest are directly
> tied to a single CPU mode.  The only 'modes' you need to consider for
> Linux are 'guest' and 'host' when Linux can run VMs and, 'self' and
> 'hypervisor' when Linux is a guest.
> 
> When Linux can run VMs, you count EL2 events when exclude_host is not
> set.
> 
> (When Linux is a guest, and you set/clear exclude_hv, for this to work,
> you need some way of informing your hypervisor that you want to know
> about events happening in the hypervisor.  This could be a PV interface,
> or maybe this can work by the guest setting/clearing the NSH bit in its
> virtual PMU registers, which then amusingly can get translated into
> actually counting in EL1/EL2 (non-VHE) or EL2 (VHE) by KVM's PMU
> emulation code.  The method used is specific to the hypervisor used, but
> not specific to whether the hypervisor is type-1 or type-2.)

This is what I had in mind. I think we're aligned despite nomenclature
confusing me.

I think we need to:

 - Add support in KVM guests for exclude_hv.

 - Prevent !exclude_hv from returning a count when is is not a guest
   on a !VHE kernel.

I'm not really clear how we implement the second point. We want to count
EL2 (because that would represent a HV) but only when it is a guest. How
does it know it's a guest? (Is it a guest when it's not a KVM host?).
I'm struggling to see how we update the logic in armv8pmu_set_event_filter
(whilst keeping in mind that we want it to write ARMV8_PMU_INCLUDE_EL2
such that when it's a guest the KVM PMU emulation code can use that to
know to do something different).

Thanks,

Andrew Murray

> 
> 
> Thanks,
> 
>     Christoffer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2018-12-18 16:27 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-12 10:29 [PATCH v8 0/5] arm64: Support perf event modifiers :G and :H Andrew Murray
2018-12-12 10:29 ` [PATCH v8 1/5] arm64: arm_pmu: remove unnecessary isb instruction Andrew Murray
2018-12-12 10:29 ` [PATCH v8 2/5] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data Andrew Murray
2018-12-12 10:37   ` Suzuki K Poulose
2018-12-12 12:31     ` Andrew Murray
2018-12-18 11:40   ` Christoffer Dall
2018-12-12 10:29 ` [PATCH v8 3/5] arm64: KVM: add accessors to track guest/host only counters Andrew Murray
2018-12-12 10:56   ` Suzuki K Poulose
2018-12-12 10:29 ` [PATCH v8 4/5] arm64: arm_pmu: Add support for exclude_host/exclude_guest attributes Andrew Murray
2018-12-12 10:42   ` Suzuki K Poulose
2018-12-12 10:47     ` Julien Thierry
2018-12-12 10:51       ` Suzuki K Poulose
2018-12-12 10:55   ` Suzuki K Poulose
2018-12-12 14:38     ` Andrew Murray
2018-12-18 12:02   ` Christoffer Dall
2018-12-18 13:25     ` Andrew Murray
2018-12-18 14:38       ` Christoffer Dall
2018-12-18 16:27         ` Andrew Murray [this message]
2018-12-18 18:51           ` Christoffer Dall
2018-12-18 19:19             ` Andrew Jones
2019-01-04 15:32     ` Will Deacon
2019-01-08 10:18       ` Christoffer Dall
2019-01-08 11:25         ` Andrew Murray
2019-01-08 11:50           ` Marc Zyngier
2019-01-08 12:03             ` Christoffer Dall
2019-01-08 12:12               ` Marc Zyngier
2019-01-08 12:20                 ` Christoffer Dall
2019-01-08 12:14           ` Christoffer Dall
2019-01-08 12:39             ` Andrew Murray
2018-12-12 10:29 ` [PATCH v8 5/5] arm64: KVM: Enable support for :G/:H perf event modifiers Andrew Murray
2018-12-12 10:53   ` Suzuki K Poulose
2018-12-12 14:46     ` Andrew Murray

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181218162705.GG47018@e119886-lin.cambridge.arm.com \
    --to=andrew.murray@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=joro@8bytes.org \
    --cc=julien.thierry@arm.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).