All of
 help / color / mirror / Atom feed
From: Peter Zijlstra <>
To: Like Xu <>
Cc: Paolo Bonzini <>,
	Sean Christopherson <>,
	Vitaly Kuznetsov <>,
	Wanpeng Li <>,
	Jim Mattson <>, Joerg Roedel <>,
	Thomas Gleixner <>,,,
Subject: Re: [PATCH] KVM: x86/pmu: Introduce pmc->is_paused to reduce the call time of perf interfaces
Date: Thu, 29 Jul 2021 14:58:23 +0200	[thread overview]
Message-ID: <YQKl7/> (raw)
In-Reply-To: <>

On Wed, Jul 28, 2021 at 08:07:05PM +0800, Like Xu wrote:
> From: Like Xu <>
> Based on our observations, after any vm-exit associated with vPMU, there
> are at least two or more perf interfaces to be called for guest counter
> emulation, such as perf_event_{pause, read_value, period}(), and each one
> will {lock, unlock} the same perf_event_ctx. The frequency of calls becomes
> more severe when guest use counters in a multiplexed manner.
> Holding a lock once and completing the KVM request operations in the perf
> context would introduce a set of impractical new interfaces. So we can
> further optimize the vPMU implementation by avoiding repeated calls to
> these interfaces in the KVM context for at least one pattern:
> After we call perf_event_pause() once, the event will be disabled and its
> internal count will be reset to 0. So there is no need to pause it again
> or read its value. Once the event is paused, event period will not be
> updated until the next time it's resumed or reprogrammed. And there is
> also no need to call perf_event_period twice for a non-running counter,
> considering the perf_event for a running counter is never paused.
> Based on this implementation, for the following common usage of
> sampling 4 events using perf on a 4u8g guest:
>   echo 0 > /proc/sys/kernel/watchdog
>   echo 25 > /proc/sys/kernel/perf_cpu_time_max_percent
>   echo 10000 > /proc/sys/kernel/perf_event_max_sample_rate
>   echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent
>   for i in `seq 1 1 10`
>   do
>   taskset -c 0 perf record \
>   -e cpu-cycles -e instructions -e branch-instructions -e cache-misses \
>   /root/br_instr a
>   done
> the average latency of the guest NMI handler is reduced from
> 37646.7 ns to 32929.3 ns (~1.14x speed up) on the Intel ICX server.
> Also, in addition to collecting more samples, no loss of sampling
> accuracy was observed compared to before the optimization.
> Signed-off-by: Like Xu <>

Looks sane I suppose.

Acked-by: Peter Zijlstra (Intel) <>

What kinds of VM-exits are the most common?

  reply	other threads:[~2021-07-29 13:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-28 12:07 Like Xu
2021-07-29 12:58 ` Peter Zijlstra [this message]
2021-07-29 13:46   ` Like Xu
2021-08-02 15:46 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YQKl7/ \ \ \ \ \ \ \ \ \ \ \ \ \
    --subject='Re: [PATCH] KVM: x86/pmu: Introduce pmc->is_paused to reduce the call time of perf interfaces' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.