Re: [PATCH v1 3/3] KVM: arm64: Add histogram stats for handling time of arch specific exit reasons

From: Marc Zyngier <maz@kernel.org>
To: Jing Zhang <jingzhangos@google.com>
Cc: KVM <kvm@vger.kernel.org>, KVMARM <kvmarm@lists.cs.columbia.edu>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Will Deacon <will@kernel.org>,
	David Matlack <dmatlack@google.com>,
	Peter Shier <pshier@google.com>, Oliver Upton <oupton@google.com>,
	Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH v1 3/3] KVM: arm64: Add histogram stats for handling time of arch specific exit reasons
Date: Wed, 22 Sep 2021 12:22:47 +0100	[thread overview]
Message-ID: <87czp0voqg.wl-maz@kernel.org> (raw)
In-Reply-To: <20210922010851.2312845-3-jingzhangos@google.com>

Hi Jing,

On Wed, 22 Sep 2021 02:08:51 +0100,
Jing Zhang <jingzhangos@google.com> wrote:
> 
> These logarithmic histogram stats are useful for monitoring performance
> of handling for different kinds of VCPU exit reasons.
> 
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h | 36 ++++++++++++
>  arch/arm64/kvm/arm.c              |  4 ++
>  arch/arm64/kvm/guest.c            | 43 ++++++++++++++
>  arch/arm64/kvm/handle_exit.c      | 95 +++++++++++++++++++++++++++++++
>  4 files changed, 178 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 4d65de22add3..f1a29ca3d4f3 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -417,6 +417,9 @@ struct kvm_vcpu_arch {
>  
>  	/* Arch specific exit reason */
>  	enum arm_exit_reason exit_reason;
> +
> +	/* The timestamp for the last VCPU exit */
> +	u64 last_exit_time;
>  };
>  
>  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> @@ -605,6 +608,8 @@ struct kvm_vm_stat {
>  	struct kvm_vm_stat_generic generic;
>  };
>  
> +#define ARM_EXIT_HIST_CNT	64
> +
>  struct kvm_vcpu_stat {
>  	struct kvm_vcpu_stat_generic generic;
>  	u64 mmio_exit_user;
> @@ -641,6 +646,36 @@ struct kvm_vcpu_stat {
>  		u64 exit_fp_asimd;
>  		u64 exit_pac;
>  	};
> +	/* Histogram stats for handling time of arch specific exit reasons */
> +	struct {
> +		u64 exit_unknown_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_irq_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_el1_serror_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_hyp_gone_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_il_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_wfi_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_wfe_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_cp15_32_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_cp15_64_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_cp14_32_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_cp14_ls_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_cp14_64_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_hvc32_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_smc32_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_hvc64_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_smc64_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_sys64_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_sve_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_iabt_low_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_dabt_low_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_softstp_low_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_watchpt_low_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_breakpt_low_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_bkpt32_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_brk64_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_fp_asimd_hist[ARM_EXIT_HIST_CNT];
> +		u64 exit_pac_hist[ARM_EXIT_HIST_CNT];
> +	};

I'm sorry, but I don't think this is right, and I really wish you had
reached out before putting any effort in this.

This appears to be as ~13kB per vcpu worth of data that means
*nothing*. Counting exception syndrome occurrences doesn't say
anything about how important the associated exception is. A few
examples:

- exit_hyp_gone_hist: KVM is gone. We've removed the hypervisor. What
  is the point of this? How many values do you expect?

- exit_dabt_low_hist: page faults. Crucial data, isn't it? Permission,
  Access, or Translation fault? At which level? Caused by userspace or
  kernel? Resulted in a page being allocated or not? Read from swap?
  Successful or not?

- exit_pac_hist, exit_fp_asimd_hist: actually handled early, and only
  end-up in that part of the hypervisor when there is nothing to do
  (either there is no corresponding HW or the feature is disabled).

- exit_sys64_hist: System register traps. Which sysreg? With what effect?

Blindly exposing these numbers doesn't result in anything that can be
used for any sort of useful analysis and bloats the already huge data
structures. It also affects every single user, most of which do not
care about this data. It also doesn't scale. Are you going to keep
adding these counters and histograms every time the architecture (or
KVM itself) grows some new event?

Frankly, this is a job for BPF and the tracing subsystem, not for some
hardcoded syndrome accounting. It would allow to extract meaningful
information, prevent bloat, and crucially make it optional. Even empty
trace points like the ones used in the scheduler would be infinitely
better than this (load your own module that hooks into these trace
points, expose the data you want, any way you want).

As it stands, there is no way I'd be considering merging something
that looks like this series.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.