All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Alexander Graf <graf@amazon.de>,
	milanpa@amazon.com, Milan Pandurov <milanpa@amazon.de>,
	kvm@vger.kernel.org
Cc: rkrcmar@redhat.com, borntraeger@de.ibm.com
Subject: Re: [PATCH 2/2] kvm: Add ioctl for gathering debug counters
Date: Thu, 23 Jan 2020 15:19:12 +0100	[thread overview]
Message-ID: <b69546be-a25c-bbea-7e37-c07f019dcf85@redhat.com> (raw)
In-Reply-To: <6f13c197-b242-90a5-3f53-b75aa8a0e5aa@amazon.de>

On 23/01/20 13:32, Alexander Graf wrote:
>> See above: I am not sure they are the same story because their consumers
>> might be very different from registers.  Registers are generally
>> consumed by programs (to migrate VMs, for example) and only occasionally
>> by humans, while stats are meant to be consumed by humans.  We may
>> disagree on whether this justifies a completely different API...
> 
> I don't fully agree on the "human" part here.

I agree it's not entirely about humans, but in general it's going to be
rules and queries on monitoring tools, where 1) the monitoring tools'
output is generally not KVM-specific, 2) the rules and queries will be
written by humans.

So if the kernel produces insn_emulation_fail, the plugin for the
monitoring tool will just log kvm.insn_emulation_fail.  If the kernel
produces 0x10042, the plugin will have to convert it and then log it.
This is why I'm not sure that providing strings is actually less work
for userspace.

Paolo

> At the end of the day, you
> want stats because you want to act on stats. Ideally, you want to do
> that fully automatically. Let me give you a few examples:
> 
> 1) insn_emulation_fail triggers
> 
> You may want to feed all the failures into a database to check whether
> there is something wrong in the emulator.
> 
> 2) (remote_)tlb_flush beyond certain threshold
> 
> If you see that you're constantly flushing remote TLBs, there's a good
> chance that you found a workload that may need tuning in KVM. You want
> to gather those stats across your full fleet of hosts, so that for the
> few occasions when you hit it, you can work with the actual VM owners to
> potentially improve their performance
> 
> 3) exits beyond certain threshold
> 
> You know roughly how many exits your fleet would usually see, so you can
> configure an upper threshold on that. When you now have an automated way
> to notify you when the threshold is exceeded, you can check what that
> particular guest did to raise so many exits.
> 
> 
> ... and I'm sure there's room for a lot more potential stats that could
> be useful to gather to determine the health of a KVM environment, such
> as a "vcpu steal time" one or a "maximum time between two VMENTERS while
> the guest was in running state".
> 
> All of these should eventually feed into something bigger that collects
> the numbers across your full VM fleet, so that a human can take actions
> based on them. However, that means the values are no longer directly
> impacting a human, they need to feed into machines. And for that, exact,
> constant identifiers make much more sense


  reply	other threads:[~2020-01-23 14:19 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-15 13:43 [PATCH 2/2] kvm: Add ioctl for gathering debug counters Milan Pandurov
2020-01-15 14:04 ` Alexander Graf
2020-01-15 14:43   ` milanpa
2020-01-15 14:59     ` Alexander Graf
2020-01-17 23:38       ` Paolo Bonzini
2020-01-20 17:53         ` Alexander Graf
2020-01-20 18:57           ` milanpa
2020-01-21 15:38             ` Alexander Graf
2020-01-23 12:08               ` Paolo Bonzini
2020-01-23 12:32                 ` Alexander Graf
2020-01-23 14:19                   ` Paolo Bonzini [this message]
2020-01-23 14:45                     ` Alexander Graf
2020-01-23 14:50                       ` Paolo Bonzini
2020-01-23 14:58                         ` Alexander Graf
2020-01-23 15:05                           ` Paolo Bonzini
2020-01-23 15:27                             ` milanpa
2020-01-23 16:15                               ` Paolo Bonzini
2020-01-23 18:31                                 ` milanpa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b69546be-a25c-bbea-7e37-c07f019dcf85@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=graf@amazon.de \
    --cc=kvm@vger.kernel.org \
    --cc=milanpa@amazon.com \
    --cc=milanpa@amazon.de \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.