From: "Philippe Mathieu-Daudé" <philmd@linaro.org>
To: Anthony Harivel <aharivel@redhat.com>,
qemu-devel@nongnu.org, pbonzini@redhat.com, mtosatti@redhat.com
Subject: Re: [RFC PATCH] Add support for RAPL MSRs in KVM/Qemu
Date: Fri, 19 May 2023 13:32:19 +0200 [thread overview]
Message-ID: <d6118a9c-1e3f-4c29-520e-26562bbac600@linaro.org> (raw)
In-Reply-To: <20230517130730.85469-1-aharivel@redhat.com>
Hi Anthony,
On 17/5/23 15:07, Anthony Harivel wrote:
> Starting with the "Sandy Bridge" generation, Intel CPUs provide a RAPL
> interface (Running Average Power Limit) for advertising the accumulated
> energy consumption of various power domains (e.g. CPU packages, DRAM,
> etc.).
>
> The consumption is reported via MSRs (model specific registers) like
> MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are
> 64 bits registers that represent the accumulated energy consumption in
> micro Joules. They are updated by microcode every ~1ms.
>
> For now, KVM always returns 0 when the guest requests the value of
> these MSRs. Use the KVM MSR filtering mechanism to allow QEMU handle
> these MSRs dynamically in userspace.
>
> To limit the amount of system calls for every MSR call, create a new
> thread in QEMU that updates the "virtual" MSR values asynchronously.
>
> Each vCPU has its own vMSR to reflect the independence of vCPUs. The
> thread updates the vMSR values with the ratio of energy consumed of
> the whole physical CPU package the vCPU thread runs on and the
> thread's utime and stime values.
>
> All other non-vCPU threads are also taken into account. Their energy
> consumption is evenly distributed among all vCPUs threads running on
> the same physical CPU package.
>
> This feature is activated with -accel kvm,rapl=true.
>
> Actual limitation:
> - Works only on Intel host CPU because AMD CPUs are using different MSR
> adresses.
>
> - Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at
> the moment.
>
> - Since each vCPU has an independent vMSR value, the vCPU topology must
> be changed to match that reality. There must be a single vCPU per
> virtual socket (e.g.: -smp 4,sockets=4). Accessing pkg-0 energy will
> give vCPU 0 energy, pkg-1 will give vCPU 1 energy, etc.
>
> Signed-off-by: Anthony Harivel <aharivel@redhat.com>
> ---
> diff --git a/target/i386/kvm/vmsr_energy.h b/target/i386/kvm/vmsr_energy.h
> new file mode 100644
> index 000000000000..5f79d2cbe00d
> --- /dev/null
> +++ b/target/i386/kvm/vmsr_energy.h
> @@ -0,0 +1,80 @@
> +/*
> + * QEMU KVM support -- x86 virtual energy-related MSR.
> + *
> + * Copyright 2023 Red Hat, Inc. 2023
> + *
> + * Author:
> + * Anthony Harivel <aharivel@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef VMSR_ENERGY_H
> +#define VMSR_ENERGY_H
> +
> +#include "qemu/osdep.h"
> +
> +#include <numa.h>
> +
> +/*
> + * Define the interval time in micro seconds between 2 samples of
> + * energy related MSRs
> + */
> +#define MSR_ENERGY_THREAD_SLEEP_US 1000000.0
> +
> +/*
> + * Thread statistic
> + * @ thread_id: TID (thread ID)
> + * @ is_vcpu: true is thread is vCPU thread
> + * @ cpu_id: CPU number last executed on
> + * @ vcpu_id: vCPU ID
> + * @ numa_node_id:node number of the CPU
> + * @ utime: amount of clock ticks the thread
> + * has been scheduled in User mode
> + * @ stime: amount of clock ticks the thread
> + * has been scheduled in System mode
> + * @ delta_ticks: delta of utime+stime between
> + * the two samples (before/after sleep)
> + */
> +struct thread_stat {
> + unsigned int thread_id;
> + bool is_vcpu;
> + unsigned int cpu_id;
> + unsigned int vcpu_id;
> + unsigned int numa_node_id;
> + unsigned long long *utime;
> + unsigned long long *stime;
> + unsigned long long delta_ticks;
> +};
> +
> +/*
> + * Package statistic
> + * @ e_start: package energy counter before the sleep
> + * @ e_end: package energy counter after the sleep
> + * @ e_delta: delta of package energy counter
> + * @ e_ratio: store the energy ratio of non-vCPU thread
> + * @ nb_vcpu: number of vCPU running on this package
> + */
> +struct packge_energy_stat {
"package"
> + uint64_t e_start;
> + uint64_t e_end;
> + uint64_t e_delta;
> + uint64_t e_ratio;
> + unsigned int nb_vcpu;
> +};
> +
> +typedef struct thread_stat thread_stat;
> +typedef struct packge_energy_stat package_energy_stat;
> +
> +uint64_t read_msr(uint32_t reg, unsigned int cpu_id);
> +void delta_ticks(thread_stat *thd_stat, int i);
> +unsigned int get_maxcpus(unsigned int package_num);
> +int read_thread_stat(struct thread_stat *thread, int pid, int index);
> +pid_t *get_thread_ids(pid_t pid, int *num_threads);
> +double get_ratio(package_energy_stat *pkg_stat,
> + thread_stat *thd_stat,
> + int maxticks, int i);
Would prefixing these declarations with 'vmsr_' provide
a clearer API? Otherwise, maybe this isn't the best header
to declare them.
> +
> +#endif /* VMSR_ENERGY_H */
next prev parent reply other threads:[~2023-05-19 11:33 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-17 13:07 [RFC PATCH] Add support for RAPL MSRs in KVM/Qemu Anthony Harivel
2023-05-17 15:43 ` Marcelo Tosatti
2023-05-18 14:26 ` Anthony Harivel
2023-05-19 18:28 ` Marcelo Tosatti
2023-05-24 14:53 ` Anthony Harivel
2023-05-26 15:23 ` Marcelo Tosatti
2023-05-19 11:32 ` Philippe Mathieu-Daudé [this message]
2023-05-19 12:30 ` Anthony Harivel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d6118a9c-1e3f-4c29-520e-26562bbac600@linaro.org \
--to=philmd@linaro.org \
--cc=aharivel@redhat.com \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).