kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Harivel <aharivel@redhat.com>
To: kvm@vger.kernel.org
Cc: rjarry@redhat.com, Anthony Harivel <aharivel@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Christophe Fontaine <cfontain@redhat.com>
Subject: [RFC] KVM: x86: Give host userspace control for MSR_RAPL_POWER_UNIT and MSR_PKG_POWER_STATUS
Date: Wed, 18 Jan 2023 15:21:23 +0100	[thread overview]
Message-ID: <20230118142123.461247-1-aharivel@redhat.com> (raw)

Allow userspace to update the MSR_RAPL_POWER_UNIT and
MSR_PKG_POWER_STATUS powercap registers. By default, these MSRs still
return 0.

This enables VMMs running on top of KVM with access to energy metrics
like /sys/devices/virtual/powercap/*/*/energy_uj to compute VMs power
values in proportion with other metrics (e.g. CPU %guest, steal time,
etc.) and periodically update the MSRs with ioctl KVM_SET_MSRS so that
the guest OS can consume them using power metering tools.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Christophe Fontaine <cfontain@redhat.com>
Signed-off-by: Anthony Harivel <aharivel@redhat.com>
---

Notes:
    The main goal of this patch is to bring a first step to give energy
    awareness to VMs.
    
    As of today, KVM always report 0 in these MSRs since the entire host
    power consumption needs to be hidden from the guests. However, there is
    no fallback mechanism for VMs to measure their power usage.
    
    The idea is to let the VMMs running on top of KVM periodically update
    those MSRs with representative values of the VM's power consumption.
    
    If this solution is accepted, VMMs like QEMU will need to be patched to
    set proper values in these registers and enable power metering in
    guests.
    
    I am submitting this as an RFC to get input/feedback from a broader
    audience who may be aware of potential side effects of such a mechanism.
    
    Regards,
    Anthony
    
    "If you can’t measure it, you can’t improve it." – Lord Kelvin

 arch/x86/include/asm/kvm_host.h |  4 ++++
 arch/x86/kvm/x86.c              | 18 ++++++++++++++++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6aaae18f1854..c6072915f229 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1006,6 +1006,10 @@ struct kvm_vcpu_arch {
 	 */
 	bool pdptrs_from_userspace;
 
+	/* Powercap related MSRs */
+	u64 msr_rapl_power_unit;
+	u64 msr_pkg_energy_status;
+
 #if IS_ENABLED(CONFIG_HYPERV)
 	hpa_t hv_root_tdp;
 #endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index da4bbd043a7b..adc89144f84f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1528,6 +1528,10 @@ static const u32 emulated_msrs_all[] = {
 
 	MSR_K7_HWCR,
 	MSR_KVM_POLL_CONTROL,
+
+	/* The following MSRs can be updated by the userspace */
+	MSR_RAPL_POWER_UNIT,
+	MSR_PKG_ENERGY_STATUS,
 };
 
 static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)];
@@ -3888,6 +3892,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * as to-be-saved, even if an MSRs isn't fully supported.
 		 */
 		return !msr_info->host_initiated || data;
+	case MSR_RAPL_POWER_UNIT:
+		vcpu->arch.msr_rapl_power_unit = data;
+		break;
+	case MSR_PKG_ENERGY_STATUS:
+		vcpu->arch.msr_pkg_energy_status = data;
+		break;
 	default:
 		if (kvm_pmu_is_valid_msr(vcpu, msr))
 			return kvm_pmu_set_msr(vcpu, msr_info);
@@ -3973,13 +3983,17 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	 * data here. Do not conditionalize this on CPUID, as KVM does not do
 	 * so for existing CPU-specific MSRs.
 	 */
-	case MSR_RAPL_POWER_UNIT:
 	case MSR_PP0_ENERGY_STATUS:	/* Power plane 0 (core) */
 	case MSR_PP1_ENERGY_STATUS:	/* Power plane 1 (graphics uncore) */
-	case MSR_PKG_ENERGY_STATUS:	/* Total package */
 	case MSR_DRAM_ENERGY_STATUS:	/* DRAM controller */
 		msr_info->data = 0;
 		break;
+	case MSR_RAPL_POWER_UNIT:
+		msr_info->data = vcpu->arch.msr_rapl_power_unit;
+		break;
+	case MSR_PKG_ENERGY_STATUS:	/* Total package */
+		msr_info->data = vcpu->arch.msr_pkg_energy_status;
+		break;
 	case MSR_IA32_PEBS_ENABLE:
 	case MSR_IA32_DS_AREA:
 	case MSR_PEBS_DATA_CFG:
-- 
2.39.0


             reply	other threads:[~2023-01-18 14:34 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-18 14:21 Anthony Harivel [this message]
2023-01-19  5:57 ` [RFC] KVM: x86: Give host userspace control for MSR_RAPL_POWER_UNIT and MSR_PKG_POWER_STATUS Xiaoyao Li
2023-01-20  8:59   ` Anthony Harivel
2023-01-19 23:27 ` Paolo Bonzini
2023-01-20 16:47   ` Anthony Harivel
2023-01-20 16:57     ` Paolo Bonzini
2023-01-23 12:43       ` Anthony Harivel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230118142123.461247-1-aharivel@redhat.com \
    --to=aharivel@redhat.com \
    --cc=cfontain@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rjarry@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).