All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhao Liu <zhao1.liu@linux.intel.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H . Peter Anvin" <hpa@zytor.com>,
	kvm@vger.kernel.org, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org, x86@kernel.org
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>,
	Len Brown <len.brown@intel.com>, Zhang Rui <rui.zhang@intel.com>,
	Zhenyu Wang <zhenyu.z.wang@intel.com>,
	Zhuocheng Ding <zhuocheng.ding@intel.com>,
	Dapeng Mi <dapeng1.mi@intel.com>,
	Yanting Jiang <yanting.jiang@intel.com>,
	Yongwei Ma <yongwei.ma@intel.com>,
	Vineeth Pillai <vineeth@bitbyteword.org>,
	Suleiman Souhlal <suleiman@google.com>,
	Masami Hiramatsu <mhiramat@google.com>,
	David Dai <davidai@google.com>,
	Saravana Kannan <saravanak@google.com>,
	Zhao Liu <zhao1.liu@intel.com>
Subject: [RFC 13/26] KVM: VMX: Support virtual HFI table for VM
Date: Sat,  3 Feb 2024 17:12:01 +0800	[thread overview]
Message-ID: <20240203091214.411862-14-zhao1.liu@linux.intel.com> (raw)
In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com>

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

The Hardware Feedback Interface (HFI) is a feature that allows hardware
to provide guidance to the operating system scheduler through a hardware
feedback interface structure (HFI table) in memory [1], so that the
scheduler can perform optimal workload scheduling.

ITD (Intel Thread Director) and HFI features both depend on the HFI
table, but their HFI tables are slightly different. The HFI table
provided by the ITD feature has 4 classes (in terms of more columns
in the table) and the native HFI feature supports 1 class [2].

In fact, the size of the HFI table is determined by the feature bit that
the processor supports, but the range of data updates in the table
is determined by the feature actually enabled (HFI or ITD) [3], which is
controlled by MSR_IA32_HW_FEEDBACK_CONFIG.

To benefit the scheduling in VM with HFI/ITD, we need to maintain
virtual HFI tables in KVM. The virtual HFI table is based on the real
HFI table. We extract the HFI entries corresponding to the pCPU that the
vCPU is running on, and reorganize these actual entries into a new
virtual HFI table with the vCPU's HFI index.

Also, to simplify the logic, before the emulation of ITD is supported,
we build virtual HFI table based on HFI feature by default (i.e. only 1
class is supported, based on class 0 of real hardware).

Add the interfaces to initialize and build the virtual HFI table, and to
inject the thermal interrupt into the VM to notify about HFI updates.

[1]: SDM, vol. 3B, section 15.6 HARDWARE FEEDBACK INTERFACE AND INTEL
     THREAD DIRECTOR
[2]: SDM, vol. 3B, section 15.6.2 Intel Thread Director Table Structure
[3]: SDM, vol. 3B, section 15.6.5 Hardware Feedback Interface
     Configuration, Table 15-10. IA32_HW_FEEDBACK_CONFIG Control Option

Tested-by: Yanting Jiang <yanting.jiang@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 119 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 119 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 96f0f768939d..7881f6b51daa 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1532,6 +1532,125 @@ static void vmx_inject_therm_interrupt(struct kvm_vcpu *vcpu)
 		kvm_apic_therm_deliver(vcpu);
 }
 
+static inline bool vmx_hfi_initialized(struct kvm_vmx *kvm_vmx)
+{
+	return kvm_vmx->pkg_therm.hfi_desc.hfi_enabled &&
+	       kvm_vmx->pkg_therm.hfi_desc.table_ptr_valid;
+}
+
+static inline bool vmx_hfi_int_enabled(struct kvm_vmx *kvm_vmx)
+{
+	return kvm_vmx->pkg_therm.hfi_desc.hfi_int_enabled;
+}
+
+static int vmx_init_hfi_table(struct kvm *kvm)
+{
+	struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm);
+	struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc;
+	struct hfi_features *hfi_features = &kvm_vmx_hfi->hfi_features;
+	struct hfi_table *hfi_table = &kvm_vmx_hfi->hfi_table;
+	int nr_classes, ret = 0;
+
+	/*
+	 * Currently we haven't supported ITD. HFI is the default feature
+	 * with 1 class.
+	 */
+	nr_classes = 1;
+	ret = intel_hfi_build_virt_features(hfi_features,
+					    nr_classes,
+					    kvm->created_vcpus);
+	if (unlikely(ret))
+		return ret;
+
+	hfi_table->base_addr = kzalloc(hfi_features->nr_table_pages <<
+				       PAGE_SHIFT, GFP_KERNEL);
+	if (!hfi_table->base_addr)
+		return -ENOMEM;
+
+	hfi_table->hdr = hfi_table->base_addr + sizeof(*hfi_table->timestamp);
+	hfi_table->data = hfi_table->hdr + hfi_features->hdr_size;
+	return 0;
+}
+
+static int vmx_build_hfi_table(struct kvm *kvm)
+{
+	struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm);
+	struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc;
+	struct hfi_features *hfi_features = &kvm_vmx_hfi->hfi_features;
+	struct hfi_table *hfi_table = &kvm_vmx_hfi->hfi_table;
+	struct hfi_hdr *hfi_hdr = hfi_table->hdr;
+	int nr_classes, ret = 0, updated = 0;
+	struct kvm_vcpu *v;
+	unsigned long i;
+
+	/*
+	 * Currently we haven't supported ITD. HFI is the default feature
+	 * with 1 class.
+	 */
+	nr_classes = 1;
+	for (int j = 0; j < nr_classes; j++) {
+		hfi_hdr->perf_updated = 0;
+		hfi_hdr->ee_updated = 0;
+		hfi_hdr++;
+	}
+
+	kvm_for_each_vcpu(i, v, kvm) {
+		ret = intel_hfi_build_virt_table(hfi_table, hfi_features,
+						 nr_classes,
+						 to_vmx(v)->hfi_table_idx,
+						 v->cpu);
+		if (unlikely(ret < 0))
+			return ret;
+		updated |= ret;
+	}
+
+	if (!updated)
+		return updated;
+
+	/* Timestamp must be monotonic. */
+	(*kvm_vmx_hfi->hfi_table.timestamp)++;
+
+	/* Update the HFI table, whether the HFI interrupt is enabled or not. */
+	kvm_write_guest(kvm, kvm_vmx_hfi->table_base, hfi_table->base_addr,
+			hfi_features->nr_table_pages << PAGE_SHIFT);
+	return 1;
+}
+
+static void vmx_update_hfi_table(struct kvm *kvm)
+{
+	struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm);
+	struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc;
+	int ret = 0;
+
+	if (!intel_hfi_enabled())
+		return;
+
+	if (!vmx_hfi_initialized(kvm_vmx))
+		return;
+
+	if (!kvm_vmx_hfi->hfi_table.base_addr) {
+		ret = vmx_init_hfi_table(kvm);
+		if (unlikely(ret))
+			return;
+	}
+
+	ret = vmx_build_hfi_table(kvm);
+	if (ret <= 0)
+		return;
+
+	kvm_vmx_hfi->hfi_update_status = true;
+	kvm_vmx_hfi->hfi_update_pending = false;
+
+	/*
+	 * Since HFI is shared for all vCPUs of the same VM, we
+	 * actually support only 1 package topology VMs, so when
+	 * emulating package level interrupt, we only inject an
+	 * interrupt into one vCPU to reduce the overhead.
+	 */
+	if (vmx_hfi_int_enabled(kvm_vmx))
+		vmx_inject_therm_interrupt(kvm_get_vcpu(kvm, 0));
+}
+
 /*
  * Switches to specified vcpu, until a matching vcpu_put(), but assumes
  * vcpu mutex is already taken.
-- 
2.34.1


  parent reply	other threads:[~2024-02-03  9:01 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-03  9:11 [RFC 00/26] Intel Thread Director Virtualization Zhao Liu
2024-02-03  9:11 ` [RFC 01/26] thermal: Add bit definition for x86 thermal related MSRs Zhao Liu
2024-02-03  9:11 ` [RFC 02/26] thermal: intel: hfi: Add helpers to build HFI/ITD structures Zhao Liu
2024-02-03  9:11 ` [RFC 03/26] thermal: intel: hfi: Add HFI notifier helpers to notify HFI update Zhao Liu
2024-02-03  9:11 ` [RFC 04/26] KVM: Add kvm_arch_sched_out() hook Zhao Liu
2024-02-03  9:11 ` [RFC 05/26] KVM: x86: Reset hardware history at vCPU's sched_in/out Zhao Liu
2024-02-03  9:11 ` [RFC 06/26] KVM: VMX: Add helpers to handle the writes to MSR's R/O and R/WC0 bits Zhao Liu
2024-02-03  9:11 ` [RFC 07/26] KVM: VMX: Emulate ACPI (CPUID.0x01.edx[bit 22]) feature Zhao Liu
2024-02-03  9:11 ` [RFC 08/26] KVM: x86: Expose TM/ACC (CPUID.0x01.edx[bit 29]) feature bit to VM Zhao Liu
2024-02-03  9:11 ` [RFC 09/26] KVM: x86: cpuid: Define CPUID 0x06.eax by kvm_cpu_cap_mask() Zhao Liu
2024-02-03  9:11 ` [RFC 10/26] KVM: VMX: Emulate PTM/PTS (CPUID.0x06.eax[bit 6]) feature Zhao Liu
2024-02-03  9:11 ` [RFC 11/26] KVM: VMX: Introduce HFI description structure Zhao Liu
2024-02-03  9:12 ` [RFC 12/26] KVM: VMX: Introduce HFI table index for vCPU Zhao Liu
2024-02-03  9:12 ` Zhao Liu [this message]
2024-02-03  9:12 ` [RFC 14/26] KVM: x86: Introduce the HFI dynamic update request and kvm_x86_ops Zhao Liu
2024-02-03  9:12 ` [RFC 15/26] KVM: VMX: Sync update of Host HFI table to Guest Zhao Liu
2024-02-03  9:12 ` [RFC 16/26] KVM: VMX: Update HFI table when vCPU migrates Zhao Liu
2024-02-03  9:12 ` [RFC 17/26] KVM: VMX: Allow to inject thermal interrupt without HFI update Zhao Liu
2024-02-03  9:12 ` [RFC 18/26] KVM: VMX: Emulate HFI related bits in package thermal MSRs Zhao Liu
2024-02-03  9:12 ` [RFC 19/26] KVM: VMX: Emulate the MSRs of HFI feature Zhao Liu
2024-02-03  9:12 ` [RFC 20/26] KVM: x86: Expose HFI feature bit and HFI info in CPUID Zhao Liu
2024-02-03  9:12 ` [RFC 21/26] KVM: VMX: Extend HFI table and MSR emulation to support ITD Zhao Liu
2024-02-03  9:12 ` [RFC 22/26] KVM: VMX: Pass through ITD classification related MSRs to Guest Zhao Liu
2024-02-03  9:12 ` [RFC 23/26] KVM: x86: Expose ITD feature bit and related info in CPUID Zhao Liu
2024-02-03  9:12 ` [RFC 24/26] KVM: VMX: Emulate the MSR of HRESET feature Zhao Liu
2024-02-03  9:12 ` [RFC 25/26] KVM: x86: Expose HRESET feature's CPUID to Guest Zhao Liu
2024-02-03  9:12 ` [RFC 26/26] Documentation: KVM: Add description of pkg_therm_lock Zhao Liu
2024-02-22  7:42 ` [RFC 00/26] Intel Thread Director Virtualization Zhao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240203091214.411862-14-zhao1.liu@linux.intel.com \
    --to=zhao1.liu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=daniel.lezcano@linaro.org \
    --cc=dapeng1.mi@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=davidai@google.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mhiramat@google.com \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rafael@kernel.org \
    --cc=ricardo.neri-calderon@linux.intel.com \
    --cc=rui.zhang@intel.com \
    --cc=saravanak@google.com \
    --cc=seanjc@google.com \
    --cc=suleiman@google.com \
    --cc=tglx@linutronix.de \
    --cc=vineeth@bitbyteword.org \
    --cc=x86@kernel.org \
    --cc=yanting.jiang@intel.com \
    --cc=yongwei.ma@intel.com \
    --cc=zhao1.liu@intel.com \
    --cc=zhenyu.z.wang@intel.com \
    --cc=zhuocheng.ding@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.