linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v2 PATCH 0/4] x86/kvm: Virtual suspend time injection support
@ 2021-08-06 10:07 Hikaru Nishida
  2021-08-06 10:07 ` [v2 PATCH 1/4] x86/kvm: Reserve KVM_FEATURE_HOST_SUSPEND_TIME and MSR_KVM_HOST_SUSPEND_TIME Hikaru Nishida
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Hikaru Nishida @ 2021-08-06 10:07 UTC (permalink / raw)
  To: linux-kernel, dme, tglx, mlevitsk
  Cc: suleiman, Hikaru Nishida, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin, Ingo Molnar, Jim Mattson, Joerg Roedel,
	John Stultz, Jonathan Corbet, Juergen Gross, Lai Jiangshan,
	Mike Travis, Paolo Bonzini, Peter Zijlstra (Intel),
	Sean Christopherson, Stephen Boyd, Steve Wahl, Tom Lendacky,
	Vitaly Kuznetsov, Wanpeng Li, kvm, linux-doc, x86


Hi,

This patch series adds virtual suspend time injection support to KVM.
(It is an updated version of the following series.)
https://lore.kernel.org/kvm/20210426090644.2218834-1-hikalium@chromium.org/

Changes to v1:

- Using IRQ instead of polling to detect the host's suspension.
- Removed unused arg "updated" from kvm_write_suspend_time()
- Improved comments and commit messages.

Please take a look again.

Thanks,

Hikaru Nishida



Hikaru Nishida (4):
  x86/kvm: Reserve KVM_FEATURE_HOST_SUSPEND_TIME and
    MSR_KVM_HOST_SUSPEND_TIME
  x86/kvm: Add definitions for virtual suspend time injection
  x86/kvm: Add host side support for virtual suspend time injection
  x86/kvm: Add guest side support for virtual suspend time injection

 Documentation/virt/kvm/cpuid.rst     |   3 +
 Documentation/virt/kvm/msr.rst       |  30 ++++++++
 arch/x86/Kconfig                     |  13 ++++
 arch/x86/include/asm/idtentry.h      |   4 +
 arch/x86/include/asm/irq_vectors.h   |   7 +-
 arch/x86/include/asm/kvm_host.h      |   5 ++
 arch/x86/include/asm/kvm_para.h      |   9 +++
 arch/x86/include/uapi/asm/kvm_para.h |   6 ++
 arch/x86/kernel/kvmclock.c           |  40 ++++++++++
 arch/x86/kvm/Kconfig                 |  13 ++++
 arch/x86/kvm/cpuid.c                 |   4 +
 arch/x86/kvm/x86.c                   | 109 ++++++++++++++++++++++++++-
 include/linux/kvm_host.h             |   8 ++
 include/linux/timekeeper_internal.h  |   4 +
 kernel/time/timekeeping.c            |  37 +++++++++
 15 files changed, 290 insertions(+), 2 deletions(-)

-- 
2.32.0.605.g8dce9f2422-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [v2 PATCH 1/4] x86/kvm: Reserve KVM_FEATURE_HOST_SUSPEND_TIME and MSR_KVM_HOST_SUSPEND_TIME
  2021-08-06 10:07 [v2 PATCH 0/4] x86/kvm: Virtual suspend time injection support Hikaru Nishida
@ 2021-08-06 10:07 ` Hikaru Nishida
  2021-08-06 10:07 ` [v2 PATCH 2/4] x86/kvm: Add definitions for virtual suspend time injection Hikaru Nishida
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Hikaru Nishida @ 2021-08-06 10:07 UTC (permalink / raw)
  To: linux-kernel, dme, tglx, mlevitsk
  Cc: suleiman, Hikaru Nishida, Jonathan Corbet, Paolo Bonzini, kvm, linux-doc

No functional change; just add documentation for
KVM_FEATURE_HOST_SUSPEND_TIME and its corresponding
MSR_KVM_HOST_SUSPEND_TIME to support virtual suspend timing injection in
later patches.

Signed-off-by: Hikaru Nishida <hikalium@chromium.org>
---

 Documentation/virt/kvm/cpuid.rst |  3 +++
 Documentation/virt/kvm/msr.rst   | 30 ++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
index bda3e3e737d7..f17b95b0d943 100644
--- a/Documentation/virt/kvm/cpuid.rst
+++ b/Documentation/virt/kvm/cpuid.rst
@@ -103,6 +103,9 @@ KVM_FEATURE_HC_MAP_GPA_RANGE       16          guest checks this feature bit bef
 KVM_FEATURE_MIGRATION_CONTROL      17          guest checks this feature bit before
                                                using MSR_KVM_MIGRATION_CONTROL
 
+KVM_FEATURE_HOST_SUSPEND_TIME      18          host suspend time information
+                                               is available at msr 0x4b564d09.
+
 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT 24          host will warn if no guest-side
                                                per-cpu warps are expected in
                                                kvmclock
diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst
index 9315fc385fb0..a218a350d0d0 100644
--- a/Documentation/virt/kvm/msr.rst
+++ b/Documentation/virt/kvm/msr.rst
@@ -389,3 +389,33 @@ data:
         guest is communicating page encryption status to the host using the
         ``KVM_HC_MAP_GPA_RANGE`` hypercall, it can set bit 0 in this MSR to
         allow live migration of the guest.
+
+MSR_KVM_HOST_SUSPEND_TIME:
+	0x4b564d09
+
+data:
+	8-byte alignment physical address of a memory area which must be
+	in guest RAM, plus an enable bit in bit 0. This memory is expected to
+	hold a copy of the following structure::
+
+	 struct kvm_suspend_time {
+		__u64   suspend_time_ns;
+	 };
+
+	whose data will be filled in by the hypervisor.
+	If the guest register this structure through the MSR write, the host
+	will stop all the clocks including TSCs observed by the guest during
+	the host's suspension and report the duration of suspend through this
+	structure. The update will be notified through
+	VIRT_SUSPEND_TIMING_VECTOR IRQ. Fields have the following meanings:
+
+	suspend_time_ns:
+		Total number of nanoseconds passed during the host's suspend
+		while the VM is running. This value will be increasing
+		monotonically.
+
+	Note that although MSRs are per-CPU entities, the effect of this
+	particular MSR is global.
+
+	Availability of this MSR must be checked via bit 18 in 0x4000001 cpuid
+	leaf prior to usage.
-- 
2.32.0.605.g8dce9f2422-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v2 PATCH 2/4] x86/kvm: Add definitions for virtual suspend time injection
  2021-08-06 10:07 [v2 PATCH 0/4] x86/kvm: Virtual suspend time injection support Hikaru Nishida
  2021-08-06 10:07 ` [v2 PATCH 1/4] x86/kvm: Reserve KVM_FEATURE_HOST_SUSPEND_TIME and MSR_KVM_HOST_SUSPEND_TIME Hikaru Nishida
@ 2021-08-06 10:07 ` Hikaru Nishida
  2021-08-10 15:14   ` Thomas Gleixner
  2021-08-13 18:36   ` Guenter Roeck
  2021-08-06 10:07 ` [v2 PATCH 3/4] x86/kvm: Add host side support " Hikaru Nishida
  2021-08-06 10:07 ` [v2 PATCH 4/4] x86/kvm: Add guest " Hikaru Nishida
  3 siblings, 2 replies; 13+ messages in thread
From: Hikaru Nishida @ 2021-08-06 10:07 UTC (permalink / raw)
  To: linux-kernel, dme, tglx, mlevitsk
  Cc: suleiman, Hikaru Nishida, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin, Ingo Molnar, Jim Mattson, Joerg Roedel,
	John Stultz, Paolo Bonzini, Sean Christopherson, Stephen Boyd,
	Steve Wahl, Vitaly Kuznetsov, Wanpeng Li, kvm, x86

Add definitions of MSR, KVM_FEATURE bit, IRQ and a structure called
kvm_suspend_time that are used by later patchs to support the
virtual suspend time injection mechanism.

Signed-off-by: Hikaru Nishida <hikalium@chromium.org>
---

 arch/x86/include/asm/irq_vectors.h   | 7 ++++++-
 arch/x86/include/uapi/asm/kvm_para.h | 6 ++++++
 kernel/time/timekeeping.c            | 1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 43dcb9284208..6785054080c8 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -104,7 +104,12 @@
 #define HYPERV_STIMER0_VECTOR		0xed
 #endif
 
-#define LOCAL_TIMER_VECTOR		0xec
+#if defined(CONFIG_KVM_VIRT_SUSPEND_TIMING) || \
+	defined(CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST)
+#define VIRT_SUSPEND_TIMING_VECTOR	0xec
+#endif
+
+#define LOCAL_TIMER_VECTOR		0xeb
 
 #define NR_VECTORS			 256
 
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 5146bbab84d4..ccea4e344f46 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -35,6 +35,7 @@
 #define KVM_FEATURE_MSI_EXT_DEST_ID	15
 #define KVM_FEATURE_HC_MAP_GPA_RANGE	16
 #define KVM_FEATURE_MIGRATION_CONTROL	17
+#define KVM_FEATURE_HOST_SUSPEND_TIME	18
 
 #define KVM_HINTS_REALTIME      0
 
@@ -57,6 +58,7 @@
 #define MSR_KVM_ASYNC_PF_INT	0x4b564d06
 #define MSR_KVM_ASYNC_PF_ACK	0x4b564d07
 #define MSR_KVM_MIGRATION_CONTROL	0x4b564d08
+#define MSR_KVM_HOST_SUSPEND_TIME      0x4b564d09
 
 struct kvm_steal_time {
 	__u64 steal;
@@ -79,6 +81,10 @@ struct kvm_clock_pairing {
 	__u32 pad[9];
 };
 
+struct kvm_suspend_time {
+	__u64   suspend_time_ns;
+};
+
 #define KVM_STEAL_ALIGNMENT_BITS 5
 #define KVM_STEAL_VALID_BITS ((-1ULL << (KVM_STEAL_ALIGNMENT_BITS + 1)))
 #define KVM_STEAL_RESERVED_MASK (((1 << KVM_STEAL_ALIGNMENT_BITS) - 1 ) << 1)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 8a364aa9881a..233ceb6cce1f 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -22,6 +22,7 @@
 #include <linux/pvclock_gtod.h>
 #include <linux/compiler.h>
 #include <linux/audit.h>
+#include <linux/kvm_host.h>
 
 #include "tick-internal.h"
 #include "ntp_internal.h"
-- 
2.32.0.605.g8dce9f2422-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v2 PATCH 3/4] x86/kvm: Add host side support for virtual suspend time injection
  2021-08-06 10:07 [v2 PATCH 0/4] x86/kvm: Virtual suspend time injection support Hikaru Nishida
  2021-08-06 10:07 ` [v2 PATCH 1/4] x86/kvm: Reserve KVM_FEATURE_HOST_SUSPEND_TIME and MSR_KVM_HOST_SUSPEND_TIME Hikaru Nishida
  2021-08-06 10:07 ` [v2 PATCH 2/4] x86/kvm: Add definitions for virtual suspend time injection Hikaru Nishida
@ 2021-08-06 10:07 ` Hikaru Nishida
  2021-08-10 15:21   ` Thomas Gleixner
                     ` (2 more replies)
  2021-08-06 10:07 ` [v2 PATCH 4/4] x86/kvm: Add guest " Hikaru Nishida
  3 siblings, 3 replies; 13+ messages in thread
From: Hikaru Nishida @ 2021-08-06 10:07 UTC (permalink / raw)
  To: linux-kernel, dme, tglx, mlevitsk
  Cc: suleiman, Hikaru Nishida, Borislav Petkov, H. Peter Anvin,
	Ingo Molnar, Jim Mattson, Joerg Roedel, John Stultz,
	Paolo Bonzini, Sean Christopherson, Stephen Boyd,
	Vitaly Kuznetsov, Wanpeng Li, kvm, x86

This patch implements virtual suspend time injection support for kvm
hosts.

If this functionality is enabled and the guest requests it, the host
will stop all the clocks observed by the guest during the host's
suspension and report the duration of suspend to the guest through
struct kvm_host_suspend_time to give a chance to adjust CLOCK_BOOTTIME
to the guest. This mechanism can be used to align the guest's clock
behavior to the hosts' ones.

Signed-off-by: Hikaru Nishida <hikalium@chromium.org>
---

 arch/x86/include/asm/kvm_host.h |   5 ++
 arch/x86/kvm/Kconfig            |  13 ++++
 arch/x86/kvm/cpuid.c            |   4 ++
 arch/x86/kvm/x86.c              | 109 +++++++++++++++++++++++++++++++-
 include/linux/kvm_host.h        |   8 +++
 kernel/time/timekeeping.c       |   3 +
 6 files changed, 141 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0ac920f6adcb..39d7540f35c4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1064,6 +1064,11 @@ struct kvm_arch {
 	bool pause_in_guest;
 	bool cstate_in_guest;
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+	u64 msr_suspend_time;
+	struct gfn_to_hva_cache suspend_time;
+#endif /* KVM_VIRT_SUSPEND_TIMING */
+
 	unsigned long irq_sources_bitmap;
 	s64 kvmclock_offset;
 	raw_spinlock_t tsc_write_lock;
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index ac69894eab88..4a2d020d3b60 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -129,4 +129,17 @@ config KVM_MMU_AUDIT
 	 This option adds a R/W kVM module parameter 'mmu_audit', which allows
 	 auditing of KVM MMU events at runtime.
 
+config KVM_VIRT_SUSPEND_TIMING
+	bool "Virtual suspend time injection"
+	depends on KVM=y
+	default n
+	help
+	 This option makes the host's suspension reflected on the guest's clocks.
+	 In other words, guest's CLOCK_MONOTONIC will stop and
+	 CLOCK_BOOTTIME keeps running during the host's suspension.
+	 This feature will only be effective when both guest and host enable
+	 this option.
+
+	 If unsure, say N.
+
 endif # VIRTUALIZATION
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index c42613cfb5ba..d71ced5c5a78 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -911,6 +911,10 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 			     (1 << KVM_FEATURE_PV_SCHED_YIELD) |
 			     (1 << KVM_FEATURE_ASYNC_PF_INT);
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+		entry->eax |= (1 << KVM_FEATURE_HOST_SUSPEND_TIME);
+#endif
+
 		if (sched_info_on())
 			entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 20777b49c56a..59897b95cda4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1355,6 +1355,9 @@ static const u32 emulated_msrs_all[] = {
 
 	MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME,
 	MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF_INT, MSR_KVM_ASYNC_PF_ACK,
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+	MSR_KVM_HOST_SUSPEND_TIME,
+#endif
 
 	MSR_IA32_TSC_ADJUST,
 	MSR_IA32_TSC_DEADLINE,
@@ -2132,6 +2135,24 @@ void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock, int sec_hi_ofs)
 	kvm_write_guest(kvm, wall_clock, &version, sizeof(version));
 }
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+static int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
+				    struct kvm_interrupt *irq);
+static void kvm_write_suspend_time(struct kvm *kvm)
+{
+	struct kvm_arch *ka = &kvm->arch;
+	struct kvm_suspend_time st;
+
+	st.suspend_time_ns = kvm->suspend_time_ns;
+	kvm_write_guest_cached(kvm, &ka->suspend_time, &st, sizeof(st));
+}
+
+static void kvm_make_suspend_time_interrupt(struct kvm_vcpu *vcpu)
+{
+	kvm_queue_interrupt(vcpu, VIRT_SUSPEND_TIMING_VECTOR, false);
+	kvm_make_request(KVM_REQ_EVENT, vcpu);
+}
+#endif
 static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time,
 				  bool old_msr, bool host_initiated)
 {
@@ -3237,6 +3258,25 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 	kvm_unmap_gfn(vcpu, &map, &vcpu->arch.st.cache, true, false);
 }
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+static int set_suspend_time_struct(struct kvm_vcpu *vcpu, u64 data)
+{
+	if (!(data & KVM_MSR_ENABLED)) {
+		vcpu->kvm->arch.msr_suspend_time = data;
+		return 0;
+	}
+
+	if (kvm_gfn_to_hva_cache_init(vcpu->kvm,
+				&vcpu->kvm->arch.suspend_time, data & ~1ULL,
+				sizeof(struct kvm_suspend_time)))
+		return 1;
+
+	kvm_write_suspend_time(vcpu->kvm);
+	vcpu->kvm->arch.msr_suspend_time = data;
+	return 0;
+}
+#endif
+
 int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	bool pr = false;
@@ -3451,6 +3491,14 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		vcpu->arch.msr_kvm_poll_control = data;
 		break;
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+	case MSR_KVM_HOST_SUSPEND_TIME:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_HOST_SUSPEND_TIME))
+			return 1;
+
+		return set_suspend_time_struct(vcpu, data);
+#endif
+
 	case MSR_IA32_MCG_CTL:
 	case MSR_IA32_MCG_STATUS:
 	case MSR_IA32_MC0_CTL ... MSR_IA32_MCx_CTL(KVM_MAX_MCE_BANKS) - 1:
@@ -3769,6 +3817,14 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 
 		msr_info->data = vcpu->arch.msr_kvm_poll_control;
 		break;
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+	case MSR_KVM_HOST_SUSPEND_TIME:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_HOST_SUSPEND_TIME))
+			return 1;
+
+		msr_info->data = vcpu->kvm->arch.msr_suspend_time;
+		break;
+#endif
 	case MSR_IA32_P5_MC_ADDR:
 	case MSR_IA32_P5_MC_TYPE:
 	case MSR_IA32_MCG_CAP:
@@ -9778,7 +9834,14 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
 		kvm_clear_request(KVM_REQ_UNBLOCK, vcpu);
 		if (kvm_cpu_has_pending_timer(vcpu))
 			kvm_inject_pending_timer_irqs(vcpu);
-
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+		if (kvm->suspend_injection_requested &&
+			kvm_vcpu_ready_for_interrupt_injection(vcpu)) {
+			kvm_write_suspend_time(kvm);
+			kvm_make_suspend_time_interrupt(vcpu);
+			kvm->suspend_injection_requested = false;
+		}
+#endif
 		if (dm_request_for_irq_injection(vcpu) &&
 			kvm_vcpu_ready_for_interrupt_injection(vcpu)) {
 			r = 0;
@@ -12338,6 +12401,50 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size,
 }
 EXPORT_SYMBOL_GPL(kvm_sev_es_string_io);
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+void kvm_arch_timekeeping_inject_sleeptime(const struct timespec64 *delta)
+{
+	struct kvm_vcpu *vcpu;
+	u64 suspend_time_ns;
+	struct kvm *kvm;
+	s64 adj;
+	int i;
+
+	suspend_time_ns = timespec64_to_ns(delta);
+	adj = tsc_khz * (suspend_time_ns / 1000000);
+	/*
+	 * Adjust TSCs on all vcpus and kvmclock as if they are stopped
+	 * during host's suspension.
+	 * Also, cummulative suspend time is recorded in kvm structure and
+	 * the update will be notified via an interrupt for each vcpu.
+	 */
+	mutex_lock(&kvm_lock);
+	list_for_each_entry(kvm, &vm_list, vm_list) {
+		if (!(kvm->arch.msr_suspend_time & KVM_MSR_ENABLED))
+			continue;
+
+		kvm_for_each_vcpu(i, vcpu, kvm)
+			vcpu->arch.tsc_offset_adjustment -= adj;
+
+		/*
+		 * Move the offset of kvm_clock here as if it is stopped
+		 * during the suspension.
+		 */
+		kvm->arch.kvmclock_offset -= suspend_time_ns;
+
+		/* suspend_time is accumulated per VM. */
+		kvm->suspend_time_ns += suspend_time_ns;
+		kvm->suspend_injection_requested = true;
+		/*
+		 * This adjustment will be reflected to the struct provided
+		 * from the guest via MSR_KVM_HOST_SUSPEND_TIME before
+		 * the notification interrupt is injected.
+		 */
+	}
+	mutex_unlock(&kvm_lock);
+}
+#endif
+
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_entry);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_fast_mmio);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d44ff1367ef3..bb9c582a70d0 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -601,6 +601,10 @@ struct kvm {
 	struct notifier_block pm_notifier;
 #endif
 	char stats_id[KVM_STATS_NAME_SIZE];
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+	u64 suspend_time_ns;
+	bool suspend_injection_requested;
+#endif
 };
 
 #define kvm_err(fmt, ...) \
@@ -1715,4 +1719,8 @@ static inline void kvm_handle_signal_exit(struct kvm_vcpu *vcpu)
 /* Max number of entries allowed for each kvm dirty ring */
 #define  KVM_DIRTY_RING_MAX_ENTRIES  65536
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+void kvm_arch_timekeeping_inject_sleeptime(const struct timespec64 *delta);
+#endif /* CONFIG_KVM_VIRT_SUSPEND_TIMING */
+
 #endif
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 233ceb6cce1f..3ac3fb479981 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1797,6 +1797,9 @@ void timekeeping_resume(void)
 	if (inject_sleeptime) {
 		suspend_timing_needed = false;
 		__timekeeping_inject_sleeptime(tk, &ts_delta);
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
+		kvm_arch_timekeeping_inject_sleeptime(&ts_delta);
+#endif
 	}
 
 	/* Re-base the last cycle value */
-- 
2.32.0.605.g8dce9f2422-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v2 PATCH 4/4] x86/kvm: Add guest side support for virtual suspend time injection
  2021-08-06 10:07 [v2 PATCH 0/4] x86/kvm: Virtual suspend time injection support Hikaru Nishida
                   ` (2 preceding siblings ...)
  2021-08-06 10:07 ` [v2 PATCH 3/4] x86/kvm: Add host side support " Hikaru Nishida
@ 2021-08-06 10:07 ` Hikaru Nishida
  2021-08-10 15:48   ` Thomas Gleixner
  3 siblings, 1 reply; 13+ messages in thread
From: Hikaru Nishida @ 2021-08-06 10:07 UTC (permalink / raw)
  To: linux-kernel, dme, tglx, mlevitsk
  Cc: suleiman, Hikaru Nishida, Borislav Petkov, H. Peter Anvin,
	Ingo Molnar, Jim Mattson, Joerg Roedel, John Stultz,
	Juergen Gross, Lai Jiangshan, Mike Travis, Paolo Bonzini,
	Peter Zijlstra (Intel),
	Sean Christopherson, Stephen Boyd, Tom Lendacky,
	Vitaly Kuznetsov, Wanpeng Li, kvm, x86

This patch implements virtual suspend time injection support for kvm
guests. If this functionality is enabled and the host supports
KVM_FEATURE_HOST_SUSPEND_TIME,
the guest will register struct kvm_host_suspend_time through MSR to
get how much time the guest spend during the host's suspension.
Host will notify the update on the structure (which happens if the host
went into suspension while the guest was running) through the irq and
the irq will trigger the adjustment of CLOCK_BOOTTIME inside a guest.

Before this patch, there was no way to adjust the CLOCK_BOOTTIME without
actually suspending the kernel. However, some guest applications rely on
the fact that there will be some difference between CLOCK_BOOTTIME and
CLOCK_MONOTONIC after the suspention of the execution and they will be
broken if we just pausing the guest instead of actually suspending them.
Pausing the guest kernels is one solution to solve the problem, but
if we could adjust the clocks without actually suspending them, we can
reduce the overhead of guest's suspend/resume cycles on every host's
suspensions. So this change will be useful for the devices which
experience suspend/resume frequently.

Signed-off-by: Hikaru Nishida <hikalium@chromium.org>
---

 arch/x86/Kconfig                    | 13 ++++++++++
 arch/x86/include/asm/idtentry.h     |  4 +++
 arch/x86/include/asm/kvm_para.h     |  9 +++++++
 arch/x86/kernel/kvmclock.c          | 40 +++++++++++++++++++++++++++++
 include/linux/timekeeper_internal.h |  4 +++
 kernel/time/timekeeping.c           | 33 ++++++++++++++++++++++++
 6 files changed, 103 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 45463c65ea0a..760fe7f04170 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -822,6 +822,19 @@ config KVM_GUEST
 	  underlying device model, the host provides the guest with
 	  timing infrastructure such as time of day, and system time
 
+config KVM_VIRT_SUSPEND_TIMING_GUEST
+	bool "Virtual suspend time injection (guest side)"
+	depends on KVM_GUEST
+	default n
+	help
+	 This option makes the host's suspension reflected on the guest's clocks.
+	 In other words, guest's CLOCK_MONOTONIC will stop and
+	 CLOCK_BOOTTIME keeps running during the host's suspension.
+	 This feature will only be effective when both guest and host enable
+	 this option.
+
+	 If unsure, say N.
+
 config ARCH_CPUIDLE_HALTPOLL
 	def_bool n
 	prompt "Disable host haltpoll when loading haltpoll driver"
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 1345088e9902..38f37c2a6063 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -671,6 +671,10 @@ DECLARE_IDTENTRY_SYSVEC(POSTED_INTR_WAKEUP_VECTOR,	sysvec_kvm_posted_intr_wakeup
 DECLARE_IDTENTRY_SYSVEC(POSTED_INTR_NESTED_VECTOR,	sysvec_kvm_posted_intr_nested_ipi);
 #endif
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST
+DECLARE_IDTENTRY_SYSVEC(VIRT_SUSPEND_TIMING_VECTOR, sysvec_virtual_suspend_time);
+#endif
+
 #if IS_ENABLED(CONFIG_HYPERV)
 DECLARE_IDTENTRY_SYSVEC(HYPERVISOR_CALLBACK_VECTOR,	sysvec_hyperv_callback);
 DECLARE_IDTENTRY_SYSVEC(HYPERV_REENLIGHTENMENT_VECTOR,	sysvec_hyperv_reenlightenment);
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 69299878b200..094023687c8b 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -16,6 +16,15 @@ static inline bool kvm_check_and_clear_guest_paused(void)
 }
 #endif /* CONFIG_KVM_GUEST */
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST
+u64 kvm_get_suspend_time(void);
+#else
+static inline u64 kvm_get_suspend_time(void)
+{
+	return 0;
+}
+#endif /* CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST */
+
 #define KVM_HYPERCALL \
         ALTERNATIVE("vmcall", "vmmcall", X86_FEATURE_VMMCALL)
 
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index ad273e5861c1..1c92b54b1bce 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -16,11 +16,15 @@
 #include <linux/mm.h>
 #include <linux/slab.h>
 #include <linux/set_memory.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
 
 #include <asm/hypervisor.h>
 #include <asm/mem_encrypt.h>
 #include <asm/x86_init.h>
 #include <asm/kvmclock.h>
+#include <asm/desc.h>
+#include <asm/idtentry.h>
 
 static int kvmclock __initdata = 1;
 static int kvmclock_vsyscall __initdata = 1;
@@ -48,6 +52,9 @@ early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
 
 static struct pvclock_vsyscall_time_info
 			hv_clock_boot[HVC_BOOT_ARRAY_SIZE] __bss_decrypted __aligned(PAGE_SIZE);
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST
+static struct kvm_suspend_time suspend_time __bss_decrypted;
+#endif
 static struct pvclock_wall_clock wall_clock __bss_decrypted;
 static DEFINE_PER_CPU(struct pvclock_vsyscall_time_info *, hv_clock_per_cpu);
 static struct pvclock_vsyscall_time_info *hvclock_mem;
@@ -163,6 +170,18 @@ static int kvm_cs_enable(struct clocksource *cs)
 	return 0;
 }
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST
+/*
+ * kvm_get_suspend_time - This function returns total time passed during
+ * the host was in a suspend state while this guest was running.
+ * (Not a duration of the last host suspension but cumulative time.)
+ */
+u64 kvm_get_suspend_time(void)
+{
+	return suspend_time.suspend_time_ns;
+}
+#endif /* CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST */
+
 struct clocksource kvm_clock = {
 	.name	= "kvm-clock",
 	.read	= kvm_clock_get_cycles,
@@ -290,6 +309,18 @@ static int kvmclock_setup_percpu(unsigned int cpu)
 	return p ? 0 : -ENOMEM;
 }
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST
+void timekeeping_inject_virtual_suspend_time(void);
+DEFINE_IDTENTRY_SYSVEC(sysvec_virtual_suspend_time)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	timekeeping_inject_virtual_suspend_time();
+
+	set_irq_regs(old_regs);
+}
+#endif
+
 void __init kvmclock_init(void)
 {
 	u8 flags;
@@ -304,6 +335,15 @@ void __init kvmclock_init(void)
 		return;
 	}
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST
+	if (kvm_para_has_feature(KVM_FEATURE_HOST_SUSPEND_TIME)) {
+		alloc_intr_gate(VIRT_SUSPEND_TIMING_VECTOR, asm_sysvec_virtual_suspend_time);
+		/* Register the suspend time structure */
+		wrmsrl(MSR_KVM_HOST_SUSPEND_TIME,
+		       slow_virt_to_phys(&suspend_time) | KVM_MSR_ENABLED);
+	}
+#endif
+
 	if (cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "kvmclock:setup_percpu",
 			      kvmclock_setup_percpu, NULL) < 0) {
 		return;
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 84ff2844df2a..a5fd515f0a9d 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -124,6 +124,10 @@ struct timekeeper {
 	u32			ntp_err_mult;
 	/* Flag used to avoid updating NTP twice with same second */
 	u32			skip_second_overflow;
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST
+	/* suspend_time_injected keeps the duration injected through kvm */
+	u64			suspend_time_injected;
+#endif
 #ifdef CONFIG_DEBUG_TIMEKEEPING
 	long			last_warning;
 	/*
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 3ac3fb479981..424c61d38646 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2125,6 +2125,39 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset,
 	return offset;
 }
 
+#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST
+/*
+ * timekeeping_inject_virtual_suspend_time - Inject virtual suspend time
+ * when requested by the kvm host.
+ * This function should be called under irq context.
+ */
+void timekeeping_inject_virtual_suspend_time(void)
+{
+	/*
+	 * Only updates shadow_timekeeper so the change will be reflected
+	 * on the next call of timekeeping_advance().
+	 */
+	struct timekeeper *tk = &shadow_timekeeper;
+	unsigned long flags;
+	struct timespec64 delta;
+	u64 suspend_time;
+
+	raw_spin_lock_irqsave(&timekeeper_lock, flags);
+	suspend_time = kvm_get_suspend_time();
+	if (suspend_time > tk->suspend_time_injected) {
+		/*
+		 * Do injection only if the time is not injected yet.
+		 * suspend_time and tk->suspend_time_injected values are
+		 * cummrative, so take a diff and inject the duration.
+		 */
+		delta = ns_to_timespec64(suspend_time - tk->suspend_time_injected);
+		__timekeeping_inject_sleeptime(tk, &delta);
+		tk->suspend_time_injected = suspend_time;
+	}
+	raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+}
+#endif
+
 /*
  * timekeeping_advance - Updates the timekeeper to the current time and
  * current NTP tick length
-- 
2.32.0.605.g8dce9f2422-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [v2 PATCH 2/4] x86/kvm: Add definitions for virtual suspend time injection
  2021-08-06 10:07 ` [v2 PATCH 2/4] x86/kvm: Add definitions for virtual suspend time injection Hikaru Nishida
@ 2021-08-10 15:14   ` Thomas Gleixner
  2021-08-13 18:36   ` Guenter Roeck
  1 sibling, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2021-08-10 15:14 UTC (permalink / raw)
  To: Hikaru Nishida, linux-kernel, dme, mlevitsk
  Cc: suleiman, Hikaru Nishida, Andy Lutomirski, Borislav Petkov,
	H. Peter Anvin, Ingo Molnar, Jim Mattson, Joerg Roedel,
	John Stultz, Paolo Bonzini, Sean Christopherson, Stephen Boyd,
	Steve Wahl, Vitaly Kuznetsov, Wanpeng Li, kvm, x86

On Fri, Aug 06 2021 at 19:07, Hikaru Nishida wrote:

> Add definitions of MSR, KVM_FEATURE bit, IRQ and a structure called
> kvm_suspend_time that are used by later patchs to support the
> virtual suspend time injection mechanism.

Why does this need to include kvm_host.h in the timekeeping core?

> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -22,6 +22,7 @@
>  #include <linux/pvclock_gtod.h>
>  #include <linux/compiler.h>
>  #include <linux/audit.h>
> +#include <linux/kvm_host.h>
>  
>  #include "tick-internal.h"
>  #include "ntp_internal.h"

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [v2 PATCH 3/4] x86/kvm: Add host side support for virtual suspend time injection
  2021-08-06 10:07 ` [v2 PATCH 3/4] x86/kvm: Add host side support " Hikaru Nishida
@ 2021-08-10 15:21   ` Thomas Gleixner
  2021-08-10 15:24   ` Thomas Gleixner
  2021-08-14  7:22   ` Paolo Bonzini
  2 siblings, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2021-08-10 15:21 UTC (permalink / raw)
  To: Hikaru Nishida, linux-kernel, dme, mlevitsk
  Cc: suleiman, Hikaru Nishida, Borislav Petkov, H. Peter Anvin,
	Ingo Molnar, Jim Mattson, Joerg Roedel, John Stultz,
	Paolo Bonzini, Sean Christopherson, Stephen Boyd,
	Vitaly Kuznetsov, Wanpeng Li, kvm, x86

On Fri, Aug 06 2021 at 19:07, Hikaru Nishida wrote:

> This patch implements virtual suspend time injection support for kvm

git grep 'This patch' Documentation/process/

> hosts.
>
> If this functionality is enabled and the guest requests it, the host
> will stop all the clocks observed by the guest during the host's
> suspension and report the duration of suspend to the guest through
> struct kvm_host_suspend_time to give a chance to adjust CLOCK_BOOTTIME
> to the guest. This mechanism can be used to align the guest's clock
> behavior to the hosts' ones.
>
> Signed-off-by: Hikaru Nishida <hikalium@chromium.org>
> ---
>
>  arch/x86/include/asm/kvm_host.h |   5 ++
>  arch/x86/kvm/Kconfig            |  13 ++++
>  arch/x86/kvm/cpuid.c            |   4 ++
>  arch/x86/kvm/x86.c              | 109 +++++++++++++++++++++++++++++++-
>  include/linux/kvm_host.h        |   8 +++
>  kernel/time/timekeeping.c       |   3 +

Please split this into adding the infrastructure and then implementing
the x86 side of it.

>  
> +#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
> +void kvm_arch_timekeeping_inject_sleeptime(const struct timespec64 *delta);
 +#else
 +static inline void kvm_arch_timekeeping_inject_sleeptime(const struct timespec64 *delta){}
> +#endif /* CONFIG_KVM_VIRT_SUSPEND_TIMING */
> +
>  #endif
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 233ceb6cce1f..3ac3fb479981 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1797,6 +1797,9 @@ void timekeeping_resume(void)
>  	if (inject_sleeptime) {
>  		suspend_timing_needed = false;
>  		__timekeeping_inject_sleeptime(tk, &ts_delta);
> +#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
> +		kvm_arch_timekeeping_inject_sleeptime(&ts_delta);
> +#endif

which get's rid of these ugly ifdefs.

Also this is the wrong place because sleep time can be injected from
other places as well. This should be in __timekeeping_inject_sleeptime()
if at all.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [v2 PATCH 3/4] x86/kvm: Add host side support for virtual suspend time injection
  2021-08-06 10:07 ` [v2 PATCH 3/4] x86/kvm: Add host side support " Hikaru Nishida
  2021-08-10 15:21   ` Thomas Gleixner
@ 2021-08-10 15:24   ` Thomas Gleixner
  2021-08-14  7:22   ` Paolo Bonzini
  2 siblings, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2021-08-10 15:24 UTC (permalink / raw)
  To: Hikaru Nishida, linux-kernel, dme, mlevitsk
  Cc: suleiman, Hikaru Nishida, Borislav Petkov, H. Peter Anvin,
	Ingo Molnar, Jim Mattson, Joerg Roedel, John Stultz,
	Paolo Bonzini, Sean Christopherson, Stephen Boyd,
	Vitaly Kuznetsov, Wanpeng Li, kvm, x86

On Fri, Aug 06 2021 at 19:07, Hikaru Nishida wrote:
>  
> +#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
> +void kvm_arch_timekeeping_inject_sleeptime(const struct timespec64 *delta)
> +{
> +	struct kvm_vcpu *vcpu;
> +	u64 suspend_time_ns;
> +	struct kvm *kvm;
> +	s64 adj;
> +	int i;
> +
> +	suspend_time_ns = timespec64_to_ns(delta);
> +	adj = tsc_khz * (suspend_time_ns / 1000000);
> +	/*
> +	 * Adjust TSCs on all vcpus and kvmclock as if they are stopped
> +	 * during host's suspension.
> +	 * Also, cummulative suspend time is recorded in kvm structure and
> +	 * the update will be notified via an interrupt for each vcpu.
> +	 */
> +	mutex_lock(&kvm_lock);

This is invoked from with timekeeper_lock held with interrupts
disabled. How is a mutex_lock() supposed to work here?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [v2 PATCH 4/4] x86/kvm: Add guest side support for virtual suspend time injection
  2021-08-06 10:07 ` [v2 PATCH 4/4] x86/kvm: Add guest " Hikaru Nishida
@ 2021-08-10 15:48   ` Thomas Gleixner
  0 siblings, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2021-08-10 15:48 UTC (permalink / raw)
  To: Hikaru Nishida, linux-kernel, dme, mlevitsk
  Cc: suleiman, Hikaru Nishida, Borislav Petkov, H. Peter Anvin,
	Ingo Molnar, Jim Mattson, Joerg Roedel, John Stultz,
	Juergen Gross, Lai Jiangshan, Mike Travis, Paolo Bonzini,
	Peter Zijlstra (Intel),
	Sean Christopherson, Stephen Boyd, Tom Lendacky,
	Vitaly Kuznetsov, Wanpeng Li, kvm, x86

On Fri, Aug 06 2021 at 19:07, Hikaru Nishida wrote:
>  arch/x86/Kconfig                    | 13 ++++++++++
>  arch/x86/include/asm/idtentry.h     |  4 +++
>  arch/x86/include/asm/kvm_para.h     |  9 +++++++
>  arch/x86/kernel/kvmclock.c          | 40 +++++++++++++++++++++++++++++
>  include/linux/timekeeper_internal.h |  4 +++
>  kernel/time/timekeeping.c           | 33 ++++++++++++++++++++++++

Again, this wants to be split into infrastructure and usage.

> --- a/include/linux/timekeeper_internal.h
> +++ b/include/linux/timekeeper_internal.h
> @@ -124,6 +124,10 @@ struct timekeeper {
> 	u32			ntp_err_mult;
> 	/* Flag used to avoid updating NTP twice with same second */
> 	u32			skip_second_overflow;
> +#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST
> +	/* suspend_time_injected keeps the duration injected through kvm */
> +	u64			suspend_time_injected;

This is KVM only, so please can we have a name for that struct member
which reflects this?

> +#endif
>  #ifdef CONFIG_DEBUG_TIMEKEEPING
> 	long			last_warning;
> 	/*

> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 3ac3fb479981..424c61d38646 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -2125,6 +2125,39 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset,
>  	return offset;
>  }
>  
> +#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST
> +/*
> + * timekeeping_inject_virtual_suspend_time - Inject virtual suspend time
> + * when requested by the kvm host.

If this is an attempt to provide a kernel-doc comment for this function,
then it's clearly a failed attempt and aside of that malformatted.

> + * This function should be called under irq context.

Why? There is no reason for being called from interrupt context and
nothing inforces it.

> + */
> +void timekeeping_inject_virtual_suspend_time(void)
> +{
> +	/*
> +	 * Only updates shadow_timekeeper so the change will be reflected
> +	 * on the next call of timekeeping_advance().

No. That's broken.

    timekeeping_inject_virtual_suspend_time();

    do_settimeofday() or do_adjtimex()

       timekeeping_update(tk, TK_MIRROR...);

and your change to the shadow timekeeper is gone.

Of course there is also no justification for this approach. What's wrong
with updating it right away?

> +	 */
> +	struct timekeeper *tk = &shadow_timekeeper;
> +	unsigned long flags;
> +	struct timespec64 delta;
> +	u64 suspend_time;

Please sort variables in reverse fir tree order and not randomly as you
see fit.

> +
> +	raw_spin_lock_irqsave(&timekeeper_lock, flags);
> +	suspend_time = kvm_get_suspend_time();
> +	if (suspend_time > tk->suspend_time_injected) {
> +		/*
> +		 * Do injection only if the time is not injected yet.
> +		 * suspend_time and tk->suspend_time_injected values are
> +		 * cummrative, so take a diff and inject the duration.

cummrative?

> +		 */
> +		delta = ns_to_timespec64(suspend_time - tk->suspend_time_injected);
> +		__timekeeping_inject_sleeptime(tk, &delta);
> +		tk->suspend_time_injected = suspend_time;

It's absolutely unclear how this storage and diff magic works and the
comment is not helping someone not familiar with the implementation of
kvm_get_suspend_time() and the related code at all. Please explain
non-obvious logic properly.

Thanks,

        tglx





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [v2 PATCH 2/4] x86/kvm: Add definitions for virtual suspend time injection
  2021-08-06 10:07 ` [v2 PATCH 2/4] x86/kvm: Add definitions for virtual suspend time injection Hikaru Nishida
  2021-08-10 15:14   ` Thomas Gleixner
@ 2021-08-13 18:36   ` Guenter Roeck
  1 sibling, 0 replies; 13+ messages in thread
From: Guenter Roeck @ 2021-08-13 18:36 UTC (permalink / raw)
  To: Hikaru Nishida
  Cc: linux-kernel, dme, tglx, mlevitsk, suleiman, Andy Lutomirski,
	Borislav Petkov, H. Peter Anvin, Ingo Molnar, Jim Mattson,
	Joerg Roedel, John Stultz, Paolo Bonzini, Sean Christopherson,
	Stephen Boyd, Steve Wahl, Vitaly Kuznetsov, Wanpeng Li, kvm, x86

On Fri, Aug 06, 2021 at 07:07:08PM +0900, Hikaru Nishida wrote:
> Add definitions of MSR, KVM_FEATURE bit, IRQ and a structure called
> kvm_suspend_time that are used by later patchs to support the
> virtual suspend time injection mechanism.
> 
> Signed-off-by: Hikaru Nishida <hikalium@chromium.org>
> ---
> 

This patch series assumes that kvm is supported on all architectures and builds.
Where it isn't, it results in widespread compile errors such as ...

> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 8a364aa9881a..233ceb6cce1f 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -22,6 +22,7 @@
>  #include <linux/pvclock_gtod.h>
>  #include <linux/compiler.h>
>  #include <linux/audit.h>
> +#include <linux/kvm_host.h>
>  
   In file included from kernel/time/timekeeping.c:25:
   In file included from include/linux/kvm_host.h:32:
>> include/uapi/linux/kvm.h:14:10: fatal error: 'asm/kvm.h' file not found
   #include <asm/kvm.h>
            ^~~~~~~~~~~

Guenter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [v2 PATCH 3/4] x86/kvm: Add host side support for virtual suspend time injection
  2021-08-06 10:07 ` [v2 PATCH 3/4] x86/kvm: Add host side support " Hikaru Nishida
  2021-08-10 15:21   ` Thomas Gleixner
  2021-08-10 15:24   ` Thomas Gleixner
@ 2021-08-14  7:22   ` Paolo Bonzini
  2021-08-18  9:32     ` Vitaly Kuznetsov
  2 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2021-08-14  7:22 UTC (permalink / raw)
  To: Hikaru Nishida, linux-kernel, dme, tglx, mlevitsk
  Cc: suleiman, Borislav Petkov, H. Peter Anvin, Ingo Molnar,
	Jim Mattson, Joerg Roedel, John Stultz, Sean Christopherson,
	Stephen Boyd, Vitaly Kuznetsov, Wanpeng Li, kvm, x86

On 06/08/21 12:07, Hikaru Nishida wrote:
> +#if defined(CONFIG_KVM_VIRT_SUSPEND_TIMING) || \
> +	defined(CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST)
> +#define VIRT_SUSPEND_TIMING_VECTOR	0xec
> +#endif

No need to use a new vector.  You can rename the existing 
MSR_KVM_ASYNC_PF_INT to MSR_KVM_HYPERVISOR_CALLBACK_INT or something 
like that, and add the code to sysvec_kvm_asyncpf_interrupt.

> +static void kvm_make_suspend_time_interrupt(struct kvm_vcpu *vcpu)
> +{
> +	kvm_queue_interrupt(vcpu, VIRT_SUSPEND_TIMING_VECTOR, false);
> +	kvm_make_request(KVM_REQ_EVENT, vcpu);
> +}

Use kvm_apic_set_irq which will inject the interrupt as soon as 
possible, so you do not even need to check 
kvm_vcpu_ready_for_interrupt_injection.

> +#ifdef CONFIG_KVM_VIRT_SUSPEND_TIMING
> +		if (kvm->suspend_injection_requested &&
> +			kvm_vcpu_ready_for_interrupt_injection(vcpu)) {
> +			kvm_write_suspend_time(kvm);
> +			kvm_make_suspend_time_interrupt(vcpu);
> +			kvm->suspend_injection_requested = false;
> +		}
> +#endif

Do not read variables in vcpu_run; There is KVM_REQ_* if you need to do 
work in the vCPU run loop.

> +	mutex_lock(&kvm_lock);
> +	list_for_each_entry(kvm, &vm_list, vm_list) {
> +		if (!(kvm->arch.msr_suspend_time & KVM_MSR_ENABLED))
> +			continue;
> +
> +		kvm_for_each_vcpu(i, vcpu, kvm)
> +			vcpu->arch.tsc_offset_adjustment -= adj;
> +
> +		/*
> +		 * Move the offset of kvm_clock here as if it is stopped
> +		 * during the suspension.
> +		 */
> +		kvm->arch.kvmclock_offset -= suspend_time_ns;
> +
> +		/* suspend_time is accumulated per VM. */
> +		kvm->suspend_time_ns += suspend_time_ns;
> +		kvm->suspend_injection_requested = true;
> +		/*
> +		 * This adjustment will be reflected to the struct provided
> +		 * from the guest via MSR_KVM_HOST_SUSPEND_TIME before
> +		 * the notification interrupt is injected.
> +		 */
> +	}
> +	mutex_unlock(&kvm_lock);
> +}

As pointed out by Thomas, this should be a work item.

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [v2 PATCH 3/4] x86/kvm: Add host side support for virtual suspend time injection
  2021-08-14  7:22   ` Paolo Bonzini
@ 2021-08-18  9:32     ` Vitaly Kuznetsov
  2021-08-18 20:49       ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Vitaly Kuznetsov @ 2021-08-18  9:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: suleiman, Borislav Petkov, H. Peter Anvin, Ingo Molnar,
	Jim Mattson, Joerg Roedel, John Stultz, Sean Christopherson,
	Stephen Boyd, Wanpeng Li, kvm, x86, Hikaru Nishida, linux-kernel,
	dme, tglx, mlevitsk

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 06/08/21 12:07, Hikaru Nishida wrote:
>> +#if defined(CONFIG_KVM_VIRT_SUSPEND_TIMING) || \
>> +	defined(CONFIG_KVM_VIRT_SUSPEND_TIMING_GUEST)
>> +#define VIRT_SUSPEND_TIMING_VECTOR	0xec
>> +#endif
>
> No need to use a new vector.  You can rename the existing 
> MSR_KVM_ASYNC_PF_INT to MSR_KVM_HYPERVISOR_CALLBACK_INT or something 
> like that, and add the code to sysvec_kvm_asyncpf_interrupt.

On the host side, I'd vote for keeping MSR_KVM_ASYNC_PF_INT for async PF
mechanism only for two reasons:
- We may want to use (currently reserved) upper 56 bits of it for new
asyncPF related features (e.g. flags) and it would be unnatural to add
them to 'MSR_KVM_HYPERVISOR_CALLBACK_INT'
- We should probably leave it to the guest if it wants to share 'suspend
time' notification interrupt with async PF (and if it actually wants to
get one/another).

On the guest side, it is perfectly fine to reuse
HYPERVISOR_CALLBACK_VECTOR for everything.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [v2 PATCH 3/4] x86/kvm: Add host side support for virtual suspend time injection
  2021-08-18  9:32     ` Vitaly Kuznetsov
@ 2021-08-18 20:49       ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2021-08-18 20:49 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: suleiman, Borislav Petkov, H. Peter Anvin, Ingo Molnar,
	Jim Mattson, Joerg Roedel, John Stultz, Sean Christopherson,
	Stephen Boyd, Wanpeng Li, kvm, x86, Hikaru Nishida, linux-kernel,
	dme, tglx, mlevitsk

On 18/08/21 11:32, Vitaly Kuznetsov wrote:
> On the host side, I'd vote for keeping MSR_KVM_ASYNC_PF_INT for async PF
> mechanism only for two reasons:
> - We may want to use (currently reserved) upper 56 bits of it for new
> asyncPF related features (e.g. flags) and it would be unnatural to add
> them to 'MSR_KVM_HYPERVISOR_CALLBACK_INT'
> - We should probably leave it to the guest if it wants to share 'suspend
> time' notification interrupt with async PF (and if it actually wants to
> get one/another).

I agree that it's fine either way.  That said, more MSRs are more 
complexity and more opportunity for getting things wrong (in either KVM 
or userspace---for example, migration).  There are still 14 free bits in 
MSR_ASYNC_PF_EN (bits 63-52 and 5-4, so it should not be a problem to 
repurpose MSR_ASYNC_PF_INT.

Paolo

> On the guest side, it is perfectly fine to reuse
> HYPERVISOR_CALLBACK_VECTOR for everything.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-08-18 20:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-06 10:07 [v2 PATCH 0/4] x86/kvm: Virtual suspend time injection support Hikaru Nishida
2021-08-06 10:07 ` [v2 PATCH 1/4] x86/kvm: Reserve KVM_FEATURE_HOST_SUSPEND_TIME and MSR_KVM_HOST_SUSPEND_TIME Hikaru Nishida
2021-08-06 10:07 ` [v2 PATCH 2/4] x86/kvm: Add definitions for virtual suspend time injection Hikaru Nishida
2021-08-10 15:14   ` Thomas Gleixner
2021-08-13 18:36   ` Guenter Roeck
2021-08-06 10:07 ` [v2 PATCH 3/4] x86/kvm: Add host side support " Hikaru Nishida
2021-08-10 15:21   ` Thomas Gleixner
2021-08-10 15:24   ` Thomas Gleixner
2021-08-14  7:22   ` Paolo Bonzini
2021-08-18  9:32     ` Vitaly Kuznetsov
2021-08-18 20:49       ` Paolo Bonzini
2021-08-06 10:07 ` [v2 PATCH 4/4] x86/kvm: Add guest " Hikaru Nishida
2021-08-10 15:48   ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).