linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state
@ 2021-08-04  8:57 Oliver Upton
  2021-08-04  8:57 ` [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK Oliver Upton
                   ` (22 more replies)
  0 siblings, 23 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:57 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

KVM's current means of saving/restoring system counters is plagued with
temporal issues. At least on ARM64 and x86, we migrate the guest's
system counter by-value through the respective guest system register
values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
brittle as the state is not idempotent: the host system counter is still
oscillating between the attempted save and restore. Furthermore, VMMs
may wish to transparently live migrate guest VMs, meaning that they
include the elapsed time due to live migration blackout in the guest
system counter view. The VMM thread could be preempted for any number of
reasons (scheduler, L0 hypervisor under nested) between the time that
it calculates the desired guest counter value and when KVM actually sets
this counter state.

Despite the value-based interface that we present to userspace, KVM
actually has idempotent guest controls by way of system counter offsets.
We can avoid all of the issues associated with a value-based interface
by abstracting these offset controls in new ioctls. This series
introduces new vCPU device attributes to provide userspace access to the
vCPU's system counter offset.

Patch 1 addresses a possible race in KVM_GET_CLOCK where
use_master_clock is read outside of the pvclock_gtod_sync_lock.

Patch 2 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
ioctls to provide userspace with a (host_tsc, realtime) instant. This is
essential for a VMM to perform precise migration of the guest's system
counters.

Patches 3-4 are some preparatory changes for exposing the TSC offset to
userspace. Patch 5 provides a vCPU attribute to provide userspace access
to the TSC offset.

Patches 6-7 implement a test for the new additions to
KVM_{GET,SET}_CLOCK.

Patch 8 fixes some assertions in the kvm device attribute helpers.

Patches 9-10 implement at test for the tsc offset attribute introduced in
patch 5.

Patches 11-12 lay the groundwork for patch 13, which exposes CNTVOFF_EL2
through the ONE_REG interface.

Patches 14-15 add test cases for userspace manipulation of the virtual
counter-timer.

Patches 16-17 add a vCPU attribute to adjust the host-guest offset of an
ARM vCPU, but only implements support for ECV hosts. Patches 18-19 add
support for non-ECV hosts by emulating physical counter offsetting.

Patch 20 adds test cases for adjusting the host-guest offset, and
finally patch 21 adds a test to measure the emulation overhead of
CNTPCT_EL2.

This series was tested on both an Ampere Mt. Jade and Haswell systems.
Unfortunately, the ECV portions of this series are untested, as there is
no ECV-capable hardware and the ARM fast models only partially implement
ECV.

Physical counter benchmark
--------------------------

The following data was collected by running 10000 iterations of the
benchmark test from Patch 21 on an Ampere Mt. Jade reference server, A 2S
machine with 2 80-core Ampere Altra SoCs. Measurements were collected
for both VHE and nVHE operation using the `kvm-arm.mode=` command-line
parameter.

nVHE
----

+--------------------+--------+---------+
|       Metric       | Native | Trapped |
+--------------------+--------+---------+
| Average            | 54ns   | 148ns   |
| Standard Deviation | 124ns  | 122ns   |
| 95th Percentile    | 258ns  | 348ns   |
+--------------------+--------+---------+

VHE
---

+--------------------+--------+---------+
|       Metric       | Native | Trapped |
+--------------------+--------+---------+
| Average            | 53ns   | 152ns   |
| Standard Deviation | 92ns   | 94ns    |
| 95th Percentile    | 204ns  | 307ns   |
+--------------------+--------+---------+

This series applies cleanly to kvm/queue at the following commit:

6cd974485e25 ("KVM: selftests: Add a test of an unbacked nested PI descriptor")

v1 -> v2:
  - Reimplemented as vCPU device attributes instead of a distinct ioctl.
  - Added the (realtime, host_tsc) instant support to KVM_{GET,SET}_CLOCK
  - Changed the arm64 implementation to broadcast counter
    offset values to all vCPUs in a guest. This upholds the
    architectural expectations of a consistent counter-timer across CPUs.
  - Fixed a bug with traps in VHE mode. We now configure traps on every
    transition into a guest to handle differing VMs (trapped, emulated).

v2 -> v3:
  - Added documentation for additions to KVM_{GET,SET}_CLOCK
  - Added documentation for all new vCPU attributes
  - Added documentation for suggested algorithm to migrate a guest's
    TSC(s)
  - Bug fixes throughout series
  - Rename KVM_CLOCK_REAL_TIME -> KVM_CLOCK_REALTIME

v3 -> v4:
  - Added patch to address incorrect device helper assertions (Drew)
  - Carried Drew's r-b tags where appropriate
  - x86 selftest cleanup
  - Removed stale kvm_timer_init_vhe() function
  - Removed unnecessary GUEST_DONE() from selftests

v4 -> v5:
  - Fix typo in TSC migration algorithm
  - Carry more of Drew's r-b tags
  - clean up run loop logic in counter emulation benchmark (missed from
    Drew's comments on v3)

v5 -> v6:
  - Add fix for race in KVM_GET_CLOCK (Sean)
  - Fix 32-bit build issues in series + use of uninitialized host tsc
    value (Sean)
  - General style cleanups
  - Rework ARM virtual counter offsetting to match guest behavior. Use
    the ONE_REG interface instead of a VM attribute (Marc)
  - Maintain a single host-guest counter offset, which applies to both
    physical and virtual counters
  - Dropped some of Drew's r-b tags due to nontrivial patch changes
    (sorry for the churn!)

v1: https://lore.kernel.org/kvm/20210608214742.1897483-1-oupton@google.com/
v2: https://lore.kernel.org/r/20210716212629.2232756-1-oupton@google.com
v3: https://lore.kernel.org/r/20210719184949.1385910-1-oupton@google.com
v4: https://lore.kernel.org/r/20210729001012.70394-1-oupton@google.com
v5: https://lore.kernel.org/r/20210729173300.181775-1-oupton@google.com

Oliver Upton (21):
  KVM: x86: Fix potential race in KVM_GET_CLOCK
  KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK
  KVM: x86: Take the pvclock sync lock behind the tsc_write_lock
  KVM: x86: Refactor tsc synchronization code
  KVM: x86: Expose TSC offset controls to userspace
  tools: arch: x86: pull in pvclock headers
  selftests: KVM: Add test for KVM_{GET,SET}_CLOCK
  selftests: KVM: Fix kvm device helper ioctl assertions
  selftests: KVM: Add helpers for vCPU device attributes
  selftests: KVM: Introduce system counter offset test
  KVM: arm64: Refactor update_vtimer_cntvoff()
  KVM: arm64: Separate guest/host counter offset values
  KVM: arm64: Allow userspace to configure a vCPU's virtual offset
  selftests: KVM: Add helper to check for register presence
  selftests: KVM: Add support for aarch64 to system_counter_offset_test
  arm64: cpufeature: Enumerate support for Enhanced Counter
    Virtualization
  KVM: arm64: Allow userspace to configure a guest's counter-timer
    offset
  KVM: arm64: Configure timer traps in vcpu_load() for VHE
  KVM: arm64: Emulate physical counter offsetting on non-ECV systems
  selftests: KVM: Test physical counter offsetting
  selftests: KVM: Add counter emulation benchmark

 Documentation/virt/kvm/api.rst                |  52 ++-
 Documentation/virt/kvm/devices/vcpu.rst       |  85 ++++
 Documentation/virt/kvm/locking.rst            |  11 +
 arch/arm64/include/asm/kvm_asm.h              |   2 +
 arch/arm64/include/asm/sysreg.h               |   5 +
 arch/arm64/include/uapi/asm/kvm.h             |   2 +
 arch/arm64/kernel/cpufeature.c                |  10 +
 arch/arm64/kvm/arch_timer.c                   | 224 ++++++++++-
 arch/arm64/kvm/arm.c                          |   4 +-
 arch/arm64/kvm/guest.c                        |   6 +-
 arch/arm64/kvm/hyp/include/hyp/switch.h       |  29 ++
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |   6 +
 arch/arm64/kvm/hyp/nvhe/timer-sr.c            |  16 +-
 arch/arm64/kvm/hyp/vhe/timer-sr.c             |   5 +
 arch/arm64/tools/cpucaps                      |   1 +
 arch/x86/include/asm/kvm_host.h               |   4 +
 arch/x86/include/uapi/asm/kvm.h               |   4 +
 arch/x86/kvm/x86.c                            | 364 +++++++++++++-----
 include/clocksource/arm_arch_timer.h          |   1 +
 include/kvm/arm_arch_timer.h                  |   6 +-
 include/uapi/linux/kvm.h                      |   7 +-
 tools/arch/x86/include/asm/pvclock-abi.h      |  48 +++
 tools/arch/x86/include/asm/pvclock.h          | 103 +++++
 tools/testing/selftests/kvm/.gitignore        |   3 +
 tools/testing/selftests/kvm/Makefile          |   4 +
 .../kvm/aarch64/counter_emulation_benchmark.c | 207 ++++++++++
 .../selftests/kvm/include/aarch64/processor.h |  24 ++
 .../testing/selftests/kvm/include/kvm_util.h  |  13 +
 tools/testing/selftests/kvm/lib/kvm_util.c    |  63 ++-
 .../kvm/system_counter_offset_test.c          | 211 ++++++++++
 .../selftests/kvm/x86_64/kvm_clock_test.c     | 204 ++++++++++
 31 files changed, 1581 insertions(+), 143 deletions(-)
 create mode 100644 tools/arch/x86/include/asm/pvclock-abi.h
 create mode 100644 tools/arch/x86/include/asm/pvclock.h
 create mode 100644 tools/testing/selftests/kvm/aarch64/counter_emulation_benchmark.c
 create mode 100644 tools/testing/selftests/kvm/system_counter_offset_test.c
 create mode 100644 tools/testing/selftests/kvm/x86_64/kvm_clock_test.c

-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
@ 2021-08-04  8:57 ` Oliver Upton
  2021-08-11 12:23   ` Paolo Bonzini
  2021-08-04  8:58 ` [PATCH v6 02/21] KVM: x86: Report host tsc and realtime values " Oliver Upton
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:57 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Sean noticed that KVM_GET_CLOCK was checking kvm_arch.use_master_clock
outside of the pvclock sync lock. This is problematic, as the clock
value written to the user may or may not actually correspond to a stable
TSC.

Fix the race by populating the entire kvm_clock_data structure behind
the pvclock_gtod_sync_lock.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/x86.c | 39 ++++++++++++++++++++++++++++-----------
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8cdf4ac6990b..34287c522f4e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2782,19 +2782,20 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
 #endif
 }
 
-u64 get_kvmclock_ns(struct kvm *kvm)
+static void get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
 {
 	struct kvm_arch *ka = &kvm->arch;
 	struct pvclock_vcpu_time_info hv_clock;
 	unsigned long flags;
-	u64 ret;
 
 	spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
 	if (!ka->use_master_clock) {
 		spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
-		return get_kvmclock_base_ns() + ka->kvmclock_offset;
+		data->clock = get_kvmclock_base_ns() + ka->kvmclock_offset;
+		return;
 	}
 
+	data->flags |= KVM_CLOCK_TSC_STABLE;
 	hv_clock.tsc_timestamp = ka->master_cycle_now;
 	hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset;
 	spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
@@ -2806,13 +2807,26 @@ u64 get_kvmclock_ns(struct kvm *kvm)
 		kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,
 				   &hv_clock.tsc_shift,
 				   &hv_clock.tsc_to_system_mul);
-		ret = __pvclock_read_cycles(&hv_clock, rdtsc());
-	} else
-		ret = get_kvmclock_base_ns() + ka->kvmclock_offset;
+		data->clock = __pvclock_read_cycles(&hv_clock, rdtsc());
+	} else {
+		data->clock = get_kvmclock_base_ns() + ka->kvmclock_offset;
+	}
 
 	put_cpu();
+}
 
-	return ret;
+u64 get_kvmclock_ns(struct kvm *kvm)
+{
+	struct kvm_clock_data data;
+
+	/*
+	 * Zero flags as it's accessed RMW, leave everything else uninitialized
+	 * as clock is always written and no other fields are consumed.
+	 */
+	data.flags = 0;
+
+	get_kvmclock(kvm, &data);
+	return data.clock;
 }
 
 static void kvm_setup_pvclock_page(struct kvm_vcpu *v,
@@ -6104,11 +6118,14 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 	case KVM_GET_CLOCK: {
 		struct kvm_clock_data user_ns;
-		u64 now_ns;
 
-		now_ns = get_kvmclock_ns(kvm);
-		user_ns.clock = now_ns;
-		user_ns.flags = kvm->arch.use_master_clock ? KVM_CLOCK_TSC_STABLE : 0;
+		/*
+		 * Zero flags as it is accessed RMW, leave everything else
+		 * uninitialized as clock is always written and no other fields
+		 * are consumed.
+		 */
+		user_ns.flags = 0;
+		get_kvmclock(kvm, &user_ns);
 		memset(&user_ns.pad, 0, sizeof(user_ns.pad));
 
 		r = -EFAULT;
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 02/21] KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
  2021-08-04  8:57 ` [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 03/21] KVM: x86: Take the pvclock sync lock behind the tsc_write_lock Oliver Upton
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Handling the migration of TSCs correctly is difficult, in part because
Linux does not provide userspace with the ability to retrieve a (TSC,
realtime) clock pair for a single instant in time. In lieu of a more
convenient facility, KVM can report similar information in the kvm_clock
structure.

Provide userspace with a host TSC & realtime pair iff the realtime clock
is based on the TSC. If userspace provides KVM_SET_CLOCK with a valid
realtime value, advance the KVM clock by the amount of elapsed time. Do
not step the KVM clock backwards, though, as it is a monotonic
oscillator.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 Documentation/virt/kvm/api.rst  |  42 ++++++++---
 arch/x86/include/asm/kvm_host.h |   3 +
 arch/x86/kvm/x86.c              | 127 ++++++++++++++++++--------------
 include/uapi/linux/kvm.h        |   7 +-
 4 files changed, 112 insertions(+), 67 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index dae68e68ca23..8d4a3471ad9e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -993,20 +993,34 @@ such as migration.
 When KVM_CAP_ADJUST_CLOCK is passed to KVM_CHECK_EXTENSION, it returns the
 set of bits that KVM can return in struct kvm_clock_data's flag member.
 
-The only flag defined now is KVM_CLOCK_TSC_STABLE.  If set, the returned
-value is the exact kvmclock value seen by all VCPUs at the instant
-when KVM_GET_CLOCK was called.  If clear, the returned value is simply
-CLOCK_MONOTONIC plus a constant offset; the offset can be modified
-with KVM_SET_CLOCK.  KVM will try to make all VCPUs follow this clock,
-but the exact value read by each VCPU could differ, because the host
-TSC is not stable.
+FLAGS:
+
+KVM_CLOCK_TSC_STABLE.  If set, the returned value is the exact kvmclock
+value seen by all VCPUs at the instant when KVM_GET_CLOCK was called.
+If clear, the returned value is simply CLOCK_MONOTONIC plus a constant
+offset; the offset can be modified with KVM_SET_CLOCK.  KVM will try
+to make all VCPUs follow this clock, but the exact value read by each
+VCPU could differ, because the host TSC is not stable.
+
+KVM_CLOCK_REALTIME.  If set, the `realtime` field in the kvm_clock_data
+structure is populated with the value of the host's real time
+clocksource at the instant when KVM_GET_CLOCK was called. If clear,
+the `realtime` field does not contain a value.
+
+KVM_CLOCK_HOST_TSC.  If set, the `host_tsc` field in the kvm_clock_data
+structure is populated with the value of the host's timestamp counter (TSC)
+at the instant when KVM_GET_CLOCK was called. If clear, the `host_tsc` field
+does not contain a value.
 
 ::
 
   struct kvm_clock_data {
 	__u64 clock;  /* kvmclock current value */
 	__u32 flags;
-	__u32 pad[9];
+	__u32 pad0;
+	__u64 realtime;
+	__u64 host_tsc;
+	__u32 pad[4];
   };
 
 
@@ -1023,12 +1037,22 @@ Sets the current timestamp of kvmclock to the value specified in its parameter.
 In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios
 such as migration.
 
+FLAGS:
+
+KVM_CLOCK_REALTIME.  If set, KVM will compare the value of the `realtime` field
+with the value of the host's real time clocksource at the instant when
+KVM_SET_CLOCK was called. The difference in elapsed time is added to the final
+kvmclock value that will be provided to guests.
+
 ::
 
   struct kvm_clock_data {
 	__u64 clock;  /* kvmclock current value */
 	__u32 flags;
-	__u32 pad[9];
+	__u32 pad0;
+	__u64 realtime;
+	__u64 host_tsc;
+	__u32 pad[4];
   };
 
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6818095dd157..d6376ca8efce 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1926,4 +1926,7 @@ int kvm_cpu_dirty_log_size(void);
 
 int alloc_all_memslots_rmaps(struct kvm *kvm);
 
+#define KVM_CLOCK_VALID_FLAGS						\
+	(KVM_CLOCK_TSC_STABLE | KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC)
+
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 34287c522f4e..26f1fa263192 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2804,10 +2804,20 @@ static void get_kvmclock(struct kvm *kvm, struct kvm_clock_data *data)
 	get_cpu();
 
 	if (__this_cpu_read(cpu_tsc_khz)) {
+#ifdef CONFIG_X86_64
+		struct timespec64 ts;
+
+		if (kvm_get_walltime_and_clockread(&ts, &data->host_tsc)) {
+			data->realtime = ts.tv_nsec + NSEC_PER_SEC * ts.tv_sec;
+			data->flags |= KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC;
+		} else
+#endif
+		data->host_tsc = rdtsc();
+
 		kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,
 				   &hv_clock.tsc_shift,
 				   &hv_clock.tsc_to_system_mul);
-		data->clock = __pvclock_read_cycles(&hv_clock, rdtsc());
+		data->clock = __pvclock_read_cycles(&hv_clock, data->host_tsc);
 	} else {
 		data->clock = get_kvmclock_base_ns() + ka->kvmclock_offset;
 	}
@@ -4047,7 +4057,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = KVM_SYNC_X86_VALID_FIELDS;
 		break;
 	case KVM_CAP_ADJUST_CLOCK:
-		r = KVM_CLOCK_TSC_STABLE;
+		r = KVM_CLOCK_VALID_FLAGS;
 		break;
 	case KVM_CAP_X86_DISABLE_EXITS:
 		r |=  KVM_X86_DISABLE_EXITS_HLT | KVM_X86_DISABLE_EXITS_PAUSE |
@@ -5834,6 +5844,60 @@ int kvm_arch_pm_notifier(struct kvm *kvm, unsigned long state)
 }
 #endif /* CONFIG_HAVE_KVM_PM_NOTIFIER */
 
+static int kvm_vm_ioctl_get_clock(struct kvm *kvm, void __user *argp)
+{
+	struct kvm_clock_data data;
+
+	memset(&data, 0, sizeof(data));
+	get_kvmclock(kvm, &data);
+
+	if (copy_to_user(argp, &data, sizeof(data)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int kvm_vm_ioctl_set_clock(struct kvm *kvm, void __user *argp)
+{
+	struct kvm_arch *ka = &kvm->arch;
+	struct kvm_clock_data data;
+	u64 now_raw_ns;
+
+	if (copy_from_user(&data, argp, sizeof(data)))
+		return -EFAULT;
+
+	if (data.flags & ~KVM_CLOCK_REALTIME)
+		return -EINVAL;
+
+	/*
+	 * TODO: userspace has to take care of races with VCPU_RUN, so
+	 * kvm_gen_update_masterclock() can be cut down to locked
+	 * pvclock_update_vm_gtod_copy().
+	 */
+	kvm_gen_update_masterclock(kvm);
+
+	spin_lock_irq(&ka->pvclock_gtod_sync_lock);
+	if (data.flags & KVM_CLOCK_REALTIME) {
+		u64 now_real_ns = ktime_get_real_ns();
+
+		/*
+		 * Avoid stepping the kvmclock backwards.
+		 */
+		if (now_real_ns > data.realtime)
+			data.clock += now_real_ns - data.realtime;
+	}
+
+	if (ka->use_master_clock)
+		now_raw_ns = ka->master_kernel_ns;
+	else
+		now_raw_ns = get_kvmclock_base_ns();
+	ka->kvmclock_offset = data.clock - now_raw_ns;
+	spin_unlock_irq(&ka->pvclock_gtod_sync_lock);
+
+	kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
+	return 0;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
 		       unsigned int ioctl, unsigned long arg)
 {
@@ -6077,63 +6141,12 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		break;
 	}
 #endif
-	case KVM_SET_CLOCK: {
-		struct kvm_arch *ka = &kvm->arch;
-		struct kvm_clock_data user_ns;
-		u64 now_ns;
-
-		r = -EFAULT;
-		if (copy_from_user(&user_ns, argp, sizeof(user_ns)))
-			goto out;
-
-		r = -EINVAL;
-		if (user_ns.flags)
-			goto out;
-
-		r = 0;
-		/*
-		 * TODO: userspace has to take care of races with VCPU_RUN, so
-		 * kvm_gen_update_masterclock() can be cut down to locked
-		 * pvclock_update_vm_gtod_copy().
-		 */
-		kvm_gen_update_masterclock(kvm);
-
-		/*
-		 * This pairs with kvm_guest_time_update(): when masterclock is
-		 * in use, we use master_kernel_ns + kvmclock_offset to set
-		 * unsigned 'system_time' so if we use get_kvmclock_ns() (which
-		 * is slightly ahead) here we risk going negative on unsigned
-		 * 'system_time' when 'user_ns.clock' is very small.
-		 */
-		spin_lock_irq(&ka->pvclock_gtod_sync_lock);
-		if (kvm->arch.use_master_clock)
-			now_ns = ka->master_kernel_ns;
-		else
-			now_ns = get_kvmclock_base_ns();
-		ka->kvmclock_offset = user_ns.clock - now_ns;
-		spin_unlock_irq(&ka->pvclock_gtod_sync_lock);
-
-		kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
+	case KVM_SET_CLOCK:
+		r = kvm_vm_ioctl_set_clock(kvm, argp);
 		break;
-	}
-	case KVM_GET_CLOCK: {
-		struct kvm_clock_data user_ns;
-
-		/*
-		 * Zero flags as it is accessed RMW, leave everything else
-		 * uninitialized as clock is always written and no other fields
-		 * are consumed.
-		 */
-		user_ns.flags = 0;
-		get_kvmclock(kvm, &user_ns);
-		memset(&user_ns.pad, 0, sizeof(user_ns.pad));
-
-		r = -EFAULT;
-		if (copy_to_user(argp, &user_ns, sizeof(user_ns)))
-			goto out;
-		r = 0;
+	case KVM_GET_CLOCK:
+		r = kvm_vm_ioctl_get_clock(kvm, argp);
 		break;
-	}
 	case KVM_MEMORY_ENCRYPT_OP: {
 		r = -ENOTTY;
 		if (kvm_x86_ops.mem_enc_op)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index d9e4aabcb31a..53a49cb8616a 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1223,11 +1223,16 @@ struct kvm_irqfd {
 
 /* Do not use 1, KVM_CHECK_EXTENSION returned it before we had flags.  */
 #define KVM_CLOCK_TSC_STABLE		2
+#define KVM_CLOCK_REALTIME		(1 << 2)
+#define KVM_CLOCK_HOST_TSC		(1 << 3)
 
 struct kvm_clock_data {
 	__u64 clock;
 	__u32 flags;
-	__u32 pad[9];
+	__u32 pad0;
+	__u64 realtime;
+	__u64 host_tsc;
+	__u32 pad[4];
 };
 
 /* For KVM_CAP_SW_TLB */
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 03/21] KVM: x86: Take the pvclock sync lock behind the tsc_write_lock
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
  2021-08-04  8:57 ` [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 02/21] KVM: x86: Report host tsc and realtime values " Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 04/21] KVM: x86: Refactor tsc synchronization code Oliver Upton
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

A later change requires that the pvclock sync lock be taken while
holding the tsc_write_lock. Change the locking in kvm_synchronize_tsc()
to align with the requirement to isolate the locking change to its own
commit.

Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 Documentation/virt/kvm/locking.rst | 11 +++++++++++
 arch/x86/kvm/x86.c                 |  2 +-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 8138201efb09..0bf346adac2a 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -36,6 +36,9 @@ On x86:
   holding kvm->arch.mmu_lock (typically with ``read_lock``, otherwise
   there's no need to take kvm->arch.tdp_mmu_pages_lock at all).
 
+- kvm->arch.tsc_write_lock is taken outside
+  kvm->arch.pvclock_gtod_sync_lock
+
 Everything else is a leaf: no other lock is taken inside the critical
 sections.
 
@@ -222,6 +225,14 @@ time it will be set using the Dirty tracking mechanism described above.
 :Comment:	'raw' because hardware enabling/disabling must be atomic /wrt
 		migration.
 
+:Name:		kvm_arch::pvclock_gtod_sync_lock
+:Type:		raw_spinlock_t
+:Arch:		x86
+:Protects:	kvm_arch::{cur_tsc_generation,cur_tsc_nsec,cur_tsc_write,
+			cur_tsc_offset,nr_vcpus_matched_tsc}
+:Comment:	'raw' because updating the kvm master clock must not be
+		preempted.
+
 :Name:		kvm_arch::tsc_write_lock
 :Type:		raw_spinlock
 :Arch:		x86
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 26f1fa263192..93b449761fbe 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2529,7 +2529,6 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
 	vcpu->arch.this_tsc_write = kvm->arch.cur_tsc_write;
 
 	kvm_vcpu_write_tsc_offset(vcpu, offset);
-	raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
 
 	spin_lock_irqsave(&kvm->arch.pvclock_gtod_sync_lock, flags);
 	if (!matched) {
@@ -2540,6 +2539,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
 
 	kvm_track_tsc_matching(vcpu);
 	spin_unlock_irqrestore(&kvm->arch.pvclock_gtod_sync_lock, flags);
+	raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
 }
 
 static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 04/21] KVM: x86: Refactor tsc synchronization code
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (2 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 03/21] KVM: x86: Take the pvclock sync lock behind the tsc_write_lock Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 05/21] KVM: x86: Expose TSC offset controls to userspace Oliver Upton
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Refactor kvm_synchronize_tsc to make a new function that allows callers
to specify TSC parameters (offset, value, nanoseconds, etc.) explicitly
for the sake of participating in TSC synchronization.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/x86.c | 105 ++++++++++++++++++++++++++-------------------
 1 file changed, 61 insertions(+), 44 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 93b449761fbe..91aea751d621 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2443,13 +2443,71 @@ static inline bool kvm_check_tsc_unstable(void)
 	return check_tsc_unstable();
 }
 
+/*
+ * Infers attempts to synchronize the guest's tsc from host writes. Sets the
+ * offset for the vcpu and tracks the TSC matching generation that the vcpu
+ * participates in.
+ */
+static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 offset, u64 tsc,
+				  u64 ns, bool matched)
+{
+	struct kvm *kvm = vcpu->kvm;
+	bool already_matched;
+
+	lockdep_assert_held(&kvm->arch.tsc_write_lock);
+
+	already_matched =
+	       (vcpu->arch.this_tsc_generation == kvm->arch.cur_tsc_generation);
+
+	/*
+	 * We track the most recent recorded KHZ, write and time to
+	 * allow the matching interval to be extended at each write.
+	 */
+	kvm->arch.last_tsc_nsec = ns;
+	kvm->arch.last_tsc_write = tsc;
+	kvm->arch.last_tsc_khz = vcpu->arch.virtual_tsc_khz;
+
+	vcpu->arch.last_guest_tsc = tsc;
+
+	/* Keep track of which generation this VCPU has synchronized to */
+	vcpu->arch.this_tsc_generation = kvm->arch.cur_tsc_generation;
+	vcpu->arch.this_tsc_nsec = kvm->arch.cur_tsc_nsec;
+	vcpu->arch.this_tsc_write = kvm->arch.cur_tsc_write;
+
+	kvm_vcpu_write_tsc_offset(vcpu, offset);
+
+	if (!matched) {
+		/*
+		 * We split periods of matched TSC writes into generations.
+		 * For each generation, we track the original measured
+		 * nanosecond time, offset, and write, so if TSCs are in
+		 * sync, we can match exact offset, and if not, we can match
+		 * exact software computation in compute_guest_tsc()
+		 *
+		 * These values are tracked in kvm->arch.cur_xxx variables.
+		 */
+		kvm->arch.cur_tsc_generation++;
+		kvm->arch.cur_tsc_nsec = ns;
+		kvm->arch.cur_tsc_write = tsc;
+		kvm->arch.cur_tsc_offset = offset;
+
+		spin_lock(&kvm->arch.pvclock_gtod_sync_lock);
+		kvm->arch.nr_vcpus_matched_tsc = 0;
+	} else if (!already_matched) {
+		spin_lock(&kvm->arch.pvclock_gtod_sync_lock);
+		kvm->arch.nr_vcpus_matched_tsc++;
+	}
+
+	kvm_track_tsc_matching(vcpu);
+	spin_unlock(&kvm->arch.pvclock_gtod_sync_lock);
+}
+
 static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
 {
 	struct kvm *kvm = vcpu->kvm;
 	u64 offset, ns, elapsed;
 	unsigned long flags;
-	bool matched;
-	bool already_matched;
+	bool matched = false;
 	bool synchronizing = false;
 
 	raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
@@ -2495,50 +2553,9 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
 			offset = kvm_compute_l1_tsc_offset(vcpu, data);
 		}
 		matched = true;
-		already_matched = (vcpu->arch.this_tsc_generation == kvm->arch.cur_tsc_generation);
-	} else {
-		/*
-		 * We split periods of matched TSC writes into generations.
-		 * For each generation, we track the original measured
-		 * nanosecond time, offset, and write, so if TSCs are in
-		 * sync, we can match exact offset, and if not, we can match
-		 * exact software computation in compute_guest_tsc()
-		 *
-		 * These values are tracked in kvm->arch.cur_xxx variables.
-		 */
-		kvm->arch.cur_tsc_generation++;
-		kvm->arch.cur_tsc_nsec = ns;
-		kvm->arch.cur_tsc_write = data;
-		kvm->arch.cur_tsc_offset = offset;
-		matched = false;
 	}
 
-	/*
-	 * We also track th most recent recorded KHZ, write and time to
-	 * allow the matching interval to be extended at each write.
-	 */
-	kvm->arch.last_tsc_nsec = ns;
-	kvm->arch.last_tsc_write = data;
-	kvm->arch.last_tsc_khz = vcpu->arch.virtual_tsc_khz;
-
-	vcpu->arch.last_guest_tsc = data;
-
-	/* Keep track of which generation this VCPU has synchronized to */
-	vcpu->arch.this_tsc_generation = kvm->arch.cur_tsc_generation;
-	vcpu->arch.this_tsc_nsec = kvm->arch.cur_tsc_nsec;
-	vcpu->arch.this_tsc_write = kvm->arch.cur_tsc_write;
-
-	kvm_vcpu_write_tsc_offset(vcpu, offset);
-
-	spin_lock_irqsave(&kvm->arch.pvclock_gtod_sync_lock, flags);
-	if (!matched) {
-		kvm->arch.nr_vcpus_matched_tsc = 0;
-	} else if (!already_matched) {
-		kvm->arch.nr_vcpus_matched_tsc++;
-	}
-
-	kvm_track_tsc_matching(vcpu);
-	spin_unlock_irqrestore(&kvm->arch.pvclock_gtod_sync_lock, flags);
+	__kvm_synchronize_tsc(vcpu, offset, data, ns, matched);
 	raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
 }
 
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 05/21] KVM: x86: Expose TSC offset controls to userspace
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (3 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 04/21] KVM: x86: Refactor tsc synchronization code Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 06/21] tools: arch: x86: pull in pvclock headers Oliver Upton
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

To date, VMM-directed TSC synchronization and migration has been a bit
messy. KVM has some baked-in heuristics around TSC writes to infer if
the VMM is attempting to synchronize. This is problematic, as it depends
on host userspace writing to the guest's TSC within 1 second of the last
write.

A much cleaner approach to configuring the guest's views of the TSC is to
simply migrate the TSC offset for every vCPU. Offsets are idempotent,
and thus not subject to change depending on when the VMM actually
reads/writes values from/to KVM. The VMM can then read the TSC once with
KVM_GET_CLOCK to capture a (realtime, host_tsc) pair at the instant when
the guest is paused.

Cc: David Matlack <dmatlack@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 Documentation/virt/kvm/devices/vcpu.rst |  57 +++++++++++++
 arch/x86/include/asm/kvm_host.h         |   1 +
 arch/x86/include/uapi/asm/kvm.h         |   4 +
 arch/x86/kvm/x86.c                      | 109 ++++++++++++++++++++++++
 4 files changed, 171 insertions(+)

diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 2acec3b9ef65..3b399d727c11 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -161,3 +161,60 @@ Specifies the base address of the stolen time structure for this VCPU. The
 base address must be 64 byte aligned and exist within a valid guest memory
 region. See Documentation/virt/kvm/arm/pvtime.rst for more information
 including the layout of the stolen time structure.
+
+4. GROUP: KVM_VCPU_TSC_CTRL
+===========================
+
+:Architectures: x86
+
+4.1 ATTRIBUTE: KVM_VCPU_TSC_OFFSET
+
+:Parameters: 64-bit unsigned TSC offset
+
+Returns:
+
+	 ======= ======================================
+	 -EFAULT Error reading/writing the provided
+		 parameter address.
+	 -ENXIO  Attribute not supported
+	 ======= ======================================
+
+Specifies the guest's TSC offset relative to the host's TSC. The guest's
+TSC is then derived by the following equation:
+
+  guest_tsc = host_tsc + KVM_VCPU_TSC_OFFSET
+
+This attribute is useful for the precise migration of a guest's TSC. The
+following describes a possible algorithm to use for the migration of a
+guest's TSC:
+
+From the source VMM process:
+
+1. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (t_0),
+   kvmclock nanoseconds (k_0), and realtime nanoseconds (r_0).
+
+2. Read the KVM_VCPU_TSC_OFFSET attribute for every vCPU to record the
+   guest TSC offset (off_n).
+
+3. Invoke the KVM_GET_TSC_KHZ ioctl to record the frequency of the
+   guest's TSC (freq).
+
+From the destination VMM process:
+
+4. Invoke the KVM_SET_CLOCK ioctl, providing the kvmclock nanoseconds
+   (k_0) and realtime nanoseconds (r_0) in their respective fields.
+   Ensure that the KVM_CLOCK_REALTIME flag is set in the provided
+   structure. KVM will advance the VM's kvmclock to account for elapsed
+   time since recording the clock values.
+
+5. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (t_1) and
+   kvmclock nanoseconds (k_1).
+
+6. Adjust the guest TSC offsets for every vCPU to account for (1) time
+   elapsed since recording state and (2) difference in TSCs between the
+   source and destination machine:
+
+   new_off_n = t_0 + off_n + (k_1 - k_0) * freq - t_1
+
+7. Write the KVM_VCPU_TSC_OFFSET attribute for every vCPU with the
+   respective value derived in the previous step.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d6376ca8efce..e9bfc00692fb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1064,6 +1064,7 @@ struct kvm_arch {
 	u64 last_tsc_nsec;
 	u64 last_tsc_write;
 	u32 last_tsc_khz;
+	u64 last_tsc_offset;
 	u64 cur_tsc_nsec;
 	u64 cur_tsc_write;
 	u64 cur_tsc_offset;
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index a6c327f8ad9e..0b22e1e84e78 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -503,4 +503,8 @@ struct kvm_pmu_event_filter {
 #define KVM_PMU_EVENT_ALLOW 0
 #define KVM_PMU_EVENT_DENY 1
 
+/* for KVM_{GET,SET,HAS}_DEVICE_ATTR */
+#define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */
+#define   KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */
+
 #endif /* _ASM_X86_KVM_H */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 91aea751d621..3fdad71e5a36 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2466,6 +2466,7 @@ static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 offset, u64 tsc,
 	kvm->arch.last_tsc_nsec = ns;
 	kvm->arch.last_tsc_write = tsc;
 	kvm->arch.last_tsc_khz = vcpu->arch.virtual_tsc_khz;
+	kvm->arch.last_tsc_offset = offset;
 
 	vcpu->arch.last_guest_tsc = tsc;
 
@@ -4924,6 +4925,109 @@ static int kvm_set_guest_paused(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static int kvm_arch_tsc_has_attr(struct kvm_vcpu *vcpu,
+				 struct kvm_device_attr *attr)
+{
+	int r;
+
+	switch (attr->attr) {
+	case KVM_VCPU_TSC_OFFSET:
+		r = 0;
+		break;
+	default:
+		r = -ENXIO;
+	}
+
+	return r;
+}
+
+static int kvm_arch_tsc_get_attr(struct kvm_vcpu *vcpu,
+				 struct kvm_device_attr *attr)
+{
+	u64 __user *uaddr = (u64 __user *)attr->addr;
+	int r;
+
+	switch (attr->attr) {
+	case KVM_VCPU_TSC_OFFSET:
+		r = -EFAULT;
+		if (put_user(vcpu->arch.l1_tsc_offset, uaddr))
+			break;
+		r = 0;
+		break;
+	default:
+		r = -ENXIO;
+	}
+
+	return r;
+}
+
+static int kvm_arch_tsc_set_attr(struct kvm_vcpu *vcpu,
+				 struct kvm_device_attr *attr)
+{
+	u64 __user *uaddr = (u64 __user *)attr->addr;
+	struct kvm *kvm = vcpu->kvm;
+	int r;
+
+	switch (attr->attr) {
+	case KVM_VCPU_TSC_OFFSET: {
+		u64 offset, tsc, ns;
+		unsigned long flags;
+		bool matched;
+
+		r = -EFAULT;
+		if (get_user(offset, uaddr))
+			break;
+
+		raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
+
+		matched = (vcpu->arch.virtual_tsc_khz &&
+			   kvm->arch.last_tsc_khz == vcpu->arch.virtual_tsc_khz &&
+			   kvm->arch.last_tsc_offset == offset);
+
+		tsc = kvm_scale_tsc(vcpu, rdtsc(), vcpu->arch.l1_tsc_scaling_ratio) + offset;
+		ns = get_kvmclock_base_ns();
+
+		__kvm_synchronize_tsc(vcpu, offset, tsc, ns, matched);
+		raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
+
+		r = 0;
+		break;
+	}
+	default:
+		r = -ENXIO;
+	}
+
+	return r;
+}
+
+static int kvm_vcpu_ioctl_device_attr(struct kvm_vcpu *vcpu,
+				      unsigned int ioctl,
+				      void __user *argp)
+{
+	struct kvm_device_attr attr;
+	int r;
+
+	if (copy_from_user(&attr, argp, sizeof(attr)))
+		return -EFAULT;
+
+	if (attr.group != KVM_VCPU_TSC_CTRL)
+		return -ENXIO;
+
+	switch (ioctl) {
+	case KVM_HAS_DEVICE_ATTR:
+		r = kvm_arch_tsc_has_attr(vcpu, &attr);
+		break;
+	case KVM_GET_DEVICE_ATTR:
+		r = kvm_arch_tsc_get_attr(vcpu, &attr);
+		break;
+	case KVM_SET_DEVICE_ATTR:
+		r = kvm_arch_tsc_set_attr(vcpu, &attr);
+		break;
+	}
+
+	return r;
+}
+
 static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 				     struct kvm_enable_cap *cap)
 {
@@ -5378,6 +5482,11 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		r = __set_sregs2(vcpu, u.sregs2);
 		break;
 	}
+	case KVM_HAS_DEVICE_ATTR:
+	case KVM_GET_DEVICE_ATTR:
+	case KVM_SET_DEVICE_ATTR:
+		r = kvm_vcpu_ioctl_device_attr(vcpu, ioctl, argp);
+		break;
 	default:
 		r = -EINVAL;
 	}
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 06/21] tools: arch: x86: pull in pvclock headers
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (4 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 05/21] KVM: x86: Expose TSC offset controls to userspace Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 07/21] selftests: KVM: Add test for KVM_{GET,SET}_CLOCK Oliver Upton
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Copy over approximately clean versions of the pvclock headers into
tools. Reconcile headers/symbols missing in tools that are unneeded.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 tools/arch/x86/include/asm/pvclock-abi.h |  48 +++++++++++
 tools/arch/x86/include/asm/pvclock.h     | 103 +++++++++++++++++++++++
 2 files changed, 151 insertions(+)
 create mode 100644 tools/arch/x86/include/asm/pvclock-abi.h
 create mode 100644 tools/arch/x86/include/asm/pvclock.h

diff --git a/tools/arch/x86/include/asm/pvclock-abi.h b/tools/arch/x86/include/asm/pvclock-abi.h
new file mode 100644
index 000000000000..1436226efe3e
--- /dev/null
+++ b/tools/arch/x86/include/asm/pvclock-abi.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_PVCLOCK_ABI_H
+#define _ASM_X86_PVCLOCK_ABI_H
+#ifndef __ASSEMBLY__
+
+/*
+ * These structs MUST NOT be changed.
+ * They are the ABI between hypervisor and guest OS.
+ * Both Xen and KVM are using this.
+ *
+ * pvclock_vcpu_time_info holds the system time and the tsc timestamp
+ * of the last update. So the guest can use the tsc delta to get a
+ * more precise system time.  There is one per virtual cpu.
+ *
+ * pvclock_wall_clock references the point in time when the system
+ * time was zero (usually boot time), thus the guest calculates the
+ * current wall clock by adding the system time.
+ *
+ * Protocol for the "version" fields is: hypervisor raises it (making
+ * it uneven) before it starts updating the fields and raises it again
+ * (making it even) when it is done.  Thus the guest can make sure the
+ * time values it got are consistent by checking the version before
+ * and after reading them.
+ */
+
+struct pvclock_vcpu_time_info {
+	u32   version;
+	u32   pad0;
+	u64   tsc_timestamp;
+	u64   system_time;
+	u32   tsc_to_system_mul;
+	s8    tsc_shift;
+	u8    flags;
+	u8    pad[2];
+} __attribute__((__packed__)); /* 32 bytes */
+
+struct pvclock_wall_clock {
+	u32   version;
+	u32   sec;
+	u32   nsec;
+} __attribute__((__packed__));
+
+#define PVCLOCK_TSC_STABLE_BIT	(1 << 0)
+#define PVCLOCK_GUEST_STOPPED	(1 << 1)
+/* PVCLOCK_COUNTS_FROM_ZERO broke ABI and can't be used anymore. */
+#define PVCLOCK_COUNTS_FROM_ZERO (1 << 2)
+#endif /* __ASSEMBLY__ */
+#endif /* _ASM_X86_PVCLOCK_ABI_H */
diff --git a/tools/arch/x86/include/asm/pvclock.h b/tools/arch/x86/include/asm/pvclock.h
new file mode 100644
index 000000000000..2628f9a6330b
--- /dev/null
+++ b/tools/arch/x86/include/asm/pvclock.h
@@ -0,0 +1,103 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_PVCLOCK_H
+#define _ASM_X86_PVCLOCK_H
+
+#include <asm/barrier.h>
+#include <asm/pvclock-abi.h>
+
+/* some helper functions for xen and kvm pv clock sources */
+u64 pvclock_clocksource_read(struct pvclock_vcpu_time_info *src);
+u8 pvclock_read_flags(struct pvclock_vcpu_time_info *src);
+void pvclock_set_flags(u8 flags);
+unsigned long pvclock_tsc_khz(struct pvclock_vcpu_time_info *src);
+void pvclock_resume(void);
+
+void pvclock_touch_watchdogs(void);
+
+static __always_inline
+unsigned pvclock_read_begin(const struct pvclock_vcpu_time_info *src)
+{
+	unsigned version = src->version & ~1;
+	/* Make sure that the version is read before the data. */
+	rmb();
+	return version;
+}
+
+static __always_inline
+bool pvclock_read_retry(const struct pvclock_vcpu_time_info *src,
+			unsigned version)
+{
+	/* Make sure that the version is re-read after the data. */
+	rmb();
+	return version != src->version;
+}
+
+/*
+ * Scale a 64-bit delta by scaling and multiplying by a 32-bit fraction,
+ * yielding a 64-bit result.
+ */
+static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
+{
+	u64 product;
+#ifdef __i386__
+	u32 tmp1, tmp2;
+#else
+	unsigned long tmp;
+#endif
+
+	if (shift < 0)
+		delta >>= -shift;
+	else
+		delta <<= shift;
+
+#ifdef __i386__
+	__asm__ (
+		"mul  %5       ; "
+		"mov  %4,%%eax ; "
+		"mov  %%edx,%4 ; "
+		"mul  %5       ; "
+		"xor  %5,%5    ; "
+		"add  %4,%%eax ; "
+		"adc  %5,%%edx ; "
+		: "=A" (product), "=r" (tmp1), "=r" (tmp2)
+		: "a" ((u32)delta), "1" ((u32)(delta >> 32)), "2" (mul_frac) );
+#elif defined(__x86_64__)
+	__asm__ (
+		"mulq %[mul_frac] ; shrd $32, %[hi], %[lo]"
+		: [lo]"=a"(product),
+		  [hi]"=d"(tmp)
+		: "0"(delta),
+		  [mul_frac]"rm"((u64)mul_frac));
+#else
+#error implement me!
+#endif
+
+	return product;
+}
+
+static __always_inline
+u64 __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src, u64 tsc)
+{
+	u64 delta = tsc - src->tsc_timestamp;
+	u64 offset = pvclock_scale_delta(delta, src->tsc_to_system_mul,
+					     src->tsc_shift);
+	return src->system_time + offset;
+}
+
+struct pvclock_vsyscall_time_info {
+	struct pvclock_vcpu_time_info pvti;
+} __attribute__((__aligned__(64)));
+
+#define PVTI_SIZE sizeof(struct pvclock_vsyscall_time_info)
+
+#ifdef CONFIG_PARAVIRT_CLOCK
+void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti);
+struct pvclock_vsyscall_time_info *pvclock_get_pvti_cpu0_va(void);
+#else
+static inline struct pvclock_vsyscall_time_info *pvclock_get_pvti_cpu0_va(void)
+{
+	return NULL;
+}
+#endif
+
+#endif /* _ASM_X86_PVCLOCK_H */
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 07/21] selftests: KVM: Add test for KVM_{GET,SET}_CLOCK
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (5 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 06/21] tools: arch: x86: pull in pvclock headers Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 08/21] selftests: KVM: Fix kvm device helper ioctl assertions Oliver Upton
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Add a selftest for the new KVM clock UAPI that was introduced. Ensure
that the KVM clock is consistent between userspace and the guest, and
that the difference in realtime will only ever cause the KVM clock to
advance forward.

Cc: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../testing/selftests/kvm/include/kvm_util.h  |   2 +
 .../selftests/kvm/x86_64/kvm_clock_test.c     | 204 ++++++++++++++++++
 4 files changed, 208 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/kvm_clock_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 36896d251977..958a809c8de4 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -11,6 +11,7 @@
 /x86_64/emulator_error_test
 /x86_64/get_cpuid_test
 /x86_64/get_msr_index_features
+/x86_64/kvm_clock_test
 /x86_64/kvm_pv_test
 /x86_64/hyperv_clock
 /x86_64/hyperv_cpuid
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index c103873531e0..0f94b18b33ce 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -46,6 +46,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/get_cpuid_test
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_clock
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid
 TEST_GEN_PROGS_x86_64 += x86_64/hyperv_features
+TEST_GEN_PROGS_x86_64 += x86_64/kvm_clock_test
 TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test
 TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
 TEST_GEN_PROGS_x86_64 += x86_64/mmu_role_test
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index 010b59b13917..a8ac5d52e17b 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -19,6 +19,8 @@
 #define KVM_DEV_PATH "/dev/kvm"
 #define KVM_MAX_VCPUS 512
 
+#define NSEC_PER_SEC 1000000000L
+
 /*
  * Callers of kvm_util only have an incomplete/opaque description of the
  * structure kvm_util is using to maintain the state of a VM.
diff --git a/tools/testing/selftests/kvm/x86_64/kvm_clock_test.c b/tools/testing/selftests/kvm/x86_64/kvm_clock_test.c
new file mode 100644
index 000000000000..e0dcc27ae9f1
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/kvm_clock_test.c
@@ -0,0 +1,204 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2021, Google LLC.
+ *
+ * Tests for adjusting the KVM clock from userspace
+ */
+#include <asm/kvm_para.h>
+#include <asm/pvclock.h>
+#include <asm/pvclock-abi.h>
+#include <stdint.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <time.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+
+#define VCPU_ID 0
+
+struct test_case {
+	uint64_t kvmclock_base;
+	int64_t realtime_offset;
+};
+
+static struct test_case test_cases[] = {
+	{ .kvmclock_base = 0 },
+	{ .kvmclock_base = 180 * NSEC_PER_SEC },
+	{ .kvmclock_base = 0, .realtime_offset = -180 * NSEC_PER_SEC },
+	{ .kvmclock_base = 0, .realtime_offset = 180 * NSEC_PER_SEC },
+};
+
+#define GUEST_SYNC_CLOCK(__stage, __val)			\
+		GUEST_SYNC_ARGS(__stage, __val, 0, 0, 0)
+
+static void guest_main(vm_paddr_t pvti_pa, struct pvclock_vcpu_time_info *pvti)
+{
+	int i;
+
+	wrmsr(MSR_KVM_SYSTEM_TIME_NEW, pvti_pa | KVM_MSR_ENABLED);
+	for (i = 0; i < ARRAY_SIZE(test_cases); i++)
+		GUEST_SYNC_CLOCK(i, __pvclock_read_cycles(pvti, rdtsc()));
+}
+
+#define EXPECTED_FLAGS (KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC)
+
+static inline void assert_flags(struct kvm_clock_data *data)
+{
+	TEST_ASSERT((data->flags & EXPECTED_FLAGS) == EXPECTED_FLAGS,
+		    "unexpected clock data flags: %x (want set: %x)",
+		    data->flags, EXPECTED_FLAGS);
+}
+
+static void handle_sync(struct ucall *uc, struct kvm_clock_data *start,
+			struct kvm_clock_data *end)
+{
+	uint64_t obs, exp_lo, exp_hi;
+
+	obs = uc->args[2];
+	exp_lo = start->clock;
+	exp_hi = end->clock;
+
+	assert_flags(start);
+	assert_flags(end);
+
+	TEST_ASSERT(exp_lo <= obs && obs <= exp_hi,
+		    "unexpected kvm-clock value: %"PRIu64" expected range: [%"PRIu64", %"PRIu64"]",
+		    obs, exp_lo, exp_hi);
+
+	pr_info("kvm-clock value: %"PRIu64" expected range [%"PRIu64", %"PRIu64"]\n",
+		obs, exp_lo, exp_hi);
+}
+
+static void handle_abort(struct ucall *uc)
+{
+	TEST_FAIL("%s at %s:%ld", (const char *)uc->args[0],
+		  __FILE__, uc->args[1]);
+}
+
+static void setup_clock(struct kvm_vm *vm, struct test_case *test_case)
+{
+	struct kvm_clock_data data;
+
+	memset(&data, 0, sizeof(data));
+
+	data.clock = test_case->kvmclock_base;
+	if (test_case->realtime_offset) {
+		struct timespec ts;
+		int r;
+
+		data.flags |= KVM_CLOCK_REALTIME;
+		do {
+			r = clock_gettime(CLOCK_REALTIME, &ts);
+			if (!r)
+				break;
+		} while (errno == EINTR);
+
+		TEST_ASSERT(!r, "clock_gettime() failed: %d\n", r);
+
+		data.realtime = ts.tv_sec * NSEC_PER_SEC;
+		data.realtime += ts.tv_nsec;
+		data.realtime += test_case->realtime_offset;
+	}
+
+	vm_ioctl(vm, KVM_SET_CLOCK, &data);
+}
+
+static void enter_guest(struct kvm_vm *vm)
+{
+	struct kvm_clock_data start, end;
+	struct kvm_run *run;
+	struct ucall uc;
+	int i, r;
+
+	run = vcpu_state(vm, VCPU_ID);
+
+	for (i = 0; i < ARRAY_SIZE(test_cases); i++) {
+		setup_clock(vm, &test_cases[i]);
+
+		vm_ioctl(vm, KVM_GET_CLOCK, &start);
+
+		r = _vcpu_run(vm, VCPU_ID);
+		vm_ioctl(vm, KVM_GET_CLOCK, &end);
+
+		TEST_ASSERT(!r, "vcpu_run failed: %d\n", r);
+		TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+			    "unexpected exit reason: %u (%s)",
+			    run->exit_reason, exit_reason_str(run->exit_reason));
+
+		switch (get_ucall(vm, VCPU_ID, &uc)) {
+		case UCALL_SYNC:
+			handle_sync(&uc, &start, &end);
+			break;
+		case UCALL_ABORT:
+			handle_abort(&uc);
+			return;
+		default:
+			TEST_ASSERT(0, "unhandled ucall: %ld\n",
+				    get_ucall(vm, VCPU_ID, &uc));
+		}
+	}
+}
+
+#define CLOCKSOURCE_PATH "/sys/devices/system/clocksource/clocksource0/current_clocksource"
+
+static void check_clocksource(void)
+{
+	char *clk_name;
+	struct stat st;
+	FILE *fp;
+
+	fp = fopen(CLOCKSOURCE_PATH, "r");
+	if (!fp) {
+		pr_info("failed to open clocksource file: %d; assuming TSC.\n",
+			errno);
+		return;
+	}
+
+	if (fstat(fileno(fp), &st)) {
+		pr_info("failed to stat clocksource file: %d; assuming TSC.\n",
+			errno);
+		goto out;
+	}
+
+	clk_name = malloc(st.st_size);
+	TEST_ASSERT(clk_name, "failed to allocate buffer to read file\n");
+
+	if (!fgets(clk_name, st.st_size, fp)) {
+		pr_info("failed to read clocksource file: %d; assuming TSC.\n",
+			ferror(fp));
+		goto out;
+	}
+
+	TEST_ASSERT(!strncmp(clk_name, "tsc\n", st.st_size),
+		    "clocksource not supported: %s", clk_name);
+out:
+	fclose(fp);
+}
+
+int main(void)
+{
+	vm_vaddr_t pvti_gva;
+	vm_paddr_t pvti_gpa;
+	struct kvm_vm *vm;
+	int flags;
+
+	flags = kvm_check_cap(KVM_CAP_ADJUST_CLOCK);
+	if (!(flags & KVM_CLOCK_REALTIME)) {
+		print_skip("KVM_CLOCK_REALTIME not supported; flags: %x",
+			   flags);
+		exit(KSFT_SKIP);
+	}
+
+	check_clocksource();
+
+	vm = vm_create_default(VCPU_ID, 0, guest_main);
+
+	pvti_gva = vm_vaddr_alloc(vm, getpagesize(), 0x10000);
+	pvti_gpa = addr_gva2gpa(vm, pvti_gva);
+	vcpu_args_set(vm, VCPU_ID, 2, pvti_gpa, pvti_gva);
+
+	enter_guest(vm);
+	kvm_vm_free(vm);
+}
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 08/21] selftests: KVM: Fix kvm device helper ioctl assertions
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (6 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 07/21] selftests: KVM: Add test for KVM_{GET,SET}_CLOCK Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 09/21] selftests: KVM: Add helpers for vCPU device attributes Oliver Upton
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

The KVM_CREATE_DEVICE and KVM_{GET,SET}_DEVICE_ATTR ioctls are defined
to return a value of zero on success. As such, tighten the assertions in
the helper functions to only pass if the return code is zero.

Suggested-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/lib/kvm_util.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 10a8ed691c66..0ffc2d39c80d 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1984,7 +1984,7 @@ int kvm_device_check_attr(int dev_fd, uint32_t group, uint64_t attr)
 {
 	int ret = _kvm_device_check_attr(dev_fd, group, attr);
 
-	TEST_ASSERT(ret >= 0, "KVM_HAS_DEVICE_ATTR failed, rc: %i errno: %i", ret, errno);
+	TEST_ASSERT(!ret, "KVM_HAS_DEVICE_ATTR failed, rc: %i errno: %i", ret, errno);
 	return ret;
 }
 
@@ -2008,7 +2008,7 @@ int kvm_create_device(struct kvm_vm *vm, uint64_t type, bool test)
 	ret = _kvm_create_device(vm, type, test, &fd);
 
 	if (!test) {
-		TEST_ASSERT(ret >= 0,
+		TEST_ASSERT(!ret,
 			    "KVM_CREATE_DEVICE IOCTL failed, rc: %i errno: %i", ret, errno);
 		return fd;
 	}
@@ -2036,7 +2036,7 @@ int kvm_device_access(int dev_fd, uint32_t group, uint64_t attr,
 {
 	int ret = _kvm_device_access(dev_fd, group, attr, val, write);
 
-	TEST_ASSERT(ret >= 0, "KVM_SET|GET_DEVICE_ATTR IOCTL failed, rc: %i errno: %i", ret, errno);
+	TEST_ASSERT(!ret, "KVM_SET|GET_DEVICE_ATTR IOCTL failed, rc: %i errno: %i", ret, errno);
 	return ret;
 }
 
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 09/21] selftests: KVM: Add helpers for vCPU device attributes
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (7 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 08/21] selftests: KVM: Fix kvm device helper ioctl assertions Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 10/21] selftests: KVM: Introduce system counter offset test Oliver Upton
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

vCPU file descriptors are abstracted away from test code in KVM
selftests, meaning that tests cannot directly access a vCPU's device
attributes. Add helpers that tests can use to get at vCPU device
attributes.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 .../testing/selftests/kvm/include/kvm_util.h  |  9 +++++
 tools/testing/selftests/kvm/lib/kvm_util.c    | 38 +++++++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index a8ac5d52e17b..1b3ef5757819 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -240,6 +240,15 @@ int _kvm_device_access(int dev_fd, uint32_t group, uint64_t attr,
 int kvm_device_access(int dev_fd, uint32_t group, uint64_t attr,
 		      void *val, bool write);
 
+int _vcpu_has_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group,
+			  uint64_t attr);
+int vcpu_has_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group,
+			 uint64_t attr);
+int _vcpu_access_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group,
+			  uint64_t attr, void *val, bool write);
+int vcpu_access_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group,
+			 uint64_t attr, void *val, bool write);
+
 const char *exit_reason_str(unsigned int exit_reason);
 
 void virt_pgd_alloc(struct kvm_vm *vm);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 0ffc2d39c80d..0fe66ca6139a 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2040,6 +2040,44 @@ int kvm_device_access(int dev_fd, uint32_t group, uint64_t attr,
 	return ret;
 }
 
+int _vcpu_has_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group,
+			  uint64_t attr)
+{
+	struct vcpu *vcpu = vcpu_find(vm, vcpuid);
+
+	TEST_ASSERT(vcpu, "nonexistent vcpu id: %d", vcpuid);
+
+	return _kvm_device_check_attr(vcpu->fd, group, attr);
+}
+
+int vcpu_has_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group,
+				 uint64_t attr)
+{
+	int ret = _vcpu_has_device_attr(vm, vcpuid, group, attr);
+
+	TEST_ASSERT(!ret, "KVM_HAS_DEVICE_ATTR IOCTL failed, rc: %i errno: %i", ret, errno);
+	return ret;
+}
+
+int _vcpu_access_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group,
+			     uint64_t attr, void *val, bool write)
+{
+	struct vcpu *vcpu = vcpu_find(vm, vcpuid);
+
+	TEST_ASSERT(vcpu, "nonexistent vcpu id: %d", vcpuid);
+
+	return _kvm_device_access(vcpu->fd, group, attr, val, write);
+}
+
+int vcpu_access_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group,
+			    uint64_t attr, void *val, bool write)
+{
+	int ret = _vcpu_access_device_attr(vm, vcpuid, group, attr, val, write);
+
+	TEST_ASSERT(!ret, "KVM_SET|GET_DEVICE_ATTR IOCTL failed, rc: %i errno: %i", ret, errno);
+	return ret;
+}
+
 /*
  * VM Dump
  *
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 10/21] selftests: KVM: Introduce system counter offset test
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (8 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 09/21] selftests: KVM: Add helpers for vCPU device attributes Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 11/21] KVM: arm64: Refactor update_vtimer_cntvoff() Oliver Upton
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Introduce a KVM selftest to verify that userspace manipulation of the
TSC (via the new vCPU attribute) results in the correct behavior within
the guest.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../kvm/system_counter_offset_test.c          | 132 ++++++++++++++++++
 3 files changed, 134 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/system_counter_offset_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 958a809c8de4..3d2585f0bffc 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -52,3 +52,4 @@
 /set_memory_region_test
 /steal_time
 /kvm_binary_stats_test
+/system_counter_offset_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 0f94b18b33ce..9f7060c02668 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -85,6 +85,7 @@ TEST_GEN_PROGS_x86_64 += memslot_perf_test
 TEST_GEN_PROGS_x86_64 += set_memory_region_test
 TEST_GEN_PROGS_x86_64 += steal_time
 TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
+TEST_GEN_PROGS_x86_64 += system_counter_offset_test
 
 TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
diff --git a/tools/testing/selftests/kvm/system_counter_offset_test.c b/tools/testing/selftests/kvm/system_counter_offset_test.c
new file mode 100644
index 000000000000..b337bbbfa41f
--- /dev/null
+++ b/tools/testing/selftests/kvm/system_counter_offset_test.c
@@ -0,0 +1,132 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2021, Google LLC.
+ *
+ * Tests for adjusting the system counter from userspace
+ */
+#include <asm/kvm_para.h>
+#include <stdint.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <time.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+
+#define VCPU_ID 0
+
+#ifdef __x86_64__
+
+struct test_case {
+	uint64_t tsc_offset;
+};
+
+static struct test_case test_cases[] = {
+	{ 0 },
+	{ 180 * NSEC_PER_SEC },
+	{ -180 * NSEC_PER_SEC },
+};
+
+static void check_preconditions(struct kvm_vm *vm)
+{
+	if (!_vcpu_has_device_attr(vm, VCPU_ID, KVM_VCPU_TSC_CTRL, KVM_VCPU_TSC_OFFSET))
+		return;
+
+	print_skip("KVM_VCPU_TSC_OFFSET not supported; skipping test");
+	exit(KSFT_SKIP);
+}
+
+static void setup_system_counter(struct kvm_vm *vm, struct test_case *test)
+{
+	vcpu_access_device_attr(vm, VCPU_ID, KVM_VCPU_TSC_CTRL,
+				KVM_VCPU_TSC_OFFSET, &test->tsc_offset, true);
+}
+
+static uint64_t guest_read_system_counter(struct test_case *test)
+{
+	return rdtsc();
+}
+
+static uint64_t host_read_guest_system_counter(struct test_case *test)
+{
+	return rdtsc() + test->tsc_offset;
+}
+
+#else /* __x86_64__ */
+
+#error test not implemented for this architecture!
+
+#endif
+
+#define GUEST_SYNC_CLOCK(__stage, __val)			\
+		GUEST_SYNC_ARGS(__stage, __val, 0, 0, 0)
+
+static void guest_main(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(test_cases); i++) {
+		struct test_case *test = &test_cases[i];
+
+		GUEST_SYNC_CLOCK(i, guest_read_system_counter(test));
+	}
+}
+
+static void handle_sync(struct ucall *uc, uint64_t start, uint64_t end)
+{
+	uint64_t obs = uc->args[2];
+
+	TEST_ASSERT(start <= obs && obs <= end,
+		    "unexpected system counter value: %"PRIu64" expected range: [%"PRIu64", %"PRIu64"]",
+		    obs, start, end);
+
+	pr_info("system counter value: %"PRIu64" expected range [%"PRIu64", %"PRIu64"]\n",
+		obs, start, end);
+}
+
+static void handle_abort(struct ucall *uc)
+{
+	TEST_FAIL("%s at %s:%ld", (const char *)uc->args[0],
+		  __FILE__, uc->args[1]);
+}
+
+static void enter_guest(struct kvm_vm *vm)
+{
+	uint64_t start, end;
+	struct ucall uc;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(test_cases); i++) {
+		struct test_case *test = &test_cases[i];
+
+		setup_system_counter(vm, test);
+		start = host_read_guest_system_counter(test);
+		vcpu_run(vm, VCPU_ID);
+		end = host_read_guest_system_counter(test);
+
+		switch (get_ucall(vm, VCPU_ID, &uc)) {
+		case UCALL_SYNC:
+			handle_sync(&uc, start, end);
+			break;
+		case UCALL_ABORT:
+			handle_abort(&uc);
+			return;
+		default:
+			TEST_ASSERT(0, "unhandled ucall %ld\n",
+				    get_ucall(vm, VCPU_ID, &uc));
+		}
+	}
+}
+
+int main(void)
+{
+	struct kvm_vm *vm;
+
+	vm = vm_create_default(VCPU_ID, 0, guest_main);
+	check_preconditions(vm);
+	ucall_init(vm, NULL);
+
+	enter_guest(vm);
+	kvm_vm_free(vm);
+}
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 11/21] KVM: arm64: Refactor update_vtimer_cntvoff()
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (9 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 10/21] selftests: KVM: Introduce system counter offset test Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  9:23   ` Andrew Jones
  2021-08-04  8:58 ` [PATCH v6 12/21] KVM: arm64: Separate guest/host counter offset values Oliver Upton
                   ` (11 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Make the implementation of update_vtimer_cntvoff() generic w.r.t. guest
timer context and spin off into a new helper method for later use.
Require callers of this new helper method to grab the kvm lock
beforehand.

No functional change intended.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/arm64/kvm/arch_timer.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 3df67c127489..c0101db75ad4 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -747,22 +747,32 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-/* Make the updates of cntvoff for all vtimer contexts atomic */
-static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
+/* Make offset updates for all timer contexts atomic */
+static void update_timer_offset(struct kvm_vcpu *vcpu,
+				enum kvm_arch_timers timer, u64 offset)
 {
 	int i;
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_vcpu *tmp;
 
-	mutex_lock(&kvm->lock);
+	lockdep_assert_held(&kvm->lock);
+
 	kvm_for_each_vcpu(i, tmp, kvm)
-		timer_set_offset(vcpu_vtimer(tmp), cntvoff);
+		timer_set_offset(vcpu_get_timer(tmp, timer), offset);
 
 	/*
 	 * When called from the vcpu create path, the CPU being created is not
 	 * included in the loop above, so we just set it here as well.
 	 */
-	timer_set_offset(vcpu_vtimer(vcpu), cntvoff);
+	timer_set_offset(vcpu_get_timer(vcpu, timer), offset);
+}
+
+static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
+{
+	struct kvm *kvm = vcpu->kvm;
+
+	mutex_lock(&kvm->lock);
+	update_timer_offset(vcpu, TIMER_VTIMER, cntvoff);
 	mutex_unlock(&kvm->lock);
 }
 
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 12/21] KVM: arm64: Separate guest/host counter offset values
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (10 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 11/21] KVM: arm64: Refactor update_vtimer_cntvoff() Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04 10:19   ` Andrew Jones
  2021-08-04  8:58 ` [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset Oliver Upton
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

In some instances, a VMM may want to update the guest's counter-timer
offset in a transparent manner, meaning that changes to the hardware
value do not affect the synthetic register presented to the guest or the
VMM through said guest's architectural state. Lay the groundwork to
separate guest offset register writes from the hardware values utilized
by KVM.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/arm64/kvm/arch_timer.c  | 48 ++++++++++++++++++++++++++++++++----
 include/kvm/arm_arch_timer.h |  3 +++
 2 files changed, 46 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index c0101db75ad4..4c2b763a8849 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -87,6 +87,18 @@ static u64 timer_get_offset(struct arch_timer_context *ctxt)
 	struct kvm_vcpu *vcpu = ctxt->vcpu;
 
 	switch(arch_timer_ctx_index(ctxt)) {
+	case TIMER_VTIMER:
+		return ctxt->host_offset;
+	default:
+		return 0;
+	}
+}
+
+static u64 timer_get_guest_offset(struct arch_timer_context *ctxt)
+{
+	struct kvm_vcpu *vcpu = ctxt->vcpu;
+
+	switch (arch_timer_ctx_index(ctxt)) {
 	case TIMER_VTIMER:
 		return __vcpu_sys_reg(vcpu, CNTVOFF_EL2);
 	default:
@@ -132,13 +144,31 @@ static void timer_set_offset(struct arch_timer_context *ctxt, u64 offset)
 
 	switch(arch_timer_ctx_index(ctxt)) {
 	case TIMER_VTIMER:
-		__vcpu_sys_reg(vcpu, CNTVOFF_EL2) = offset;
+		ctxt->host_offset = offset;
 		break;
 	default:
 		WARN(offset, "timer %ld\n", arch_timer_ctx_index(ctxt));
 	}
 }
 
+static void timer_set_guest_offset(struct arch_timer_context *ctxt, u64 offset)
+{
+	struct kvm_vcpu *vcpu = ctxt->vcpu;
+
+	switch (arch_timer_ctx_index(ctxt)) {
+	case TIMER_VTIMER: {
+		u64 host_offset = timer_get_offset(ctxt);
+
+		host_offset += offset - __vcpu_sys_reg(vcpu, CNTVOFF_EL2);
+		__vcpu_sys_reg(vcpu, CNTVOFF_EL2) = offset;
+		timer_set_offset(ctxt, host_offset);
+		break;
+	}
+	default:
+		WARN_ONCE(offset, "timer %ld\n", arch_timer_ctx_index(ctxt));
+	}
+}
+
 u64 kvm_phys_timer_read(void)
 {
 	return timecounter->cc->read(timecounter->cc);
@@ -749,7 +779,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
 
 /* Make offset updates for all timer contexts atomic */
 static void update_timer_offset(struct kvm_vcpu *vcpu,
-				enum kvm_arch_timers timer, u64 offset)
+				enum kvm_arch_timers timer, u64 offset,
+				bool guest_visible)
 {
 	int i;
 	struct kvm *kvm = vcpu->kvm;
@@ -758,13 +789,20 @@ static void update_timer_offset(struct kvm_vcpu *vcpu,
 	lockdep_assert_held(&kvm->lock);
 
 	kvm_for_each_vcpu(i, tmp, kvm)
-		timer_set_offset(vcpu_get_timer(tmp, timer), offset);
+		if (guest_visible)
+			timer_set_guest_offset(vcpu_get_timer(tmp, timer),
+					       offset);
+		else
+			timer_set_offset(vcpu_get_timer(tmp, timer), offset);
 
 	/*
 	 * When called from the vcpu create path, the CPU being created is not
 	 * included in the loop above, so we just set it here as well.
 	 */
-	timer_set_offset(vcpu_get_timer(vcpu, timer), offset);
+	if (guest_visible)
+		timer_set_guest_offset(vcpu_get_timer(vcpu, timer), offset);
+	else
+		timer_set_offset(vcpu_get_timer(vcpu, timer), offset);
 }
 
 static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
@@ -772,7 +810,7 @@ static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
 	struct kvm *kvm = vcpu->kvm;
 
 	mutex_lock(&kvm->lock);
-	update_timer_offset(vcpu, TIMER_VTIMER, cntvoff);
+	update_timer_offset(vcpu, TIMER_VTIMER, cntvoff, true);
 	mutex_unlock(&kvm->lock);
 }
 
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 51c19381108c..9d65d4a29f81 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -42,6 +42,9 @@ struct arch_timer_context {
 	/* Duplicated state from arch_timer.c for convenience */
 	u32				host_timer_irq;
 	u32				host_timer_irq_flags;
+
+	/* offset relative to the host's physical counter-timer */
+	u64				host_offset;
 };
 
 struct timer_map {
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (11 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 12/21] KVM: arm64: Separate guest/host counter offset values Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04 10:20   ` Andrew Jones
  2021-08-10  9:35   ` Marc Zyngier
  2021-08-04  8:58 ` [PATCH v6 14/21] selftests: KVM: Add helper to check for register presence Oliver Upton
                   ` (9 subsequent siblings)
  22 siblings, 2 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Allow userspace to access the guest's virtual counter-timer offset
through the ONE_REG interface. The value read or written is defined to
be an offset from the guest's physical counter-timer. Add some
documentation to clarify how a VMM should use this and the existing
CNTVCT_EL0.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 Documentation/virt/kvm/api.rst    | 10 ++++++++++
 arch/arm64/include/uapi/asm/kvm.h |  1 +
 arch/arm64/kvm/arch_timer.c       | 11 +++++++++++
 arch/arm64/kvm/guest.c            |  6 +++++-
 include/kvm/arm_arch_timer.h      |  1 +
 5 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 8d4a3471ad9e..28a65dc89985 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -2487,6 +2487,16 @@ arm64 system registers have the following id bit patterns::
      derived from the register encoding for CNTV_CVAL_EL0.  As this is
      API, it must remain this way.
 
+.. warning::
+
+     The value of KVM_REG_ARM_TIMER_OFFSET is defined as an offset from
+     the guest's view of the physical counter-timer.
+
+     Userspace should use either KVM_REG_ARM_TIMER_OFFSET or
+     KVM_REG_ARM_TIMER_CVAL to pause and resume a guest's virtual
+     counter-timer. Mixed use of these registers could result in an
+     unpredictable guest counter value.
+
 arm64 firmware pseudo-registers have the following bit pattern::
 
   0x6030 0000 0014 <regno:16>
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index b3edde68bc3e..949a31bc10f0 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -255,6 +255,7 @@ struct kvm_arm_copy_mte_tags {
 #define KVM_REG_ARM_TIMER_CTL		ARM64_SYS_REG(3, 3, 14, 3, 1)
 #define KVM_REG_ARM_TIMER_CVAL		ARM64_SYS_REG(3, 3, 14, 0, 2)
 #define KVM_REG_ARM_TIMER_CNT		ARM64_SYS_REG(3, 3, 14, 3, 2)
+#define KVM_REG_ARM_TIMER_OFFSET	ARM64_SYS_REG(3, 4, 14, 0, 3)
 
 /* KVM-as-firmware specific pseudo-registers */
 #define KVM_REG_ARM_FW			(0x0014 << KVM_REG_ARM_COPROC_SHIFT)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 4c2b763a8849..a8815b09da3e 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -868,6 +868,10 @@ int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, u64 value)
 		timer = vcpu_vtimer(vcpu);
 		kvm_arm_timer_write(vcpu, timer, TIMER_REG_CVAL, value);
 		break;
+	case KVM_REG_ARM_TIMER_OFFSET:
+		timer = vcpu_vtimer(vcpu);
+		update_vtimer_cntvoff(vcpu, value);
+		break;
 	case KVM_REG_ARM_PTIMER_CTL:
 		timer = vcpu_ptimer(vcpu);
 		kvm_arm_timer_write(vcpu, timer, TIMER_REG_CTL, value);
@@ -912,6 +916,9 @@ u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
 	case KVM_REG_ARM_TIMER_CVAL:
 		return kvm_arm_timer_read(vcpu,
 					  vcpu_vtimer(vcpu), TIMER_REG_CVAL);
+	case KVM_REG_ARM_TIMER_OFFSET:
+		return kvm_arm_timer_read(vcpu,
+					  vcpu_vtimer(vcpu), TIMER_REG_OFFSET);
 	case KVM_REG_ARM_PTIMER_CTL:
 		return kvm_arm_timer_read(vcpu,
 					  vcpu_ptimer(vcpu), TIMER_REG_CTL);
@@ -949,6 +956,10 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
 		val = kvm_phys_timer_read() - timer_get_offset(timer);
 		break;
 
+	case TIMER_REG_OFFSET:
+		val = timer_get_offset(timer);
+		break;
+
 	default:
 		BUG();
 	}
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 1dfb83578277..17fc06e2b422 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -591,7 +591,7 @@ static unsigned long num_core_regs(const struct kvm_vcpu *vcpu)
  * ARM64 versions of the TIMER registers, always available on arm64
  */
 
-#define NUM_TIMER_REGS 3
+#define NUM_TIMER_REGS 4
 
 static bool is_timer_reg(u64 index)
 {
@@ -599,6 +599,7 @@ static bool is_timer_reg(u64 index)
 	case KVM_REG_ARM_TIMER_CTL:
 	case KVM_REG_ARM_TIMER_CNT:
 	case KVM_REG_ARM_TIMER_CVAL:
+	case KVM_REG_ARM_TIMER_OFFSET:
 		return true;
 	}
 	return false;
@@ -614,6 +615,9 @@ static int copy_timer_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 	uindices++;
 	if (put_user(KVM_REG_ARM_TIMER_CVAL, uindices))
 		return -EFAULT;
+	uindices++;
+	if (put_user(KVM_REG_ARM_TIMER_OFFSET, uindices))
+		return -EFAULT;
 
 	return 0;
 }
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 9d65d4a29f81..615f9314f6a5 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -21,6 +21,7 @@ enum kvm_arch_timer_regs {
 	TIMER_REG_CVAL,
 	TIMER_REG_TVAL,
 	TIMER_REG_CTL,
+	TIMER_REG_OFFSET,
 };
 
 struct arch_timer_context {
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 14/21] selftests: KVM: Add helper to check for register presence
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (12 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  9:14   ` Andrew Jones
  2021-08-04  8:58 ` [PATCH v6 15/21] selftests: KVM: Add support for aarch64 to system_counter_offset_test Oliver Upton
                   ` (8 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

The KVM_GET_REG_LIST vCPU ioctl returns a list of supported registers
for a given vCPU. Add a helper to check if a register exists in the list
of supported registers.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 .../testing/selftests/kvm/include/kvm_util.h  |  2 ++
 tools/testing/selftests/kvm/lib/kvm_util.c    | 19 +++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index 1b3ef5757819..077082dd2ca7 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -215,6 +215,8 @@ void vcpu_fpu_get(struct kvm_vm *vm, uint32_t vcpuid,
 		  struct kvm_fpu *fpu);
 void vcpu_fpu_set(struct kvm_vm *vm, uint32_t vcpuid,
 		  struct kvm_fpu *fpu);
+
+bool vcpu_has_reg(struct kvm_vm *vm, uint32_t vcpuid, uint64_t reg_id);
 void vcpu_get_reg(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_one_reg *reg);
 void vcpu_set_reg(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_one_reg *reg);
 #ifdef __KVM_HAVE_VCPU_EVENTS
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 0fe66ca6139a..a5801d4ed37d 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1823,6 +1823,25 @@ void vcpu_fpu_set(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_fpu *fpu)
 		    ret, errno, strerror(errno));
 }
 
+bool vcpu_has_reg(struct kvm_vm *vm, uint32_t vcpuid, uint64_t reg_id)
+{
+	struct kvm_reg_list *list;
+	bool ret = false;
+	uint64_t i;
+
+	list = vcpu_get_reg_list(vm, vcpuid);
+
+	for (i = 0; i < list->n; i++) {
+		if (list->reg[i] == reg_id) {
+			ret = true;
+			break;
+		}
+	}
+
+	free(list);
+	return ret;
+}
+
 void vcpu_get_reg(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_one_reg *reg)
 {
 	int ret;
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 15/21] selftests: KVM: Add support for aarch64 to system_counter_offset_test
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (13 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 14/21] selftests: KVM: Add helper to check for register presence Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04  8:58 ` [PATCH v6 16/21] arm64: cpufeature: Enumerate support for Enhanced Counter Virtualization Oliver Upton
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

KVM/arm64 now allows userspace to adjust the guest virtual counter-timer
via a vCPU register. Test that changes to the virtual counter-timer
offset result in the correct view being presented to the guest.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/Makefile          |  1 +
 .../selftests/kvm/include/aarch64/processor.h | 12 ++++
 .../kvm/system_counter_offset_test.c          | 56 ++++++++++++++++++-
 3 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 9f7060c02668..fab42e7c23ee 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -98,6 +98,7 @@ TEST_GEN_PROGS_aarch64 += kvm_page_table_test
 TEST_GEN_PROGS_aarch64 += set_memory_region_test
 TEST_GEN_PROGS_aarch64 += steal_time
 TEST_GEN_PROGS_aarch64 += kvm_binary_stats_test
+TEST_GEN_PROGS_aarch64 += system_counter_offset_test
 
 TEST_GEN_PROGS_s390x = s390x/memop
 TEST_GEN_PROGS_s390x += s390x/resets
diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index 27dc5c2e56b9..3168cdbae6ee 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -129,4 +129,16 @@ void vm_install_sync_handler(struct kvm_vm *vm,
 
 #define isb()	asm volatile("isb" : : : "memory")
 
+static inline uint64_t read_cntvct_ordered(void)
+{
+	uint64_t r;
+
+	__asm__ __volatile__("isb\n\t"
+			     "mrs %0, cntvct_el0\n\t"
+			     "isb\n\t"
+			     : "=r"(r));
+
+	return r;
+}
+
 #endif /* SELFTEST_KVM_PROCESSOR_H */
diff --git a/tools/testing/selftests/kvm/system_counter_offset_test.c b/tools/testing/selftests/kvm/system_counter_offset_test.c
index b337bbbfa41f..ac933db83d03 100644
--- a/tools/testing/selftests/kvm/system_counter_offset_test.c
+++ b/tools/testing/selftests/kvm/system_counter_offset_test.c
@@ -53,7 +53,61 @@ static uint64_t host_read_guest_system_counter(struct test_case *test)
 	return rdtsc() + test->tsc_offset;
 }
 
-#else /* __x86_64__ */
+#elif __aarch64__ /* __x86_64__ */
+
+enum arch_counter {
+	VIRTUAL,
+};
+
+struct test_case {
+	enum arch_counter counter;
+	uint64_t offset;
+};
+
+static struct test_case test_cases[] = {
+	{ .counter = VIRTUAL, .offset = 0 },
+	{ .counter = VIRTUAL, .offset = 180 * NSEC_PER_SEC },
+	{ .counter = VIRTUAL, .offset = -180 * NSEC_PER_SEC },
+};
+
+static void check_preconditions(struct kvm_vm *vm)
+{
+	if (vcpu_has_reg(vm, VCPU_ID, KVM_REG_ARM_TIMER_OFFSET))
+		return;
+
+	print_skip("KVM_REG_ARM_TIMER_OFFSET not supported; skipping test");
+	exit(KSFT_SKIP);
+}
+
+static void setup_system_counter(struct kvm_vm *vm, struct test_case *test)
+{
+	struct kvm_one_reg reg = {
+		.id = KVM_REG_ARM_TIMER_OFFSET,
+		.addr = (__u64)&test->offset,
+	};
+
+	vcpu_set_reg(vm, VCPU_ID, &reg);
+}
+
+static uint64_t guest_read_system_counter(struct test_case *test)
+{
+	switch (test->counter) {
+	case VIRTUAL:
+		return read_cntvct_ordered();
+	default:
+		GUEST_ASSERT(0);
+	}
+
+	/* unreachable */
+	return 0;
+}
+
+static uint64_t host_read_guest_system_counter(struct test_case *test)
+{
+	return read_cntvct_ordered() - test->offset;
+}
+
+#else /* __aarch64__ */
 
 #error test not implemented for this architecture!
 
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 16/21] arm64: cpufeature: Enumerate support for Enhanced Counter Virtualization
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (14 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 15/21] selftests: KVM: Add support for aarch64 to system_counter_offset_test Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-10  9:38   ` Marc Zyngier
  2021-08-04  8:58 ` [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset Oliver Upton
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Introduce a new cpucap to indicate if the system supports full enhanced
counter virtualization (i.e. ID_AA64MMFR0_EL1.ECV==0x2).

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/arm64/include/asm/sysreg.h |  2 ++
 arch/arm64/kernel/cpufeature.c  | 10 ++++++++++
 arch/arm64/tools/cpucaps        |  1 +
 3 files changed, 13 insertions(+)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 7b9c3acba684..4dfc44066dfb 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -847,6 +847,8 @@
 #define ID_AA64MMFR0_ASID_SHIFT		4
 #define ID_AA64MMFR0_PARANGE_SHIFT	0
 
+#define ID_AA64MMFR0_ECV_VIRT		0x1
+#define ID_AA64MMFR0_ECV_PHYS		0x2
 #define ID_AA64MMFR0_TGRAN4_NI		0xf
 #define ID_AA64MMFR0_TGRAN4_SUPPORTED	0x0
 #define ID_AA64MMFR0_TGRAN64_NI		0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0ead8bfedf20..94c349e179d3 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2301,6 +2301,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.matches = has_cpuid_feature,
 		.min_field_value = 1,
 	},
+	{
+		.desc = "Enhanced Counter Virtualization (Physical)",
+		.capability = ARM64_ECV,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.sys_reg = SYS_ID_AA64MMFR0_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64MMFR0_ECV_SHIFT,
+		.matches = has_cpuid_feature,
+		.min_field_value = ID_AA64MMFR0_ECV_PHYS,
+	},
 	{},
 };
 
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 49305c2e6dfd..d819ea614da5 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -3,6 +3,7 @@
 # Internal CPU capabilities constants, keep this list sorted
 
 BTI
+ECV
 # Unreliable: use system_supports_32bit_el0() instead.
 HAS_32BIT_EL0_DO_NOT_USE
 HAS_32BIT_EL1
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (15 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 16/21] arm64: cpufeature: Enumerate support for Enhanced Counter Virtualization Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04 10:17   ` Andrew Jones
  2021-08-10 10:56   ` Marc Zyngier
  2021-08-04  8:58 ` [PATCH v6 18/21] KVM: arm64: Configure timer traps in vcpu_load() for VHE Oliver Upton
                   ` (5 subsequent siblings)
  22 siblings, 2 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Presently, KVM provides no facilities for correctly migrating a guest
that depends on the physical counter-timer. Whie most guests (barring
NV, of course) should not depend on the physical counter-timer, an
operator may wish to provide a consistent view of the physical
counter-timer across migrations.

Provide userspace with a new vCPU attribute to modify the guest
counter-timer offset. Unlike KVM_REG_ARM_TIMER_OFFSET, this attribute is
hidden from the guest's architectural state. The value offsets *both*
the virtual and physical counter-timer views for the guest. Only support
this attribute on ECV systems as ECV is required for hardware offsetting
of the physical counter-timer.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 Documentation/virt/kvm/devices/vcpu.rst |  28 ++++++
 arch/arm64/include/asm/kvm_asm.h        |   2 +
 arch/arm64/include/asm/sysreg.h         |   2 +
 arch/arm64/include/uapi/asm/kvm.h       |   1 +
 arch/arm64/kvm/arch_timer.c             | 122 +++++++++++++++++++++++-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c      |   6 ++
 arch/arm64/kvm/hyp/nvhe/timer-sr.c      |   5 +
 arch/arm64/kvm/hyp/vhe/timer-sr.c       |   5 +
 include/clocksource/arm_arch_timer.h    |   1 +
 9 files changed, 169 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 3b399d727c11..3ba35b9d9d03 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -139,6 +139,34 @@ configured values on other VCPUs.  Userspace should configure the interrupt
 numbers on at least one VCPU after creating all VCPUs and before running any
 VCPUs.
 
+2.2. ATTRIBUTE: KVM_ARM_VCPU_TIMER_OFFSET
+-----------------------------------------
+
+:Parameters: in kvm_device_attr.addr the address for the timer offset is a
+             pointer to a __u64
+
+Returns:
+
+	 ======= ==================================
+	 -EFAULT Error reading/writing the provided
+		 parameter address
+	 -ENXIO  Timer offsetting not implemented
+	 ======= ==================================
+
+Specifies the guest's counter-timer offset from the host's virtual counter.
+The guest's physical counter value is then derived by the following
+equation:
+
+  guest_cntpct = host_cntvct - KVM_ARM_VCPU_TIMER_OFFSET
+
+The guest's virtual counter value is derived by the following equation:
+
+  guest_cntvct = host_cntvct - KVM_REG_ARM_TIMER_OFFSET
+			- KVM_ARM_VCPU_TIMER_OFFSET
+
+KVM does not allow the use of varying offset values for different vCPUs;
+the last written offset value will be broadcasted to all vCPUs in a VM.
+
 3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL
 ==================================
 
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 9f0bf2109be7..ab1c8fdb0177 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -65,6 +65,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
 #define __KVM_HOST_SMCCC_FUNC___pkvm_mark_hyp			20
 #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			21
+#define __KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntpoff		22
 
 #ifndef __ASSEMBLY__
 
@@ -200,6 +201,7 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa,
 extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu);
 
 extern void __kvm_timer_set_cntvoff(u64 cntvoff);
+extern void __kvm_timer_set_cntpoff(u64 cntpoff);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 4dfc44066dfb..c34672aa65b9 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -586,6 +586,8 @@
 #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
 #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
 
+#define SYS_CNTPOFF_EL2			sys_reg(3, 4, 14, 0, 6)
+
 /* VHE encodings for architectural EL0/1 system registers */
 #define SYS_SCTLR_EL12			sys_reg(3, 5, 1, 0, 0)
 #define SYS_CPACR_EL12			sys_reg(3, 5, 1, 0, 2)
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 949a31bc10f0..15150f8224a1 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -366,6 +366,7 @@ struct kvm_arm_copy_mte_tags {
 #define KVM_ARM_VCPU_TIMER_CTRL		1
 #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
 #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
+#define   KVM_ARM_VCPU_TIMER_OFFSET		2
 #define KVM_ARM_VCPU_PVTIME_CTRL	2
 #define   KVM_ARM_VCPU_PVTIME_IPA	0
 
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index a8815b09da3e..f15058612994 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -85,11 +85,15 @@ u64 timer_get_cval(struct arch_timer_context *ctxt)
 static u64 timer_get_offset(struct arch_timer_context *ctxt)
 {
 	struct kvm_vcpu *vcpu = ctxt->vcpu;
+	struct arch_timer_cpu *timer = vcpu_timer(vcpu);
 
 	switch(arch_timer_ctx_index(ctxt)) {
 	case TIMER_VTIMER:
+	case TIMER_PTIMER:
 		return ctxt->host_offset;
 	default:
+		WARN_ONCE(1, "unrecognized timer %ld\n",
+			  arch_timer_ctx_index(ctxt));
 		return 0;
 	}
 }
@@ -144,6 +148,7 @@ static void timer_set_offset(struct arch_timer_context *ctxt, u64 offset)
 
 	switch(arch_timer_ctx_index(ctxt)) {
 	case TIMER_VTIMER:
+	case TIMER_PTIMER:
 		ctxt->host_offset = offset;
 		break;
 	default:
@@ -572,6 +577,11 @@ static void set_cntvoff(u64 cntvoff)
 	kvm_call_hyp(__kvm_timer_set_cntvoff, cntvoff);
 }
 
+static void set_cntpoff(u64 cntpoff)
+{
+	kvm_call_hyp(__kvm_timer_set_cntpoff, cntpoff);
+}
+
 static inline void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active)
 {
 	int r;
@@ -647,6 +657,8 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
 	}
 
 	set_cntvoff(timer_get_offset(map.direct_vtimer));
+	if (cpus_have_const_cap(ARM64_ECV))
+		set_cntpoff(timer_get_offset(vcpu_ptimer(vcpu)));
 
 	kvm_timer_unblocking(vcpu);
 
@@ -814,6 +826,22 @@ static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
 	mutex_unlock(&kvm->lock);
 }
 
+static void update_ptimer_cntpoff(struct kvm_vcpu *vcpu, u64 cntpoff)
+{
+	struct kvm *kvm = vcpu->kvm;
+	u64 cntvoff;
+
+	mutex_lock(&kvm->lock);
+
+	/* adjustments to the physical offset also affect vtimer */
+	cntvoff = timer_get_offset(vcpu_vtimer(vcpu));
+	cntvoff += cntpoff - timer_get_offset(vcpu_ptimer(vcpu));
+
+	update_timer_offset(vcpu, TIMER_PTIMER, cntpoff, false);
+	update_timer_offset(vcpu, TIMER_VTIMER, cntvoff, false);
+	mutex_unlock(&kvm->lock);
+}
+
 void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
 {
 	struct arch_timer_cpu *timer = vcpu_timer(vcpu);
@@ -932,6 +960,29 @@ u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
 	return (u64)-1;
 }
 
+/**
+ * kvm_arm_timer_read_offset - returns the guest value of CNTVOFF_EL2.
+ * @vcpu: the vcpu pointer
+ *
+ * Computes the guest value of CNTVOFF_EL2 by subtracting the physical
+ * counter offset. Note that KVM defines CNTVOFF_EL2 as the offset from the
+ * guest's physical counter-timer, not the host's.
+ *
+ * Returns: the guest value for CNTVOFF_EL2
+ */
+static u64 kvm_arm_timer_read_offset(struct kvm_vcpu *vcpu)
+{
+	struct kvm *kvm = vcpu->kvm;
+	u64 offset;
+
+	mutex_lock(&kvm->lock);
+	offset = timer_get_offset(vcpu_vtimer(vcpu)) -
+			timer_get_offset(vcpu_ptimer(vcpu));
+	mutex_unlock(&kvm->lock);
+
+	return offset;
+}
+
 static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
 			      struct arch_timer_context *timer,
 			      enum kvm_arch_timer_regs treg)
@@ -957,7 +1008,7 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
 		break;
 
 	case TIMER_REG_OFFSET:
-		val = timer_get_offset(timer);
+		val = kvm_arm_timer_read_offset(vcpu);
 		break;
 
 	default:
@@ -1350,6 +1401,9 @@ void kvm_timer_init_vhe(void)
 	val = read_sysreg(cnthctl_el2);
 	val |= (CNTHCTL_EL1PCEN << cnthctl_shift);
 	val |= (CNTHCTL_EL1PCTEN << cnthctl_shift);
+
+	if (cpus_have_const_cap(ARM64_ECV))
+		val |= CNTHCTL_ECV;
 	write_sysreg(val, cnthctl_el2);
 }
 
@@ -1364,7 +1418,8 @@ static void set_timer_irqs(struct kvm *kvm, int vtimer_irq, int ptimer_irq)
 	}
 }
 
-int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
+static int kvm_arm_timer_set_attr_irq(struct kvm_vcpu *vcpu,
+				      struct kvm_device_attr *attr)
 {
 	int __user *uaddr = (int __user *)(long)attr->addr;
 	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
@@ -1397,7 +1452,37 @@ int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 	return 0;
 }
 
-int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
+static int kvm_arm_timer_set_attr_offset(struct kvm_vcpu *vcpu,
+					 struct kvm_device_attr *attr)
+{
+	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
+	u64 offset;
+
+	if (!cpus_have_const_cap(ARM64_ECV))
+		return -ENXIO;
+
+	if (get_user(offset, uaddr))
+		return -EFAULT;
+
+	update_ptimer_cntpoff(vcpu, offset);
+	return 0;
+}
+
+int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
+{
+	switch (attr->attr) {
+	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
+	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
+		return kvm_arm_timer_set_attr_irq(vcpu, attr);
+	case KVM_ARM_VCPU_TIMER_OFFSET:
+		return kvm_arm_timer_set_attr_offset(vcpu, attr);
+	default:
+		return -ENXIO;
+	}
+}
+
+static int kvm_arm_timer_get_attr_irq(struct kvm_vcpu *vcpu,
+				      struct kvm_device_attr *attr)
 {
 	int __user *uaddr = (int __user *)(long)attr->addr;
 	struct arch_timer_context *timer;
@@ -1418,12 +1503,43 @@ int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 	return put_user(irq, uaddr);
 }
 
+static int kvm_arm_timer_get_attr_offset(struct kvm_vcpu *vcpu,
+					 struct kvm_device_attr *attr)
+{
+	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
+	u64 offset;
+
+	if (!cpus_have_const_cap(ARM64_ECV))
+		return -ENXIO;
+
+	offset = timer_get_offset(vcpu_ptimer(vcpu));
+	return put_user(offset, uaddr);
+}
+
+int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu,
+			   struct kvm_device_attr *attr)
+{
+	switch (attr->attr) {
+	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
+	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
+		return kvm_arm_timer_get_attr_irq(vcpu, attr);
+	case KVM_ARM_VCPU_TIMER_OFFSET:
+		return kvm_arm_timer_get_attr_offset(vcpu, attr);
+	default:
+		return -ENXIO;
+	}
+}
+
 int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 {
 	switch (attr->attr) {
 	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
 	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
 		return 0;
+	case KVM_ARM_VCPU_TIMER_OFFSET:
+		if (cpus_have_const_cap(ARM64_ECV))
+			return 0;
+		break;
 	}
 
 	return -ENXIO;
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 1632f001f4ed..cfa923df3af6 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -68,6 +68,11 @@ static void handle___kvm_timer_set_cntvoff(struct kvm_cpu_context *host_ctxt)
 	__kvm_timer_set_cntvoff(cpu_reg(host_ctxt, 1));
 }
 
+static void handle___kvm_timer_set_cntpoff(struct kvm_cpu_context *host_ctxt)
+{
+	__kvm_timer_set_cntpoff(cpu_reg(host_ctxt, 1));
+}
+
 static void handle___kvm_enable_ssbs(struct kvm_cpu_context *host_ctxt)
 {
 	u64 tmp;
@@ -197,6 +202,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__pkvm_create_private_mapping),
 	HANDLE_FUNC(__pkvm_prot_finalize),
 	HANDLE_FUNC(__pkvm_mark_hyp),
+	HANDLE_FUNC(__kvm_timer_set_cntpoff),
 };
 
 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/timer-sr.c b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
index 9072e71693ba..5b8b4cd02506 100644
--- a/arch/arm64/kvm/hyp/nvhe/timer-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
@@ -15,6 +15,11 @@ void __kvm_timer_set_cntvoff(u64 cntvoff)
 	write_sysreg(cntvoff, cntvoff_el2);
 }
 
+void __kvm_timer_set_cntpoff(u64 cntpoff)
+{
+	write_sysreg_s(cntpoff, SYS_CNTPOFF_EL2);
+}
+
 /*
  * Should only be called on non-VHE systems.
  * VHE systems use EL2 timers and configure EL1 timers in kvm_timer_init_vhe().
diff --git a/arch/arm64/kvm/hyp/vhe/timer-sr.c b/arch/arm64/kvm/hyp/vhe/timer-sr.c
index 4cda674a8be6..231e16a071a5 100644
--- a/arch/arm64/kvm/hyp/vhe/timer-sr.c
+++ b/arch/arm64/kvm/hyp/vhe/timer-sr.c
@@ -10,3 +10,8 @@ void __kvm_timer_set_cntvoff(u64 cntvoff)
 {
 	write_sysreg(cntvoff, cntvoff_el2);
 }
+
+void __kvm_timer_set_cntpoff(u64 cntpoff)
+{
+	write_sysreg_s(cntpoff, SYS_CNTPOFF_EL2);
+}
diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
index 73c7139c866f..7252ffa3d675 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -21,6 +21,7 @@
 #define CNTHCTL_EVNTEN			(1 << 2)
 #define CNTHCTL_EVNTDIR			(1 << 3)
 #define CNTHCTL_EVNTI			(0xF << 4)
+#define CNTHCTL_ECV			(1 << 12)
 
 enum arch_timer_reg {
 	ARCH_TIMER_REG_CTRL,
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 18/21] KVM: arm64: Configure timer traps in vcpu_load() for VHE
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (16 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04 10:25   ` Andrew Jones
  2021-08-04  8:58 ` [PATCH v6 19/21] KVM: arm64: Emulate physical counter offsetting on non-ECV systems Oliver Upton
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

In preparation for emulated physical counter-timer offsetting, configure
traps on every vcpu_load() for VHE systems. As before, these trap
settings do not affect host userspace, and are only active for the
guest.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/arm64/kvm/arch_timer.c  | 10 +++++++---
 arch/arm64/kvm/arm.c         |  4 +---
 include/kvm/arm_arch_timer.h |  2 --
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index f15058612994..9ead94aa867d 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -51,6 +51,7 @@ static void kvm_arm_timer_write(struct kvm_vcpu *vcpu,
 static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
 			      struct arch_timer_context *timer,
 			      enum kvm_arch_timer_regs treg);
+static void kvm_timer_enable_traps_vhe(void);
 
 u32 timer_get_ctl(struct arch_timer_context *ctxt)
 {
@@ -668,6 +669,9 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
 
 	if (map.emul_ptimer)
 		timer_emulate(map.emul_ptimer);
+
+	if (has_vhe())
+		kvm_timer_enable_traps_vhe();
 }
 
 bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu)
@@ -1383,12 +1387,12 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 }
 
 /*
- * On VHE system, we only need to configure the EL2 timer trap register once,
- * not for every world switch.
+ * On VHE system, we only need to configure the EL2 timer trap register on
+ * vcpu_load(), but not every world switch into the guest.
  * The host kernel runs at EL2 with HCR_EL2.TGE == 1,
  * and this makes those bits have no effect for the host kernel execution.
  */
-void kvm_timer_init_vhe(void)
+static void kvm_timer_enable_traps_vhe(void)
 {
 	/* When HCR_EL2.E2H ==1, EL1PCEN and EL1PCTEN are shifted by 10 */
 	u32 cnthctl_shift = 10;
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e9a2b8f27792..47ea1e1ba80b 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1558,9 +1558,7 @@ static void cpu_hyp_reinit(void)
 
 	cpu_hyp_reset();
 
-	if (is_kernel_in_hyp_mode())
-		kvm_timer_init_vhe();
-	else
+	if (!is_kernel_in_hyp_mode())
 		cpu_init_hyp_mode();
 
 	cpu_set_hyp_vector();
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 615f9314f6a5..254653b42da0 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -87,8 +87,6 @@ u64 kvm_phys_timer_read(void);
 void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu);
 void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu);
 
-void kvm_timer_init_vhe(void);
-
 bool kvm_arch_timer_get_input_level(int vintid);
 
 #define vcpu_timer(v)	(&(v)->arch.timer_cpu)
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 19/21] KVM: arm64: Emulate physical counter offsetting on non-ECV systems
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (17 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 18/21] KVM: arm64: Configure timer traps in vcpu_load() for VHE Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04 11:05   ` Andrew Jones
  2021-08-10 11:27   ` Marc Zyngier
  2021-08-04  8:58 ` [PATCH v6 20/21] selftests: KVM: Test physical counter offsetting Oliver Upton
                   ` (3 subsequent siblings)
  22 siblings, 2 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Unfortunately, ECV hasn't yet arrived in any tangible hardware. At the
same time, controlling the guest view of the physical counter-timer is
useful. Support guest counter-timer offsetting on non-ECV systems by
trapping guest accesses to the physical counter-timer. Emulate reads of
the physical counter in the fast exit path.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/arm64/include/asm/sysreg.h         |  1 +
 arch/arm64/kvm/arch_timer.c             | 53 +++++++++++++++----------
 arch/arm64/kvm/hyp/include/hyp/switch.h | 29 ++++++++++++++
 arch/arm64/kvm/hyp/nvhe/timer-sr.c      | 11 ++++-
 4 files changed, 70 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index c34672aa65b9..e49790ae5da4 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -505,6 +505,7 @@
 #define SYS_AMEVCNTR0_MEM_STALL		SYS_AMEVCNTR0_EL0(3)
 
 #define SYS_CNTFRQ_EL0			sys_reg(3, 3, 14, 0, 0)
+#define SYS_CNTPCT_EL0			sys_reg(3, 3, 14, 0, 1)
 
 #define SYS_CNTP_TVAL_EL0		sys_reg(3, 3, 14, 2, 0)
 #define SYS_CNTP_CTL_EL0		sys_reg(3, 3, 14, 2, 1)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 9ead94aa867d..b7cb63acf2a0 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -51,7 +51,7 @@ static void kvm_arm_timer_write(struct kvm_vcpu *vcpu,
 static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
 			      struct arch_timer_context *timer,
 			      enum kvm_arch_timer_regs treg);
-static void kvm_timer_enable_traps_vhe(void);
+static void kvm_timer_enable_traps_vhe(struct kvm_vcpu *vcpu);
 
 u32 timer_get_ctl(struct arch_timer_context *ctxt)
 {
@@ -175,6 +175,12 @@ static void timer_set_guest_offset(struct arch_timer_context *ctxt, u64 offset)
 	}
 }
 
+static bool ptimer_emulation_required(struct kvm_vcpu *vcpu)
+{
+	return timer_get_offset(vcpu_ptimer(vcpu)) &&
+			!cpus_have_const_cap(ARM64_ECV);
+}
+
 u64 kvm_phys_timer_read(void)
 {
 	return timecounter->cc->read(timecounter->cc);
@@ -184,8 +190,13 @@ static void get_timer_map(struct kvm_vcpu *vcpu, struct timer_map *map)
 {
 	if (has_vhe()) {
 		map->direct_vtimer = vcpu_vtimer(vcpu);
-		map->direct_ptimer = vcpu_ptimer(vcpu);
-		map->emul_ptimer = NULL;
+		if (!ptimer_emulation_required(vcpu)) {
+			map->direct_ptimer = vcpu_ptimer(vcpu);
+			map->emul_ptimer = NULL;
+		} else {
+			map->direct_ptimer = NULL;
+			map->emul_ptimer = vcpu_ptimer(vcpu);
+		}
 	} else {
 		map->direct_vtimer = vcpu_vtimer(vcpu);
 		map->direct_ptimer = NULL;
@@ -671,7 +682,7 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
 		timer_emulate(map.emul_ptimer);
 
 	if (has_vhe())
-		kvm_timer_enable_traps_vhe();
+		kvm_timer_enable_traps_vhe(vcpu);
 }
 
 bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu)
@@ -1392,22 +1403,29 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
  * The host kernel runs at EL2 with HCR_EL2.TGE == 1,
  * and this makes those bits have no effect for the host kernel execution.
  */
-static void kvm_timer_enable_traps_vhe(void)
+static void kvm_timer_enable_traps_vhe(struct kvm_vcpu *vcpu)
 {
 	/* When HCR_EL2.E2H ==1, EL1PCEN and EL1PCTEN are shifted by 10 */
 	u32 cnthctl_shift = 10;
-	u64 val;
+	u64 val, mask;
+
+	mask = CNTHCTL_EL1PCEN << cnthctl_shift;
+	mask |= CNTHCTL_EL1PCTEN << cnthctl_shift;
 
-	/*
-	 * VHE systems allow the guest direct access to the EL1 physical
-	 * timer/counter.
-	 */
 	val = read_sysreg(cnthctl_el2);
-	val |= (CNTHCTL_EL1PCEN << cnthctl_shift);
-	val |= (CNTHCTL_EL1PCTEN << cnthctl_shift);
 
 	if (cpus_have_const_cap(ARM64_ECV))
 		val |= CNTHCTL_ECV;
+
+	/*
+	 * VHE systems allow the guest direct access to the EL1 physical
+	 * timer/counter if offsetting isn't requested on a non-ECV system.
+	 */
+	if (ptimer_emulation_required(vcpu))
+		val &= ~mask;
+	else
+		val |= mask;
+
 	write_sysreg(val, cnthctl_el2);
 }
 
@@ -1462,9 +1480,6 @@ static int kvm_arm_timer_set_attr_offset(struct kvm_vcpu *vcpu,
 	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
 	u64 offset;
 
-	if (!cpus_have_const_cap(ARM64_ECV))
-		return -ENXIO;
-
 	if (get_user(offset, uaddr))
 		return -EFAULT;
 
@@ -1513,9 +1528,6 @@ static int kvm_arm_timer_get_attr_offset(struct kvm_vcpu *vcpu,
 	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
 	u64 offset;
 
-	if (!cpus_have_const_cap(ARM64_ECV))
-		return -ENXIO;
-
 	offset = timer_get_offset(vcpu_ptimer(vcpu));
 	return put_user(offset, uaddr);
 }
@@ -1539,11 +1551,8 @@ int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 	switch (attr->attr) {
 	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
 	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
-		return 0;
 	case KVM_ARM_VCPU_TIMER_OFFSET:
-		if (cpus_have_const_cap(ARM64_ECV))
-			return 0;
-		break;
+		return 0;
 	}
 
 	return -ENXIO;
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index e4a2f295a394..abd3813a709e 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -15,6 +15,7 @@
 #include <linux/jump_label.h>
 #include <uapi/linux/psci.h>
 
+#include <kvm/arm_arch_timer.h>
 #include <kvm/arm_psci.h>
 
 #include <asm/barrier.h>
@@ -405,6 +406,31 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+static inline u64 __timer_read_cntpct(struct kvm_vcpu *vcpu)
+{
+	return __arch_counter_get_cntpct() - vcpu_ptimer(vcpu)->host_offset;
+}
+
+static inline bool __hyp_handle_counter(struct kvm_vcpu *vcpu)
+{
+	u32 sysreg;
+	int rt;
+	u64 rv;
+
+	if (kvm_vcpu_trap_get_class(vcpu) != ESR_ELx_EC_SYS64)
+		return false;
+
+	sysreg = esr_sys64_to_sysreg(kvm_vcpu_get_esr(vcpu));
+	if (sysreg != SYS_CNTPCT_EL0)
+		return false;
+
+	rt = kvm_vcpu_sys_get_rt(vcpu);
+	rv = __timer_read_cntpct(vcpu);
+	vcpu_set_reg(vcpu, rt, rv);
+	__kvm_skip_instr(vcpu);
+	return true;
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
@@ -439,6 +465,9 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	if (*exit_code != ARM_EXCEPTION_TRAP)
 		goto exit;
 
+	if (__hyp_handle_counter(vcpu))
+		goto guest;
+
 	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
 	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 &&
 	    handle_tx2_tvm(vcpu))
diff --git a/arch/arm64/kvm/hyp/nvhe/timer-sr.c b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
index 5b8b4cd02506..67236c2e0ba7 100644
--- a/arch/arm64/kvm/hyp/nvhe/timer-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
@@ -44,10 +44,17 @@ void __timer_enable_traps(struct kvm_vcpu *vcpu)
 
 	/*
 	 * Disallow physical timer access for the guest
-	 * Physical counter access is allowed
 	 */
 	val = read_sysreg(cnthctl_el2);
 	val &= ~CNTHCTL_EL1PCEN;
-	val |= CNTHCTL_EL1PCTEN;
+
+	/*
+	 * Disallow physical counter access for the guest if offsetting is
+	 * requested on a non-ECV system.
+	 */
+	if (vcpu_ptimer(vcpu)->host_offset && !cpus_have_const_cap(ARM64_ECV))
+		val &= ~CNTHCTL_EL1PCTEN;
+	else
+		val |= CNTHCTL_EL1PCTEN;
 	write_sysreg(val, cnthctl_el2);
 }
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 20/21] selftests: KVM: Test physical counter offsetting
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (18 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 19/21] KVM: arm64: Emulate physical counter offsetting on non-ECV systems Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04 11:03   ` Andrew Jones
  2021-08-04  8:58 ` [PATCH v6 21/21] selftests: KVM: Add counter emulation benchmark Oliver Upton
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Test that userspace adjustment of the guest physical counter-timer
results in the correct view within the guest.

Cc: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 .../selftests/kvm/include/aarch64/processor.h | 12 +++++++
 .../kvm/system_counter_offset_test.c          | 31 +++++++++++++++++--
 2 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index 3168cdbae6ee..7f53d90e9512 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -141,4 +141,16 @@ static inline uint64_t read_cntvct_ordered(void)
 	return r;
 }
 
+static inline uint64_t read_cntpct_ordered(void)
+{
+	uint64_t r;
+
+	__asm__ __volatile__("isb\n\t"
+			     "mrs %0, cntpct_el0\n\t"
+			     "isb\n\t"
+			     : "=r"(r));
+
+	return r;
+}
+
 #endif /* SELFTEST_KVM_PROCESSOR_H */
diff --git a/tools/testing/selftests/kvm/system_counter_offset_test.c b/tools/testing/selftests/kvm/system_counter_offset_test.c
index ac933db83d03..82d26a45cc48 100644
--- a/tools/testing/selftests/kvm/system_counter_offset_test.c
+++ b/tools/testing/selftests/kvm/system_counter_offset_test.c
@@ -57,6 +57,9 @@ static uint64_t host_read_guest_system_counter(struct test_case *test)
 
 enum arch_counter {
 	VIRTUAL,
+	PHYSICAL,
+	/* offset physical, read virtual */
+	PHYSICAL_READ_VIRTUAL,
 };
 
 struct test_case {
@@ -68,32 +71,54 @@ static struct test_case test_cases[] = {
 	{ .counter = VIRTUAL, .offset = 0 },
 	{ .counter = VIRTUAL, .offset = 180 * NSEC_PER_SEC },
 	{ .counter = VIRTUAL, .offset = -180 * NSEC_PER_SEC },
+	{ .counter = PHYSICAL, .offset = 0 },
+	{ .counter = PHYSICAL, .offset = 180 * NSEC_PER_SEC },
+	{ .counter = PHYSICAL, .offset = -180 * NSEC_PER_SEC },
+	{ .counter = PHYSICAL_READ_VIRTUAL, .offset = 0 },
+	{ .counter = PHYSICAL_READ_VIRTUAL, .offset = 180 * NSEC_PER_SEC },
+	{ .counter = PHYSICAL_READ_VIRTUAL, .offset = -180 * NSEC_PER_SEC },
 };
 
 static void check_preconditions(struct kvm_vm *vm)
 {
-	if (vcpu_has_reg(vm, VCPU_ID, KVM_REG_ARM_TIMER_OFFSET))
+	if (vcpu_has_reg(vm, VCPU_ID, KVM_REG_ARM_TIMER_OFFSET) &&
+	    !_vcpu_has_device_attr(vm, VCPU_ID, KVM_ARM_VCPU_TIMER_CTRL,
+				   KVM_ARM_VCPU_TIMER_OFFSET))
 		return;
 
-	print_skip("KVM_REG_ARM_TIMER_OFFSET not supported; skipping test");
+	print_skip("KVM_REG_ARM_TIMER_OFFSET|KVM_ARM_VCPU_TIMER_OFFSET not supported; skipping test");
 	exit(KSFT_SKIP);
 }
 
 static void setup_system_counter(struct kvm_vm *vm, struct test_case *test)
 {
+	uint64_t cntvoff, cntpoff;
 	struct kvm_one_reg reg = {
 		.id = KVM_REG_ARM_TIMER_OFFSET,
-		.addr = (__u64)&test->offset,
+		.addr = (__u64)&cntvoff,
 	};
 
+	if (test->counter == VIRTUAL) {
+		cntvoff = test->offset;
+		cntpoff = 0;
+	} else {
+		cntvoff = 0;
+		cntpoff = test->offset;
+	}
+
 	vcpu_set_reg(vm, VCPU_ID, &reg);
+	vcpu_access_device_attr(vm, VCPU_ID, KVM_ARM_VCPU_TIMER_CTRL,
+				KVM_ARM_VCPU_TIMER_OFFSET, &cntpoff, true);
 }
 
 static uint64_t guest_read_system_counter(struct test_case *test)
 {
 	switch (test->counter) {
 	case VIRTUAL:
+	case PHYSICAL_READ_VIRTUAL:
 		return read_cntvct_ordered();
+	case PHYSICAL:
+		return read_cntpct_ordered();
 	default:
 		GUEST_ASSERT(0);
 	}
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v6 21/21] selftests: KVM: Add counter emulation benchmark
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (19 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 20/21] selftests: KVM: Test physical counter offsetting Oliver Upton
@ 2021-08-04  8:58 ` Oliver Upton
  2021-08-04 11:05 ` [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
  2021-08-11 13:05 ` Paolo Bonzini
  22 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04  8:58 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas, Oliver Upton

Add a test case for counter emulation on arm64. A side effect of how KVM
handles physical counter offsetting on non-ECV systems is that the
virtual counter will always hit hardware and the physical could be
emulated. Force emulation by writing a nonzero offset to the physical
counter and compare the elapsed cycles to a direct read of the hardware
register.

Reviewed-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../kvm/aarch64/counter_emulation_benchmark.c | 207 ++++++++++++++++++
 3 files changed, 209 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/aarch64/counter_emulation_benchmark.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 3d2585f0bffc..a5953f92f6b1 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -1,4 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0-only
+/aarch64/counter_emulation_benchmark
 /aarch64/debug-exceptions
 /aarch64/get-reg-list
 /aarch64/vgic_init
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index fab42e7c23ee..d24f7a914992 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -87,6 +87,7 @@ TEST_GEN_PROGS_x86_64 += steal_time
 TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
 TEST_GEN_PROGS_x86_64 += system_counter_offset_test
 
+TEST_GEN_PROGS_aarch64 += aarch64/counter_emulation_benchmark
 TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
 TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
diff --git a/tools/testing/selftests/kvm/aarch64/counter_emulation_benchmark.c b/tools/testing/selftests/kvm/aarch64/counter_emulation_benchmark.c
new file mode 100644
index 000000000000..09ff79ab3d6f
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/counter_emulation_benchmark.c
@@ -0,0 +1,207 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * counter_emulation_benchmark.c -- test to measure the effects of counter
+ * emulation on guest reads of the physical counter.
+ *
+ * Copyright (c) 2021, Google LLC.
+ */
+
+#define _GNU_SOURCE
+#include <asm/kvm.h>
+#include <linux/kvm.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+#include "kvm_util.h"
+#include "processor.h"
+#include "test_util.h"
+
+#define VCPU_ID 0
+
+static struct counter_values {
+	uint64_t cntvct_start;
+	uint64_t cntpct;
+	uint64_t cntvct_end;
+} counter_values;
+
+static uint64_t nr_iterations = 1000;
+
+static void do_test(void)
+{
+	/*
+	 * Open-coded approach instead of using helper methods to keep a tight
+	 * interval around the physical counter read.
+	 */
+	asm volatile("isb\n\t"
+		     "mrs %[cntvct_start], cntvct_el0\n\t"
+		     "isb\n\t"
+		     "mrs %[cntpct], cntpct_el0\n\t"
+		     "isb\n\t"
+		     "mrs %[cntvct_end], cntvct_el0\n\t"
+		     "isb\n\t"
+		     : [cntvct_start] "=r"(counter_values.cntvct_start),
+		     [cntpct] "=r"(counter_values.cntpct),
+		     [cntvct_end] "=r"(counter_values.cntvct_end));
+}
+
+static void guest_main(void)
+{
+	int i;
+
+	for (i = 0; i < nr_iterations; i++) {
+		do_test();
+		GUEST_SYNC(i);
+	}
+
+	for (i = 0; i < nr_iterations; i++) {
+		do_test();
+		GUEST_SYNC(i);
+	}
+}
+
+static void enter_guest(struct kvm_vm *vm)
+{
+	struct ucall uc;
+
+	vcpu_ioctl(vm, VCPU_ID, KVM_RUN, NULL);
+
+	switch (get_ucall(vm, VCPU_ID, &uc)) {
+	case UCALL_SYNC:
+		break;
+	case UCALL_ABORT:
+		TEST_ASSERT(false, "%s at %s:%ld", (const char *)uc.args[0],
+			    __FILE__, uc.args[1]);
+		break;
+	default:
+		TEST_ASSERT(false, "unexpected exit: %s",
+			    exit_reason_str(vcpu_state(vm, VCPU_ID)->exit_reason));
+		break;
+	}
+}
+
+static double counter_frequency(void)
+{
+	uint32_t freq;
+
+	asm volatile("mrs %0, cntfrq_el0"
+		     : "=r" (freq));
+
+	return freq / 1000000.0;
+}
+
+static void log_csv(FILE *csv, bool trapped)
+{
+	double freq = counter_frequency();
+
+	fprintf(csv, "%s,%.02f,%lu,%lu,%lu\n",
+		trapped ? "true" : "false", freq,
+		counter_values.cntvct_start,
+		counter_values.cntpct,
+		counter_values.cntvct_end);
+}
+
+static double run_loop(struct kvm_vm *vm, FILE *csv, bool trapped)
+{
+	double avg = 0;
+	int i;
+
+	for (i = 0; i < nr_iterations; i++) {
+		uint64_t delta;
+
+		enter_guest(vm);
+		sync_global_from_guest(vm, counter_values);
+
+		if (csv)
+			log_csv(csv, trapped);
+
+		delta = counter_values.cntvct_end - counter_values.cntvct_start;
+		avg = ((avg * i) + delta) / (i + 1);
+	}
+
+	return avg;
+}
+
+static void setup_counter(struct kvm_vm *vm, uint64_t offset)
+{
+	vcpu_access_device_attr(vm, VCPU_ID, KVM_ARM_VCPU_TIMER_CTRL,
+				KVM_ARM_VCPU_TIMER_OFFSET, &offset,
+				true);
+}
+
+static void run_tests(struct kvm_vm *vm, FILE *csv)
+{
+	double avg_trapped, avg_native, freq;
+
+	freq = counter_frequency();
+
+	if (csv)
+		fputs("trapped,freq_mhz,cntvct_start,cntpct,cntvct_end\n", csv);
+
+	/* no physical offsetting; kvm allows reads of cntpct_el0 */
+	setup_counter(vm, 0);
+	avg_native = run_loop(vm, csv, false);
+
+	/* force emulation of the physical counter */
+	setup_counter(vm, 1);
+	avg_trapped = run_loop(vm, csv, true);
+
+	pr_info("%lu iterations: average cycles (@%.02fMHz) native: %.02f, trapped: %.02f\n",
+		nr_iterations, freq, avg_native, avg_trapped);
+}
+
+static void usage(const char *program_name)
+{
+	fprintf(stderr,
+		"Usage: %s [-h] [-o csv_file] [-n iterations]\n"
+		"  -h prints this message\n"
+		"  -n number of test iterations (default: %lu)\n"
+		"  -o csv file to write data\n",
+		program_name, nr_iterations);
+}
+
+int main(int argc, char **argv)
+{
+	struct kvm_vm *vm;
+	FILE *csv = NULL;
+	int opt;
+
+	while ((opt = getopt(argc, argv, "hn:o:")) != -1) {
+		switch (opt) {
+		case 'o':
+			csv = fopen(optarg, "w");
+			if (!csv) {
+				fprintf(stderr, "failed to open file '%s': %d\n",
+					optarg, errno);
+				exit(1);
+			}
+			break;
+		case 'n':
+			nr_iterations = strtoul(optarg, NULL, 0);
+			break;
+		default:
+			fprintf(stderr, "unrecognized option: '-%c'\n", opt);
+			/* fallthrough */
+		case 'h':
+			usage(argv[0]);
+			exit(1);
+		}
+	}
+
+	vm = vm_create_default(VCPU_ID, 0, guest_main);
+	sync_global_to_guest(vm, nr_iterations);
+	ucall_init(vm, NULL);
+
+	if (_vcpu_has_device_attr(vm, VCPU_ID, KVM_ARM_VCPU_TIMER_CTRL,
+				  KVM_ARM_VCPU_TIMER_OFFSET)) {
+		print_skip("KVM_ARM_VCPU_TIMER_OFFSET not supported.");
+		exit(KSFT_SKIP);
+	}
+
+	run_tests(vm, csv);
+	kvm_vm_free(vm);
+
+	if (csv)
+		fclose(csv);
+}
-- 
2.32.0.605.g8dce9f2422-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 14/21] selftests: KVM: Add helper to check for register presence
  2021-08-04  8:58 ` [PATCH v6 14/21] selftests: KVM: Add helper to check for register presence Oliver Upton
@ 2021-08-04  9:14   ` Andrew Jones
  0 siblings, 0 replies; 51+ messages in thread
From: Andrew Jones @ 2021-08-04  9:14 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Marc Zyngier,
	Peter Shier, Jim Mattson, David Matlack, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Will Deacon, Catalin Marinas

On Wed, Aug 04, 2021 at 08:58:12AM +0000, Oliver Upton wrote:
> The KVM_GET_REG_LIST vCPU ioctl returns a list of supported registers
> for a given vCPU. Add a helper to check if a register exists in the list
> of supported registers.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  .../testing/selftests/kvm/include/kvm_util.h  |  2 ++
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 19 +++++++++++++++++++
>  2 files changed, 21 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
> index 1b3ef5757819..077082dd2ca7 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util.h
> @@ -215,6 +215,8 @@ void vcpu_fpu_get(struct kvm_vm *vm, uint32_t vcpuid,
>  		  struct kvm_fpu *fpu);
>  void vcpu_fpu_set(struct kvm_vm *vm, uint32_t vcpuid,
>  		  struct kvm_fpu *fpu);
> +
> +bool vcpu_has_reg(struct kvm_vm *vm, uint32_t vcpuid, uint64_t reg_id);
>  void vcpu_get_reg(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_one_reg *reg);
>  void vcpu_set_reg(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_one_reg *reg);
>  #ifdef __KVM_HAVE_VCPU_EVENTS
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 0fe66ca6139a..a5801d4ed37d 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -1823,6 +1823,25 @@ void vcpu_fpu_set(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_fpu *fpu)
>  		    ret, errno, strerror(errno));
>  }
>  
> +bool vcpu_has_reg(struct kvm_vm *vm, uint32_t vcpuid, uint64_t reg_id)
> +{
> +	struct kvm_reg_list *list;
> +	bool ret = false;
> +	uint64_t i;
> +
> +	list = vcpu_get_reg_list(vm, vcpuid);
> +
> +	for (i = 0; i < list->n; i++) {
> +		if (list->reg[i] == reg_id) {
> +			ret = true;
> +			break;
> +		}
> +	}
> +
> +	free(list);
> +	return ret;
> +}
> +
>  void vcpu_get_reg(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_one_reg *reg)
>  {
>  	int ret;
> -- 
> 2.32.0.605.g8dce9f2422-goog
>

Reviewed-by: Andrew Jones <drjones@redhat.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 11/21] KVM: arm64: Refactor update_vtimer_cntvoff()
  2021-08-04  8:58 ` [PATCH v6 11/21] KVM: arm64: Refactor update_vtimer_cntvoff() Oliver Upton
@ 2021-08-04  9:23   ` Andrew Jones
  0 siblings, 0 replies; 51+ messages in thread
From: Andrew Jones @ 2021-08-04  9:23 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Marc Zyngier,
	Peter Shier, Jim Mattson, David Matlack, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Will Deacon, Catalin Marinas

On Wed, Aug 04, 2021 at 08:58:09AM +0000, Oliver Upton wrote:
> Make the implementation of update_vtimer_cntvoff() generic w.r.t. guest
> timer context and spin off into a new helper method for later use.
> Require callers of this new helper method to grab the kvm lock
> beforehand.
> 
> No functional change intended.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/arm64/kvm/arch_timer.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 3df67c127489..c0101db75ad4 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -747,22 +747,32 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> -/* Make the updates of cntvoff for all vtimer contexts atomic */
> -static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
> +/* Make offset updates for all timer contexts atomic */
> +static void update_timer_offset(struct kvm_vcpu *vcpu,
> +				enum kvm_arch_timers timer, u64 offset)
>  {
>  	int i;
>  	struct kvm *kvm = vcpu->kvm;
>  	struct kvm_vcpu *tmp;
>  
> -	mutex_lock(&kvm->lock);
> +	lockdep_assert_held(&kvm->lock);
> +
>  	kvm_for_each_vcpu(i, tmp, kvm)
> -		timer_set_offset(vcpu_vtimer(tmp), cntvoff);
> +		timer_set_offset(vcpu_get_timer(tmp, timer), offset);
>  
>  	/*
>  	 * When called from the vcpu create path, the CPU being created is not
>  	 * included in the loop above, so we just set it here as well.
>  	 */
> -	timer_set_offset(vcpu_vtimer(vcpu), cntvoff);
> +	timer_set_offset(vcpu_get_timer(vcpu, timer), offset);
> +}
> +
> +static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +
> +	mutex_lock(&kvm->lock);
> +	update_timer_offset(vcpu, TIMER_VTIMER, cntvoff);
>  	mutex_unlock(&kvm->lock);
>  }
>  
> -- 
> 2.32.0.605.g8dce9f2422-goog
>

Reviewed-by: Andrew Jones <drjones@redhat.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset
  2021-08-04  8:58 ` [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset Oliver Upton
@ 2021-08-04 10:17   ` Andrew Jones
  2021-08-04 10:22     ` Oliver Upton
  2021-08-10 10:56   ` Marc Zyngier
  1 sibling, 1 reply; 51+ messages in thread
From: Andrew Jones @ 2021-08-04 10:17 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Marc Zyngier,
	Peter Shier, Jim Mattson, David Matlack, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Will Deacon, Catalin Marinas

On Wed, Aug 04, 2021 at 08:58:15AM +0000, Oliver Upton wrote:
> Presently, KVM provides no facilities for correctly migrating a guest
> that depends on the physical counter-timer. Whie most guests (barring
> NV, of course) should not depend on the physical counter-timer, an
> operator may wish to provide a consistent view of the physical
> counter-timer across migrations.
> 
> Provide userspace with a new vCPU attribute to modify the guest
> counter-timer offset. Unlike KVM_REG_ARM_TIMER_OFFSET, this attribute is
> hidden from the guest's architectural state. The value offsets *both*
> the virtual and physical counter-timer views for the guest. Only support
> this attribute on ECV systems as ECV is required for hardware offsetting
> of the physical counter-timer.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  Documentation/virt/kvm/devices/vcpu.rst |  28 ++++++
>  arch/arm64/include/asm/kvm_asm.h        |   2 +
>  arch/arm64/include/asm/sysreg.h         |   2 +
>  arch/arm64/include/uapi/asm/kvm.h       |   1 +
>  arch/arm64/kvm/arch_timer.c             | 122 +++++++++++++++++++++++-
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c      |   6 ++
>  arch/arm64/kvm/hyp/nvhe/timer-sr.c      |   5 +
>  arch/arm64/kvm/hyp/vhe/timer-sr.c       |   5 +
>  include/clocksource/arm_arch_timer.h    |   1 +
>  9 files changed, 169 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> index 3b399d727c11..3ba35b9d9d03 100644
> --- a/Documentation/virt/kvm/devices/vcpu.rst
> +++ b/Documentation/virt/kvm/devices/vcpu.rst
> @@ -139,6 +139,34 @@ configured values on other VCPUs.  Userspace should configure the interrupt
>  numbers on at least one VCPU after creating all VCPUs and before running any
>  VCPUs.
>  
> +2.2. ATTRIBUTE: KVM_ARM_VCPU_TIMER_OFFSET
> +-----------------------------------------
> +
> +:Parameters: in kvm_device_attr.addr the address for the timer offset is a
> +             pointer to a __u64
> +
> +Returns:
> +
> +	 ======= ==================================
> +	 -EFAULT Error reading/writing the provided
> +		 parameter address
> +	 -ENXIO  Timer offsetting not implemented
> +	 ======= ==================================
> +
> +Specifies the guest's counter-timer offset from the host's virtual counter.
> +The guest's physical counter value is then derived by the following
> +equation:
> +
> +  guest_cntpct = host_cntvct - KVM_ARM_VCPU_TIMER_OFFSET
> +
> +The guest's virtual counter value is derived by the following equation:
> +
> +  guest_cntvct = host_cntvct - KVM_REG_ARM_TIMER_OFFSET
> +			- KVM_ARM_VCPU_TIMER_OFFSET
> +
> +KVM does not allow the use of varying offset values for different vCPUs;
> +the last written offset value will be broadcasted to all vCPUs in a VM.
> +
>  3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL
>  ==================================
>  
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 9f0bf2109be7..ab1c8fdb0177 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -65,6 +65,7 @@
>  #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
>  #define __KVM_HOST_SMCCC_FUNC___pkvm_mark_hyp			20
>  #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			21
> +#define __KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntpoff		22
>  
>  #ifndef __ASSEMBLY__
>  
> @@ -200,6 +201,7 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa,
>  extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu);
>  
>  extern void __kvm_timer_set_cntvoff(u64 cntvoff);
> +extern void __kvm_timer_set_cntpoff(u64 cntpoff);
>  
>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>  
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 4dfc44066dfb..c34672aa65b9 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -586,6 +586,8 @@
>  #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
>  #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
>  
> +#define SYS_CNTPOFF_EL2			sys_reg(3, 4, 14, 0, 6)
> +
>  /* VHE encodings for architectural EL0/1 system registers */
>  #define SYS_SCTLR_EL12			sys_reg(3, 5, 1, 0, 0)
>  #define SYS_CPACR_EL12			sys_reg(3, 5, 1, 0, 2)
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 949a31bc10f0..15150f8224a1 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -366,6 +366,7 @@ struct kvm_arm_copy_mte_tags {
>  #define KVM_ARM_VCPU_TIMER_CTRL		1
>  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
>  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
> +#define   KVM_ARM_VCPU_TIMER_OFFSET		2
>  #define KVM_ARM_VCPU_PVTIME_CTRL	2
>  #define   KVM_ARM_VCPU_PVTIME_IPA	0
>  
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index a8815b09da3e..f15058612994 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -85,11 +85,15 @@ u64 timer_get_cval(struct arch_timer_context *ctxt)
>  static u64 timer_get_offset(struct arch_timer_context *ctxt)
>  {
>  	struct kvm_vcpu *vcpu = ctxt->vcpu;
> +	struct arch_timer_cpu *timer = vcpu_timer(vcpu);

This new timer variable doesn't appear to get used.

>  
>  	switch(arch_timer_ctx_index(ctxt)) {
>  	case TIMER_VTIMER:
> +	case TIMER_PTIMER:
>  		return ctxt->host_offset;
>  	default:
> +		WARN_ONCE(1, "unrecognized timer %ld\n",
> +			  arch_timer_ctx_index(ctxt));
>  		return 0;
>  	}
>  }
> @@ -144,6 +148,7 @@ static void timer_set_offset(struct arch_timer_context *ctxt, u64 offset)
>  
>  	switch(arch_timer_ctx_index(ctxt)) {
>  	case TIMER_VTIMER:
> +	case TIMER_PTIMER:
>  		ctxt->host_offset = offset;
>  		break;
>  	default:
> @@ -572,6 +577,11 @@ static void set_cntvoff(u64 cntvoff)
>  	kvm_call_hyp(__kvm_timer_set_cntvoff, cntvoff);
>  }
>  
> +static void set_cntpoff(u64 cntpoff)
> +{
> +	kvm_call_hyp(__kvm_timer_set_cntpoff, cntpoff);
> +}
> +
>  static inline void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active)
>  {
>  	int r;
> @@ -647,6 +657,8 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
>  	}
>  
>  	set_cntvoff(timer_get_offset(map.direct_vtimer));
> +	if (cpus_have_const_cap(ARM64_ECV))
> +		set_cntpoff(timer_get_offset(vcpu_ptimer(vcpu)));
>  
>  	kvm_timer_unblocking(vcpu);
>  
> @@ -814,6 +826,22 @@ static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
>  	mutex_unlock(&kvm->lock);
>  }
>  
> +static void update_ptimer_cntpoff(struct kvm_vcpu *vcpu, u64 cntpoff)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +	u64 cntvoff;
> +
> +	mutex_lock(&kvm->lock);
> +
> +	/* adjustments to the physical offset also affect vtimer */
> +	cntvoff = timer_get_offset(vcpu_vtimer(vcpu));
> +	cntvoff += cntpoff - timer_get_offset(vcpu_ptimer(vcpu));
> +
> +	update_timer_offset(vcpu, TIMER_PTIMER, cntpoff, false);
> +	update_timer_offset(vcpu, TIMER_VTIMER, cntvoff, false);
> +	mutex_unlock(&kvm->lock);
> +}
> +
>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>  {
>  	struct arch_timer_cpu *timer = vcpu_timer(vcpu);
> @@ -932,6 +960,29 @@ u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
>  	return (u64)-1;
>  }
>  
> +/**
> + * kvm_arm_timer_read_offset - returns the guest value of CNTVOFF_EL2.
> + * @vcpu: the vcpu pointer
> + *
> + * Computes the guest value of CNTVOFF_EL2 by subtracting the physical
> + * counter offset. Note that KVM defines CNTVOFF_EL2 as the offset from the
> + * guest's physical counter-timer, not the host's.
> + *
> + * Returns: the guest value for CNTVOFF_EL2
> + */
> +static u64 kvm_arm_timer_read_offset(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +	u64 offset;
> +
> +	mutex_lock(&kvm->lock);
> +	offset = timer_get_offset(vcpu_vtimer(vcpu)) -
> +			timer_get_offset(vcpu_ptimer(vcpu));
> +	mutex_unlock(&kvm->lock);
> +
> +	return offset;
> +}
> +
>  static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
>  			      struct arch_timer_context *timer,
>  			      enum kvm_arch_timer_regs treg)
> @@ -957,7 +1008,7 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
>  		break;
>  
>  	case TIMER_REG_OFFSET:
> -		val = timer_get_offset(timer);
> +		val = kvm_arm_timer_read_offset(vcpu);
>  		break;
>  
>  	default:
> @@ -1350,6 +1401,9 @@ void kvm_timer_init_vhe(void)
>  	val = read_sysreg(cnthctl_el2);
>  	val |= (CNTHCTL_EL1PCEN << cnthctl_shift);
>  	val |= (CNTHCTL_EL1PCTEN << cnthctl_shift);
> +
> +	if (cpus_have_const_cap(ARM64_ECV))
> +		val |= CNTHCTL_ECV;
>  	write_sysreg(val, cnthctl_el2);
>  }
>  
> @@ -1364,7 +1418,8 @@ static void set_timer_irqs(struct kvm *kvm, int vtimer_irq, int ptimer_irq)
>  	}
>  }
>  
> -int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> +static int kvm_arm_timer_set_attr_irq(struct kvm_vcpu *vcpu,
> +				      struct kvm_device_attr *attr)
>  {
>  	int __user *uaddr = (int __user *)(long)attr->addr;
>  	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
> @@ -1397,7 +1452,37 @@ int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  	return 0;
>  }
>  
> -int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> +static int kvm_arm_timer_set_attr_offset(struct kvm_vcpu *vcpu,
> +					 struct kvm_device_attr *attr)
> +{
> +	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
> +	u64 offset;
> +
> +	if (!cpus_have_const_cap(ARM64_ECV))
> +		return -ENXIO;
> +
> +	if (get_user(offset, uaddr))
> +		return -EFAULT;
> +
> +	update_ptimer_cntpoff(vcpu, offset);
> +	return 0;
> +}
> +
> +int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> +{
> +	switch (attr->attr) {
> +	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
> +	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
> +		return kvm_arm_timer_set_attr_irq(vcpu, attr);
> +	case KVM_ARM_VCPU_TIMER_OFFSET:
> +		return kvm_arm_timer_set_attr_offset(vcpu, attr);
> +	default:
> +		return -ENXIO;
> +	}
> +}
> +
> +static int kvm_arm_timer_get_attr_irq(struct kvm_vcpu *vcpu,
> +				      struct kvm_device_attr *attr)
>  {
>  	int __user *uaddr = (int __user *)(long)attr->addr;
>  	struct arch_timer_context *timer;
> @@ -1418,12 +1503,43 @@ int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  	return put_user(irq, uaddr);
>  }
>  
> +static int kvm_arm_timer_get_attr_offset(struct kvm_vcpu *vcpu,
> +					 struct kvm_device_attr *attr)
> +{
> +	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
> +	u64 offset;
> +
> +	if (!cpus_have_const_cap(ARM64_ECV))
> +		return -ENXIO;
> +
> +	offset = timer_get_offset(vcpu_ptimer(vcpu));
> +	return put_user(offset, uaddr);
> +}
> +
> +int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu,
> +			   struct kvm_device_attr *attr)
> +{
> +	switch (attr->attr) {
> +	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
> +	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
> +		return kvm_arm_timer_get_attr_irq(vcpu, attr);
> +	case KVM_ARM_VCPU_TIMER_OFFSET:
> +		return kvm_arm_timer_get_attr_offset(vcpu, attr);
> +	default:
> +		return -ENXIO;
> +	}
> +}
> +
>  int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  {
>  	switch (attr->attr) {
>  	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
>  	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
>  		return 0;
> +	case KVM_ARM_VCPU_TIMER_OFFSET:
> +		if (cpus_have_const_cap(ARM64_ECV))
> +			return 0;
> +		break;
>  	}
>  
>  	return -ENXIO;
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 1632f001f4ed..cfa923df3af6 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -68,6 +68,11 @@ static void handle___kvm_timer_set_cntvoff(struct kvm_cpu_context *host_ctxt)
>  	__kvm_timer_set_cntvoff(cpu_reg(host_ctxt, 1));
>  }
>  
> +static void handle___kvm_timer_set_cntpoff(struct kvm_cpu_context *host_ctxt)
> +{
> +	__kvm_timer_set_cntpoff(cpu_reg(host_ctxt, 1));
> +}
> +
>  static void handle___kvm_enable_ssbs(struct kvm_cpu_context *host_ctxt)
>  {
>  	u64 tmp;
> @@ -197,6 +202,7 @@ static const hcall_t host_hcall[] = {
>  	HANDLE_FUNC(__pkvm_create_private_mapping),
>  	HANDLE_FUNC(__pkvm_prot_finalize),
>  	HANDLE_FUNC(__pkvm_mark_hyp),
> +	HANDLE_FUNC(__kvm_timer_set_cntpoff),
>  };
>  
>  static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
> diff --git a/arch/arm64/kvm/hyp/nvhe/timer-sr.c b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> index 9072e71693ba..5b8b4cd02506 100644
> --- a/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> +++ b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> @@ -15,6 +15,11 @@ void __kvm_timer_set_cntvoff(u64 cntvoff)
>  	write_sysreg(cntvoff, cntvoff_el2);
>  }
>  
> +void __kvm_timer_set_cntpoff(u64 cntpoff)
> +{
> +	write_sysreg_s(cntpoff, SYS_CNTPOFF_EL2);
> +}
> +
>  /*
>   * Should only be called on non-VHE systems.
>   * VHE systems use EL2 timers and configure EL1 timers in kvm_timer_init_vhe().
> diff --git a/arch/arm64/kvm/hyp/vhe/timer-sr.c b/arch/arm64/kvm/hyp/vhe/timer-sr.c
> index 4cda674a8be6..231e16a071a5 100644
> --- a/arch/arm64/kvm/hyp/vhe/timer-sr.c
> +++ b/arch/arm64/kvm/hyp/vhe/timer-sr.c
> @@ -10,3 +10,8 @@ void __kvm_timer_set_cntvoff(u64 cntvoff)
>  {
>  	write_sysreg(cntvoff, cntvoff_el2);
>  }
> +
> +void __kvm_timer_set_cntpoff(u64 cntpoff)
> +{
> +	write_sysreg_s(cntpoff, SYS_CNTPOFF_EL2);
> +}
> diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
> index 73c7139c866f..7252ffa3d675 100644
> --- a/include/clocksource/arm_arch_timer.h
> +++ b/include/clocksource/arm_arch_timer.h
> @@ -21,6 +21,7 @@
>  #define CNTHCTL_EVNTEN			(1 << 2)
>  #define CNTHCTL_EVNTDIR			(1 << 3)
>  #define CNTHCTL_EVNTI			(0xF << 4)
> +#define CNTHCTL_ECV			(1 << 12)
>  
>  enum arch_timer_reg {
>  	ARCH_TIMER_REG_CTRL,
> -- 
> 2.32.0.605.g8dce9f2422-goog
>

Reviewed-by: Andrew Jones <drjones@redhat.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 12/21] KVM: arm64: Separate guest/host counter offset values
  2021-08-04  8:58 ` [PATCH v6 12/21] KVM: arm64: Separate guest/host counter offset values Oliver Upton
@ 2021-08-04 10:19   ` Andrew Jones
  0 siblings, 0 replies; 51+ messages in thread
From: Andrew Jones @ 2021-08-04 10:19 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Marc Zyngier,
	Peter Shier, Jim Mattson, David Matlack, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Will Deacon, Catalin Marinas

On Wed, Aug 04, 2021 at 08:58:10AM +0000, Oliver Upton wrote:
> In some instances, a VMM may want to update the guest's counter-timer
> offset in a transparent manner, meaning that changes to the hardware
> value do not affect the synthetic register presented to the guest or the
> VMM through said guest's architectural state. Lay the groundwork to
> separate guest offset register writes from the hardware values utilized
> by KVM.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/arm64/kvm/arch_timer.c  | 48 ++++++++++++++++++++++++++++++++----
>  include/kvm/arm_arch_timer.h |  3 +++
>  2 files changed, 46 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index c0101db75ad4..4c2b763a8849 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -87,6 +87,18 @@ static u64 timer_get_offset(struct arch_timer_context *ctxt)
>  	struct kvm_vcpu *vcpu = ctxt->vcpu;
>  
>  	switch(arch_timer_ctx_index(ctxt)) {
> +	case TIMER_VTIMER:
> +		return ctxt->host_offset;
> +	default:
> +		return 0;
> +	}
> +}
> +
> +static u64 timer_get_guest_offset(struct arch_timer_context *ctxt)
> +{
> +	struct kvm_vcpu *vcpu = ctxt->vcpu;
> +
> +	switch (arch_timer_ctx_index(ctxt)) {
>  	case TIMER_VTIMER:
>  		return __vcpu_sys_reg(vcpu, CNTVOFF_EL2);
>  	default:
> @@ -132,13 +144,31 @@ static void timer_set_offset(struct arch_timer_context *ctxt, u64 offset)
>  
>  	switch(arch_timer_ctx_index(ctxt)) {
>  	case TIMER_VTIMER:
> -		__vcpu_sys_reg(vcpu, CNTVOFF_EL2) = offset;
> +		ctxt->host_offset = offset;
>  		break;
>  	default:
>  		WARN(offset, "timer %ld\n", arch_timer_ctx_index(ctxt));
>  	}
>  }
>  
> +static void timer_set_guest_offset(struct arch_timer_context *ctxt, u64 offset)
> +{
> +	struct kvm_vcpu *vcpu = ctxt->vcpu;
> +
> +	switch (arch_timer_ctx_index(ctxt)) {
> +	case TIMER_VTIMER: {
> +		u64 host_offset = timer_get_offset(ctxt);
> +
> +		host_offset += offset - __vcpu_sys_reg(vcpu, CNTVOFF_EL2);
> +		__vcpu_sys_reg(vcpu, CNTVOFF_EL2) = offset;
> +		timer_set_offset(ctxt, host_offset);
> +		break;
> +	}
> +	default:
> +		WARN_ONCE(offset, "timer %ld\n", arch_timer_ctx_index(ctxt));
> +	}
> +}
> +
>  u64 kvm_phys_timer_read(void)
>  {
>  	return timecounter->cc->read(timecounter->cc);
> @@ -749,7 +779,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu)
>  
>  /* Make offset updates for all timer contexts atomic */
>  static void update_timer_offset(struct kvm_vcpu *vcpu,
> -				enum kvm_arch_timers timer, u64 offset)
> +				enum kvm_arch_timers timer, u64 offset,
> +				bool guest_visible)
>  {
>  	int i;
>  	struct kvm *kvm = vcpu->kvm;
> @@ -758,13 +789,20 @@ static void update_timer_offset(struct kvm_vcpu *vcpu,
>  	lockdep_assert_held(&kvm->lock);
>  
>  	kvm_for_each_vcpu(i, tmp, kvm)
> -		timer_set_offset(vcpu_get_timer(tmp, timer), offset);
> +		if (guest_visible)
> +			timer_set_guest_offset(vcpu_get_timer(tmp, timer),
> +					       offset);
> +		else
> +			timer_set_offset(vcpu_get_timer(tmp, timer), offset);
>  
>  	/*
>  	 * When called from the vcpu create path, the CPU being created is not
>  	 * included in the loop above, so we just set it here as well.
>  	 */
> -	timer_set_offset(vcpu_get_timer(vcpu, timer), offset);
> +	if (guest_visible)
> +		timer_set_guest_offset(vcpu_get_timer(vcpu, timer), offset);
> +	else
> +		timer_set_offset(vcpu_get_timer(vcpu, timer), offset);
>  }
>  
>  static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
> @@ -772,7 +810,7 @@ static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
>  	struct kvm *kvm = vcpu->kvm;
>  
>  	mutex_lock(&kvm->lock);
> -	update_timer_offset(vcpu, TIMER_VTIMER, cntvoff);
> +	update_timer_offset(vcpu, TIMER_VTIMER, cntvoff, true);
>  	mutex_unlock(&kvm->lock);
>  }
>  
> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> index 51c19381108c..9d65d4a29f81 100644
> --- a/include/kvm/arm_arch_timer.h
> +++ b/include/kvm/arm_arch_timer.h
> @@ -42,6 +42,9 @@ struct arch_timer_context {
>  	/* Duplicated state from arch_timer.c for convenience */
>  	u32				host_timer_irq;
>  	u32				host_timer_irq_flags;
> +
> +	/* offset relative to the host's physical counter-timer */
> +	u64				host_offset;
>  };
>  
>  struct timer_map {
> -- 
> 2.32.0.605.g8dce9f2422-goog
>

 
Reviewed-by: Andrew Jones <drjones@redhat.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset
  2021-08-04  8:58 ` [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset Oliver Upton
@ 2021-08-04 10:20   ` Andrew Jones
  2021-08-10  9:35   ` Marc Zyngier
  1 sibling, 0 replies; 51+ messages in thread
From: Andrew Jones @ 2021-08-04 10:20 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Marc Zyngier,
	Peter Shier, Jim Mattson, David Matlack, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Will Deacon, Catalin Marinas

On Wed, Aug 04, 2021 at 08:58:11AM +0000, Oliver Upton wrote:
> Allow userspace to access the guest's virtual counter-timer offset
> through the ONE_REG interface. The value read or written is defined to
> be an offset from the guest's physical counter-timer. Add some
> documentation to clarify how a VMM should use this and the existing
> CNTVCT_EL0.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  Documentation/virt/kvm/api.rst    | 10 ++++++++++
>  arch/arm64/include/uapi/asm/kvm.h |  1 +
>  arch/arm64/kvm/arch_timer.c       | 11 +++++++++++
>  arch/arm64/kvm/guest.c            |  6 +++++-
>  include/kvm/arm_arch_timer.h      |  1 +
>  5 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 8d4a3471ad9e..28a65dc89985 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -2487,6 +2487,16 @@ arm64 system registers have the following id bit patterns::
>       derived from the register encoding for CNTV_CVAL_EL0.  As this is
>       API, it must remain this way.
>  
> +.. warning::
> +
> +     The value of KVM_REG_ARM_TIMER_OFFSET is defined as an offset from
> +     the guest's view of the physical counter-timer.
> +
> +     Userspace should use either KVM_REG_ARM_TIMER_OFFSET or
> +     KVM_REG_ARM_TIMER_CVAL to pause and resume a guest's virtual
> +     counter-timer. Mixed use of these registers could result in an
> +     unpredictable guest counter value.
> +
>  arm64 firmware pseudo-registers have the following bit pattern::
>  
>    0x6030 0000 0014 <regno:16>
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index b3edde68bc3e..949a31bc10f0 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -255,6 +255,7 @@ struct kvm_arm_copy_mte_tags {
>  #define KVM_REG_ARM_TIMER_CTL		ARM64_SYS_REG(3, 3, 14, 3, 1)
>  #define KVM_REG_ARM_TIMER_CVAL		ARM64_SYS_REG(3, 3, 14, 0, 2)
>  #define KVM_REG_ARM_TIMER_CNT		ARM64_SYS_REG(3, 3, 14, 3, 2)
> +#define KVM_REG_ARM_TIMER_OFFSET	ARM64_SYS_REG(3, 4, 14, 0, 3)
>  
>  /* KVM-as-firmware specific pseudo-registers */
>  #define KVM_REG_ARM_FW			(0x0014 << KVM_REG_ARM_COPROC_SHIFT)
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 4c2b763a8849..a8815b09da3e 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -868,6 +868,10 @@ int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, u64 value)
>  		timer = vcpu_vtimer(vcpu);
>  		kvm_arm_timer_write(vcpu, timer, TIMER_REG_CVAL, value);
>  		break;
> +	case KVM_REG_ARM_TIMER_OFFSET:
> +		timer = vcpu_vtimer(vcpu);
> +		update_vtimer_cntvoff(vcpu, value);
> +		break;
>  	case KVM_REG_ARM_PTIMER_CTL:
>  		timer = vcpu_ptimer(vcpu);
>  		kvm_arm_timer_write(vcpu, timer, TIMER_REG_CTL, value);
> @@ -912,6 +916,9 @@ u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
>  	case KVM_REG_ARM_TIMER_CVAL:
>  		return kvm_arm_timer_read(vcpu,
>  					  vcpu_vtimer(vcpu), TIMER_REG_CVAL);
> +	case KVM_REG_ARM_TIMER_OFFSET:
> +		return kvm_arm_timer_read(vcpu,
> +					  vcpu_vtimer(vcpu), TIMER_REG_OFFSET);
>  	case KVM_REG_ARM_PTIMER_CTL:
>  		return kvm_arm_timer_read(vcpu,
>  					  vcpu_ptimer(vcpu), TIMER_REG_CTL);
> @@ -949,6 +956,10 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
>  		val = kvm_phys_timer_read() - timer_get_offset(timer);
>  		break;
>  
> +	case TIMER_REG_OFFSET:
> +		val = timer_get_offset(timer);
> +		break;
> +
>  	default:
>  		BUG();
>  	}
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 1dfb83578277..17fc06e2b422 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -591,7 +591,7 @@ static unsigned long num_core_regs(const struct kvm_vcpu *vcpu)
>   * ARM64 versions of the TIMER registers, always available on arm64
>   */
>  
> -#define NUM_TIMER_REGS 3
> +#define NUM_TIMER_REGS 4
>  
>  static bool is_timer_reg(u64 index)
>  {
> @@ -599,6 +599,7 @@ static bool is_timer_reg(u64 index)
>  	case KVM_REG_ARM_TIMER_CTL:
>  	case KVM_REG_ARM_TIMER_CNT:
>  	case KVM_REG_ARM_TIMER_CVAL:
> +	case KVM_REG_ARM_TIMER_OFFSET:
>  		return true;
>  	}
>  	return false;
> @@ -614,6 +615,9 @@ static int copy_timer_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>  	uindices++;
>  	if (put_user(KVM_REG_ARM_TIMER_CVAL, uindices))
>  		return -EFAULT;
> +	uindices++;
> +	if (put_user(KVM_REG_ARM_TIMER_OFFSET, uindices))
> +		return -EFAULT;
>  
>  	return 0;
>  }
> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> index 9d65d4a29f81..615f9314f6a5 100644
> --- a/include/kvm/arm_arch_timer.h
> +++ b/include/kvm/arm_arch_timer.h
> @@ -21,6 +21,7 @@ enum kvm_arch_timer_regs {
>  	TIMER_REG_CVAL,
>  	TIMER_REG_TVAL,
>  	TIMER_REG_CTL,
> +	TIMER_REG_OFFSET,
>  };
>  
>  struct arch_timer_context {
> -- 
> 2.32.0.605.g8dce9f2422-goog
>

 
Reviewed-by: Andrew Jones <drjones@redhat.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset
  2021-08-04 10:17   ` Andrew Jones
@ 2021-08-04 10:22     ` Oliver Upton
  0 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-04 10:22 UTC (permalink / raw)
  To: Andrew Jones
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Marc Zyngier,
	Peter Shier, Jim Mattson, David Matlack, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Will Deacon, Catalin Marinas

On Wed, Aug 4, 2021 at 3:17 AM Andrew Jones <drjones@redhat.com> wrote:
> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > index a8815b09da3e..f15058612994 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -85,11 +85,15 @@ u64 timer_get_cval(struct arch_timer_context *ctxt)
> >  static u64 timer_get_offset(struct arch_timer_context *ctxt)
> >  {
> >       struct kvm_vcpu *vcpu = ctxt->vcpu;
> > +     struct arch_timer_cpu *timer = vcpu_timer(vcpu);
>
> This new timer variable doesn't appear to get used.

Ooops, this is stale. Thanks for catching that.

>
> Reviewed-by: Andrew Jones <drjones@redhat.com>
>

Thanks for the quick review!

--
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 18/21] KVM: arm64: Configure timer traps in vcpu_load() for VHE
  2021-08-04  8:58 ` [PATCH v6 18/21] KVM: arm64: Configure timer traps in vcpu_load() for VHE Oliver Upton
@ 2021-08-04 10:25   ` Andrew Jones
  0 siblings, 0 replies; 51+ messages in thread
From: Andrew Jones @ 2021-08-04 10:25 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Marc Zyngier,
	Peter Shier, Jim Mattson, David Matlack, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Will Deacon, Catalin Marinas

On Wed, Aug 04, 2021 at 08:58:16AM +0000, Oliver Upton wrote:
> In preparation for emulated physical counter-timer offsetting, configure
> traps on every vcpu_load() for VHE systems. As before, these trap
> settings do not affect host userspace, and are only active for the
> guest.
> 
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/arm64/kvm/arch_timer.c  | 10 +++++++---
>  arch/arm64/kvm/arm.c         |  4 +---
>  include/kvm/arm_arch_timer.h |  2 --
>  3 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index f15058612994..9ead94aa867d 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -51,6 +51,7 @@ static void kvm_arm_timer_write(struct kvm_vcpu *vcpu,
>  static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
>  			      struct arch_timer_context *timer,
>  			      enum kvm_arch_timer_regs treg);
> +static void kvm_timer_enable_traps_vhe(void);
>  
>  u32 timer_get_ctl(struct arch_timer_context *ctxt)
>  {
> @@ -668,6 +669,9 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
>  
>  	if (map.emul_ptimer)
>  		timer_emulate(map.emul_ptimer);
> +
> +	if (has_vhe())
> +		kvm_timer_enable_traps_vhe();
>  }
>  
>  bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu)
> @@ -1383,12 +1387,12 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>  }
>  
>  /*
> - * On VHE system, we only need to configure the EL2 timer trap register once,
> - * not for every world switch.
> + * On VHE system, we only need to configure the EL2 timer trap register on
> + * vcpu_load(), but not every world switch into the guest.
>   * The host kernel runs at EL2 with HCR_EL2.TGE == 1,
>   * and this makes those bits have no effect for the host kernel execution.
>   */
> -void kvm_timer_init_vhe(void)
> +static void kvm_timer_enable_traps_vhe(void)
>  {
>  	/* When HCR_EL2.E2H ==1, EL1PCEN and EL1PCTEN are shifted by 10 */
>  	u32 cnthctl_shift = 10;
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e9a2b8f27792..47ea1e1ba80b 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1558,9 +1558,7 @@ static void cpu_hyp_reinit(void)
>  
>  	cpu_hyp_reset();
>  
> -	if (is_kernel_in_hyp_mode())
> -		kvm_timer_init_vhe();
> -	else
> +	if (!is_kernel_in_hyp_mode())
>  		cpu_init_hyp_mode();
>  
>  	cpu_set_hyp_vector();
> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> index 615f9314f6a5..254653b42da0 100644
> --- a/include/kvm/arm_arch_timer.h
> +++ b/include/kvm/arm_arch_timer.h
> @@ -87,8 +87,6 @@ u64 kvm_phys_timer_read(void);
>  void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu);
>  void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu);
>  
> -void kvm_timer_init_vhe(void);
> -
>  bool kvm_arch_timer_get_input_level(int vintid);
>  
>  #define vcpu_timer(v)	(&(v)->arch.timer_cpu)
> -- 
> 2.32.0.605.g8dce9f2422-goog
>

Reviewed-by: Andrew Jones <drjones@redhat.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 20/21] selftests: KVM: Test physical counter offsetting
  2021-08-04  8:58 ` [PATCH v6 20/21] selftests: KVM: Test physical counter offsetting Oliver Upton
@ 2021-08-04 11:03   ` Andrew Jones
  0 siblings, 0 replies; 51+ messages in thread
From: Andrew Jones @ 2021-08-04 11:03 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Marc Zyngier,
	Peter Shier, Jim Mattson, David Matlack, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Will Deacon, Catalin Marinas

On Wed, Aug 04, 2021 at 08:58:18AM +0000, Oliver Upton wrote:
> Test that userspace adjustment of the guest physical counter-timer
> results in the correct view within the guest.
> 
> Cc: Andrew Jones <drjones@redhat.com>
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  .../selftests/kvm/include/aarch64/processor.h | 12 +++++++
>  .../kvm/system_counter_offset_test.c          | 31 +++++++++++++++++--
>  2 files changed, 40 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
> index 3168cdbae6ee..7f53d90e9512 100644
> --- a/tools/testing/selftests/kvm/include/aarch64/processor.h
> +++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
> @@ -141,4 +141,16 @@ static inline uint64_t read_cntvct_ordered(void)
>  	return r;
>  }
>  
> +static inline uint64_t read_cntpct_ordered(void)
> +{
> +	uint64_t r;
> +
> +	__asm__ __volatile__("isb\n\t"
> +			     "mrs %0, cntpct_el0\n\t"
> +			     "isb\n\t"
> +			     : "=r"(r));
> +
> +	return r;
> +}
> +
>  #endif /* SELFTEST_KVM_PROCESSOR_H */
> diff --git a/tools/testing/selftests/kvm/system_counter_offset_test.c b/tools/testing/selftests/kvm/system_counter_offset_test.c
> index ac933db83d03..82d26a45cc48 100644
> --- a/tools/testing/selftests/kvm/system_counter_offset_test.c
> +++ b/tools/testing/selftests/kvm/system_counter_offset_test.c
> @@ -57,6 +57,9 @@ static uint64_t host_read_guest_system_counter(struct test_case *test)
>  
>  enum arch_counter {
>  	VIRTUAL,
> +	PHYSICAL,
> +	/* offset physical, read virtual */
> +	PHYSICAL_READ_VIRTUAL,
>  };
>  
>  struct test_case {
> @@ -68,32 +71,54 @@ static struct test_case test_cases[] = {
>  	{ .counter = VIRTUAL, .offset = 0 },
>  	{ .counter = VIRTUAL, .offset = 180 * NSEC_PER_SEC },
>  	{ .counter = VIRTUAL, .offset = -180 * NSEC_PER_SEC },
> +	{ .counter = PHYSICAL, .offset = 0 },
> +	{ .counter = PHYSICAL, .offset = 180 * NSEC_PER_SEC },
> +	{ .counter = PHYSICAL, .offset = -180 * NSEC_PER_SEC },
> +	{ .counter = PHYSICAL_READ_VIRTUAL, .offset = 0 },
> +	{ .counter = PHYSICAL_READ_VIRTUAL, .offset = 180 * NSEC_PER_SEC },
> +	{ .counter = PHYSICAL_READ_VIRTUAL, .offset = -180 * NSEC_PER_SEC },
>  };
>  
>  static void check_preconditions(struct kvm_vm *vm)
>  {
> -	if (vcpu_has_reg(vm, VCPU_ID, KVM_REG_ARM_TIMER_OFFSET))
> +	if (vcpu_has_reg(vm, VCPU_ID, KVM_REG_ARM_TIMER_OFFSET) &&
> +	    !_vcpu_has_device_attr(vm, VCPU_ID, KVM_ARM_VCPU_TIMER_CTRL,
> +				   KVM_ARM_VCPU_TIMER_OFFSET))
>  		return;
>  
> -	print_skip("KVM_REG_ARM_TIMER_OFFSET not supported; skipping test");
> +	print_skip("KVM_REG_ARM_TIMER_OFFSET|KVM_ARM_VCPU_TIMER_OFFSET not supported; skipping test");
>  	exit(KSFT_SKIP);
>  }
>  
>  static void setup_system_counter(struct kvm_vm *vm, struct test_case *test)
>  {
> +	uint64_t cntvoff, cntpoff;
>  	struct kvm_one_reg reg = {
>  		.id = KVM_REG_ARM_TIMER_OFFSET,
> -		.addr = (__u64)&test->offset,
> +		.addr = (__u64)&cntvoff,
>  	};
>  
> +	if (test->counter == VIRTUAL) {
> +		cntvoff = test->offset;
> +		cntpoff = 0;
> +	} else {
> +		cntvoff = 0;
> +		cntpoff = test->offset;
> +	}
> +
>  	vcpu_set_reg(vm, VCPU_ID, &reg);
> +	vcpu_access_device_attr(vm, VCPU_ID, KVM_ARM_VCPU_TIMER_CTRL,
> +				KVM_ARM_VCPU_TIMER_OFFSET, &cntpoff, true);
>  }
>  
>  static uint64_t guest_read_system_counter(struct test_case *test)
>  {
>  	switch (test->counter) {
>  	case VIRTUAL:
> +	case PHYSICAL_READ_VIRTUAL:
>  		return read_cntvct_ordered();
> +	case PHYSICAL:
> +		return read_cntpct_ordered();
>  	default:
>  		GUEST_ASSERT(0);
>  	}
> -- 
> 2.32.0.605.g8dce9f2422-goog
>

Reviewed-by: Andrew Jones <drjones@redhat.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (20 preceding siblings ...)
  2021-08-04  8:58 ` [PATCH v6 21/21] selftests: KVM: Add counter emulation benchmark Oliver Upton
@ 2021-08-04 11:05 ` Oliver Upton
  2021-08-04 22:03   ` Oliver Upton
  2021-08-11 13:05 ` Paolo Bonzini
  22 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-04 11:05 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Wed, Aug 4, 2021 at 1:58 AM Oliver Upton <oupton@google.com> wrote:
>
> KVM's current means of saving/restoring system counters is plagued with
> temporal issues. At least on ARM64 and x86, we migrate the guest's
> system counter by-value through the respective guest system register
> values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
> brittle as the state is not idempotent: the host system counter is still
> oscillating between the attempted save and restore. Furthermore, VMMs
> may wish to transparently live migrate guest VMs, meaning that they
> include the elapsed time due to live migration blackout in the guest
> system counter view. The VMM thread could be preempted for any number of
> reasons (scheduler, L0 hypervisor under nested) between the time that
> it calculates the desired guest counter value and when KVM actually sets
> this counter state.
>
> Despite the value-based interface that we present to userspace, KVM
> actually has idempotent guest controls by way of system counter offsets.
> We can avoid all of the issues associated with a value-based interface
> by abstracting these offset controls in new ioctls. This series
> introduces new vCPU device attributes to provide userspace access to the
> vCPU's system counter offset.
>
> Patch 1 addresses a possible race in KVM_GET_CLOCK where
> use_master_clock is read outside of the pvclock_gtod_sync_lock.
>
> Patch 2 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> essential for a VMM to perform precise migration of the guest's system
> counters.
>
> Patches 3-4 are some preparatory changes for exposing the TSC offset to
> userspace. Patch 5 provides a vCPU attribute to provide userspace access
> to the TSC offset.
>
> Patches 6-7 implement a test for the new additions to
> KVM_{GET,SET}_CLOCK.
>
> Patch 8 fixes some assertions in the kvm device attribute helpers.
>
> Patches 9-10 implement at test for the tsc offset attribute introduced in
> patch 5.
>
> Patches 11-12 lay the groundwork for patch 13, which exposes CNTVOFF_EL2
> through the ONE_REG interface.
>
> Patches 14-15 add test cases for userspace manipulation of the virtual
> counter-timer.
>
> Patches 16-17 add a vCPU attribute to adjust the host-guest offset of an
> ARM vCPU, but only implements support for ECV hosts. Patches 18-19 add
> support for non-ECV hosts by emulating physical counter offsetting.
>
> Patch 20 adds test cases for adjusting the host-guest offset, and
> finally patch 21 adds a test to measure the emulation overhead of
> CNTPCT_EL2.
>
> This series was tested on both an Ampere Mt. Jade and Haswell systems.
> Unfortunately, the ECV portions of this series are untested, as there is
> no ECV-capable hardware and the ARM fast models only partially implement
> ECV.

Small correction: I was only using the foundation model. Apparently
the AEM FVP provides full ECV support.

>
> Physical counter benchmark
> --------------------------
>
> The following data was collected by running 10000 iterations of the
> benchmark test from Patch 21 on an Ampere Mt. Jade reference server, A 2S
> machine with 2 80-core Ampere Altra SoCs. Measurements were collected
> for both VHE and nVHE operation using the `kvm-arm.mode=` command-line
> parameter.
>
> nVHE
> ----
>
> +--------------------+--------+---------+
> |       Metric       | Native | Trapped |
> +--------------------+--------+---------+
> | Average            | 54ns   | 148ns   |
> | Standard Deviation | 124ns  | 122ns   |
> | 95th Percentile    | 258ns  | 348ns   |
> +--------------------+--------+---------+
>
> VHE
> ---
>
> +--------------------+--------+---------+
> |       Metric       | Native | Trapped |
> +--------------------+--------+---------+
> | Average            | 53ns   | 152ns   |
> | Standard Deviation | 92ns   | 94ns    |
> | 95th Percentile    | 204ns  | 307ns   |
> +--------------------+--------+---------+
>
> This series applies cleanly to kvm/queue at the following commit:
>
> 6cd974485e25 ("KVM: selftests: Add a test of an unbacked nested PI descriptor")
>
> v1 -> v2:
>   - Reimplemented as vCPU device attributes instead of a distinct ioctl.
>   - Added the (realtime, host_tsc) instant support to KVM_{GET,SET}_CLOCK
>   - Changed the arm64 implementation to broadcast counter
>     offset values to all vCPUs in a guest. This upholds the
>     architectural expectations of a consistent counter-timer across CPUs.
>   - Fixed a bug with traps in VHE mode. We now configure traps on every
>     transition into a guest to handle differing VMs (trapped, emulated).
>
> v2 -> v3:
>   - Added documentation for additions to KVM_{GET,SET}_CLOCK
>   - Added documentation for all new vCPU attributes
>   - Added documentation for suggested algorithm to migrate a guest's
>     TSC(s)
>   - Bug fixes throughout series
>   - Rename KVM_CLOCK_REAL_TIME -> KVM_CLOCK_REALTIME
>
> v3 -> v4:
>   - Added patch to address incorrect device helper assertions (Drew)
>   - Carried Drew's r-b tags where appropriate
>   - x86 selftest cleanup
>   - Removed stale kvm_timer_init_vhe() function
>   - Removed unnecessary GUEST_DONE() from selftests
>
> v4 -> v5:
>   - Fix typo in TSC migration algorithm
>   - Carry more of Drew's r-b tags
>   - clean up run loop logic in counter emulation benchmark (missed from
>     Drew's comments on v3)
>
> v5 -> v6:
>   - Add fix for race in KVM_GET_CLOCK (Sean)
>   - Fix 32-bit build issues in series + use of uninitialized host tsc
>     value (Sean)
>   - General style cleanups
>   - Rework ARM virtual counter offsetting to match guest behavior. Use
>     the ONE_REG interface instead of a VM attribute (Marc)
>   - Maintain a single host-guest counter offset, which applies to both
>     physical and virtual counters
>   - Dropped some of Drew's r-b tags due to nontrivial patch changes
>     (sorry for the churn!)
>
> v1: https://lore.kernel.org/kvm/20210608214742.1897483-1-oupton@google.com/
> v2: https://lore.kernel.org/r/20210716212629.2232756-1-oupton@google.com
> v3: https://lore.kernel.org/r/20210719184949.1385910-1-oupton@google.com
> v4: https://lore.kernel.org/r/20210729001012.70394-1-oupton@google.com
> v5: https://lore.kernel.org/r/20210729173300.181775-1-oupton@google.com
>
> Oliver Upton (21):
>   KVM: x86: Fix potential race in KVM_GET_CLOCK
>   KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK
>   KVM: x86: Take the pvclock sync lock behind the tsc_write_lock
>   KVM: x86: Refactor tsc synchronization code
>   KVM: x86: Expose TSC offset controls to userspace
>   tools: arch: x86: pull in pvclock headers
>   selftests: KVM: Add test for KVM_{GET,SET}_CLOCK
>   selftests: KVM: Fix kvm device helper ioctl assertions
>   selftests: KVM: Add helpers for vCPU device attributes
>   selftests: KVM: Introduce system counter offset test
>   KVM: arm64: Refactor update_vtimer_cntvoff()
>   KVM: arm64: Separate guest/host counter offset values
>   KVM: arm64: Allow userspace to configure a vCPU's virtual offset
>   selftests: KVM: Add helper to check for register presence
>   selftests: KVM: Add support for aarch64 to system_counter_offset_test
>   arm64: cpufeature: Enumerate support for Enhanced Counter
>     Virtualization
>   KVM: arm64: Allow userspace to configure a guest's counter-timer
>     offset
>   KVM: arm64: Configure timer traps in vcpu_load() for VHE
>   KVM: arm64: Emulate physical counter offsetting on non-ECV systems
>   selftests: KVM: Test physical counter offsetting
>   selftests: KVM: Add counter emulation benchmark
>
>  Documentation/virt/kvm/api.rst                |  52 ++-
>  Documentation/virt/kvm/devices/vcpu.rst       |  85 ++++
>  Documentation/virt/kvm/locking.rst            |  11 +
>  arch/arm64/include/asm/kvm_asm.h              |   2 +
>  arch/arm64/include/asm/sysreg.h               |   5 +
>  arch/arm64/include/uapi/asm/kvm.h             |   2 +
>  arch/arm64/kernel/cpufeature.c                |  10 +
>  arch/arm64/kvm/arch_timer.c                   | 224 ++++++++++-
>  arch/arm64/kvm/arm.c                          |   4 +-
>  arch/arm64/kvm/guest.c                        |   6 +-
>  arch/arm64/kvm/hyp/include/hyp/switch.h       |  29 ++
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c            |   6 +
>  arch/arm64/kvm/hyp/nvhe/timer-sr.c            |  16 +-
>  arch/arm64/kvm/hyp/vhe/timer-sr.c             |   5 +
>  arch/arm64/tools/cpucaps                      |   1 +
>  arch/x86/include/asm/kvm_host.h               |   4 +
>  arch/x86/include/uapi/asm/kvm.h               |   4 +
>  arch/x86/kvm/x86.c                            | 364 +++++++++++++-----
>  include/clocksource/arm_arch_timer.h          |   1 +
>  include/kvm/arm_arch_timer.h                  |   6 +-
>  include/uapi/linux/kvm.h                      |   7 +-
>  tools/arch/x86/include/asm/pvclock-abi.h      |  48 +++
>  tools/arch/x86/include/asm/pvclock.h          | 103 +++++
>  tools/testing/selftests/kvm/.gitignore        |   3 +
>  tools/testing/selftests/kvm/Makefile          |   4 +
>  .../kvm/aarch64/counter_emulation_benchmark.c | 207 ++++++++++
>  .../selftests/kvm/include/aarch64/processor.h |  24 ++
>  .../testing/selftests/kvm/include/kvm_util.h  |  13 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    |  63 ++-
>  .../kvm/system_counter_offset_test.c          | 211 ++++++++++
>  .../selftests/kvm/x86_64/kvm_clock_test.c     | 204 ++++++++++
>  31 files changed, 1581 insertions(+), 143 deletions(-)
>  create mode 100644 tools/arch/x86/include/asm/pvclock-abi.h
>  create mode 100644 tools/arch/x86/include/asm/pvclock.h
>  create mode 100644 tools/testing/selftests/kvm/aarch64/counter_emulation_benchmark.c
>  create mode 100644 tools/testing/selftests/kvm/system_counter_offset_test.c
>  create mode 100644 tools/testing/selftests/kvm/x86_64/kvm_clock_test.c
>
> --
> 2.32.0.605.g8dce9f2422-goog
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 19/21] KVM: arm64: Emulate physical counter offsetting on non-ECV systems
  2021-08-04  8:58 ` [PATCH v6 19/21] KVM: arm64: Emulate physical counter offsetting on non-ECV systems Oliver Upton
@ 2021-08-04 11:05   ` Andrew Jones
  2021-08-05  6:27     ` Oliver Upton
  2021-08-10 11:27   ` Marc Zyngier
  1 sibling, 1 reply; 51+ messages in thread
From: Andrew Jones @ 2021-08-04 11:05 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Marc Zyngier,
	Peter Shier, Jim Mattson, David Matlack, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Will Deacon, Catalin Marinas

On Wed, Aug 04, 2021 at 08:58:17AM +0000, Oliver Upton wrote:
> Unfortunately, ECV hasn't yet arrived in any tangible hardware. At the
> same time, controlling the guest view of the physical counter-timer is
> useful. Support guest counter-timer offsetting on non-ECV systems by
> trapping guest accesses to the physical counter-timer. Emulate reads of
> the physical counter in the fast exit path.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/arm64/include/asm/sysreg.h         |  1 +
>  arch/arm64/kvm/arch_timer.c             | 53 +++++++++++++++----------
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 29 ++++++++++++++
>  arch/arm64/kvm/hyp/nvhe/timer-sr.c      | 11 ++++-
>  4 files changed, 70 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index c34672aa65b9..e49790ae5da4 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -505,6 +505,7 @@
>  #define SYS_AMEVCNTR0_MEM_STALL		SYS_AMEVCNTR0_EL0(3)
>  
>  #define SYS_CNTFRQ_EL0			sys_reg(3, 3, 14, 0, 0)
> +#define SYS_CNTPCT_EL0			sys_reg(3, 3, 14, 0, 1)
>  
>  #define SYS_CNTP_TVAL_EL0		sys_reg(3, 3, 14, 2, 0)
>  #define SYS_CNTP_CTL_EL0		sys_reg(3, 3, 14, 2, 1)
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 9ead94aa867d..b7cb63acf2a0 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -51,7 +51,7 @@ static void kvm_arm_timer_write(struct kvm_vcpu *vcpu,
>  static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
>  			      struct arch_timer_context *timer,
>  			      enum kvm_arch_timer_regs treg);
> -static void kvm_timer_enable_traps_vhe(void);
> +static void kvm_timer_enable_traps_vhe(struct kvm_vcpu *vcpu);
>  
>  u32 timer_get_ctl(struct arch_timer_context *ctxt)
>  {
> @@ -175,6 +175,12 @@ static void timer_set_guest_offset(struct arch_timer_context *ctxt, u64 offset)
>  	}
>  }
>  
> +static bool ptimer_emulation_required(struct kvm_vcpu *vcpu)
> +{
> +	return timer_get_offset(vcpu_ptimer(vcpu)) &&
> +			!cpus_have_const_cap(ARM64_ECV);

Whenever I see a static branch check and something else in the same
condition, I always wonder if we could trim a few instructions for
the static branch is false case by testing it first.

> +}
> +
>  u64 kvm_phys_timer_read(void)
>  {
>  	return timecounter->cc->read(timecounter->cc);
> @@ -184,8 +190,13 @@ static void get_timer_map(struct kvm_vcpu *vcpu, struct timer_map *map)
>  {
>  	if (has_vhe()) {
>  		map->direct_vtimer = vcpu_vtimer(vcpu);
> -		map->direct_ptimer = vcpu_ptimer(vcpu);
> -		map->emul_ptimer = NULL;
> +		if (!ptimer_emulation_required(vcpu)) {
> +			map->direct_ptimer = vcpu_ptimer(vcpu);
> +			map->emul_ptimer = NULL;
> +		} else {
> +			map->direct_ptimer = NULL;
> +			map->emul_ptimer = vcpu_ptimer(vcpu);
> +		}
>  	} else {
>  		map->direct_vtimer = vcpu_vtimer(vcpu);
>  		map->direct_ptimer = NULL;
> @@ -671,7 +682,7 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
>  		timer_emulate(map.emul_ptimer);
>  
>  	if (has_vhe())
> -		kvm_timer_enable_traps_vhe();
> +		kvm_timer_enable_traps_vhe(vcpu);
>  }
>  
>  bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu)
> @@ -1392,22 +1403,29 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>   * The host kernel runs at EL2 with HCR_EL2.TGE == 1,
>   * and this makes those bits have no effect for the host kernel execution.
>   */
> -static void kvm_timer_enable_traps_vhe(void)
> +static void kvm_timer_enable_traps_vhe(struct kvm_vcpu *vcpu)
>  {
>  	/* When HCR_EL2.E2H ==1, EL1PCEN and EL1PCTEN are shifted by 10 */
>  	u32 cnthctl_shift = 10;
> -	u64 val;
> +	u64 val, mask;
> +
> +	mask = CNTHCTL_EL1PCEN << cnthctl_shift;
> +	mask |= CNTHCTL_EL1PCTEN << cnthctl_shift;
>  
> -	/*
> -	 * VHE systems allow the guest direct access to the EL1 physical
> -	 * timer/counter.
> -	 */
>  	val = read_sysreg(cnthctl_el2);
> -	val |= (CNTHCTL_EL1PCEN << cnthctl_shift);
> -	val |= (CNTHCTL_EL1PCTEN << cnthctl_shift);
>  
>  	if (cpus_have_const_cap(ARM64_ECV))
>  		val |= CNTHCTL_ECV;
> +
> +	/*
> +	 * VHE systems allow the guest direct access to the EL1 physical
> +	 * timer/counter if offsetting isn't requested on a non-ECV system.
> +	 */
> +	if (ptimer_emulation_required(vcpu))
> +		val &= ~mask;
> +	else
> +		val |= mask;
> +
>  	write_sysreg(val, cnthctl_el2);
>  }
>  
> @@ -1462,9 +1480,6 @@ static int kvm_arm_timer_set_attr_offset(struct kvm_vcpu *vcpu,
>  	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
>  	u64 offset;
>  
> -	if (!cpus_have_const_cap(ARM64_ECV))
> -		return -ENXIO;
> -
>  	if (get_user(offset, uaddr))
>  		return -EFAULT;
>  
> @@ -1513,9 +1528,6 @@ static int kvm_arm_timer_get_attr_offset(struct kvm_vcpu *vcpu,
>  	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
>  	u64 offset;
>  
> -	if (!cpus_have_const_cap(ARM64_ECV))
> -		return -ENXIO;
> -
>  	offset = timer_get_offset(vcpu_ptimer(vcpu));
>  	return put_user(offset, uaddr);
>  }
> @@ -1539,11 +1551,8 @@ int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  	switch (attr->attr) {
>  	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
>  	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
> -		return 0;
>  	case KVM_ARM_VCPU_TIMER_OFFSET:
> -		if (cpus_have_const_cap(ARM64_ECV))
> -			return 0;
> -		break;
> +		return 0;

So now, if userspace wants to know when they're using an emulated
TIMER_OFFSET vs. ECV, then they'll need to check the HWCAP. I guess
that's fair. We should update the selftest to report what it's testing
when the HWCAP is available.

>  	}
>  
>  	return -ENXIO;
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index e4a2f295a394..abd3813a709e 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -15,6 +15,7 @@
>  #include <linux/jump_label.h>
>  #include <uapi/linux/psci.h>
>  
> +#include <kvm/arm_arch_timer.h>
>  #include <kvm/arm_psci.h>
>  
>  #include <asm/barrier.h>
> @@ -405,6 +406,31 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
>  	return true;
>  }
>  
> +static inline u64 __timer_read_cntpct(struct kvm_vcpu *vcpu)
> +{
> +	return __arch_counter_get_cntpct() - vcpu_ptimer(vcpu)->host_offset;
> +}
> +
> +static inline bool __hyp_handle_counter(struct kvm_vcpu *vcpu)
> +{
> +	u32 sysreg;
> +	int rt;
> +	u64 rv;
> +
> +	if (kvm_vcpu_trap_get_class(vcpu) != ESR_ELx_EC_SYS64)
> +		return false;
> +
> +	sysreg = esr_sys64_to_sysreg(kvm_vcpu_get_esr(vcpu));
> +	if (sysreg != SYS_CNTPCT_EL0)
> +		return false;
> +
> +	rt = kvm_vcpu_sys_get_rt(vcpu);
> +	rv = __timer_read_cntpct(vcpu);
> +	vcpu_set_reg(vcpu, rt, rv);
> +	__kvm_skip_instr(vcpu);
> +	return true;
> +}
> +
>  /*
>   * Return true when we were able to fixup the guest exit and should return to
>   * the guest, false when we should restore the host state and return to the
> @@ -439,6 +465,9 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  	if (*exit_code != ARM_EXCEPTION_TRAP)
>  		goto exit;
>  
> +	if (__hyp_handle_counter(vcpu))
> +		goto guest;
> +
>  	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
>  	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 &&
>  	    handle_tx2_tvm(vcpu))
> diff --git a/arch/arm64/kvm/hyp/nvhe/timer-sr.c b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> index 5b8b4cd02506..67236c2e0ba7 100644
> --- a/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> +++ b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> @@ -44,10 +44,17 @@ void __timer_enable_traps(struct kvm_vcpu *vcpu)
>  
>  	/*
>  	 * Disallow physical timer access for the guest
> -	 * Physical counter access is allowed
>  	 */
>  	val = read_sysreg(cnthctl_el2);
>  	val &= ~CNTHCTL_EL1PCEN;
> -	val |= CNTHCTL_EL1PCTEN;
> +
> +	/*
> +	 * Disallow physical counter access for the guest if offsetting is
> +	 * requested on a non-ECV system.
> +	 */
> +	if (vcpu_ptimer(vcpu)->host_offset && !cpus_have_const_cap(ARM64_ECV))

Shouldn't we expose and reuse ptimer_emulation_required() here?

> +		val &= ~CNTHCTL_EL1PCTEN;
> +	else
> +		val |= CNTHCTL_EL1PCTEN;
>  	write_sysreg(val, cnthctl_el2);
>  }
> -- 
> 2.32.0.605.g8dce9f2422-goog
>

Otherwise,

Reviewed-by: Andrew Jones <drjones@redhat.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state
  2021-08-04 11:05 ` [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
@ 2021-08-04 22:03   ` Oliver Upton
  2021-08-10  0:04     ` Oliver Upton
  0 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-04 22:03 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Wed, Aug 4, 2021 at 4:05 AM Oliver Upton <oupton@google.com> wrote:
>
> On Wed, Aug 4, 2021 at 1:58 AM Oliver Upton <oupton@google.com> wrote:
> >
> > KVM's current means of saving/restoring system counters is plagued with
> > temporal issues. At least on ARM64 and x86, we migrate the guest's
> > system counter by-value through the respective guest system register
> > values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
> > brittle as the state is not idempotent: the host system counter is still
> > oscillating between the attempted save and restore. Furthermore, VMMs
> > may wish to transparently live migrate guest VMs, meaning that they
> > include the elapsed time due to live migration blackout in the guest
> > system counter view. The VMM thread could be preempted for any number of
> > reasons (scheduler, L0 hypervisor under nested) between the time that
> > it calculates the desired guest counter value and when KVM actually sets
> > this counter state.
> >
> > Despite the value-based interface that we present to userspace, KVM
> > actually has idempotent guest controls by way of system counter offsets.
> > We can avoid all of the issues associated with a value-based interface
> > by abstracting these offset controls in new ioctls. This series
> > introduces new vCPU device attributes to provide userspace access to the
> > vCPU's system counter offset.
> >
> > Patch 1 addresses a possible race in KVM_GET_CLOCK where
> > use_master_clock is read outside of the pvclock_gtod_sync_lock.
> >
> > Patch 2 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> > ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> > essential for a VMM to perform precise migration of the guest's system
> > counters.
> >
> > Patches 3-4 are some preparatory changes for exposing the TSC offset to
> > userspace. Patch 5 provides a vCPU attribute to provide userspace access
> > to the TSC offset.
> >
> > Patches 6-7 implement a test for the new additions to
> > KVM_{GET,SET}_CLOCK.
> >
> > Patch 8 fixes some assertions in the kvm device attribute helpers.
> >
> > Patches 9-10 implement at test for the tsc offset attribute introduced in
> > patch 5.
> >
> > Patches 11-12 lay the groundwork for patch 13, which exposes CNTVOFF_EL2
> > through the ONE_REG interface.
> >
> > Patches 14-15 add test cases for userspace manipulation of the virtual
> > counter-timer.
> >
> > Patches 16-17 add a vCPU attribute to adjust the host-guest offset of an
> > ARM vCPU, but only implements support for ECV hosts. Patches 18-19 add
> > support for non-ECV hosts by emulating physical counter offsetting.
> >
> > Patch 20 adds test cases for adjusting the host-guest offset, and
> > finally patch 21 adds a test to measure the emulation overhead of
> > CNTPCT_EL2.
> >
> > This series was tested on both an Ampere Mt. Jade and Haswell systems.
> > Unfortunately, the ECV portions of this series are untested, as there is
> > no ECV-capable hardware and the ARM fast models only partially implement
> > ECV.
>
> Small correction: I was only using the foundation model. Apparently
> the AEM FVP provides full ECV support.

Ok. I've now tested this series on the FVP Base RevC fast model@v8.6 +
ECV=2. Passes on VHE, fails on nVHE.

I'll respin this series with the fix for nVHE+ECV soon.

--
Thanks,
Oliver

>
> >
> > Physical counter benchmark
> > --------------------------
> >
> > The following data was collected by running 10000 iterations of the
> > benchmark test from Patch 21 on an Ampere Mt. Jade reference server, A 2S
> > machine with 2 80-core Ampere Altra SoCs. Measurements were collected
> > for both VHE and nVHE operation using the `kvm-arm.mode=` command-line
> > parameter.
> >
> > nVHE
> > ----
> >
> > +--------------------+--------+---------+
> > |       Metric       | Native | Trapped |
> > +--------------------+--------+---------+
> > | Average            | 54ns   | 148ns   |
> > | Standard Deviation | 124ns  | 122ns   |
> > | 95th Percentile    | 258ns  | 348ns   |
> > +--------------------+--------+---------+
> >
> > VHE
> > ---
> >
> > +--------------------+--------+---------+
> > |       Metric       | Native | Trapped |
> > +--------------------+--------+---------+
> > | Average            | 53ns   | 152ns   |
> > | Standard Deviation | 92ns   | 94ns    |
> > | 95th Percentile    | 204ns  | 307ns   |
> > +--------------------+--------+---------+
> >
> > This series applies cleanly to kvm/queue at the following commit:
> >
> > 6cd974485e25 ("KVM: selftests: Add a test of an unbacked nested PI descriptor")
> >
> > v1 -> v2:
> >   - Reimplemented as vCPU device attributes instead of a distinct ioctl.
> >   - Added the (realtime, host_tsc) instant support to KVM_{GET,SET}_CLOCK
> >   - Changed the arm64 implementation to broadcast counter
> >     offset values to all vCPUs in a guest. This upholds the
> >     architectural expectations of a consistent counter-timer across CPUs.
> >   - Fixed a bug with traps in VHE mode. We now configure traps on every
> >     transition into a guest to handle differing VMs (trapped, emulated).
> >
> > v2 -> v3:
> >   - Added documentation for additions to KVM_{GET,SET}_CLOCK
> >   - Added documentation for all new vCPU attributes
> >   - Added documentation for suggested algorithm to migrate a guest's
> >     TSC(s)
> >   - Bug fixes throughout series
> >   - Rename KVM_CLOCK_REAL_TIME -> KVM_CLOCK_REALTIME
> >
> > v3 -> v4:
> >   - Added patch to address incorrect device helper assertions (Drew)
> >   - Carried Drew's r-b tags where appropriate
> >   - x86 selftest cleanup
> >   - Removed stale kvm_timer_init_vhe() function
> >   - Removed unnecessary GUEST_DONE() from selftests
> >
> > v4 -> v5:
> >   - Fix typo in TSC migration algorithm
> >   - Carry more of Drew's r-b tags
> >   - clean up run loop logic in counter emulation benchmark (missed from
> >     Drew's comments on v3)
> >
> > v5 -> v6:
> >   - Add fix for race in KVM_GET_CLOCK (Sean)
> >   - Fix 32-bit build issues in series + use of uninitialized host tsc
> >     value (Sean)
> >   - General style cleanups
> >   - Rework ARM virtual counter offsetting to match guest behavior. Use
> >     the ONE_REG interface instead of a VM attribute (Marc)
> >   - Maintain a single host-guest counter offset, which applies to both
> >     physical and virtual counters
> >   - Dropped some of Drew's r-b tags due to nontrivial patch changes
> >     (sorry for the churn!)
> >
> > v1: https://lore.kernel.org/kvm/20210608214742.1897483-1-oupton@google.com/
> > v2: https://lore.kernel.org/r/20210716212629.2232756-1-oupton@google.com
> > v3: https://lore.kernel.org/r/20210719184949.1385910-1-oupton@google.com
> > v4: https://lore.kernel.org/r/20210729001012.70394-1-oupton@google.com
> > v5: https://lore.kernel.org/r/20210729173300.181775-1-oupton@google.com
> >
> > Oliver Upton (21):
> >   KVM: x86: Fix potential race in KVM_GET_CLOCK
> >   KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK
> >   KVM: x86: Take the pvclock sync lock behind the tsc_write_lock
> >   KVM: x86: Refactor tsc synchronization code
> >   KVM: x86: Expose TSC offset controls to userspace
> >   tools: arch: x86: pull in pvclock headers
> >   selftests: KVM: Add test for KVM_{GET,SET}_CLOCK
> >   selftests: KVM: Fix kvm device helper ioctl assertions
> >   selftests: KVM: Add helpers for vCPU device attributes
> >   selftests: KVM: Introduce system counter offset test
> >   KVM: arm64: Refactor update_vtimer_cntvoff()
> >   KVM: arm64: Separate guest/host counter offset values
> >   KVM: arm64: Allow userspace to configure a vCPU's virtual offset
> >   selftests: KVM: Add helper to check for register presence
> >   selftests: KVM: Add support for aarch64 to system_counter_offset_test
> >   arm64: cpufeature: Enumerate support for Enhanced Counter
> >     Virtualization
> >   KVM: arm64: Allow userspace to configure a guest's counter-timer
> >     offset
> >   KVM: arm64: Configure timer traps in vcpu_load() for VHE
> >   KVM: arm64: Emulate physical counter offsetting on non-ECV systems
> >   selftests: KVM: Test physical counter offsetting
> >   selftests: KVM: Add counter emulation benchmark
> >
> >  Documentation/virt/kvm/api.rst                |  52 ++-
> >  Documentation/virt/kvm/devices/vcpu.rst       |  85 ++++
> >  Documentation/virt/kvm/locking.rst            |  11 +
> >  arch/arm64/include/asm/kvm_asm.h              |   2 +
> >  arch/arm64/include/asm/sysreg.h               |   5 +
> >  arch/arm64/include/uapi/asm/kvm.h             |   2 +
> >  arch/arm64/kernel/cpufeature.c                |  10 +
> >  arch/arm64/kvm/arch_timer.c                   | 224 ++++++++++-
> >  arch/arm64/kvm/arm.c                          |   4 +-
> >  arch/arm64/kvm/guest.c                        |   6 +-
> >  arch/arm64/kvm/hyp/include/hyp/switch.h       |  29 ++
> >  arch/arm64/kvm/hyp/nvhe/hyp-main.c            |   6 +
> >  arch/arm64/kvm/hyp/nvhe/timer-sr.c            |  16 +-
> >  arch/arm64/kvm/hyp/vhe/timer-sr.c             |   5 +
> >  arch/arm64/tools/cpucaps                      |   1 +
> >  arch/x86/include/asm/kvm_host.h               |   4 +
> >  arch/x86/include/uapi/asm/kvm.h               |   4 +
> >  arch/x86/kvm/x86.c                            | 364 +++++++++++++-----
> >  include/clocksource/arm_arch_timer.h          |   1 +
> >  include/kvm/arm_arch_timer.h                  |   6 +-
> >  include/uapi/linux/kvm.h                      |   7 +-
> >  tools/arch/x86/include/asm/pvclock-abi.h      |  48 +++
> >  tools/arch/x86/include/asm/pvclock.h          | 103 +++++
> >  tools/testing/selftests/kvm/.gitignore        |   3 +
> >  tools/testing/selftests/kvm/Makefile          |   4 +
> >  .../kvm/aarch64/counter_emulation_benchmark.c | 207 ++++++++++
> >  .../selftests/kvm/include/aarch64/processor.h |  24 ++
> >  .../testing/selftests/kvm/include/kvm_util.h  |  13 +
> >  tools/testing/selftests/kvm/lib/kvm_util.c    |  63 ++-
> >  .../kvm/system_counter_offset_test.c          | 211 ++++++++++
> >  .../selftests/kvm/x86_64/kvm_clock_test.c     | 204 ++++++++++
> >  31 files changed, 1581 insertions(+), 143 deletions(-)
> >  create mode 100644 tools/arch/x86/include/asm/pvclock-abi.h
> >  create mode 100644 tools/arch/x86/include/asm/pvclock.h
> >  create mode 100644 tools/testing/selftests/kvm/aarch64/counter_emulation_benchmark.c
> >  create mode 100644 tools/testing/selftests/kvm/system_counter_offset_test.c
> >  create mode 100644 tools/testing/selftests/kvm/x86_64/kvm_clock_test.c
> >
> > --
> > 2.32.0.605.g8dce9f2422-goog
> >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 19/21] KVM: arm64: Emulate physical counter offsetting on non-ECV systems
  2021-08-04 11:05   ` Andrew Jones
@ 2021-08-05  6:27     ` Oliver Upton
  0 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-05  6:27 UTC (permalink / raw)
  To: Andrew Jones
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Marc Zyngier,
	Peter Shier, Jim Mattson, David Matlack, Ricardo Koller,
	Jing Zhang, Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Will Deacon, Catalin Marinas

Hi Drew,

On Wed, Aug 4, 2021 at 4:05 AM Andrew Jones <drjones@redhat.com> wrote:
> > +static bool ptimer_emulation_required(struct kvm_vcpu *vcpu)
> > +{
> > +     return timer_get_offset(vcpu_ptimer(vcpu)) &&
> > +                     !cpus_have_const_cap(ARM64_ECV);
>
> Whenever I see a static branch check and something else in the same
> condition, I always wonder if we could trim a few instructions for
> the static branch is false case by testing it first.

Good point, I'll reclaim those few cycles in the next spin ;-)

> > @@ -1539,11 +1551,8 @@ int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >       switch (attr->attr) {
> >       case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
> >       case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
> > -             return 0;
> >       case KVM_ARM_VCPU_TIMER_OFFSET:
> > -             if (cpus_have_const_cap(ARM64_ECV))
> > -                     return 0;
> > -             break;
> > +             return 0;
>
> So now, if userspace wants to know when they're using an emulated
> TIMER_OFFSET vs. ECV, then they'll need to check the HWCAP. I guess
> that's fair. We should update the selftest to report what it's testing
> when the HWCAP is available.
>

Hmm...

I hadn't yet wired up the ECV cpufeature bits to an ELF HWCAP, but
this point is a bit interesting. I can see the argument being made
that we shouldn't have two ELF HWCAP bits for ECV (depending on
partial or full ECV support). ECV=0x1 is most certainly of interest to
userspace, since self-synchronized views of the counter are then
available. However, ECV=0x2 is purely of interest to EL2.

What if we only had only one ELF HWCAP bit for ECV >= 0x1? We could
let userspace read ID_AA64MMFR0_EL1.ECV if it really needs to know
about ECV = 0x2.

> > +     if (vcpu_ptimer(vcpu)->host_offset && !cpus_have_const_cap(ARM64_ECV))
>
> Shouldn't we expose and reuse ptimer_emulation_required() here?
>

Agreed, makes it much cleaner.

> > +             val &= ~CNTHCTL_EL1PCTEN;
> > +     else
> > +             val |= CNTHCTL_EL1PCTEN;
> >       write_sysreg(val, cnthctl_el2);
> >  }
> > --
> > 2.32.0.605.g8dce9f2422-goog
> >
>
> Otherwise,
>
> Reviewed-by: Andrew Jones <drjones@redhat.com>
>

Thanks!

--
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state
  2021-08-04 22:03   ` Oliver Upton
@ 2021-08-10  0:04     ` Oliver Upton
  2021-08-10 12:30       ` Marc Zyngier
  0 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-10  0:04 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Paolo Bonzini, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Wed, Aug 4, 2021 at 3:03 PM Oliver Upton <oupton@google.com> wrote:
>
> On Wed, Aug 4, 2021 at 4:05 AM Oliver Upton <oupton@google.com> wrote:
> >
> > On Wed, Aug 4, 2021 at 1:58 AM Oliver Upton <oupton@google.com> wrote:
> > >
> > > KVM's current means of saving/restoring system counters is plagued with
> > > temporal issues. At least on ARM64 and x86, we migrate the guest's
> > > system counter by-value through the respective guest system register
> > > values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
> > > brittle as the state is not idempotent: the host system counter is still
> > > oscillating between the attempted save and restore. Furthermore, VMMs
> > > may wish to transparently live migrate guest VMs, meaning that they
> > > include the elapsed time due to live migration blackout in the guest
> > > system counter view. The VMM thread could be preempted for any number of
> > > reasons (scheduler, L0 hypervisor under nested) between the time that
> > > it calculates the desired guest counter value and when KVM actually sets
> > > this counter state.
> > >
> > > Despite the value-based interface that we present to userspace, KVM
> > > actually has idempotent guest controls by way of system counter offsets.
> > > We can avoid all of the issues associated with a value-based interface
> > > by abstracting these offset controls in new ioctls. This series
> > > introduces new vCPU device attributes to provide userspace access to the
> > > vCPU's system counter offset.
> > >
> > > Patch 1 addresses a possible race in KVM_GET_CLOCK where
> > > use_master_clock is read outside of the pvclock_gtod_sync_lock.
> > >
> > > Patch 2 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> > > ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> > > essential for a VMM to perform precise migration of the guest's system
> > > counters.
> > >
> > > Patches 3-4 are some preparatory changes for exposing the TSC offset to
> > > userspace. Patch 5 provides a vCPU attribute to provide userspace access
> > > to the TSC offset.
> > >
> > > Patches 6-7 implement a test for the new additions to
> > > KVM_{GET,SET}_CLOCK.
> > >
> > > Patch 8 fixes some assertions in the kvm device attribute helpers.
> > >
> > > Patches 9-10 implement at test for the tsc offset attribute introduced in
> > > patch 5.

Paolo,

Is there anything else you're waiting to see for the x86 portion of
this series after addressing Sean's comments? There's some work
remaining on the arm64 side, though I believe the two architectures
are now disjoint for this series.

--
Thanks,
Oliver

> > > Patches 11-12 lay the groundwork for patch 13, which exposes CNTVOFF_EL2
> > > through the ONE_REG interface.
> > >
> > > Patches 14-15 add test cases for userspace manipulation of the virtual
> > > counter-timer.
> > >
> > > Patches 16-17 add a vCPU attribute to adjust the host-guest offset of an
> > > ARM vCPU, but only implements support for ECV hosts. Patches 18-19 add
> > > support for non-ECV hosts by emulating physical counter offsetting.
> > >
> > > Patch 20 adds test cases for adjusting the host-guest offset, and
> > > finally patch 21 adds a test to measure the emulation overhead of
> > > CNTPCT_EL2.
> > >
> > > This series was tested on both an Ampere Mt. Jade and Haswell systems.
> > > Unfortunately, the ECV portions of this series are untested, as there is
> > > no ECV-capable hardware and the ARM fast models only partially implement
> > > ECV.
> >
> > Small correction: I was only using the foundation model. Apparently
> > the AEM FVP provides full ECV support.
>
> Ok. I've now tested this series on the FVP Base RevC fast model@v8.6 +
> ECV=2. Passes on VHE, fails on nVHE.
>
> I'll respin this series with the fix for nVHE+ECV soon.
>
> --
> Thanks,
> Oliver
>
> >
> > >
> > > Physical counter benchmark
> > > --------------------------
> > >
> > > The following data was collected by running 10000 iterations of the
> > > benchmark test from Patch 21 on an Ampere Mt. Jade reference server, A 2S
> > > machine with 2 80-core Ampere Altra SoCs. Measurements were collected
> > > for both VHE and nVHE operation using the `kvm-arm.mode=` command-line
> > > parameter.
> > >
> > > nVHE
> > > ----
> > >
> > > +--------------------+--------+---------+
> > > |       Metric       | Native | Trapped |
> > > +--------------------+--------+---------+
> > > | Average            | 54ns   | 148ns   |
> > > | Standard Deviation | 124ns  | 122ns   |
> > > | 95th Percentile    | 258ns  | 348ns   |
> > > +--------------------+--------+---------+
> > >
> > > VHE
> > > ---
> > >
> > > +--------------------+--------+---------+
> > > |       Metric       | Native | Trapped |
> > > +--------------------+--------+---------+
> > > | Average            | 53ns   | 152ns   |
> > > | Standard Deviation | 92ns   | 94ns    |
> > > | 95th Percentile    | 204ns  | 307ns   |
> > > +--------------------+--------+---------+
> > >
> > > This series applies cleanly to kvm/queue at the following commit:
> > >
> > > 6cd974485e25 ("KVM: selftests: Add a test of an unbacked nested PI descriptor")
> > >
> > > v1 -> v2:
> > >   - Reimplemented as vCPU device attributes instead of a distinct ioctl.
> > >   - Added the (realtime, host_tsc) instant support to KVM_{GET,SET}_CLOCK
> > >   - Changed the arm64 implementation to broadcast counter
> > >     offset values to all vCPUs in a guest. This upholds the
> > >     architectural expectations of a consistent counter-timer across CPUs.
> > >   - Fixed a bug with traps in VHE mode. We now configure traps on every
> > >     transition into a guest to handle differing VMs (trapped, emulated).
> > >
> > > v2 -> v3:
> > >   - Added documentation for additions to KVM_{GET,SET}_CLOCK
> > >   - Added documentation for all new vCPU attributes
> > >   - Added documentation for suggested algorithm to migrate a guest's
> > >     TSC(s)
> > >   - Bug fixes throughout series
> > >   - Rename KVM_CLOCK_REAL_TIME -> KVM_CLOCK_REALTIME
> > >
> > > v3 -> v4:
> > >   - Added patch to address incorrect device helper assertions (Drew)
> > >   - Carried Drew's r-b tags where appropriate
> > >   - x86 selftest cleanup
> > >   - Removed stale kvm_timer_init_vhe() function
> > >   - Removed unnecessary GUEST_DONE() from selftests
> > >
> > > v4 -> v5:
> > >   - Fix typo in TSC migration algorithm
> > >   - Carry more of Drew's r-b tags
> > >   - clean up run loop logic in counter emulation benchmark (missed from
> > >     Drew's comments on v3)
> > >
> > > v5 -> v6:
> > >   - Add fix for race in KVM_GET_CLOCK (Sean)
> > >   - Fix 32-bit build issues in series + use of uninitialized host tsc
> > >     value (Sean)
> > >   - General style cleanups
> > >   - Rework ARM virtual counter offsetting to match guest behavior. Use
> > >     the ONE_REG interface instead of a VM attribute (Marc)
> > >   - Maintain a single host-guest counter offset, which applies to both
> > >     physical and virtual counters
> > >   - Dropped some of Drew's r-b tags due to nontrivial patch changes
> > >     (sorry for the churn!)
> > >
> > > v1: https://lore.kernel.org/kvm/20210608214742.1897483-1-oupton@google.com/
> > > v2: https://lore.kernel.org/r/20210716212629.2232756-1-oupton@google.com
> > > v3: https://lore.kernel.org/r/20210719184949.1385910-1-oupton@google.com
> > > v4: https://lore.kernel.org/r/20210729001012.70394-1-oupton@google.com
> > > v5: https://lore.kernel.org/r/20210729173300.181775-1-oupton@google.com
> > >
> > > Oliver Upton (21):
> > >   KVM: x86: Fix potential race in KVM_GET_CLOCK
> > >   KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK
> > >   KVM: x86: Take the pvclock sync lock behind the tsc_write_lock
> > >   KVM: x86: Refactor tsc synchronization code
> > >   KVM: x86: Expose TSC offset controls to userspace
> > >   tools: arch: x86: pull in pvclock headers
> > >   selftests: KVM: Add test for KVM_{GET,SET}_CLOCK
> > >   selftests: KVM: Fix kvm device helper ioctl assertions
> > >   selftests: KVM: Add helpers for vCPU device attributes
> > >   selftests: KVM: Introduce system counter offset test
> > >   KVM: arm64: Refactor update_vtimer_cntvoff()
> > >   KVM: arm64: Separate guest/host counter offset values
> > >   KVM: arm64: Allow userspace to configure a vCPU's virtual offset
> > >   selftests: KVM: Add helper to check for register presence
> > >   selftests: KVM: Add support for aarch64 to system_counter_offset_test
> > >   arm64: cpufeature: Enumerate support for Enhanced Counter
> > >     Virtualization
> > >   KVM: arm64: Allow userspace to configure a guest's counter-timer
> > >     offset
> > >   KVM: arm64: Configure timer traps in vcpu_load() for VHE
> > >   KVM: arm64: Emulate physical counter offsetting on non-ECV systems
> > >   selftests: KVM: Test physical counter offsetting
> > >   selftests: KVM: Add counter emulation benchmark
> > >
> > >  Documentation/virt/kvm/api.rst                |  52 ++-
> > >  Documentation/virt/kvm/devices/vcpu.rst       |  85 ++++
> > >  Documentation/virt/kvm/locking.rst            |  11 +
> > >  arch/arm64/include/asm/kvm_asm.h              |   2 +
> > >  arch/arm64/include/asm/sysreg.h               |   5 +
> > >  arch/arm64/include/uapi/asm/kvm.h             |   2 +
> > >  arch/arm64/kernel/cpufeature.c                |  10 +
> > >  arch/arm64/kvm/arch_timer.c                   | 224 ++++++++++-
> > >  arch/arm64/kvm/arm.c                          |   4 +-
> > >  arch/arm64/kvm/guest.c                        |   6 +-
> > >  arch/arm64/kvm/hyp/include/hyp/switch.h       |  29 ++
> > >  arch/arm64/kvm/hyp/nvhe/hyp-main.c            |   6 +
> > >  arch/arm64/kvm/hyp/nvhe/timer-sr.c            |  16 +-
> > >  arch/arm64/kvm/hyp/vhe/timer-sr.c             |   5 +
> > >  arch/arm64/tools/cpucaps                      |   1 +
> > >  arch/x86/include/asm/kvm_host.h               |   4 +
> > >  arch/x86/include/uapi/asm/kvm.h               |   4 +
> > >  arch/x86/kvm/x86.c                            | 364 +++++++++++++-----
> > >  include/clocksource/arm_arch_timer.h          |   1 +
> > >  include/kvm/arm_arch_timer.h                  |   6 +-
> > >  include/uapi/linux/kvm.h                      |   7 +-
> > >  tools/arch/x86/include/asm/pvclock-abi.h      |  48 +++
> > >  tools/arch/x86/include/asm/pvclock.h          | 103 +++++
> > >  tools/testing/selftests/kvm/.gitignore        |   3 +
> > >  tools/testing/selftests/kvm/Makefile          |   4 +
> > >  .../kvm/aarch64/counter_emulation_benchmark.c | 207 ++++++++++
> > >  .../selftests/kvm/include/aarch64/processor.h |  24 ++
> > >  .../testing/selftests/kvm/include/kvm_util.h  |  13 +
> > >  tools/testing/selftests/kvm/lib/kvm_util.c    |  63 ++-
> > >  .../kvm/system_counter_offset_test.c          | 211 ++++++++++
> > >  .../selftests/kvm/x86_64/kvm_clock_test.c     | 204 ++++++++++
> > >  31 files changed, 1581 insertions(+), 143 deletions(-)
> > >  create mode 100644 tools/arch/x86/include/asm/pvclock-abi.h
> > >  create mode 100644 tools/arch/x86/include/asm/pvclock.h
> > >  create mode 100644 tools/testing/selftests/kvm/aarch64/counter_emulation_benchmark.c
> > >  create mode 100644 tools/testing/selftests/kvm/system_counter_offset_test.c
> > >  create mode 100644 tools/testing/selftests/kvm/x86_64/kvm_clock_test.c
> > >
> > > --
> > > 2.32.0.605.g8dce9f2422-goog
> > >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset
  2021-08-04  8:58 ` [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset Oliver Upton
  2021-08-04 10:20   ` Andrew Jones
@ 2021-08-10  9:35   ` Marc Zyngier
  2021-08-10  9:44     ` Oliver Upton
  1 sibling, 1 reply; 51+ messages in thread
From: Marc Zyngier @ 2021-08-10  9:35 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Wed, 04 Aug 2021 09:58:11 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> Allow userspace to access the guest's virtual counter-timer offset
> through the ONE_REG interface. The value read or written is defined to
> be an offset from the guest's physical counter-timer. Add some
> documentation to clarify how a VMM should use this and the existing
> CNTVCT_EL0.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  Documentation/virt/kvm/api.rst    | 10 ++++++++++
>  arch/arm64/include/uapi/asm/kvm.h |  1 +
>  arch/arm64/kvm/arch_timer.c       | 11 +++++++++++
>  arch/arm64/kvm/guest.c            |  6 +++++-
>  include/kvm/arm_arch_timer.h      |  1 +
>  5 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 8d4a3471ad9e..28a65dc89985 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -2487,6 +2487,16 @@ arm64 system registers have the following id bit patterns::
>       derived from the register encoding for CNTV_CVAL_EL0.  As this is
>       API, it must remain this way.
>  
> +.. warning::
> +
> +     The value of KVM_REG_ARM_TIMER_OFFSET is defined as an offset from
> +     the guest's view of the physical counter-timer.
> +
> +     Userspace should use either KVM_REG_ARM_TIMER_OFFSET or
> +     KVM_REG_ARM_TIMER_CVAL to pause and resume a guest's virtual

You probably mean KVM_REG_ARM_TIMER_CNT here, despite the broken
encoding.

> +     counter-timer. Mixed use of these registers could result in an
> +     unpredictable guest counter value.
> +
>  arm64 firmware pseudo-registers have the following bit pattern::
>  
>    0x6030 0000 0014 <regno:16>
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index b3edde68bc3e..949a31bc10f0 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -255,6 +255,7 @@ struct kvm_arm_copy_mte_tags {
>  #define KVM_REG_ARM_TIMER_CTL		ARM64_SYS_REG(3, 3, 14, 3, 1)
>  #define KVM_REG_ARM_TIMER_CVAL		ARM64_SYS_REG(3, 3, 14, 0, 2)
>  #define KVM_REG_ARM_TIMER_CNT		ARM64_SYS_REG(3, 3, 14, 3, 2)
> +#define KVM_REG_ARM_TIMER_OFFSET	ARM64_SYS_REG(3, 4, 14, 0, 3)

I don't think we can use the encoding for CNTPOFF_EL2 here, as it will
eventually clash with a NV guest using the same feature for its own
purpose. We don't want this offset to overlap with any of the existing
features.

I actually liked your previous proposal of controlling the physical
offset via a device property, as it clearly indicated that you were
dealing with non-architectural state.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 16/21] arm64: cpufeature: Enumerate support for Enhanced Counter Virtualization
  2021-08-04  8:58 ` [PATCH v6 16/21] arm64: cpufeature: Enumerate support for Enhanced Counter Virtualization Oliver Upton
@ 2021-08-10  9:38   ` Marc Zyngier
  0 siblings, 0 replies; 51+ messages in thread
From: Marc Zyngier @ 2021-08-10  9:38 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Wed, 04 Aug 2021 09:58:14 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> Introduce a new cpucap to indicate if the system supports full enhanced
> counter virtualization (i.e. ID_AA64MMFR0_EL1.ECV==0x2).
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/arm64/include/asm/sysreg.h |  2 ++
>  arch/arm64/kernel/cpufeature.c  | 10 ++++++++++
>  arch/arm64/tools/cpucaps        |  1 +
>  3 files changed, 13 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 7b9c3acba684..4dfc44066dfb 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -847,6 +847,8 @@
>  #define ID_AA64MMFR0_ASID_SHIFT		4
>  #define ID_AA64MMFR0_PARANGE_SHIFT	0
>  
> +#define ID_AA64MMFR0_ECV_VIRT		0x1
> +#define ID_AA64MMFR0_ECV_PHYS		0x2
>  #define ID_AA64MMFR0_TGRAN4_NI		0xf
>  #define ID_AA64MMFR0_TGRAN4_SUPPORTED	0x0
>  #define ID_AA64MMFR0_TGRAN64_NI		0xf
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 0ead8bfedf20..94c349e179d3 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2301,6 +2301,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.matches = has_cpuid_feature,
>  		.min_field_value = 1,
>  	},
> +	{
> +		.desc = "Enhanced Counter Virtualization (Physical)",
> +		.capability = ARM64_ECV,
> +		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
> +		.sys_reg = SYS_ID_AA64MMFR0_EL1,
> +		.sign = FTR_UNSIGNED,
> +		.field_pos = ID_AA64MMFR0_ECV_SHIFT,
> +		.matches = has_cpuid_feature,
> +		.min_field_value = ID_AA64MMFR0_ECV_PHYS,
> +	},
>  	{},
>  };
>  
> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
> index 49305c2e6dfd..d819ea614da5 100644
> --- a/arch/arm64/tools/cpucaps
> +++ b/arch/arm64/tools/cpucaps
> @@ -3,6 +3,7 @@
>  # Internal CPU capabilities constants, keep this list sorted
>  
>  BTI
> +ECV
>  # Unreliable: use system_supports_32bit_el0() instead.
>  HAS_32BIT_EL0_DO_NOT_USE
>  HAS_32BIT_EL1

As discussed in another context, we probably want both ECV and ECV2 to
distinguish the two feature sets that ECV has so far.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset
  2021-08-10  9:35   ` Marc Zyngier
@ 2021-08-10  9:44     ` Oliver Upton
  2021-08-11 15:22       ` Marc Zyngier
  0 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-10  9:44 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Tue, Aug 10, 2021 at 2:35 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 04 Aug 2021 09:58:11 +0100,
> Oliver Upton <oupton@google.com> wrote:
> >
> > Allow userspace to access the guest's virtual counter-timer offset
> > through the ONE_REG interface. The value read or written is defined to
> > be an offset from the guest's physical counter-timer. Add some
> > documentation to clarify how a VMM should use this and the existing
> > CNTVCT_EL0.
> >
> > Signed-off-by: Oliver Upton <oupton@google.com>
> > ---
> >  Documentation/virt/kvm/api.rst    | 10 ++++++++++
> >  arch/arm64/include/uapi/asm/kvm.h |  1 +
> >  arch/arm64/kvm/arch_timer.c       | 11 +++++++++++
> >  arch/arm64/kvm/guest.c            |  6 +++++-
> >  include/kvm/arm_arch_timer.h      |  1 +
> >  5 files changed, 28 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 8d4a3471ad9e..28a65dc89985 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -2487,6 +2487,16 @@ arm64 system registers have the following id bit patterns::
> >       derived from the register encoding for CNTV_CVAL_EL0.  As this is
> >       API, it must remain this way.
> >
> > +.. warning::
> > +
> > +     The value of KVM_REG_ARM_TIMER_OFFSET is defined as an offset from
> > +     the guest's view of the physical counter-timer.
> > +
> > +     Userspace should use either KVM_REG_ARM_TIMER_OFFSET or
> > +     KVM_REG_ARM_TIMER_CVAL to pause and resume a guest's virtual
>
> You probably mean KVM_REG_ARM_TIMER_CNT here, despite the broken
> encoding.

Indeed I do!

>
> > +     counter-timer. Mixed use of these registers could result in an
> > +     unpredictable guest counter value.
> > +
> >  arm64 firmware pseudo-registers have the following bit pattern::
> >
> >    0x6030 0000 0014 <regno:16>
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index b3edde68bc3e..949a31bc10f0 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -255,6 +255,7 @@ struct kvm_arm_copy_mte_tags {
> >  #define KVM_REG_ARM_TIMER_CTL                ARM64_SYS_REG(3, 3, 14, 3, 1)
> >  #define KVM_REG_ARM_TIMER_CVAL               ARM64_SYS_REG(3, 3, 14, 0, 2)
> >  #define KVM_REG_ARM_TIMER_CNT                ARM64_SYS_REG(3, 3, 14, 3, 2)
> > +#define KVM_REG_ARM_TIMER_OFFSET     ARM64_SYS_REG(3, 4, 14, 0, 3)
>
> I don't think we can use the encoding for CNTPOFF_EL2 here, as it will
> eventually clash with a NV guest using the same feature for its own
> purpose. We don't want this offset to overlap with any of the existing
> features.
>
> I actually liked your previous proposal of controlling the physical
> offset via a device property, as it clearly indicated that you were
> dealing with non-architectural state.

That's actually exactly what I did here :) That said, the macro name
is horribly obfuscated from CNTVOFF_EL2. I did this for the sake of
symmetry with other virtual counter-timer registers above, though this
may warrant special casing given the fact that we have a similarly
named device attribute to handle the physical offset.

--
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset
  2021-08-04  8:58 ` [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset Oliver Upton
  2021-08-04 10:17   ` Andrew Jones
@ 2021-08-10 10:56   ` Marc Zyngier
  2021-08-10 17:55     ` Oliver Upton
  1 sibling, 1 reply; 51+ messages in thread
From: Marc Zyngier @ 2021-08-10 10:56 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Wed, 04 Aug 2021 09:58:15 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> Presently, KVM provides no facilities for correctly migrating a guest
> that depends on the physical counter-timer. Whie most guests (barring

nit: While

> NV, of course) should not depend on the physical counter-timer, an
> operator may wish to provide a consistent view of the physical
> counter-timer across migrations.
> 
> Provide userspace with a new vCPU attribute to modify the guest
> counter-timer offset. Unlike KVM_REG_ARM_TIMER_OFFSET, this attribute is
> hidden from the guest's architectural state. The value offsets *both*
> the virtual and physical counter-timer views for the guest. Only support
> this attribute on ECV systems as ECV is required for hardware offsetting
> of the physical counter-timer.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  Documentation/virt/kvm/devices/vcpu.rst |  28 ++++++
>  arch/arm64/include/asm/kvm_asm.h        |   2 +
>  arch/arm64/include/asm/sysreg.h         |   2 +
>  arch/arm64/include/uapi/asm/kvm.h       |   1 +
>  arch/arm64/kvm/arch_timer.c             | 122 +++++++++++++++++++++++-
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c      |   6 ++
>  arch/arm64/kvm/hyp/nvhe/timer-sr.c      |   5 +
>  arch/arm64/kvm/hyp/vhe/timer-sr.c       |   5 +
>  include/clocksource/arm_arch_timer.h    |   1 +
>  9 files changed, 169 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> index 3b399d727c11..3ba35b9d9d03 100644
> --- a/Documentation/virt/kvm/devices/vcpu.rst
> +++ b/Documentation/virt/kvm/devices/vcpu.rst
> @@ -139,6 +139,34 @@ configured values on other VCPUs.  Userspace should configure the interrupt
>  numbers on at least one VCPU after creating all VCPUs and before running any
>  VCPUs.
>  
> +2.2. ATTRIBUTE: KVM_ARM_VCPU_TIMER_OFFSET
> +-----------------------------------------
> +
> +:Parameters: in kvm_device_attr.addr the address for the timer offset is a
> +             pointer to a __u64
> +
> +Returns:
> +
> +	 ======= ==================================
> +	 -EFAULT Error reading/writing the provided
> +		 parameter address
> +	 -ENXIO  Timer offsetting not implemented
> +	 ======= ==================================
> +
> +Specifies the guest's counter-timer offset from the host's virtual counter.
> +The guest's physical counter value is then derived by the following
> +equation:
> +
> +  guest_cntpct = host_cntvct - KVM_ARM_VCPU_TIMER_OFFSET
> +
> +The guest's virtual counter value is derived by the following equation:
> +
> +  guest_cntvct = host_cntvct - KVM_REG_ARM_TIMER_OFFSET
> +			- KVM_ARM_VCPU_TIMER_OFFSET
> +
> +KVM does not allow the use of varying offset values for different vCPUs;
> +the last written offset value will be broadcasted to all vCPUs in a VM.
> +
>  3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL
>  ==================================
>  
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 9f0bf2109be7..ab1c8fdb0177 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -65,6 +65,7 @@
>  #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
>  #define __KVM_HOST_SMCCC_FUNC___pkvm_mark_hyp			20
>  #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			21
> +#define __KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntpoff		22
>  
>  #ifndef __ASSEMBLY__
>  
> @@ -200,6 +201,7 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa,
>  extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu);
>  
>  extern void __kvm_timer_set_cntvoff(u64 cntvoff);
> +extern void __kvm_timer_set_cntpoff(u64 cntpoff);
>  
>  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
>  
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 4dfc44066dfb..c34672aa65b9 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -586,6 +586,8 @@
>  #define SYS_ICH_LR14_EL2		__SYS__LR8_EL2(6)
>  #define SYS_ICH_LR15_EL2		__SYS__LR8_EL2(7)
>  
> +#define SYS_CNTPOFF_EL2			sys_reg(3, 4, 14, 0, 6)
> +
>  /* VHE encodings for architectural EL0/1 system registers */
>  #define SYS_SCTLR_EL12			sys_reg(3, 5, 1, 0, 0)
>  #define SYS_CPACR_EL12			sys_reg(3, 5, 1, 0, 2)
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 949a31bc10f0..15150f8224a1 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -366,6 +366,7 @@ struct kvm_arm_copy_mte_tags {
>  #define KVM_ARM_VCPU_TIMER_CTRL		1
>  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
>  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
> +#define   KVM_ARM_VCPU_TIMER_OFFSET		2
>  #define KVM_ARM_VCPU_PVTIME_CTRL	2
>  #define   KVM_ARM_VCPU_PVTIME_IPA	0
>  
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index a8815b09da3e..f15058612994 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -85,11 +85,15 @@ u64 timer_get_cval(struct arch_timer_context *ctxt)
>  static u64 timer_get_offset(struct arch_timer_context *ctxt)
>  {
>  	struct kvm_vcpu *vcpu = ctxt->vcpu;
> +	struct arch_timer_cpu *timer = vcpu_timer(vcpu);

Unused variable?

>  
>  	switch(arch_timer_ctx_index(ctxt)) {
>  	case TIMER_VTIMER:
> +	case TIMER_PTIMER:
>  		return ctxt->host_offset;
>  	default:
> +		WARN_ONCE(1, "unrecognized timer %ld\n",
> +			  arch_timer_ctx_index(ctxt));
>  		return 0;
>  	}
>  }
> @@ -144,6 +148,7 @@ static void timer_set_offset(struct arch_timer_context *ctxt, u64 offset)
>  
>  	switch(arch_timer_ctx_index(ctxt)) {
>  	case TIMER_VTIMER:
> +	case TIMER_PTIMER:
>  		ctxt->host_offset = offset;
>  		break;
>  	default:
> @@ -572,6 +577,11 @@ static void set_cntvoff(u64 cntvoff)
>  	kvm_call_hyp(__kvm_timer_set_cntvoff, cntvoff);
>  }
>  
> +static void set_cntpoff(u64 cntpoff)
> +{
> +	kvm_call_hyp(__kvm_timer_set_cntpoff, cntpoff);
> +}
> +
>  static inline void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active)
>  {
>  	int r;
> @@ -647,6 +657,8 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
>  	}
>  
>  	set_cntvoff(timer_get_offset(map.direct_vtimer));
> +	if (cpus_have_const_cap(ARM64_ECV))

This really should be a final cap instead (same for all the other use
cases).

> +		set_cntpoff(timer_get_offset(vcpu_ptimer(vcpu)));

However, tripping to EL2 for each offset on nVHE may prove to be an
unnecessary overhead. Not a problem for now anyway.

>
>  	kvm_timer_unblocking(vcpu);
>  
> @@ -814,6 +826,22 @@ static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
>  	mutex_unlock(&kvm->lock);
>  }
>  
> +static void update_ptimer_cntpoff(struct kvm_vcpu *vcpu, u64 cntpoff)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +	u64 cntvoff;
> +
> +	mutex_lock(&kvm->lock);
> +
> +	/* adjustments to the physical offset also affect vtimer */
> +	cntvoff = timer_get_offset(vcpu_vtimer(vcpu));
> +	cntvoff += cntpoff - timer_get_offset(vcpu_ptimer(vcpu));
> +
> +	update_timer_offset(vcpu, TIMER_PTIMER, cntpoff, false);
> +	update_timer_offset(vcpu, TIMER_VTIMER, cntvoff, false);
> +	mutex_unlock(&kvm->lock);
> +}
> +
>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>  {
>  	struct arch_timer_cpu *timer = vcpu_timer(vcpu);
> @@ -932,6 +960,29 @@ u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
>  	return (u64)-1;
>  }
>  
> +/**
> + * kvm_arm_timer_read_offset - returns the guest value of CNTVOFF_EL2.
> + * @vcpu: the vcpu pointer
> + *
> + * Computes the guest value of CNTVOFF_EL2 by subtracting the physical
> + * counter offset. Note that KVM defines CNTVOFF_EL2 as the offset from the
> + * guest's physical counter-timer, not the host's.
> + *
> + * Returns: the guest value for CNTVOFF_EL2
> + */
> +static u64 kvm_arm_timer_read_offset(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +	u64 offset;
> +
> +	mutex_lock(&kvm->lock);
> +	offset = timer_get_offset(vcpu_vtimer(vcpu)) -
> +			timer_get_offset(vcpu_ptimer(vcpu));

nit: please keep this on a single line.

> +	mutex_unlock(&kvm->lock);
> +
> +	return offset;
> +}
> +
>  static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
>  			      struct arch_timer_context *timer,
>  			      enum kvm_arch_timer_regs treg)
> @@ -957,7 +1008,7 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
>  		break;
>  
>  	case TIMER_REG_OFFSET:
> -		val = timer_get_offset(timer);
> +		val = kvm_arm_timer_read_offset(vcpu);
>  		break;
>  
>  	default:
> @@ -1350,6 +1401,9 @@ void kvm_timer_init_vhe(void)
>  	val = read_sysreg(cnthctl_el2);
>  	val |= (CNTHCTL_EL1PCEN << cnthctl_shift);
>  	val |= (CNTHCTL_EL1PCTEN << cnthctl_shift);
> +
> +	if (cpus_have_const_cap(ARM64_ECV))
> +		val |= CNTHCTL_ECV;

I cannot immediately see where you are doing the equivalent enablement
of ECV on the nVHE path. Obviously, it has to be done eagerly from
EL2, together with the rest of the EL1 timer setup. Something like:

diff --git a/arch/arm64/kvm/hyp/nvhe/timer-sr.c b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
index 9072e71693ba..999931fe55d2 100644
--- a/arch/arm64/kvm/hyp/nvhe/timer-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
@@ -26,6 +26,8 @@ void __timer_disable_traps(struct kvm_vcpu *vcpu)
 	/* Allow physical timer/counter access for the host */
 	val = read_sysreg(cnthctl_el2);
 	val |= CNTHCTL_EL1PCTEN | CNTHCTL_EL1PCEN;
+	if (cpus_have_final_cap(ARM64_ECV))
+		val |= CNTHCTL_ECV;
 	write_sysreg(val, cnthctl_el2);
 }
 
@@ -42,6 +44,8 @@ void __timer_enable_traps(struct kvm_vcpu *vcpu)
 	 * Physical counter access is allowed
 	 */
 	val = read_sysreg(cnthctl_el2);
+	if (cpus_have_final_cap(ARM64_ECV))
+		val &= ~CNTHCTL_ECV;
 	val &= ~CNTHCTL_EL1PCEN;
 	val |= CNTHCTL_EL1PCTEN;
 	write_sysreg(val, cnthctl_el2);

This will ensure that only the guest sees the physical offset.

>  	write_sysreg(val, cnthctl_el2);
>  }
>  
> @@ -1364,7 +1418,8 @@ static void set_timer_irqs(struct kvm *kvm, int vtimer_irq, int ptimer_irq)
>  	}
>  }
>  
> -int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> +static int kvm_arm_timer_set_attr_irq(struct kvm_vcpu *vcpu,
> +				      struct kvm_device_attr *attr)
>  {
>  	int __user *uaddr = (int __user *)(long)attr->addr;
>  	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
> @@ -1397,7 +1452,37 @@ int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  	return 0;
>  }
>  
> -int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> +static int kvm_arm_timer_set_attr_offset(struct kvm_vcpu *vcpu,
> +					 struct kvm_device_attr *attr)
> +{
> +	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
> +	u64 offset;
> +
> +	if (!cpus_have_const_cap(ARM64_ECV))
> +		return -ENXIO;
> +
> +	if (get_user(offset, uaddr))
> +		return -EFAULT;
> +
> +	update_ptimer_cntpoff(vcpu, offset);
> +	return 0;
> +}
> +
> +int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> +{
> +	switch (attr->attr) {
> +	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
> +	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
> +		return kvm_arm_timer_set_attr_irq(vcpu, attr);
> +	case KVM_ARM_VCPU_TIMER_OFFSET:
> +		return kvm_arm_timer_set_attr_offset(vcpu, attr);
> +	default:
> +		return -ENXIO;
> +	}
> +}
> +
> +static int kvm_arm_timer_get_attr_irq(struct kvm_vcpu *vcpu,
> +				      struct kvm_device_attr *attr)
>  {
>  	int __user *uaddr = (int __user *)(long)attr->addr;
>  	struct arch_timer_context *timer;
> @@ -1418,12 +1503,43 @@ int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  	return put_user(irq, uaddr);
>  }
>  
> +static int kvm_arm_timer_get_attr_offset(struct kvm_vcpu *vcpu,
> +					 struct kvm_device_attr *attr)
> +{
> +	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
> +	u64 offset;
> +
> +	if (!cpus_have_const_cap(ARM64_ECV))
> +		return -ENXIO;
> +
> +	offset = timer_get_offset(vcpu_ptimer(vcpu));
> +	return put_user(offset, uaddr);
> +}
> +
> +int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu,
> +			   struct kvm_device_attr *attr)
> +{
> +	switch (attr->attr) {
> +	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
> +	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
> +		return kvm_arm_timer_get_attr_irq(vcpu, attr);
> +	case KVM_ARM_VCPU_TIMER_OFFSET:
> +		return kvm_arm_timer_get_attr_offset(vcpu, attr);
> +	default:
> +		return -ENXIO;
> +	}
> +}
> +
>  int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  {
>  	switch (attr->attr) {
>  	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
>  	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
>  		return 0;
> +	case KVM_ARM_VCPU_TIMER_OFFSET:
> +		if (cpus_have_const_cap(ARM64_ECV))
> +			return 0;
> +		break;
>  	}
>  
>  	return -ENXIO;
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 1632f001f4ed..cfa923df3af6 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -68,6 +68,11 @@ static void handle___kvm_timer_set_cntvoff(struct kvm_cpu_context *host_ctxt)
>  	__kvm_timer_set_cntvoff(cpu_reg(host_ctxt, 1));
>  }
>  
> +static void handle___kvm_timer_set_cntpoff(struct kvm_cpu_context *host_ctxt)
> +{
> +	__kvm_timer_set_cntpoff(cpu_reg(host_ctxt, 1));
> +}
> +
>  static void handle___kvm_enable_ssbs(struct kvm_cpu_context *host_ctxt)
>  {
>  	u64 tmp;
> @@ -197,6 +202,7 @@ static const hcall_t host_hcall[] = {
>  	HANDLE_FUNC(__pkvm_create_private_mapping),
>  	HANDLE_FUNC(__pkvm_prot_finalize),
>  	HANDLE_FUNC(__pkvm_mark_hyp),
> +	HANDLE_FUNC(__kvm_timer_set_cntpoff),
>  };
>  
>  static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
> diff --git a/arch/arm64/kvm/hyp/nvhe/timer-sr.c b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> index 9072e71693ba..5b8b4cd02506 100644
> --- a/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> +++ b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> @@ -15,6 +15,11 @@ void __kvm_timer_set_cntvoff(u64 cntvoff)
>  	write_sysreg(cntvoff, cntvoff_el2);
>  }
>  
> +void __kvm_timer_set_cntpoff(u64 cntpoff)
> +{
> +	write_sysreg_s(cntpoff, SYS_CNTPOFF_EL2);
> +}
> +
>  /*
>   * Should only be called on non-VHE systems.
>   * VHE systems use EL2 timers and configure EL1 timers in kvm_timer_init_vhe().
> diff --git a/arch/arm64/kvm/hyp/vhe/timer-sr.c b/arch/arm64/kvm/hyp/vhe/timer-sr.c
> index 4cda674a8be6..231e16a071a5 100644
> --- a/arch/arm64/kvm/hyp/vhe/timer-sr.c
> +++ b/arch/arm64/kvm/hyp/vhe/timer-sr.c
> @@ -10,3 +10,8 @@ void __kvm_timer_set_cntvoff(u64 cntvoff)
>  {
>  	write_sysreg(cntvoff, cntvoff_el2);
>  }
> +
> +void __kvm_timer_set_cntpoff(u64 cntpoff)
> +{
> +	write_sysreg_s(cntpoff, SYS_CNTPOFF_EL2);
> +}
> diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
> index 73c7139c866f..7252ffa3d675 100644
> --- a/include/clocksource/arm_arch_timer.h
> +++ b/include/clocksource/arm_arch_timer.h
> @@ -21,6 +21,7 @@
>  #define CNTHCTL_EVNTEN			(1 << 2)
>  #define CNTHCTL_EVNTDIR			(1 << 3)
>  #define CNTHCTL_EVNTI			(0xF << 4)
> +#define CNTHCTL_ECV			(1 << 12)
>  
>  enum arch_timer_reg {
>  	ARCH_TIMER_REG_CTRL,

You also want to document that SCR_EL3.ECVEn has to be set to 1 for
this to work (see Documentation/arm64/booting.txt). And if it isn't,
the firmware better handle the CNTPOFF_EL2 traps correctly...

What firmware did you use for this? I think we need to update the boot
wrapper, but that's something that can be done in parallel.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 19/21] KVM: arm64: Emulate physical counter offsetting on non-ECV systems
  2021-08-04  8:58 ` [PATCH v6 19/21] KVM: arm64: Emulate physical counter offsetting on non-ECV systems Oliver Upton
  2021-08-04 11:05   ` Andrew Jones
@ 2021-08-10 11:27   ` Marc Zyngier
  1 sibling, 0 replies; 51+ messages in thread
From: Marc Zyngier @ 2021-08-10 11:27 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Wed, 04 Aug 2021 09:58:17 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> Unfortunately, ECV hasn't yet arrived in any tangible hardware. At the
> same time, controlling the guest view of the physical counter-timer is
> useful. Support guest counter-timer offsetting on non-ECV systems by
> trapping guest accesses to the physical counter-timer. Emulate reads of
> the physical counter in the fast exit path.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/arm64/include/asm/sysreg.h         |  1 +
>  arch/arm64/kvm/arch_timer.c             | 53 +++++++++++++++----------
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 29 ++++++++++++++
>  arch/arm64/kvm/hyp/nvhe/timer-sr.c      | 11 ++++-
>  4 files changed, 70 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index c34672aa65b9..e49790ae5da4 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -505,6 +505,7 @@
>  #define SYS_AMEVCNTR0_MEM_STALL		SYS_AMEVCNTR0_EL0(3)
>  
>  #define SYS_CNTFRQ_EL0			sys_reg(3, 3, 14, 0, 0)
> +#define SYS_CNTPCT_EL0			sys_reg(3, 3, 14, 0, 1)
>  
>  #define SYS_CNTP_TVAL_EL0		sys_reg(3, 3, 14, 2, 0)
>  #define SYS_CNTP_CTL_EL0		sys_reg(3, 3, 14, 2, 1)
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 9ead94aa867d..b7cb63acf2a0 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -51,7 +51,7 @@ static void kvm_arm_timer_write(struct kvm_vcpu *vcpu,
>  static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
>  			      struct arch_timer_context *timer,
>  			      enum kvm_arch_timer_regs treg);
> -static void kvm_timer_enable_traps_vhe(void);
> +static void kvm_timer_enable_traps_vhe(struct kvm_vcpu *vcpu);
>  
>  u32 timer_get_ctl(struct arch_timer_context *ctxt)
>  {
> @@ -175,6 +175,12 @@ static void timer_set_guest_offset(struct arch_timer_context *ctxt, u64 offset)
>  	}
>  }
>  
> +static bool ptimer_emulation_required(struct kvm_vcpu *vcpu)
> +{
> +	return timer_get_offset(vcpu_ptimer(vcpu)) &&
> +			!cpus_have_const_cap(ARM64_ECV);

What Andrew said! :-)

> +}
> +
>  u64 kvm_phys_timer_read(void)
>  {
>  	return timecounter->cc->read(timecounter->cc);
> @@ -184,8 +190,13 @@ static void get_timer_map(struct kvm_vcpu *vcpu, struct timer_map *map)
>  {
>  	if (has_vhe()) {
>  		map->direct_vtimer = vcpu_vtimer(vcpu);
> -		map->direct_ptimer = vcpu_ptimer(vcpu);
> -		map->emul_ptimer = NULL;
> +		if (!ptimer_emulation_required(vcpu)) {
> +			map->direct_ptimer = vcpu_ptimer(vcpu);
> +			map->emul_ptimer = NULL;
> +		} else {
> +			map->direct_ptimer = NULL;
> +			map->emul_ptimer = vcpu_ptimer(vcpu);
> +		}
>  	} else {
>  		map->direct_vtimer = vcpu_vtimer(vcpu);
>  		map->direct_ptimer = NULL;
> @@ -671,7 +682,7 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
>  		timer_emulate(map.emul_ptimer);
>  
>  	if (has_vhe())
> -		kvm_timer_enable_traps_vhe();
> +		kvm_timer_enable_traps_vhe(vcpu);
>  }
>  
>  bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu)
> @@ -1392,22 +1403,29 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>   * The host kernel runs at EL2 with HCR_EL2.TGE == 1,
>   * and this makes those bits have no effect for the host kernel execution.
>   */
> -static void kvm_timer_enable_traps_vhe(void)
> +static void kvm_timer_enable_traps_vhe(struct kvm_vcpu *vcpu)
>  {
>  	/* When HCR_EL2.E2H ==1, EL1PCEN and EL1PCTEN are shifted by 10 */
>  	u32 cnthctl_shift = 10;
> -	u64 val;
> +	u64 val, mask;
> +
> +	mask = CNTHCTL_EL1PCEN << cnthctl_shift;
> +	mask |= CNTHCTL_EL1PCTEN << cnthctl_shift;
>  
> -	/*
> -	 * VHE systems allow the guest direct access to the EL1 physical
> -	 * timer/counter.
> -	 */
>  	val = read_sysreg(cnthctl_el2);
> -	val |= (CNTHCTL_EL1PCEN << cnthctl_shift);
> -	val |= (CNTHCTL_EL1PCTEN << cnthctl_shift);
>  
>  	if (cpus_have_const_cap(ARM64_ECV))
>  		val |= CNTHCTL_ECV;
> +
> +	/*
> +	 * VHE systems allow the guest direct access to the EL1 physical
> +	 * timer/counter if offsetting isn't requested on a non-ECV system.
> +	 */
> +	if (ptimer_emulation_required(vcpu))
> +		val &= ~mask;
> +	else
> +		val |= mask;
> +
>  	write_sysreg(val, cnthctl_el2);
>  }
>  
> @@ -1462,9 +1480,6 @@ static int kvm_arm_timer_set_attr_offset(struct kvm_vcpu *vcpu,
>  	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
>  	u64 offset;
>  
> -	if (!cpus_have_const_cap(ARM64_ECV))
> -		return -ENXIO;
> -
>  	if (get_user(offset, uaddr))
>  		return -EFAULT;
>  
> @@ -1513,9 +1528,6 @@ static int kvm_arm_timer_get_attr_offset(struct kvm_vcpu *vcpu,
>  	u64 __user *uaddr = (u64 __user *)(long)attr->addr;
>  	u64 offset;
>  
> -	if (!cpus_have_const_cap(ARM64_ECV))
> -		return -ENXIO;
> -
>  	offset = timer_get_offset(vcpu_ptimer(vcpu));
>  	return put_user(offset, uaddr);
>  }
> @@ -1539,11 +1551,8 @@ int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  	switch (attr->attr) {
>  	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
>  	case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
> -		return 0;
>  	case KVM_ARM_VCPU_TIMER_OFFSET:
> -		if (cpus_have_const_cap(ARM64_ECV))
> -			return 0;
> -		break;
> +		return 0;
>  	}
>  
>  	return -ENXIO;
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index e4a2f295a394..abd3813a709e 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -15,6 +15,7 @@
>  #include <linux/jump_label.h>
>  #include <uapi/linux/psci.h>
>  
> +#include <kvm/arm_arch_timer.h>
>  #include <kvm/arm_psci.h>
>  
>  #include <asm/barrier.h>
> @@ -405,6 +406,31 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
>  	return true;
>  }
>  
> +static inline u64 __timer_read_cntpct(struct kvm_vcpu *vcpu)
> +{
> +	return __arch_counter_get_cntpct() - vcpu_ptimer(vcpu)->host_offset;
> +}
> +
> +static inline bool __hyp_handle_counter(struct kvm_vcpu *vcpu)
> +{
> +	u32 sysreg;
> +	int rt;
> +	u64 rv;

You could start with a

	if (cpus_have_final_cap(ARM64_ECV))
		return false;

which will speed-up the exit on ECV-capable systems.

> +
> +	if (kvm_vcpu_trap_get_class(vcpu) != ESR_ELx_EC_SYS64)
> +		return false;
> +
> +	sysreg = esr_sys64_to_sysreg(kvm_vcpu_get_esr(vcpu));
> +	if (sysreg != SYS_CNTPCT_EL0)
> +		return false;

You also want to check for CNTPCTSS_EL0 which will also be caught by
this trap.

> +
> +	rt = kvm_vcpu_sys_get_rt(vcpu);
> +	rv = __timer_read_cntpct(vcpu);
> +	vcpu_set_reg(vcpu, rt, rv);
> +	__kvm_skip_instr(vcpu);
> +	return true;
> +}
> +
>  /*
>   * Return true when we were able to fixup the guest exit and should return to
>   * the guest, false when we should restore the host state and return to the
> @@ -439,6 +465,9 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>  	if (*exit_code != ARM_EXCEPTION_TRAP)
>  		goto exit;
>  
> +	if (__hyp_handle_counter(vcpu))
> +		goto guest;
> +
>  	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
>  	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 &&
>  	    handle_tx2_tvm(vcpu))
> diff --git a/arch/arm64/kvm/hyp/nvhe/timer-sr.c b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> index 5b8b4cd02506..67236c2e0ba7 100644
> --- a/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> +++ b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> @@ -44,10 +44,17 @@ void __timer_enable_traps(struct kvm_vcpu *vcpu)
>  
>  	/*
>  	 * Disallow physical timer access for the guest
> -	 * Physical counter access is allowed
>  	 */
>  	val = read_sysreg(cnthctl_el2);
>  	val &= ~CNTHCTL_EL1PCEN;
> -	val |= CNTHCTL_EL1PCTEN;
> +
> +	/*
> +	 * Disallow physical counter access for the guest if offsetting is
> +	 * requested on a non-ECV system.
> +	 */
> +	if (vcpu_ptimer(vcpu)->host_offset && !cpus_have_const_cap(ARM64_ECV))
> +		val &= ~CNTHCTL_EL1PCTEN;
> +	else
> +		val |= CNTHCTL_EL1PCTEN;
>  	write_sysreg(val, cnthctl_el2);
>  }

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state
  2021-08-10  0:04     ` Oliver Upton
@ 2021-08-10 12:30       ` Marc Zyngier
  0 siblings, 0 replies; 51+ messages in thread
From: Marc Zyngier @ 2021-08-10 12:30 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Tue, 10 Aug 2021 01:04:38 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> On Wed, Aug 4, 2021 at 3:03 PM Oliver Upton <oupton@google.com> wrote:
> >
> > On Wed, Aug 4, 2021 at 4:05 AM Oliver Upton <oupton@google.com> wrote:
> > >
> > > On Wed, Aug 4, 2021 at 1:58 AM Oliver Upton <oupton@google.com> wrote:
> > > >
> > > > KVM's current means of saving/restoring system counters is plagued with
> > > > temporal issues. At least on ARM64 and x86, we migrate the guest's
> > > > system counter by-value through the respective guest system register
> > > > values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
> > > > brittle as the state is not idempotent: the host system counter is still
> > > > oscillating between the attempted save and restore. Furthermore, VMMs
> > > > may wish to transparently live migrate guest VMs, meaning that they
> > > > include the elapsed time due to live migration blackout in the guest
> > > > system counter view. The VMM thread could be preempted for any number of
> > > > reasons (scheduler, L0 hypervisor under nested) between the time that
> > > > it calculates the desired guest counter value and when KVM actually sets
> > > > this counter state.
> > > >
> > > > Despite the value-based interface that we present to userspace, KVM
> > > > actually has idempotent guest controls by way of system counter offsets.
> > > > We can avoid all of the issues associated with a value-based interface
> > > > by abstracting these offset controls in new ioctls. This series
> > > > introduces new vCPU device attributes to provide userspace access to the
> > > > vCPU's system counter offset.
> > > >
> > > > Patch 1 addresses a possible race in KVM_GET_CLOCK where
> > > > use_master_clock is read outside of the pvclock_gtod_sync_lock.
> > > >
> > > > Patch 2 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> > > > ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> > > > essential for a VMM to perform precise migration of the guest's system
> > > > counters.
> > > >
> > > > Patches 3-4 are some preparatory changes for exposing the TSC offset to
> > > > userspace. Patch 5 provides a vCPU attribute to provide userspace access
> > > > to the TSC offset.
> > > >
> > > > Patches 6-7 implement a test for the new additions to
> > > > KVM_{GET,SET}_CLOCK.
> > > >
> > > > Patch 8 fixes some assertions in the kvm device attribute helpers.
> > > >
> > > > Patches 9-10 implement at test for the tsc offset attribute introduced in
> > > > patch 5.
> 
> Paolo,
> 
> Is there anything else you're waiting to see for the x86 portion of
> this series after addressing Sean's comments? There's some work
> remaining on the arm64 side, though I believe the two architectures
> are now disjoint for this series.

I think at this stage it would make sense to split the series in two
and drive them independently.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset
  2021-08-10 10:56   ` Marc Zyngier
@ 2021-08-10 17:55     ` Oliver Upton
  2021-08-11  9:01       ` Marc Zyngier
  0 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-10 17:55 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Tue, Aug 10, 2021 at 3:56 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 04 Aug 2021 09:58:15 +0100,
> Oliver Upton <oupton@google.com> wrote:
> >
> > Presently, KVM provides no facilities for correctly migrating a guest
> > that depends on the physical counter-timer. Whie most guests (barring
>
> nit: While

Ack.

>
> > NV, of course) should not depend on the physical counter-timer, an
> > operator may wish to provide a consistent view of the physical
> > counter-timer across migrations.
> >
> > Provide userspace with a new vCPU attribute to modify the guest
> > counter-timer offset. Unlike KVM_REG_ARM_TIMER_OFFSET, this attribute is
> > hidden from the guest's architectural state. The value offsets *both*
> > the virtual and physical counter-timer views for the guest. Only support
> > this attribute on ECV systems as ECV is required for hardware offsetting
> > of the physical counter-timer.
> >
> > Signed-off-by: Oliver Upton <oupton@google.com>
> > ---
> >  Documentation/virt/kvm/devices/vcpu.rst |  28 ++++++
> >  arch/arm64/include/asm/kvm_asm.h        |   2 +
> >  arch/arm64/include/asm/sysreg.h         |   2 +
> >  arch/arm64/include/uapi/asm/kvm.h       |   1 +
> >  arch/arm64/kvm/arch_timer.c             | 122 +++++++++++++++++++++++-
> >  arch/arm64/kvm/hyp/nvhe/hyp-main.c      |   6 ++
> >  arch/arm64/kvm/hyp/nvhe/timer-sr.c      |   5 +
> >  arch/arm64/kvm/hyp/vhe/timer-sr.c       |   5 +
> >  include/clocksource/arm_arch_timer.h    |   1 +
> >  9 files changed, 169 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
> > index 3b399d727c11..3ba35b9d9d03 100644
> > --- a/Documentation/virt/kvm/devices/vcpu.rst
> > +++ b/Documentation/virt/kvm/devices/vcpu.rst
> > @@ -139,6 +139,34 @@ configured values on other VCPUs.  Userspace should configure the interrupt
> >  numbers on at least one VCPU after creating all VCPUs and before running any
> >  VCPUs.
> >
> > +2.2. ATTRIBUTE: KVM_ARM_VCPU_TIMER_OFFSET
> > +-----------------------------------------
> > +
> > +:Parameters: in kvm_device_attr.addr the address for the timer offset is a
> > +             pointer to a __u64
> > +
> > +Returns:
> > +
> > +      ======= ==================================
> > +      -EFAULT Error reading/writing the provided
> > +              parameter address
> > +      -ENXIO  Timer offsetting not implemented
> > +      ======= ==================================
> > +
> > +Specifies the guest's counter-timer offset from the host's virtual counter.
> > +The guest's physical counter value is then derived by the following
> > +equation:
> > +
> > +  guest_cntpct = host_cntvct - KVM_ARM_VCPU_TIMER_OFFSET
> > +
> > +The guest's virtual counter value is derived by the following equation:
> > +
> > +  guest_cntvct = host_cntvct - KVM_REG_ARM_TIMER_OFFSET
> > +                     - KVM_ARM_VCPU_TIMER_OFFSET
> > +
> > +KVM does not allow the use of varying offset values for different vCPUs;
> > +the last written offset value will be broadcasted to all vCPUs in a VM.
> > +
> >  3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL
> >  ==================================
> >
> > diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> > index 9f0bf2109be7..ab1c8fdb0177 100644
> > --- a/arch/arm64/include/asm/kvm_asm.h
> > +++ b/arch/arm64/include/asm/kvm_asm.h
> > @@ -65,6 +65,7 @@
> >  #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize           19
> >  #define __KVM_HOST_SMCCC_FUNC___pkvm_mark_hyp                        20
> >  #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc                        21
> > +#define __KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntpoff                22
> >
> >  #ifndef __ASSEMBLY__
> >
> > @@ -200,6 +201,7 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa,
> >  extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu);
> >
> >  extern void __kvm_timer_set_cntvoff(u64 cntvoff);
> > +extern void __kvm_timer_set_cntpoff(u64 cntpoff);
> >
> >  extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
> >
> > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> > index 4dfc44066dfb..c34672aa65b9 100644
> > --- a/arch/arm64/include/asm/sysreg.h
> > +++ b/arch/arm64/include/asm/sysreg.h
> > @@ -586,6 +586,8 @@
> >  #define SYS_ICH_LR14_EL2             __SYS__LR8_EL2(6)
> >  #define SYS_ICH_LR15_EL2             __SYS__LR8_EL2(7)
> >
> > +#define SYS_CNTPOFF_EL2                      sys_reg(3, 4, 14, 0, 6)
> > +
> >  /* VHE encodings for architectural EL0/1 system registers */
> >  #define SYS_SCTLR_EL12                       sys_reg(3, 5, 1, 0, 0)
> >  #define SYS_CPACR_EL12                       sys_reg(3, 5, 1, 0, 2)
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 949a31bc10f0..15150f8224a1 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -366,6 +366,7 @@ struct kvm_arm_copy_mte_tags {
> >  #define KVM_ARM_VCPU_TIMER_CTRL              1
> >  #define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER              0
> >  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER              1
> > +#define   KVM_ARM_VCPU_TIMER_OFFSET          2
> >  #define KVM_ARM_VCPU_PVTIME_CTRL     2
> >  #define   KVM_ARM_VCPU_PVTIME_IPA    0
> >
> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > index a8815b09da3e..f15058612994 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -85,11 +85,15 @@ u64 timer_get_cval(struct arch_timer_context *ctxt)
> >  static u64 timer_get_offset(struct arch_timer_context *ctxt)
> >  {
> >       struct kvm_vcpu *vcpu = ctxt->vcpu;
> > +     struct arch_timer_cpu *timer = vcpu_timer(vcpu);
>
> Unused variable?

Yep!

> >
> >       switch(arch_timer_ctx_index(ctxt)) {
> >       case TIMER_VTIMER:
> > +     case TIMER_PTIMER:
> >               return ctxt->host_offset;
> >       default:
> > +             WARN_ONCE(1, "unrecognized timer %ld\n",
> > +                       arch_timer_ctx_index(ctxt));
> >               return 0;
> >       }
> >  }
> > @@ -144,6 +148,7 @@ static void timer_set_offset(struct arch_timer_context *ctxt, u64 offset)
> >
> >       switch(arch_timer_ctx_index(ctxt)) {
> >       case TIMER_VTIMER:
> > +     case TIMER_PTIMER:
> >               ctxt->host_offset = offset;
> >               break;
> >       default:
> > @@ -572,6 +577,11 @@ static void set_cntvoff(u64 cntvoff)
> >       kvm_call_hyp(__kvm_timer_set_cntvoff, cntvoff);
> >  }
> >
> > +static void set_cntpoff(u64 cntpoff)
> > +{
> > +     kvm_call_hyp(__kvm_timer_set_cntpoff, cntpoff);
> > +}
> > +
> >  static inline void set_timer_irq_phys_active(struct arch_timer_context *ctx, bool active)
> >  {
> >       int r;
> > @@ -647,6 +657,8 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
> >       }
> >
> >       set_cntvoff(timer_get_offset(map.direct_vtimer));
> > +     if (cpus_have_const_cap(ARM64_ECV))
>
> This really should be a final cap instead (same for all the other use
> cases).

Ack.

> > +             set_cntpoff(timer_get_offset(vcpu_ptimer(vcpu)));
>
> However, tripping to EL2 for each offset on nVHE may prove to be an
> unnecessary overhead. Not a problem for now anyway.
>
> >
> >       kvm_timer_unblocking(vcpu);
> >
> > @@ -814,6 +826,22 @@ static void update_vtimer_cntvoff(struct kvm_vcpu *vcpu, u64 cntvoff)
> >       mutex_unlock(&kvm->lock);
> >  }
> >
> > +static void update_ptimer_cntpoff(struct kvm_vcpu *vcpu, u64 cntpoff)
> > +{
> > +     struct kvm *kvm = vcpu->kvm;
> > +     u64 cntvoff;
> > +
> > +     mutex_lock(&kvm->lock);
> > +
> > +     /* adjustments to the physical offset also affect vtimer */
> > +     cntvoff = timer_get_offset(vcpu_vtimer(vcpu));
> > +     cntvoff += cntpoff - timer_get_offset(vcpu_ptimer(vcpu));
> > +
> > +     update_timer_offset(vcpu, TIMER_PTIMER, cntpoff, false);
> > +     update_timer_offset(vcpu, TIMER_VTIMER, cntvoff, false);
> > +     mutex_unlock(&kvm->lock);
> > +}
> > +
> >  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
> >  {
> >       struct arch_timer_cpu *timer = vcpu_timer(vcpu);
> > @@ -932,6 +960,29 @@ u64 kvm_arm_timer_get_reg(struct kvm_vcpu *vcpu, u64 regid)
> >       return (u64)-1;
> >  }
> >
> > +/**
> > + * kvm_arm_timer_read_offset - returns the guest value of CNTVOFF_EL2.
> > + * @vcpu: the vcpu pointer
> > + *
> > + * Computes the guest value of CNTVOFF_EL2 by subtracting the physical
> > + * counter offset. Note that KVM defines CNTVOFF_EL2 as the offset from the
> > + * guest's physical counter-timer, not the host's.
> > + *
> > + * Returns: the guest value for CNTVOFF_EL2
> > + */
> > +static u64 kvm_arm_timer_read_offset(struct kvm_vcpu *vcpu)
> > +{
> > +     struct kvm *kvm = vcpu->kvm;
> > +     u64 offset;
> > +
> > +     mutex_lock(&kvm->lock);
> > +     offset = timer_get_offset(vcpu_vtimer(vcpu)) -
> > +                     timer_get_offset(vcpu_ptimer(vcpu));
>
> nit: please keep this on a single line.
>
> > +     mutex_unlock(&kvm->lock);
> > +
> > +     return offset;
> > +}
> > +
> >  static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
> >                             struct arch_timer_context *timer,
> >                             enum kvm_arch_timer_regs treg)
> > @@ -957,7 +1008,7 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
> >               break;
> >
> >       case TIMER_REG_OFFSET:
> > -             val = timer_get_offset(timer);
> > +             val = kvm_arm_timer_read_offset(vcpu);
> >               break;
> >
> >       default:
> > @@ -1350,6 +1401,9 @@ void kvm_timer_init_vhe(void)
> >       val = read_sysreg(cnthctl_el2);
> >       val |= (CNTHCTL_EL1PCEN << cnthctl_shift);
> >       val |= (CNTHCTL_EL1PCTEN << cnthctl_shift);
> > +
> > +     if (cpus_have_const_cap(ARM64_ECV))
> > +             val |= CNTHCTL_ECV;
>
> I cannot immediately see where you are doing the equivalent enablement
> of ECV on the nVHE path. Obviously, it has to be done eagerly from
> EL2, together with the rest of the EL1 timer setup. Something like:

Yep, I had caught this when I was actually able to run the Base FVP. I
have a fix (same as yours, basically) but held back until you reviewed
to avoid storming your inbox :)

> diff --git a/arch/arm64/kvm/hyp/nvhe/timer-sr.c b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> index 9072e71693ba..999931fe55d2 100644
> --- a/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> +++ b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> @@ -26,6 +26,8 @@ void __timer_disable_traps(struct kvm_vcpu *vcpu)
>         /* Allow physical timer/counter access for the host */
>         val = read_sysreg(cnthctl_el2);
>         val |= CNTHCTL_EL1PCTEN | CNTHCTL_EL1PCEN;
> +       if (cpus_have_final_cap(ARM64_ECV))
> +               val |= CNTHCTL_ECV;
>         write_sysreg(val, cnthctl_el2);
>  }
>
> @@ -42,6 +44,8 @@ void __timer_enable_traps(struct kvm_vcpu *vcpu)
>          * Physical counter access is allowed
>          */
>         val = read_sysreg(cnthctl_el2);
> +       if (cpus_have_final_cap(ARM64_ECV))
> +               val &= ~CNTHCTL_ECV;
>         val &= ~CNTHCTL_EL1PCEN;
>         val |= CNTHCTL_EL1PCTEN;
>         write_sysreg(val, cnthctl_el2);
>
> This will ensure that only the guest sees the physical offset.
>
> >       write_sysreg(val, cnthctl_el2);
> >  }
> >
> > @@ -1364,7 +1418,8 @@ static void set_timer_irqs(struct kvm *kvm, int vtimer_irq, int ptimer_irq)
> >       }
> >  }
> >
> > -int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> > +static int kvm_arm_timer_set_attr_irq(struct kvm_vcpu *vcpu,
> > +                                   struct kvm_device_attr *attr)
> >  {
> >       int __user *uaddr = (int __user *)(long)attr->addr;
> >       struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
> > @@ -1397,7 +1452,37 @@ int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >       return 0;
> >  }
> >
> > -int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> > +static int kvm_arm_timer_set_attr_offset(struct kvm_vcpu *vcpu,
> > +                                      struct kvm_device_attr *attr)
> > +{
> > +     u64 __user *uaddr = (u64 __user *)(long)attr->addr;
> > +     u64 offset;
> > +
> > +     if (!cpus_have_const_cap(ARM64_ECV))
> > +             return -ENXIO;
> > +
> > +     if (get_user(offset, uaddr))
> > +             return -EFAULT;
> > +
> > +     update_ptimer_cntpoff(vcpu, offset);
> > +     return 0;
> > +}
> > +
> > +int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> > +{
> > +     switch (attr->attr) {
> > +     case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
> > +     case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
> > +             return kvm_arm_timer_set_attr_irq(vcpu, attr);
> > +     case KVM_ARM_VCPU_TIMER_OFFSET:
> > +             return kvm_arm_timer_set_attr_offset(vcpu, attr);
> > +     default:
> > +             return -ENXIO;
> > +     }
> > +}
> > +
> > +static int kvm_arm_timer_get_attr_irq(struct kvm_vcpu *vcpu,
> > +                                   struct kvm_device_attr *attr)
> >  {
> >       int __user *uaddr = (int __user *)(long)attr->addr;
> >       struct arch_timer_context *timer;
> > @@ -1418,12 +1503,43 @@ int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >       return put_user(irq, uaddr);
> >  }
> >
> > +static int kvm_arm_timer_get_attr_offset(struct kvm_vcpu *vcpu,
> > +                                      struct kvm_device_attr *attr)
> > +{
> > +     u64 __user *uaddr = (u64 __user *)(long)attr->addr;
> > +     u64 offset;
> > +
> > +     if (!cpus_have_const_cap(ARM64_ECV))
> > +             return -ENXIO;
> > +
> > +     offset = timer_get_offset(vcpu_ptimer(vcpu));
> > +     return put_user(offset, uaddr);
> > +}
> > +
> > +int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu,
> > +                        struct kvm_device_attr *attr)
> > +{
> > +     switch (attr->attr) {
> > +     case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
> > +     case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
> > +             return kvm_arm_timer_get_attr_irq(vcpu, attr);
> > +     case KVM_ARM_VCPU_TIMER_OFFSET:
> > +             return kvm_arm_timer_get_attr_offset(vcpu, attr);
> > +     default:
> > +             return -ENXIO;
> > +     }
> > +}
> > +
> >  int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >  {
> >       switch (attr->attr) {
> >       case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
> >       case KVM_ARM_VCPU_TIMER_IRQ_PTIMER:
> >               return 0;
> > +     case KVM_ARM_VCPU_TIMER_OFFSET:
> > +             if (cpus_have_const_cap(ARM64_ECV))
> > +                     return 0;
> > +             break;
> >       }
> >
> >       return -ENXIO;
> > diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > index 1632f001f4ed..cfa923df3af6 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > @@ -68,6 +68,11 @@ static void handle___kvm_timer_set_cntvoff(struct kvm_cpu_context *host_ctxt)
> >       __kvm_timer_set_cntvoff(cpu_reg(host_ctxt, 1));
> >  }
> >
> > +static void handle___kvm_timer_set_cntpoff(struct kvm_cpu_context *host_ctxt)
> > +{
> > +     __kvm_timer_set_cntpoff(cpu_reg(host_ctxt, 1));
> > +}
> > +
> >  static void handle___kvm_enable_ssbs(struct kvm_cpu_context *host_ctxt)
> >  {
> >       u64 tmp;
> > @@ -197,6 +202,7 @@ static const hcall_t host_hcall[] = {
> >       HANDLE_FUNC(__pkvm_create_private_mapping),
> >       HANDLE_FUNC(__pkvm_prot_finalize),
> >       HANDLE_FUNC(__pkvm_mark_hyp),
> > +     HANDLE_FUNC(__kvm_timer_set_cntpoff),
> >  };
> >
> >  static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/timer-sr.c b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> > index 9072e71693ba..5b8b4cd02506 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
> > @@ -15,6 +15,11 @@ void __kvm_timer_set_cntvoff(u64 cntvoff)
> >       write_sysreg(cntvoff, cntvoff_el2);
> >  }
> >
> > +void __kvm_timer_set_cntpoff(u64 cntpoff)
> > +{
> > +     write_sysreg_s(cntpoff, SYS_CNTPOFF_EL2);
> > +}
> > +
> >  /*
> >   * Should only be called on non-VHE systems.
> >   * VHE systems use EL2 timers and configure EL1 timers in kvm_timer_init_vhe().
> > diff --git a/arch/arm64/kvm/hyp/vhe/timer-sr.c b/arch/arm64/kvm/hyp/vhe/timer-sr.c
> > index 4cda674a8be6..231e16a071a5 100644
> > --- a/arch/arm64/kvm/hyp/vhe/timer-sr.c
> > +++ b/arch/arm64/kvm/hyp/vhe/timer-sr.c
> > @@ -10,3 +10,8 @@ void __kvm_timer_set_cntvoff(u64 cntvoff)
> >  {
> >       write_sysreg(cntvoff, cntvoff_el2);
> >  }
> > +
> > +void __kvm_timer_set_cntpoff(u64 cntpoff)
> > +{
> > +     write_sysreg_s(cntpoff, SYS_CNTPOFF_EL2);
> > +}
> > diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
> > index 73c7139c866f..7252ffa3d675 100644
> > --- a/include/clocksource/arm_arch_timer.h
> > +++ b/include/clocksource/arm_arch_timer.h
> > @@ -21,6 +21,7 @@
> >  #define CNTHCTL_EVNTEN                       (1 << 2)
> >  #define CNTHCTL_EVNTDIR                      (1 << 3)
> >  #define CNTHCTL_EVNTI                        (0xF << 4)
> > +#define CNTHCTL_ECV                  (1 << 12)
> >
> >  enum arch_timer_reg {
> >       ARCH_TIMER_REG_CTRL,
>
> You also want to document that SCR_EL3.ECVEn has to be set to 1 for
> this to work (see Documentation/arm64/booting.txt). And if it isn't,
> the firmware better handle the CNTPOFF_EL2 traps correctly...

I'll grab the popcorn now ;-) Adding docs for this, good idea.

> What firmware did you use for this? I think we need to update the boot
> wrapper, but that's something that can be done in parallel.

I had actually just done a direct boot from ARM-TF -> Linux, nothing
else in between.

--
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset
  2021-08-10 17:55     ` Oliver Upton
@ 2021-08-11  9:01       ` Marc Zyngier
  0 siblings, 0 replies; 51+ messages in thread
From: Marc Zyngier @ 2021-08-11  9:01 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Tue, 10 Aug 2021 18:55:12 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> On Tue, Aug 10, 2021 at 3:56 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Wed, 04 Aug 2021 09:58:15 +0100,
> > Oliver Upton <oupton@google.com> wrote:

[...]

> > > diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
> > > index 73c7139c866f..7252ffa3d675 100644
> > > --- a/include/clocksource/arm_arch_timer.h
> > > +++ b/include/clocksource/arm_arch_timer.h
> > > @@ -21,6 +21,7 @@
> > >  #define CNTHCTL_EVNTEN                       (1 << 2)
> > >  #define CNTHCTL_EVNTDIR                      (1 << 3)
> > >  #define CNTHCTL_EVNTI                        (0xF << 4)
> > > +#define CNTHCTL_ECV                  (1 << 12)
> > >
> > >  enum arch_timer_reg {
> > >       ARCH_TIMER_REG_CTRL,
> >
> > You also want to document that SCR_EL3.ECVEn has to be set to 1 for
> > this to work (see Documentation/arm64/booting.txt). And if it isn't,
> > the firmware better handle the CNTPOFF_EL2 traps correctly...
> 
> I'll grab the popcorn now ;-) Adding docs for this, good idea.
> 
> > What firmware did you use for this? I think we need to update the boot
> > wrapper, but that's something that can be done in parallel.
> 
> I had actually just done a direct boot from ARM-TF -> Linux, nothing
> else in between.

Ah, right. I tend to use the boot-wrapper[1] to build a single binary
that contains the 'boot loader', DT and kernel. Using ATF is probably
more representative of the final thing, but the boot-wrapper is dead
easy to hack on...

Thanks,

	M.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/mark/boot-wrapper-aarch64.git

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK
  2021-08-04  8:57 ` [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK Oliver Upton
@ 2021-08-11 12:23   ` Paolo Bonzini
  2021-08-13 10:39     ` Oliver Upton
  0 siblings, 1 reply; 51+ messages in thread
From: Paolo Bonzini @ 2021-08-11 12:23 UTC (permalink / raw)
  To: Oliver Upton, kvm, kvmarm
  Cc: Sean Christopherson, Marc Zyngier, Peter Shier, Jim Mattson,
	David Matlack, Ricardo Koller, Jing Zhang, Raghavendra Rao Anata,
	James Morse, Alexandru Elisei, Suzuki K Poulose,
	linux-arm-kernel, Andrew Jones, Will Deacon, Catalin Marinas

On 04/08/21 10:57, Oliver Upton wrote:
> Sean noticed that KVM_GET_CLOCK was checking kvm_arch.use_master_clock
> outside of the pvclock sync lock. This is problematic, as the clock
> value written to the user may or may not actually correspond to a stable
> TSC.
> 
> Fix the race by populating the entire kvm_clock_data structure behind
> the pvclock_gtod_sync_lock.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>   arch/x86/kvm/x86.c | 39 ++++++++++++++++++++++++++++-----------
>   1 file changed, 28 insertions(+), 11 deletions(-)

I had a completely independent patch that fixed the same race.  It unifies
the read sides of tsc_write_lock and pvclock_gtod_sync_lock into a seqcount
(and replaces pvclock_gtod_sync_lock with kvm->lock on the write side).

I attach it now (based on https://lore.kernel.org/kvm/20210811102356.3406687-1-pbonzini@redhat.com/T/#t),
but the testing was extremely light so I'm not sure I will be able to include
it in 5.15.

Paolo

-------------- 8< -------------
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu, 8 Apr 2021 05:03:44 -0400
Subject: [PATCH] kvm: x86: protect masterclock with a seqcount

Protect the reference point for kvmclock with a seqcount, so that
kvmclock updates for all vCPUs can proceed in parallel.  Xen runstate
updates will also run in parallel and not bounce the kvmclock cacheline.

This also makes it possible to use KVM_REQ_CLOCK_UPDATE (which will
block on the seqcount) to prevent entering in the guests until
pvclock_update_vm_gtod_copy is complete, and thus to get rid of
KVM_REQ_MCLOCK_INPROGRESS.

nr_vcpus_matched_tsc is updated outside pvclock_update_vm_gtod_copy
though, so a spinlock must be kept for that one.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 8138201efb09..ed41da172e02 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -29,6 +29,8 @@ The acquisition orders for mutexes are as follows:
  
  On x86:
  
+- the seqcount kvm->arch.pvclock_sc is written under kvm->lock.
+
  - vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
  
  - kvm->arch.mmu_lock is an rwlock.  kvm->arch.tdp_mmu_pages_lock is
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 20daaf67a5bf..6889aab92362 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -68,8 +68,7 @@
  #define KVM_REQ_PMI			KVM_ARCH_REQ(11)
  #define KVM_REQ_SMI			KVM_ARCH_REQ(12)
  #define KVM_REQ_MASTERCLOCK_UPDATE	KVM_ARCH_REQ(13)
-#define KVM_REQ_MCLOCK_INPROGRESS \
-	KVM_ARCH_REQ_FLAGS(14, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+/* 14 unused */
  #define KVM_REQ_SCAN_IOAPIC \
  	KVM_ARCH_REQ_FLAGS(15, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
  #define KVM_REQ_GLOBAL_CLOCK_UPDATE	KVM_ARCH_REQ(16)
@@ -1067,6 +1066,11 @@ struct kvm_arch {
  
  	unsigned long irq_sources_bitmap;
  	s64 kvmclock_offset;
+
+	/*
+	 * This also protects nr_vcpus_matched_tsc which is read from a
+	 * preemption-disabled region, so it must be a raw spinlock.
+	 */
  	raw_spinlock_t tsc_write_lock;
  	u64 last_tsc_nsec;
  	u64 last_tsc_write;
@@ -1077,7 +1081,7 @@ struct kvm_arch {
  	u64 cur_tsc_generation;
  	int nr_vcpus_matched_tsc;
  
-	spinlock_t pvclock_gtod_sync_lock;
+	seqcount_mutex_t pvclock_sc;
  	bool use_master_clock;
  	u64 master_kernel_ns;
  	u64 master_cycle_now;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 74145a3fd4f2..7d73c5560260 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2533,9 +2533,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
  	vcpu->arch.this_tsc_write = kvm->arch.cur_tsc_write;
  
  	kvm_vcpu_write_tsc_offset(vcpu, offset);
-	raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
  
-	spin_lock_irqsave(&kvm->arch.pvclock_gtod_sync_lock, flags);
  	if (!matched) {
  		kvm->arch.nr_vcpus_matched_tsc = 0;
  	} else if (!already_matched) {
@@ -2543,7 +2541,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
  	}
  
  	kvm_track_tsc_matching(vcpu);
-	spin_unlock_irqrestore(&kvm->arch.pvclock_gtod_sync_lock, flags);
+	raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
  }
  
  static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
@@ -2730,9 +2728,7 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
  	struct kvm_arch *ka = &kvm->arch;
  	int vclock_mode;
  	bool host_tsc_clocksource, vcpus_matched;
-
-	vcpus_matched = (ka->nr_vcpus_matched_tsc + 1 ==
-			atomic_read(&kvm->online_vcpus));
+	unsigned long flags;
  
  	/*
  	 * If the host uses TSC clock, then passthrough TSC as stable
@@ -2742,9 +2738,14 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
  					&ka->master_kernel_ns,
  					&ka->master_cycle_now);
  
+	raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
+	vcpus_matched = (ka->nr_vcpus_matched_tsc + 1 ==
+			atomic_read(&kvm->online_vcpus));
+
  	ka->use_master_clock = host_tsc_clocksource && vcpus_matched
  				&& !ka->backwards_tsc_observed
  				&& !ka->boot_vcpu_runs_old_kvmclock;
+	raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
  
  	if (ka->use_master_clock)
  		atomic_set(&kvm_guest_has_master_clock, 1);
@@ -2758,25 +2759,26 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
  static void kvm_start_pvclock_update(struct kvm *kvm)
  {
  	struct kvm_arch *ka = &kvm->arch;
-	kvm_make_all_cpus_request(kvm, KVM_REQ_MCLOCK_INPROGRESS);
  
-	/* no guest entries from this point */
-	spin_lock_irq(&ka->pvclock_gtod_sync_lock);
+	mutex_lock(&kvm->lock);
+	/*
+	 * write_seqcount_begin disables preemption.  This is needed not just
+	 * to avoid livelock, but also because the preempt notifier is a reader
+	 * for ka->pvclock_sc.
+	 */
+	write_seqcount_begin(&ka->pvclock_sc);
+	kvm_make_all_cpus_request(kvm,
+		KVM_REQ_CLOCK_UPDATE | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP);
+
+	/* no guest entries from this point until write_seqcount_end */
  }
  
  static void kvm_end_pvclock_update(struct kvm *kvm)
  {
  	struct kvm_arch *ka = &kvm->arch;
-	struct kvm_vcpu *vcpu;
-	int i;
  
-	spin_unlock_irq(&ka->pvclock_gtod_sync_lock);
-	kvm_for_each_vcpu(i, vcpu, kvm)
-		kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
-
-	/* guest entries allowed */
-	kvm_for_each_vcpu(i, vcpu, kvm)
-		kvm_clear_request(KVM_REQ_MCLOCK_INPROGRESS, vcpu);
+	write_seqcount_end(&ka->pvclock_sc);
+	mutex_unlock(&kvm->lock);
  }
  
  static void kvm_update_masterclock(struct kvm *kvm)
@@ -2787,27 +2789,21 @@ static void kvm_update_masterclock(struct kvm *kvm)
  	kvm_end_pvclock_update(kvm);
  }
  
-u64 get_kvmclock_ns(struct kvm *kvm)
+static u64 __get_kvmclock_ns(struct kvm *kvm)
  {
  	struct kvm_arch *ka = &kvm->arch;
  	struct pvclock_vcpu_time_info hv_clock;
-	unsigned long flags;
  	u64 ret;
  
-	spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
-	if (!ka->use_master_clock) {
-		spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
+	if (!ka->use_master_clock)
  		return get_kvmclock_base_ns() + ka->kvmclock_offset;
-	}
-
-	hv_clock.tsc_timestamp = ka->master_cycle_now;
-	hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset;
-	spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
  
  	/* both __this_cpu_read() and rdtsc() should be on the same cpu */
  	get_cpu();
  
  	if (__this_cpu_read(cpu_tsc_khz)) {
+		hv_clock.tsc_timestamp = ka->master_cycle_now;
+		hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset;
  		kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,
  				   &hv_clock.tsc_shift,
  				   &hv_clock.tsc_to_system_mul);
@@ -2816,6 +2812,19 @@ u64 get_kvmclock_ns(struct kvm *kvm)
  		ret = get_kvmclock_base_ns() + ka->kvmclock_offset;
  
  	put_cpu();
+	return ret;
+}
+
+u64 get_kvmclock_ns(struct kvm *kvm)
+{
+	struct kvm_arch *ka = &kvm->arch;
+	unsigned seq;
+	u64 ret;
+
+	do {
+		seq = read_seqcount_begin(&ka->pvclock_sc);
+		ret = __get_kvmclock_ns(kvm);
+	} while (read_seqcount_retry(&ka->pvclock_sc, seq));
  
  	return ret;
  }
@@ -2882,6 +2891,7 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu *v,
  static int kvm_guest_time_update(struct kvm_vcpu *v)
  {
  	unsigned long flags, tgt_tsc_khz;
+	unsigned seq;
  	struct kvm_vcpu_arch *vcpu = &v->arch;
  	struct kvm_arch *ka = &v->kvm->arch;
  	s64 kernel_ns;
@@ -2896,13 +2906,14 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
  	 * If the host uses TSC clock, then passthrough TSC as stable
  	 * to the guest.
  	 */
-	spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
-	use_master_clock = ka->use_master_clock;
-	if (use_master_clock) {
-		host_tsc = ka->master_cycle_now;
-		kernel_ns = ka->master_kernel_ns;
-	}
-	spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
+	seq = read_seqcount_begin(&ka->pvclock_sc);
+	do {
+		use_master_clock = ka->use_master_clock;
+		if (use_master_clock) {
+			host_tsc = ka->master_cycle_now;
+			kernel_ns = ka->master_kernel_ns;
+		}
+	} while (read_seqcount_retry(&ka->pvclock_sc, seq));
  
  	/* Keep irq disabled to prevent changes to the clock */
  	local_irq_save(flags);
@@ -6098,11 +6109,13 @@ long kvm_arch_vm_ioctl(struct file *filp,
  	}
  	case KVM_GET_CLOCK: {
  		struct kvm_clock_data user_ns;
-		u64 now_ns;
+		unsigned seq;
  
-		now_ns = get_kvmclock_ns(kvm);
-		user_ns.clock = now_ns;
-		user_ns.flags = kvm->arch.use_master_clock ? KVM_CLOCK_TSC_STABLE : 0;
+		do {
+			seq = read_seqcount_begin(&kvm->arch.pvclock_sc);
+			user_ns.clock = __get_kvmclock_ns(kvm);
+			user_ns.flags = kvm->arch.use_master_clock ? KVM_CLOCK_TSC_STABLE : 0;
+		} while (read_seqcount_retry(&kvm->arch.pvclock_sc, seq));
  		memset(&user_ns.pad, 0, sizeof(user_ns.pad));
  
  		r = -EFAULT;
@@ -11144,8 +11157,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
  
  	raw_spin_lock_init(&kvm->arch.tsc_write_lock);
  	mutex_init(&kvm->arch.apic_map_lock);
-	spin_lock_init(&kvm->arch.pvclock_gtod_sync_lock);
-
+	seqcount_mutex_init(&kvm->arch.pvclock_sc, &kvm->lock);
  	kvm->arch.kvmclock_offset = -get_kvmclock_base_ns();
  	pvclock_update_vm_gtod_copy(kvm);
  


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state
  2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
                   ` (21 preceding siblings ...)
  2021-08-04 11:05 ` [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
@ 2021-08-11 13:05 ` Paolo Bonzini
  2021-08-11 18:56   ` Oliver Upton
  22 siblings, 1 reply; 51+ messages in thread
From: Paolo Bonzini @ 2021-08-11 13:05 UTC (permalink / raw)
  To: Oliver Upton, kvm, kvmarm
  Cc: Sean Christopherson, Marc Zyngier, Peter Shier, Jim Mattson,
	David Matlack, Ricardo Koller, Jing Zhang, Raghavendra Rao Anata,
	James Morse, Alexandru Elisei, Suzuki K Poulose,
	linux-arm-kernel, Andrew Jones, Will Deacon, Catalin Marinas

On 04/08/21 10:57, Oliver Upton wrote:
> KVM's current means of saving/restoring system counters is plagued with
> temporal issues. At least on ARM64 and x86, we migrate the guest's
> system counter by-value through the respective guest system register
> values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
> brittle as the state is not idempotent: the host system counter is still
> oscillating between the attempted save and restore. Furthermore, VMMs
> may wish to transparently live migrate guest VMs, meaning that they
> include the elapsed time due to live migration blackout in the guest
> system counter view. The VMM thread could be preempted for any number of
> reasons (scheduler, L0 hypervisor under nested) between the time that
> it calculates the desired guest counter value and when KVM actually sets
> this counter state.
> 
> Despite the value-based interface that we present to userspace, KVM
> actually has idempotent guest controls by way of system counter offsets.
> We can avoid all of the issues associated with a value-based interface
> by abstracting these offset controls in new ioctls. This series
> introduces new vCPU device attributes to provide userspace access to the
> vCPU's system counter offset.
> 
> Patch 1 addresses a possible race in KVM_GET_CLOCK where
> use_master_clock is read outside of the pvclock_gtod_sync_lock.
> 
> Patch 2 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> essential for a VMM to perform precise migration of the guest's system
> counters.
> 
> Patches 3-4 are some preparatory changes for exposing the TSC offset to
> userspace. Patch 5 provides a vCPU attribute to provide userspace access
> to the TSC offset.
> 
> Patches 6-7 implement a test for the new additions to
> KVM_{GET,SET}_CLOCK.
> 
> Patch 8 fixes some assertions in the kvm device attribute helpers.
> 
> Patches 9-10 implement at test for the tsc offset attribute introduced in
> patch 5.

The x86 parts look good, except that patch 3 is a bit redundant with my 
idea of altogether getting rid of the pvclock_gtod_sync_lock.  That said 
I agree that patches 1 and 2 (and extracting kvm_vm_ioctl_get_clock and 
kvm_vm_ioctl_set_clock) should be done before whatever locking changes 
have to be done.

Time is ticking for 5.15 due to my vacation, I'll see if I have some 
time to look at it further next week.

I agree that arm64 can be done separately from x86.

Paolo


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset
  2021-08-10  9:44     ` Oliver Upton
@ 2021-08-11 15:22       ` Marc Zyngier
  0 siblings, 0 replies; 51+ messages in thread
From: Marc Zyngier @ 2021-08-11 15:22 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Paolo Bonzini, Sean Christopherson, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Tue, 10 Aug 2021 10:44:01 +0100,
Oliver Upton <oupton@google.com> wrote:
> 
> On Tue, Aug 10, 2021 at 2:35 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Wed, 04 Aug 2021 09:58:11 +0100,
> > Oliver Upton <oupton@google.com> wrote:
> > >
> > > Allow userspace to access the guest's virtual counter-timer offset
> > > through the ONE_REG interface. The value read or written is defined to
> > > be an offset from the guest's physical counter-timer. Add some
> > > documentation to clarify how a VMM should use this and the existing
> > > CNTVCT_EL0.
> > >
> > > Signed-off-by: Oliver Upton <oupton@google.com>
> > > ---
> > >  Documentation/virt/kvm/api.rst    | 10 ++++++++++
> > >  arch/arm64/include/uapi/asm/kvm.h |  1 +
> > >  arch/arm64/kvm/arch_timer.c       | 11 +++++++++++
> > >  arch/arm64/kvm/guest.c            |  6 +++++-
> > >  include/kvm/arm_arch_timer.h      |  1 +
> > >  5 files changed, 28 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > > index 8d4a3471ad9e..28a65dc89985 100644
> > > --- a/Documentation/virt/kvm/api.rst
> > > +++ b/Documentation/virt/kvm/api.rst
> > > @@ -2487,6 +2487,16 @@ arm64 system registers have the following id bit patterns::
> > >       derived from the register encoding for CNTV_CVAL_EL0.  As this is
> > >       API, it must remain this way.
> > >
> > > +.. warning::
> > > +
> > > +     The value of KVM_REG_ARM_TIMER_OFFSET is defined as an offset from
> > > +     the guest's view of the physical counter-timer.
> > > +
> > > +     Userspace should use either KVM_REG_ARM_TIMER_OFFSET or
> > > +     KVM_REG_ARM_TIMER_CVAL to pause and resume a guest's virtual
> >
> > You probably mean KVM_REG_ARM_TIMER_CNT here, despite the broken
> > encoding.
> 
> Indeed I do!
> 
> >
> > > +     counter-timer. Mixed use of these registers could result in an
> > > +     unpredictable guest counter value.
> > > +
> > >  arm64 firmware pseudo-registers have the following bit pattern::
> > >
> > >    0x6030 0000 0014 <regno:16>
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > > index b3edde68bc3e..949a31bc10f0 100644
> > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -255,6 +255,7 @@ struct kvm_arm_copy_mte_tags {
> > >  #define KVM_REG_ARM_TIMER_CTL                ARM64_SYS_REG(3, 3, 14, 3, 1)
> > >  #define KVM_REG_ARM_TIMER_CVAL               ARM64_SYS_REG(3, 3, 14, 0, 2)
> > >  #define KVM_REG_ARM_TIMER_CNT                ARM64_SYS_REG(3, 3, 14, 3, 2)
> > > +#define KVM_REG_ARM_TIMER_OFFSET     ARM64_SYS_REG(3, 4, 14, 0, 3)
> >
> > I don't think we can use the encoding for CNTPOFF_EL2 here, as it will
> > eventually clash with a NV guest using the same feature for its own
> > purpose. We don't want this offset to overlap with any of the existing
> > features.
> >
> > I actually liked your previous proposal of controlling the physical
> > offset via a device property, as it clearly indicated that you were
> > dealing with non-architectural state.
> 
> That's actually exactly what I did here :) That said, the macro name
> is horribly obfuscated from CNTVOFF_EL2. I did this for the sake of
> symmetry with other virtual counter-timer registers above, though this
> may warrant special casing given the fact that we have a similarly
> named device attribute to handle the physical offset.

Gah, you are of course right. Ignore my rambling. The name is fine (or
at least in keeping with existing quality level of the making).

For the physical offset, something along the lines of
KVM_ARM_VCPU_TIMER_PHYS_OFFSET is probably right (but feel free to be
creative, I'm terrible at this stuff [1]).

Thanks,

	M.

[1] https://twitter.com/codinghorror/status/506010907021828096?lang=en

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state
  2021-08-11 13:05 ` Paolo Bonzini
@ 2021-08-11 18:56   ` Oliver Upton
  2021-08-11 19:01     ` Marc Zyngier
  0 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-11 18:56 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, kvmarm, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Wed, Aug 11, 2021 at 6:05 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 04/08/21 10:57, Oliver Upton wrote:
> > KVM's current means of saving/restoring system counters is plagued with
> > temporal issues. At least on ARM64 and x86, we migrate the guest's
> > system counter by-value through the respective guest system register
> > values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
> > brittle as the state is not idempotent: the host system counter is still
> > oscillating between the attempted save and restore. Furthermore, VMMs
> > may wish to transparently live migrate guest VMs, meaning that they
> > include the elapsed time due to live migration blackout in the guest
> > system counter view. The VMM thread could be preempted for any number of
> > reasons (scheduler, L0 hypervisor under nested) between the time that
> > it calculates the desired guest counter value and when KVM actually sets
> > this counter state.
> >
> > Despite the value-based interface that we present to userspace, KVM
> > actually has idempotent guest controls by way of system counter offsets.
> > We can avoid all of the issues associated with a value-based interface
> > by abstracting these offset controls in new ioctls. This series
> > introduces new vCPU device attributes to provide userspace access to the
> > vCPU's system counter offset.
> >
> > Patch 1 addresses a possible race in KVM_GET_CLOCK where
> > use_master_clock is read outside of the pvclock_gtod_sync_lock.
> >
> > Patch 2 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> > ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> > essential for a VMM to perform precise migration of the guest's system
> > counters.
> >
> > Patches 3-4 are some preparatory changes for exposing the TSC offset to
> > userspace. Patch 5 provides a vCPU attribute to provide userspace access
> > to the TSC offset.
> >
> > Patches 6-7 implement a test for the new additions to
> > KVM_{GET,SET}_CLOCK.
> >
> > Patch 8 fixes some assertions in the kvm device attribute helpers.
> >
> > Patches 9-10 implement at test for the tsc offset attribute introduced in
> > patch 5.
>
> The x86 parts look good, except that patch 3 is a bit redundant with my
> idea of altogether getting rid of the pvclock_gtod_sync_lock.  That said
> I agree that patches 1 and 2 (and extracting kvm_vm_ioctl_get_clock and
> kvm_vm_ioctl_set_clock) should be done before whatever locking changes
> have to be done.

Following up on patch 3.

> Time is ticking for 5.15 due to my vacation, I'll see if I have some
> time to look at it further next week.
>
> I agree that arm64 can be done separately from x86.

Marc, just a disclaimer:

I'm going to separate these two series, although there will still
exist dependencies in the selftests changes. Otherwise, kernel changes
are disjoint.

--
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state
  2021-08-11 18:56   ` Oliver Upton
@ 2021-08-11 19:01     ` Marc Zyngier
  0 siblings, 0 replies; 51+ messages in thread
From: Marc Zyngier @ 2021-08-11 19:01 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Paolo Bonzini, kvm, kvmarm, Sean Christopherson, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Wed, 11 Aug 2021 19:56:22 +0100,
Oliver Upton <oupton@google.com> wrote:

[...]

> > Time is ticking for 5.15 due to my vacation, I'll see if I have some
> > time to look at it further next week.
> >
> > I agree that arm64 can be done separately from x86.
> 
> Marc, just a disclaimer:
> 
> I'm going to separate these two series, although there will still
> exist dependencies in the selftests changes. Otherwise, kernel changes
> are disjoint.

No problem. The selftests can even sit in a third series if that makes
it easier.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK
  2021-08-11 12:23   ` Paolo Bonzini
@ 2021-08-13 10:39     ` Oliver Upton
  2021-08-13 10:44       ` Paolo Bonzini
  0 siblings, 1 reply; 51+ messages in thread
From: Oliver Upton @ 2021-08-13 10:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, kvmarm, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

Hi Paolo,

On Wed, Aug 11, 2021 at 5:23 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 04/08/21 10:57, Oliver Upton wrote:
> > Sean noticed that KVM_GET_CLOCK was checking kvm_arch.use_master_clock
> > outside of the pvclock sync lock. This is problematic, as the clock
> > value written to the user may or may not actually correspond to a stable
> > TSC.
> >
> > Fix the race by populating the entire kvm_clock_data structure behind
> > the pvclock_gtod_sync_lock.
> >
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > Signed-off-by: Oliver Upton <oupton@google.com>
> > ---
> >   arch/x86/kvm/x86.c | 39 ++++++++++++++++++++++++++++-----------
> >   1 file changed, 28 insertions(+), 11 deletions(-)
>
> I had a completely independent patch that fixed the same race.  It unifies
> the read sides of tsc_write_lock and pvclock_gtod_sync_lock into a seqcount
> (and replaces pvclock_gtod_sync_lock with kvm->lock on the write side).

Might it make sense to fix this issue under the existing locking
scheme, then shift to what you're proposing? I say that, but the
locking change in 03/21 would most certainly have a short lifetime
until this patch supersedes it.

> I attach it now (based on https://lore.kernel.org/kvm/20210811102356.3406687-1-pbonzini@redhat.com/T/#t),
> but the testing was extremely light so I'm not sure I will be able to include
> it in 5.15.
>
> Paolo
>
> -------------- 8< -------------
> From: Paolo Bonzini <pbonzini@redhat.com>
> Date: Thu, 8 Apr 2021 05:03:44 -0400
> Subject: [PATCH] kvm: x86: protect masterclock with a seqcount
>
> Protect the reference point for kvmclock with a seqcount, so that
> kvmclock updates for all vCPUs can proceed in parallel.  Xen runstate
> updates will also run in parallel and not bounce the kvmclock cacheline.
>
> This also makes it possible to use KVM_REQ_CLOCK_UPDATE (which will
> block on the seqcount) to prevent entering in the guests until
> pvclock_update_vm_gtod_copy is complete, and thus to get rid of
> KVM_REQ_MCLOCK_INPROGRESS.
>
> nr_vcpus_matched_tsc is updated outside pvclock_update_vm_gtod_copy
> though, so a spinlock must be kept for that one.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>
> diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
> index 8138201efb09..ed41da172e02 100644
> --- a/Documentation/virt/kvm/locking.rst
> +++ b/Documentation/virt/kvm/locking.rst
> @@ -29,6 +29,8 @@ The acquisition orders for mutexes are as follows:
>
>   On x86:
>
> +- the seqcount kvm->arch.pvclock_sc is written under kvm->lock.
> +
>   - vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
>
>   - kvm->arch.mmu_lock is an rwlock.  kvm->arch.tdp_mmu_pages_lock is
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 20daaf67a5bf..6889aab92362 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -68,8 +68,7 @@
>   #define KVM_REQ_PMI                   KVM_ARCH_REQ(11)
>   #define KVM_REQ_SMI                   KVM_ARCH_REQ(12)
>   #define KVM_REQ_MASTERCLOCK_UPDATE    KVM_ARCH_REQ(13)
> -#define KVM_REQ_MCLOCK_INPROGRESS \
> -       KVM_ARCH_REQ_FLAGS(14, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> +/* 14 unused */
>   #define KVM_REQ_SCAN_IOAPIC \
>         KVM_ARCH_REQ_FLAGS(15, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>   #define KVM_REQ_GLOBAL_CLOCK_UPDATE   KVM_ARCH_REQ(16)
> @@ -1067,6 +1066,11 @@ struct kvm_arch {
>
>         unsigned long irq_sources_bitmap;
>         s64 kvmclock_offset;
> +
> +       /*
> +        * This also protects nr_vcpus_matched_tsc which is read from a
> +        * preemption-disabled region, so it must be a raw spinlock.
> +        */
>         raw_spinlock_t tsc_write_lock;
>         u64 last_tsc_nsec;
>         u64 last_tsc_write;
> @@ -1077,7 +1081,7 @@ struct kvm_arch {
>         u64 cur_tsc_generation;
>         int nr_vcpus_matched_tsc;
>
> -       spinlock_t pvclock_gtod_sync_lock;
> +       seqcount_mutex_t pvclock_sc;
>         bool use_master_clock;
>         u64 master_kernel_ns;
>         u64 master_cycle_now;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 74145a3fd4f2..7d73c5560260 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2533,9 +2533,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
>         vcpu->arch.this_tsc_write = kvm->arch.cur_tsc_write;
>
>         kvm_vcpu_write_tsc_offset(vcpu, offset);
> -       raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
>
> -       spin_lock_irqsave(&kvm->arch.pvclock_gtod_sync_lock, flags);
>         if (!matched) {
>                 kvm->arch.nr_vcpus_matched_tsc = 0;
>         } else if (!already_matched) {
> @@ -2543,7 +2541,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data)
>         }
>
>         kvm_track_tsc_matching(vcpu);
> -       spin_unlock_irqrestore(&kvm->arch.pvclock_gtod_sync_lock, flags);
> +       raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
>   }
>
>   static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,
> @@ -2730,9 +2728,7 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
>         struct kvm_arch *ka = &kvm->arch;
>         int vclock_mode;
>         bool host_tsc_clocksource, vcpus_matched;
> -
> -       vcpus_matched = (ka->nr_vcpus_matched_tsc + 1 ==
> -                       atomic_read(&kvm->online_vcpus));
> +       unsigned long flags;
>
>         /*
>          * If the host uses TSC clock, then passthrough TSC as stable
> @@ -2742,9 +2738,14 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
>                                         &ka->master_kernel_ns,
>                                         &ka->master_cycle_now);
>
> +       raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
> +       vcpus_matched = (ka->nr_vcpus_matched_tsc + 1 ==
> +                       atomic_read(&kvm->online_vcpus));
> +
>         ka->use_master_clock = host_tsc_clocksource && vcpus_matched
>                                 && !ka->backwards_tsc_observed
>                                 && !ka->boot_vcpu_runs_old_kvmclock;
> +       raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
>
>         if (ka->use_master_clock)
>                 atomic_set(&kvm_guest_has_master_clock, 1);
> @@ -2758,25 +2759,26 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
>   static void kvm_start_pvclock_update(struct kvm *kvm)
>   {
>         struct kvm_arch *ka = &kvm->arch;
> -       kvm_make_all_cpus_request(kvm, KVM_REQ_MCLOCK_INPROGRESS);
>
> -       /* no guest entries from this point */
> -       spin_lock_irq(&ka->pvclock_gtod_sync_lock);
> +       mutex_lock(&kvm->lock);
> +       /*
> +        * write_seqcount_begin disables preemption.  This is needed not just
> +        * to avoid livelock, but also because the preempt notifier is a reader
> +        * for ka->pvclock_sc.
> +        */
> +       write_seqcount_begin(&ka->pvclock_sc);
> +       kvm_make_all_cpus_request(kvm,
> +               KVM_REQ_CLOCK_UPDATE | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP);
> +
> +       /* no guest entries from this point until write_seqcount_end */
>   }
>
>   static void kvm_end_pvclock_update(struct kvm *kvm)
>   {
>         struct kvm_arch *ka = &kvm->arch;
> -       struct kvm_vcpu *vcpu;
> -       int i;
>
> -       spin_unlock_irq(&ka->pvclock_gtod_sync_lock);
> -       kvm_for_each_vcpu(i, vcpu, kvm)
> -               kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> -
> -       /* guest entries allowed */
> -       kvm_for_each_vcpu(i, vcpu, kvm)
> -               kvm_clear_request(KVM_REQ_MCLOCK_INPROGRESS, vcpu);
> +       write_seqcount_end(&ka->pvclock_sc);
> +       mutex_unlock(&kvm->lock);
>   }
>
>   static void kvm_update_masterclock(struct kvm *kvm)
> @@ -2787,27 +2789,21 @@ static void kvm_update_masterclock(struct kvm *kvm)
>         kvm_end_pvclock_update(kvm);
>   }
>
> -u64 get_kvmclock_ns(struct kvm *kvm)
> +static u64 __get_kvmclock_ns(struct kvm *kvm)
>   {
>         struct kvm_arch *ka = &kvm->arch;
>         struct pvclock_vcpu_time_info hv_clock;
> -       unsigned long flags;
>         u64 ret;
>
> -       spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
> -       if (!ka->use_master_clock) {
> -               spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
> +       if (!ka->use_master_clock)
>                 return get_kvmclock_base_ns() + ka->kvmclock_offset;
> -       }
> -
> -       hv_clock.tsc_timestamp = ka->master_cycle_now;
> -       hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset;
> -       spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
>
>         /* both __this_cpu_read() and rdtsc() should be on the same cpu */
>         get_cpu();
>
>         if (__this_cpu_read(cpu_tsc_khz)) {
> +               hv_clock.tsc_timestamp = ka->master_cycle_now;
> +               hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset;
>                 kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,
>                                    &hv_clock.tsc_shift,
>                                    &hv_clock.tsc_to_system_mul);
> @@ -2816,6 +2812,19 @@ u64 get_kvmclock_ns(struct kvm *kvm)
>                 ret = get_kvmclock_base_ns() + ka->kvmclock_offset;
>
>         put_cpu();
> +       return ret;
> +}
> +
> +u64 get_kvmclock_ns(struct kvm *kvm)
> +{
> +       struct kvm_arch *ka = &kvm->arch;
> +       unsigned seq;
> +       u64 ret;
> +
> +       do {
> +               seq = read_seqcount_begin(&ka->pvclock_sc);
> +               ret = __get_kvmclock_ns(kvm);
> +       } while (read_seqcount_retry(&ka->pvclock_sc, seq));
>
>         return ret;
>   }
> @@ -2882,6 +2891,7 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu *v,
>   static int kvm_guest_time_update(struct kvm_vcpu *v)
>   {
>         unsigned long flags, tgt_tsc_khz;
> +       unsigned seq;
>         struct kvm_vcpu_arch *vcpu = &v->arch;
>         struct kvm_arch *ka = &v->kvm->arch;
>         s64 kernel_ns;
> @@ -2896,13 +2906,14 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
>          * If the host uses TSC clock, then passthrough TSC as stable
>          * to the guest.
>          */
> -       spin_lock_irqsave(&ka->pvclock_gtod_sync_lock, flags);
> -       use_master_clock = ka->use_master_clock;
> -       if (use_master_clock) {
> -               host_tsc = ka->master_cycle_now;
> -               kernel_ns = ka->master_kernel_ns;
> -       }
> -       spin_unlock_irqrestore(&ka->pvclock_gtod_sync_lock, flags);
> +       seq = read_seqcount_begin(&ka->pvclock_sc);
> +       do {
> +               use_master_clock = ka->use_master_clock;
> +               if (use_master_clock) {
> +                       host_tsc = ka->master_cycle_now;
> +                       kernel_ns = ka->master_kernel_ns;
> +               }
> +       } while (read_seqcount_retry(&ka->pvclock_sc, seq));
>
>         /* Keep irq disabled to prevent changes to the clock */
>         local_irq_save(flags);
> @@ -6098,11 +6109,13 @@ long kvm_arch_vm_ioctl(struct file *filp,
>         }
>         case KVM_GET_CLOCK: {
>                 struct kvm_clock_data user_ns;
> -               u64 now_ns;
> +               unsigned seq;
>
> -               now_ns = get_kvmclock_ns(kvm);
> -               user_ns.clock = now_ns;
> -               user_ns.flags = kvm->arch.use_master_clock ? KVM_CLOCK_TSC_STABLE : 0;
> +               do {
> +                       seq = read_seqcount_begin(&kvm->arch.pvclock_sc);
> +                       user_ns.clock = __get_kvmclock_ns(kvm);
> +                       user_ns.flags = kvm->arch.use_master_clock ? KVM_CLOCK_TSC_STABLE : 0;
> +               } while (read_seqcount_retry(&kvm->arch.pvclock_sc, seq));
>                 memset(&user_ns.pad, 0, sizeof(user_ns.pad));
>
>                 r = -EFAULT;
> @@ -11144,8 +11157,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>
>         raw_spin_lock_init(&kvm->arch.tsc_write_lock);
>         mutex_init(&kvm->arch.apic_map_lock);
> -       spin_lock_init(&kvm->arch.pvclock_gtod_sync_lock);
> -
> +       seqcount_mutex_init(&kvm->arch.pvclock_sc, &kvm->lock);
>         kvm->arch.kvmclock_offset = -get_kvmclock_base_ns();
>         pvclock_update_vm_gtod_copy(kvm);
>
>

This all looks good to me, so:

Reviewed-by: Oliver Upton <oupton@google.com>

Definitely supplants 03/21 from my series. If you'd rather take your
own for this entire series then I can rework around this patch and
resend.

--
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK
  2021-08-13 10:39     ` Oliver Upton
@ 2021-08-13 10:44       ` Paolo Bonzini
  2021-08-13 17:46         ` Oliver Upton
  0 siblings, 1 reply; 51+ messages in thread
From: Paolo Bonzini @ 2021-08-13 10:44 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, kvmarm, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On 13/08/21 12:39, Oliver Upton wrote:
> Might it make sense to fix this issue under the existing locking
> scheme, then shift to what you're proposing? I say that, but the
> locking change in 03/21 would most certainly have a short lifetime
> until this patch supersedes it.

Yes, definitely.  The seqcount change would definitely go in much later. 
  Extracting KVM_{GET,SET}_CLOCK to separate function would also be a 
patch of its own.  Give me a few more days of frantic KVM Forum 
preparation. :)

Paolo


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK
  2021-08-13 10:44       ` Paolo Bonzini
@ 2021-08-13 17:46         ` Oliver Upton
  0 siblings, 0 replies; 51+ messages in thread
From: Oliver Upton @ 2021-08-13 17:46 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, kvmarm, Sean Christopherson, Marc Zyngier, Peter Shier,
	Jim Mattson, David Matlack, Ricardo Koller, Jing Zhang,
	Raghavendra Rao Anata, James Morse, Alexandru Elisei,
	Suzuki K Poulose, linux-arm-kernel, Andrew Jones, Will Deacon,
	Catalin Marinas

On Fri, Aug 13, 2021 at 3:44 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 13/08/21 12:39, Oliver Upton wrote:
> > Might it make sense to fix this issue under the existing locking
> > scheme, then shift to what you're proposing? I say that, but the
> > locking change in 03/21 would most certainly have a short lifetime
> > until this patch supersedes it.
>
> Yes, definitely.  The seqcount change would definitely go in much later.
>   Extracting KVM_{GET,SET}_CLOCK to separate function would also be a
> patch of its own.  Give me a few more days of frantic KVM Forum
> preparation. :)

Sounds good :-) I'm probably going to send this out once more, in
three separate series:

- x86 (no changes, just rebasing)
- arm64 (address some comments, bugs)
- selftests (no changes)

--
Thanks,
Oliver

> Paolo
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2021-08-13 17:48 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-04  8:57 [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
2021-08-04  8:57 ` [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK Oliver Upton
2021-08-11 12:23   ` Paolo Bonzini
2021-08-13 10:39     ` Oliver Upton
2021-08-13 10:44       ` Paolo Bonzini
2021-08-13 17:46         ` Oliver Upton
2021-08-04  8:58 ` [PATCH v6 02/21] KVM: x86: Report host tsc and realtime values " Oliver Upton
2021-08-04  8:58 ` [PATCH v6 03/21] KVM: x86: Take the pvclock sync lock behind the tsc_write_lock Oliver Upton
2021-08-04  8:58 ` [PATCH v6 04/21] KVM: x86: Refactor tsc synchronization code Oliver Upton
2021-08-04  8:58 ` [PATCH v6 05/21] KVM: x86: Expose TSC offset controls to userspace Oliver Upton
2021-08-04  8:58 ` [PATCH v6 06/21] tools: arch: x86: pull in pvclock headers Oliver Upton
2021-08-04  8:58 ` [PATCH v6 07/21] selftests: KVM: Add test for KVM_{GET,SET}_CLOCK Oliver Upton
2021-08-04  8:58 ` [PATCH v6 08/21] selftests: KVM: Fix kvm device helper ioctl assertions Oliver Upton
2021-08-04  8:58 ` [PATCH v6 09/21] selftests: KVM: Add helpers for vCPU device attributes Oliver Upton
2021-08-04  8:58 ` [PATCH v6 10/21] selftests: KVM: Introduce system counter offset test Oliver Upton
2021-08-04  8:58 ` [PATCH v6 11/21] KVM: arm64: Refactor update_vtimer_cntvoff() Oliver Upton
2021-08-04  9:23   ` Andrew Jones
2021-08-04  8:58 ` [PATCH v6 12/21] KVM: arm64: Separate guest/host counter offset values Oliver Upton
2021-08-04 10:19   ` Andrew Jones
2021-08-04  8:58 ` [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset Oliver Upton
2021-08-04 10:20   ` Andrew Jones
2021-08-10  9:35   ` Marc Zyngier
2021-08-10  9:44     ` Oliver Upton
2021-08-11 15:22       ` Marc Zyngier
2021-08-04  8:58 ` [PATCH v6 14/21] selftests: KVM: Add helper to check for register presence Oliver Upton
2021-08-04  9:14   ` Andrew Jones
2021-08-04  8:58 ` [PATCH v6 15/21] selftests: KVM: Add support for aarch64 to system_counter_offset_test Oliver Upton
2021-08-04  8:58 ` [PATCH v6 16/21] arm64: cpufeature: Enumerate support for Enhanced Counter Virtualization Oliver Upton
2021-08-10  9:38   ` Marc Zyngier
2021-08-04  8:58 ` [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset Oliver Upton
2021-08-04 10:17   ` Andrew Jones
2021-08-04 10:22     ` Oliver Upton
2021-08-10 10:56   ` Marc Zyngier
2021-08-10 17:55     ` Oliver Upton
2021-08-11  9:01       ` Marc Zyngier
2021-08-04  8:58 ` [PATCH v6 18/21] KVM: arm64: Configure timer traps in vcpu_load() for VHE Oliver Upton
2021-08-04 10:25   ` Andrew Jones
2021-08-04  8:58 ` [PATCH v6 19/21] KVM: arm64: Emulate physical counter offsetting on non-ECV systems Oliver Upton
2021-08-04 11:05   ` Andrew Jones
2021-08-05  6:27     ` Oliver Upton
2021-08-10 11:27   ` Marc Zyngier
2021-08-04  8:58 ` [PATCH v6 20/21] selftests: KVM: Test physical counter offsetting Oliver Upton
2021-08-04 11:03   ` Andrew Jones
2021-08-04  8:58 ` [PATCH v6 21/21] selftests: KVM: Add counter emulation benchmark Oliver Upton
2021-08-04 11:05 ` [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state Oliver Upton
2021-08-04 22:03   ` Oliver Upton
2021-08-10  0:04     ` Oliver Upton
2021-08-10 12:30       ` Marc Zyngier
2021-08-11 13:05 ` Paolo Bonzini
2021-08-11 18:56   ` Oliver Upton
2021-08-11 19:01     ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).