linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full
@ 2021-12-13 14:19 Sasha Levin
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 2/9] timers/nohz: Last resort update jiffies on nohz_full IRQ entry Sasha Levin
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Sasha Levin @ 2021-12-13 14:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Frederic Weisbecker, Hasegawa Hitomi, Thomas Gleixner,
	Masayoshi Mizuma, Phil Auld, Sasha Levin, mingo, peterz,
	juri.lelli, vincent.guittot

From: Frederic Weisbecker <frederic@kernel.org>

[ Upstream commit e7f2be115f0746b969c0df14c0d182f65f005ca5 ]

getrusage(RUSAGE_THREAD) with nohz_full may return shorter utime/stime
than the actual time.

task_cputime_adjusted() snapshots utime and stime and then adjust their
sum to match the scheduler maintained cputime.sum_exec_runtime.
Unfortunately in nohz_full, sum_exec_runtime is only updated once per
second in the worst case, causing a discrepancy against utime and stime
that can be updated anytime by the reader using vtime.

To fix this situation, perform an update of cputime.sum_exec_runtime
when the cputime snapshot reports the task as actually running while
the tick is disabled. The related overhead is then contained within the
relevant situations.

Reported-by: Hasegawa Hitomi <hasegawa-hitomi@fujitsu.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Hasegawa Hitomi <hasegawa-hitomi@fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Acked-by: Phil Auld <pauld@redhat.com>
Link: https://lore.kernel.org/r/20211026141055.57358-3-frederic@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/sched/cputime.h |  5 +++--
 kernel/sched/cputime.c        | 12 +++++++++---
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/include/linux/sched/cputime.h b/include/linux/sched/cputime.h
index 6c9f19a33865a..ce3c58286062c 100644
--- a/include/linux/sched/cputime.h
+++ b/include/linux/sched/cputime.h
@@ -18,15 +18,16 @@
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
 
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
-extern void task_cputime(struct task_struct *t,
+extern bool task_cputime(struct task_struct *t,
 			 u64 *utime, u64 *stime);
 extern u64 task_gtime(struct task_struct *t);
 #else
-static inline void task_cputime(struct task_struct *t,
+static inline bool task_cputime(struct task_struct *t,
 				u64 *utime, u64 *stime)
 {
 	*utime = t->utime;
 	*stime = t->stime;
+	return false;
 }
 
 static inline u64 task_gtime(struct task_struct *t)
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 872e481d5098c..9392aea1804e5 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -615,7 +615,8 @@ void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
 		.sum_exec_runtime = p->se.sum_exec_runtime,
 	};
 
-	task_cputime(p, &cputime.utime, &cputime.stime);
+	if (task_cputime(p, &cputime.utime, &cputime.stime))
+		cputime.sum_exec_runtime = task_sched_runtime(p);
 	cputime_adjust(&cputime, &p->prev_cputime, ut, st);
 }
 EXPORT_SYMBOL_GPL(task_cputime_adjusted);
@@ -828,19 +829,21 @@ u64 task_gtime(struct task_struct *t)
  * add up the pending nohz execution time since the last
  * cputime snapshot.
  */
-void task_cputime(struct task_struct *t, u64 *utime, u64 *stime)
+bool task_cputime(struct task_struct *t, u64 *utime, u64 *stime)
 {
 	struct vtime *vtime = &t->vtime;
 	unsigned int seq;
 	u64 delta;
+	int ret;
 
 	if (!vtime_accounting_enabled()) {
 		*utime = t->utime;
 		*stime = t->stime;
-		return;
+		return false;
 	}
 
 	do {
+		ret = false;
 		seq = read_seqcount_begin(&vtime->seqcount);
 
 		*utime = t->utime;
@@ -850,6 +853,7 @@ void task_cputime(struct task_struct *t, u64 *utime, u64 *stime)
 		if (vtime->state < VTIME_SYS)
 			continue;
 
+		ret = true;
 		delta = vtime_delta(vtime);
 
 		/*
@@ -861,6 +865,8 @@ void task_cputime(struct task_struct *t, u64 *utime, u64 *stime)
 		else
 			*utime += vtime->utime + delta;
 	} while (read_seqcount_retry(&vtime->seqcount, seq));
+
+	return ret;
 }
 
 static int vtime_state_fetch(struct vtime *vtime, int cpu)
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH MANUALSEL 5.15 2/9] timers/nohz: Last resort update jiffies on nohz_full IRQ entry
  2021-12-13 14:19 [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Sasha Levin
@ 2021-12-13 14:19 ` Sasha Levin
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 3/9] KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled Sasha Levin
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Sasha Levin @ 2021-12-13 14:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Frederic Weisbecker, Paul E . McKenney, Thomas Gleixner,
	Sasha Levin, fweisbec, mingo, peterz, dave, will, dirk.behme,
	tannerlove

From: Frederic Weisbecker <frederic@kernel.org>

[ Upstream commit 53e87e3cdc155f20c3417b689df8d2ac88d79576 ]

When at least one CPU runs in nohz_full mode, a dedicated timekeeper CPU
is guaranteed to stay online and to never stop its tick.

Meanwhile on some rare case, the dedicated timekeeper may be running
with interrupts disabled for a while, such as in stop_machine.

If jiffies stop being updated, a nohz_full CPU may end up endlessly
programming the next tick in the past, taking the last jiffies update
monotonic timestamp as a stale base, resulting in an tick storm.

Here is a scenario where it matters:

0) CPU 0 is the timekeeper and CPU 1 a nohz_full CPU.

1) A stop machine callback is queued to execute somewhere.

2) CPU 0 reaches MULTI_STOP_DISABLE_IRQ while CPU 1 is still in
   MULTI_STOP_PREPARE. Hence CPU 0 can't do its timekeeping duty. CPU 1
   can still take IRQs.

3) CPU 1 receives an IRQ which queues a timer callback one jiffy forward.

4) On IRQ exit, CPU 1 schedules the tick one jiffy forward, taking
   last_jiffies_update as a base. But last_jiffies_update hasn't been
   updated for 2 jiffies since the timekeeper has interrupts disabled.

5) clockevents_program_event(), which relies on ktime_get(), observes
   that the expiration is in the past and therefore programs the min
   delta event on the clock.

6) The tick fires immediately, goto 3)

7) Tick storm, the nohz_full CPU is drown and takes ages to reach
   MULTI_STOP_DISABLE_IRQ, which is the only way out of this situation.

Solve this with unconditionally updating jiffies if the value is stale
on nohz_full IRQ entry. IRQs and other disturbances are expected to be
rare enough on nohz_full for the unconditional call to ktime_get() to
actually matter.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20211026141055.57358-2-frederic@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/softirq.c         | 3 ++-
 kernel/time/tick-sched.c | 7 +++++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 322b65d456767..41f470929e991 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -595,7 +595,8 @@ void irq_enter_rcu(void)
 {
 	__irq_enter_raw();
 
-	if (is_idle_task(current) && (irq_count() == HARDIRQ_OFFSET))
+	if (tick_nohz_full_cpu(smp_processor_id()) ||
+	    (is_idle_task(current) && (irq_count() == HARDIRQ_OFFSET)))
 		tick_irq_enter();
 
 	account_hardirq_enter(current);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6bffe5af8cb11..17a283ce2b20f 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1375,6 +1375,13 @@ static inline void tick_nohz_irq_enter(void)
 	now = ktime_get();
 	if (ts->idle_active)
 		tick_nohz_stop_idle(ts, now);
+	/*
+	 * If all CPUs are idle. We may need to update a stale jiffies value.
+	 * Note nohz_full is a special case: a timekeeper is guaranteed to stay
+	 * alive but it might be busy looping with interrupts disabled in some
+	 * rare case (typically stop machine). So we must make sure we have a
+	 * last resort.
+	 */
 	if (ts->tick_stopped)
 		tick_nohz_update_jiffies(now);
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH MANUALSEL 5.15 3/9] KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled
  2021-12-13 14:19 [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Sasha Levin
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 2/9] timers/nohz: Last resort update jiffies on nohz_full IRQ entry Sasha Levin
@ 2021-12-13 14:19 ` Sasha Levin
  2021-12-13 14:27   ` Paolo Bonzini
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 4/9] KVM: SEV: do not take kvm->lock when destroying Sasha Levin
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Sasha Levin @ 2021-12-13 14:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Paolo Bonzini, Sasha Levin, tglx, mingo, bp, dave.hansen, x86, kvm

From: Paolo Bonzini <pbonzini@redhat.com>

[ Upstream commit e90e51d5f01d2baae5dcce280866bbb96816e978 ]

There is nothing to synchronize if APICv is disabled, since neither
other vCPUs nor assigned devices can set PIR.ON.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/kvm/vmx/vmx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dacdf2395f01a..4e212f04268bb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7776,10 +7776,10 @@ static __init int hardware_setup(void)
 		ple_window_shrink = 0;
 	}
 
-	if (!cpu_has_vmx_apicv()) {
+	if (!cpu_has_vmx_apicv())
 		enable_apicv = 0;
+	if (!enable_apicv)
 		vmx_x86_ops.sync_pir_to_irr = NULL;
-	}
 
 	if (cpu_has_vmx_tsc_scaling()) {
 		kvm_has_tsc_control = true;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH MANUALSEL 5.15 4/9] KVM: SEV: do not take kvm->lock when destroying
  2021-12-13 14:19 [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Sasha Levin
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 2/9] timers/nohz: Last resort update jiffies on nohz_full IRQ entry Sasha Levin
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 3/9] KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled Sasha Levin
@ 2021-12-13 14:19 ` Sasha Levin
  2021-12-13 14:27   ` Paolo Bonzini
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 5/9] KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILE Sasha Levin
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Sasha Levin @ 2021-12-13 14:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Paolo Bonzini, Sean Christopherson, Sasha Levin, tglx, mingo, bp,
	dave.hansen, x86, kvm

From: Paolo Bonzini <pbonzini@redhat.com>

[ Upstream commit 10a37929efeb4c51a0069afdd537c4fa3831f6e5 ]

Taking the lock is useless since there are no other references,
and there are already accesses (e.g. to sev->enc_context_owner)
that do not take it.  So get rid of it.

Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211123005036.2954379-12-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/kvm/svm/sev.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 134c4ea5e6ad8..ae8092f0d401e 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1842,8 +1842,6 @@ void sev_vm_destroy(struct kvm *kvm)
 		return;
 	}
 
-	mutex_lock(&kvm->lock);
-
 	/*
 	 * Ensure that all guest tagged cache entries are flushed before
 	 * releasing the pages back to the system for use. CLFLUSH will
@@ -1863,8 +1861,6 @@ void sev_vm_destroy(struct kvm *kvm)
 		}
 	}
 
-	mutex_unlock(&kvm->lock);
-
 	sev_unbind_asid(kvm, sev->handle);
 	sev_asid_free(sev);
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH MANUALSEL 5.15 5/9] KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILE
  2021-12-13 14:19 [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Sasha Levin
                   ` (2 preceding siblings ...)
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 4/9] KVM: SEV: do not take kvm->lock when destroying Sasha Levin
@ 2021-12-13 14:19 ` Sasha Levin
  2021-12-13 14:27   ` Paolo Bonzini
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 6/9] KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN Sasha Levin
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Sasha Levin @ 2021-12-13 14:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Vitaly Kuznetsov, Sean Christopherson, Paolo Bonzini,
	Sasha Levin, shuah, kvm, linux-kselftest

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit 908fa88e420f30dde6d80f092795a18ec72ca6d3 ]

With the elevated 'KVM_CAP_MAX_VCPUS' value kvm_create_max_vcpus test
may hit RLIMIT_NOFILE limits:

 # ./kvm_create_max_vcpus
 KVM_CAP_MAX_VCPU_ID: 4096
 KVM_CAP_MAX_VCPUS: 1024
 Testing creating 1024 vCPUs, with IDs 0...1023.
 /dev/kvm not available (errno: 24), skipping test

Adjust RLIMIT_NOFILE limits to make sure KVM_CAP_MAX_VCPUS fds can be
opened. Note, raising hard limit ('rlim_max') requires CAP_SYS_RESOURCE
capability which is generally not needed to run kvm selftests (but without
raising the limit the test is doomed to fail anyway).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211123135953.667434-1-vkuznets@redhat.com>
[Skip the test if the hard limit can be raised. - Paolo]
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../selftests/kvm/kvm_create_max_vcpus.c      | 30 +++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/tools/testing/selftests/kvm/kvm_create_max_vcpus.c b/tools/testing/selftests/kvm/kvm_create_max_vcpus.c
index 0299cd81b8ba2..aa3795cd7bd3d 100644
--- a/tools/testing/selftests/kvm/kvm_create_max_vcpus.c
+++ b/tools/testing/selftests/kvm/kvm_create_max_vcpus.c
@@ -12,6 +12,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <sys/resource.h>
 
 #include "test_util.h"
 
@@ -40,10 +41,39 @@ int main(int argc, char *argv[])
 {
 	int kvm_max_vcpu_id = kvm_check_cap(KVM_CAP_MAX_VCPU_ID);
 	int kvm_max_vcpus = kvm_check_cap(KVM_CAP_MAX_VCPUS);
+	/*
+	 * Number of file descriptors reqired, KVM_CAP_MAX_VCPUS for vCPU fds +
+	 * an arbitrary number for everything else.
+	 */
+	int nr_fds_wanted = kvm_max_vcpus + 100;
+	struct rlimit rl;
 
 	pr_info("KVM_CAP_MAX_VCPU_ID: %d\n", kvm_max_vcpu_id);
 	pr_info("KVM_CAP_MAX_VCPUS: %d\n", kvm_max_vcpus);
 
+	/*
+	 * Check that we're allowed to open nr_fds_wanted file descriptors and
+	 * try raising the limits if needed.
+	 */
+	TEST_ASSERT(!getrlimit(RLIMIT_NOFILE, &rl), "getrlimit() failed!");
+
+	if (rl.rlim_cur < nr_fds_wanted) {
+		rl.rlim_cur = nr_fds_wanted;
+		if (rl.rlim_max < nr_fds_wanted) {
+			int old_rlim_max = rl.rlim_max;
+			rl.rlim_max = nr_fds_wanted;
+
+			int r = setrlimit(RLIMIT_NOFILE, &rl);
+			if (r < 0) {
+				printf("RLIMIT_NOFILE hard limit is too low (%d, wanted %d)\n",
+				       old_rlim_max, nr_fds_wanted);
+				exit(KSFT_SKIP);
+			}
+		} else {
+			TEST_ASSERT(!setrlimit(RLIMIT_NOFILE, &rl), "setrlimit() failed!");
+		}
+	}
+
 	/*
 	 * Upstream KVM prior to 4.8 does not support KVM_CAP_MAX_VCPU_ID.
 	 * Userspace is supposed to use KVM_CAP_MAX_VCPUS as the maximum ID
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH MANUALSEL 5.15 6/9] KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN
  2021-12-13 14:19 [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Sasha Levin
                   ` (3 preceding siblings ...)
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 5/9] KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILE Sasha Levin
@ 2021-12-13 14:19 ` Sasha Levin
  2021-12-13 14:28   ` Paolo Bonzini
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 7/9] KVM: selftests: Avoid KVM_SET_CPUID2 after KVM_RUN in hyperv_features test Sasha Levin
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Sasha Levin @ 2021-12-13 14:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Vitaly Kuznetsov, Paolo Bonzini, Sasha Levin, tglx, mingo, bp,
	dave.hansen, x86, kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit feb627e8d6f69c9a319fe279710959efb3eba873 ]

Commit 63f5a1909f9e ("KVM: x86: Alert userspace that KVM_SET_CPUID{,2}
after KVM_RUN is broken") officially deprecated KVM_SET_CPUID{,2} ioctls
after first successful KVM_RUN and promissed to make this sequence forbiden
in 5.16. It's time to fulfil the promise.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211122175818.608220-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/kvm/mmu/mmu.c | 28 +++++++++++-----------------
 arch/x86/kvm/x86.c     | 19 +++++++++++++++++++
 2 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 0a88cb4f731f4..31392e492ce5e 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5022,6 +5022,14 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	/*
 	 * Invalidate all MMU roles to force them to reinitialize as CPUID
 	 * information is factored into reserved bit calculations.
+	 *
+	 * Correctly handling multiple vCPU models with respect to paging and
+	 * physical address properties) in a single VM would require tracking
+	 * all relevant CPUID information in kvm_mmu_page_role. That is very
+	 * undesirable as it would increase the memory requirements for
+	 * gfn_track (see struct kvm_mmu_page_role comments).  For now that
+	 * problem is swept under the rug; KVM's CPUID API is horrific and
+	 * it's all but impossible to solve it without introducing a new API.
 	 */
 	vcpu->arch.root_mmu.mmu_role.ext.valid = 0;
 	vcpu->arch.guest_mmu.mmu_role.ext.valid = 0;
@@ -5029,24 +5037,10 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	kvm_mmu_reset_context(vcpu);
 
 	/*
-	 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
-	 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
-	 * tracked in kvm_mmu_page_role.  As a result, KVM may miss guest page
-	 * faults due to reusing SPs/SPTEs.  Alert userspace, but otherwise
-	 * sweep the problem under the rug.
-	 *
-	 * KVM's horrific CPUID ABI makes the problem all but impossible to
-	 * solve, as correctly handling multiple vCPU models (with respect to
-	 * paging and physical address properties) in a single VM would require
-	 * tracking all relevant CPUID information in kvm_mmu_page_role.  That
-	 * is very undesirable as it would double the memory requirements for
-	 * gfn_track (see struct kvm_mmu_page_role comments), and in practice
-	 * no sane VMM mucks with the core vCPU model on the fly.
+	 * Changing guest CPUID after KVM_RUN is forbidden, see the comment in
+	 * kvm_arch_vcpu_ioctl().
 	 */
-	if (vcpu->arch.last_vmentry_cpu != -1) {
-		pr_warn_ratelimited("KVM: KVM_SET_CPUID{,2} after KVM_RUN may cause guest instability\n");
-		pr_warn_ratelimited("KVM: KVM_SET_CPUID{,2} will fail after KVM_RUN starting with Linux 5.16\n");
-	}
+	KVM_BUG_ON(vcpu->arch.last_vmentry_cpu != -1, vcpu->kvm);
 }
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b7aa845f7beee..2a584070833f9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5088,6 +5088,17 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		struct kvm_cpuid __user *cpuid_arg = argp;
 		struct kvm_cpuid cpuid;
 
+		/*
+		 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
+		 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
+		 * tracked in kvm_mmu_page_role.  As a result, KVM may miss guest page
+		 * faults due to reusing SPs/SPTEs.  In practice no sane VMM mucks with
+		 * the core vCPU model on the fly, so fail.
+		 */
+		r = -EINVAL;
+		if (vcpu->arch.last_vmentry_cpu != -1)
+			goto out;
+
 		r = -EFAULT;
 		if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
 			goto out;
@@ -5098,6 +5109,14 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		struct kvm_cpuid2 __user *cpuid_arg = argp;
 		struct kvm_cpuid2 cpuid;
 
+		/*
+		 * KVM_SET_CPUID{,2} after KVM_RUN is forbidded, see the comment in
+		 * KVM_SET_CPUID case above.
+		 */
+		r = -EINVAL;
+		if (vcpu->arch.last_vmentry_cpu != -1)
+			goto out;
+
 		r = -EFAULT;
 		if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
 			goto out;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH MANUALSEL 5.15 7/9] KVM: selftests: Avoid KVM_SET_CPUID2 after KVM_RUN in hyperv_features test
  2021-12-13 14:19 [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Sasha Levin
                   ` (4 preceding siblings ...)
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 6/9] KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN Sasha Levin
@ 2021-12-13 14:19 ` Sasha Levin
  2021-12-13 14:28   ` Paolo Bonzini
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 8/9] KVM: downgrade two BUG_ONs to WARN_ON_ONCE Sasha Levin
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 9/9] x86/kvm: remove unused ack_notifier callbacks Sasha Levin
  7 siblings, 1 reply; 16+ messages in thread
From: Sasha Levin @ 2021-12-13 14:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Vitaly Kuznetsov, Paolo Bonzini, Sasha Levin, shuah, seanjc,
	ricarkol, maz, kvm, linux-kselftest

From: Vitaly Kuznetsov <vkuznets@redhat.com>

[ Upstream commit 6c1186430a808f97e2052bd5d9eff12c5d5defb0 ]

hyperv_features's sole purpose is to test access to various Hyper-V MSRs
and hypercalls with different CPUID data. As KVM_SET_CPUID2 after KVM_RUN
is deprecated and soon-to-be forbidden, avoid it by re-creating test VM
for each sub-test.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211122175818.608220-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../selftests/kvm/x86_64/hyperv_features.c    | 140 +++++++++---------
 1 file changed, 71 insertions(+), 69 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
index 91d88aaa98992..672915ce73d8f 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
@@ -165,10 +165,10 @@ static void hv_set_cpuid(struct kvm_vm *vm, struct kvm_cpuid2 *cpuid,
 	vcpu_set_cpuid(vm, VCPU_ID, cpuid);
 }
 
-static void guest_test_msrs_access(struct kvm_vm *vm, struct msr_data *msr,
-				   struct kvm_cpuid2 *best)
+static void guest_test_msrs_access(void)
 {
 	struct kvm_run *run;
+	struct kvm_vm *vm;
 	struct ucall uc;
 	int stage = 0, r;
 	struct kvm_cpuid_entry2 feat = {
@@ -180,11 +180,34 @@ static void guest_test_msrs_access(struct kvm_vm *vm, struct msr_data *msr,
 	struct kvm_cpuid_entry2 dbg = {
 		.function = HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES
 	};
-	struct kvm_enable_cap cap = {0};
-
-	run = vcpu_state(vm, VCPU_ID);
+	struct kvm_cpuid2 *best;
+	vm_vaddr_t msr_gva;
+	struct kvm_enable_cap cap = {
+		.cap = KVM_CAP_HYPERV_ENFORCE_CPUID,
+		.args = {1}
+	};
+	struct msr_data *msr;
 
 	while (true) {
+		vm = vm_create_default(VCPU_ID, 0, guest_msr);
+
+		msr_gva = vm_vaddr_alloc_page(vm);
+		memset(addr_gva2hva(vm, msr_gva), 0x0, getpagesize());
+		msr = addr_gva2hva(vm, msr_gva);
+
+		vcpu_args_set(vm, VCPU_ID, 1, msr_gva);
+		vcpu_enable_cap(vm, VCPU_ID, &cap);
+
+		vcpu_set_hv_cpuid(vm, VCPU_ID);
+
+		best = kvm_get_supported_hv_cpuid();
+
+		vm_init_descriptor_tables(vm);
+		vcpu_init_descriptor_tables(vm, VCPU_ID);
+		vm_install_exception_handler(vm, GP_VECTOR, guest_gp_handler);
+
+		run = vcpu_state(vm, VCPU_ID);
+
 		switch (stage) {
 		case 0:
 			/*
@@ -315,6 +338,7 @@ static void guest_test_msrs_access(struct kvm_vm *vm, struct msr_data *msr,
 			 * capability enabled and guest visible CPUID bit unset.
 			 */
 			cap.cap = KVM_CAP_HYPERV_SYNIC2;
+			cap.args[0] = 0;
 			vcpu_enable_cap(vm, VCPU_ID, &cap);
 			break;
 		case 22:
@@ -461,9 +485,9 @@ static void guest_test_msrs_access(struct kvm_vm *vm, struct msr_data *msr,
 
 		switch (get_ucall(vm, VCPU_ID, &uc)) {
 		case UCALL_SYNC:
-			TEST_ASSERT(uc.args[1] == stage,
-				    "Unexpected stage: %ld (%d expected)\n",
-				    uc.args[1], stage);
+			TEST_ASSERT(uc.args[1] == 0,
+				    "Unexpected stage: %ld (0 expected)\n",
+				    uc.args[1]);
 			break;
 		case UCALL_ABORT:
 			TEST_FAIL("%s at %s:%ld", (const char *)uc.args[0],
@@ -474,13 +498,14 @@ static void guest_test_msrs_access(struct kvm_vm *vm, struct msr_data *msr,
 		}
 
 		stage++;
+		kvm_vm_free(vm);
 	}
 }
 
-static void guest_test_hcalls_access(struct kvm_vm *vm, struct hcall_data *hcall,
-				     void *input, void *output, struct kvm_cpuid2 *best)
+static void guest_test_hcalls_access(void)
 {
 	struct kvm_run *run;
+	struct kvm_vm *vm;
 	struct ucall uc;
 	int stage = 0, r;
 	struct kvm_cpuid_entry2 feat = {
@@ -493,10 +518,38 @@ static void guest_test_hcalls_access(struct kvm_vm *vm, struct hcall_data *hcall
 	struct kvm_cpuid_entry2 dbg = {
 		.function = HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES
 	};
-
-	run = vcpu_state(vm, VCPU_ID);
+	struct kvm_enable_cap cap = {
+		.cap = KVM_CAP_HYPERV_ENFORCE_CPUID,
+		.args = {1}
+	};
+	vm_vaddr_t hcall_page, hcall_params;
+	struct hcall_data *hcall;
+	struct kvm_cpuid2 *best;
 
 	while (true) {
+		vm = vm_create_default(VCPU_ID, 0, guest_hcall);
+
+		vm_init_descriptor_tables(vm);
+		vcpu_init_descriptor_tables(vm, VCPU_ID);
+		vm_install_exception_handler(vm, UD_VECTOR, guest_ud_handler);
+
+		/* Hypercall input/output */
+		hcall_page = vm_vaddr_alloc_pages(vm, 2);
+		hcall = addr_gva2hva(vm, hcall_page);
+		memset(addr_gva2hva(vm, hcall_page), 0x0, 2 * getpagesize());
+
+		hcall_params = vm_vaddr_alloc_page(vm);
+		memset(addr_gva2hva(vm, hcall_params), 0x0, getpagesize());
+
+		vcpu_args_set(vm, VCPU_ID, 2, addr_gva2gpa(vm, hcall_page), hcall_params);
+		vcpu_enable_cap(vm, VCPU_ID, &cap);
+
+		vcpu_set_hv_cpuid(vm, VCPU_ID);
+
+		best = kvm_get_supported_hv_cpuid();
+
+		run = vcpu_state(vm, VCPU_ID);
+
 		switch (stage) {
 		case 0:
 			hcall->control = 0xdeadbeef;
@@ -606,9 +659,9 @@ static void guest_test_hcalls_access(struct kvm_vm *vm, struct hcall_data *hcall
 
 		switch (get_ucall(vm, VCPU_ID, &uc)) {
 		case UCALL_SYNC:
-			TEST_ASSERT(uc.args[1] == stage,
-				    "Unexpected stage: %ld (%d expected)\n",
-				    uc.args[1], stage);
+			TEST_ASSERT(uc.args[1] == 0,
+				    "Unexpected stage: %ld (0 expected)\n",
+				    uc.args[1]);
 			break;
 		case UCALL_ABORT:
 			TEST_FAIL("%s at %s:%ld", (const char *)uc.args[0],
@@ -619,66 +672,15 @@ static void guest_test_hcalls_access(struct kvm_vm *vm, struct hcall_data *hcall
 		}
 
 		stage++;
+		kvm_vm_free(vm);
 	}
 }
 
 int main(void)
 {
-	struct kvm_cpuid2 *best;
-	struct kvm_vm *vm;
-	vm_vaddr_t msr_gva, hcall_page, hcall_params;
-	struct kvm_enable_cap cap = {
-		.cap = KVM_CAP_HYPERV_ENFORCE_CPUID,
-		.args = {1}
-	};
-
-	/* Test MSRs */
-	vm = vm_create_default(VCPU_ID, 0, guest_msr);
-
-	msr_gva = vm_vaddr_alloc_page(vm);
-	memset(addr_gva2hva(vm, msr_gva), 0x0, getpagesize());
-	vcpu_args_set(vm, VCPU_ID, 1, msr_gva);
-	vcpu_enable_cap(vm, VCPU_ID, &cap);
-
-	vcpu_set_hv_cpuid(vm, VCPU_ID);
-
-	best = kvm_get_supported_hv_cpuid();
-
-	vm_init_descriptor_tables(vm);
-	vcpu_init_descriptor_tables(vm, VCPU_ID);
-	vm_install_exception_handler(vm, GP_VECTOR, guest_gp_handler);
-
 	pr_info("Testing access to Hyper-V specific MSRs\n");
-	guest_test_msrs_access(vm, addr_gva2hva(vm, msr_gva),
-			       best);
-	kvm_vm_free(vm);
-
-	/* Test hypercalls */
-	vm = vm_create_default(VCPU_ID, 0, guest_hcall);
-
-	vm_init_descriptor_tables(vm);
-	vcpu_init_descriptor_tables(vm, VCPU_ID);
-	vm_install_exception_handler(vm, UD_VECTOR, guest_ud_handler);
-
-	/* Hypercall input/output */
-	hcall_page = vm_vaddr_alloc_pages(vm, 2);
-	memset(addr_gva2hva(vm, hcall_page), 0x0, 2 * getpagesize());
-
-	hcall_params = vm_vaddr_alloc_page(vm);
-	memset(addr_gva2hva(vm, hcall_params), 0x0, getpagesize());
-
-	vcpu_args_set(vm, VCPU_ID, 2, addr_gva2gpa(vm, hcall_page), hcall_params);
-	vcpu_enable_cap(vm, VCPU_ID, &cap);
-
-	vcpu_set_hv_cpuid(vm, VCPU_ID);
-
-	best = kvm_get_supported_hv_cpuid();
+	guest_test_msrs_access();
 
 	pr_info("Testing access to Hyper-V hypercalls\n");
-	guest_test_hcalls_access(vm, addr_gva2hva(vm, hcall_params),
-				 addr_gva2hva(vm, hcall_page),
-				 addr_gva2hva(vm, hcall_page) + getpagesize(),
-				 best);
-
-	kvm_vm_free(vm);
+	guest_test_hcalls_access();
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH MANUALSEL 5.15 8/9] KVM: downgrade two BUG_ONs to WARN_ON_ONCE
  2021-12-13 14:19 [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Sasha Levin
                   ` (5 preceding siblings ...)
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 7/9] KVM: selftests: Avoid KVM_SET_CPUID2 after KVM_RUN in hyperv_features test Sasha Levin
@ 2021-12-13 14:19 ` Sasha Levin
  2021-12-13 14:28   ` Paolo Bonzini
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 9/9] x86/kvm: remove unused ack_notifier callbacks Sasha Levin
  7 siblings, 1 reply; 16+ messages in thread
From: Sasha Levin @ 2021-12-13 14:19 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Paolo Bonzini, Sasha Levin, kvm

From: Paolo Bonzini <pbonzini@redhat.com>

[ Upstream commit 5f25e71e311478f9bb0a8ef49e7d8b95316491d7 ]

This is not an unrecoverable situation.  Users of kvm_read_guest_offset_cached
and kvm_write_guest_offset_cached must expect the read/write to fail, and
therefore it is possible to just return early with an error value.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 virt/kvm/kvm_main.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ce1847bc898b2..c6bfd4e15d28a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3001,7 +3001,8 @@ int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 	int r;
 	gpa_t gpa = ghc->gpa + offset;
 
-	BUG_ON(len + offset > ghc->len);
+	if (WARN_ON_ONCE(len + offset > ghc->len))
+		return -EINVAL;
 
 	if (slots->generation != ghc->generation) {
 		if (__kvm_gfn_to_hva_cache_init(slots, ghc, ghc->gpa, ghc->len))
@@ -3038,7 +3039,8 @@ int kvm_read_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 	int r;
 	gpa_t gpa = ghc->gpa + offset;
 
-	BUG_ON(len + offset > ghc->len);
+	if (WARN_ON_ONCE(len + offset > ghc->len))
+		return -EINVAL;
 
 	if (slots->generation != ghc->generation) {
 		if (__kvm_gfn_to_hva_cache_init(slots, ghc, ghc->gpa, ghc->len))
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH MANUALSEL 5.15 9/9] x86/kvm: remove unused ack_notifier callbacks
  2021-12-13 14:19 [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Sasha Levin
                   ` (6 preceding siblings ...)
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 8/9] KVM: downgrade two BUG_ONs to WARN_ON_ONCE Sasha Levin
@ 2021-12-13 14:19 ` Sasha Levin
  2021-12-13 14:28   ` Paolo Bonzini
  7 siblings, 1 reply; 16+ messages in thread
From: Sasha Levin @ 2021-12-13 14:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Juergen Gross, Paolo Bonzini, Sasha Levin, tglx, mingo, bp,
	dave.hansen, x86, kvm

From: Juergen Gross <jgross@suse.com>

[ Upstream commit 9dba4d24cbb5524dd39ab1e08886373b17f07ff2 ]

Commit f52447261bc8c2 ("KVM: irq ack notification") introduced an
ack_notifier() callback in struct kvm_pic and in struct kvm_ioapic
without using them anywhere. Remove those callbacks again.

Signed-off-by: Juergen Gross <jgross@suse.com>
Message-Id: <20211117071617.19504-1-jgross@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/kvm/ioapic.h | 1 -
 arch/x86/kvm/irq.h    | 1 -
 2 files changed, 2 deletions(-)

diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
index 27e61ff3ac3e8..f1b2b2a6ff4db 100644
--- a/arch/x86/kvm/ioapic.h
+++ b/arch/x86/kvm/ioapic.h
@@ -81,7 +81,6 @@ struct kvm_ioapic {
 	unsigned long irq_states[IOAPIC_NUM_PINS];
 	struct kvm_io_device dev;
 	struct kvm *kvm;
-	void (*ack_notifier)(void *opaque, int irq);
 	spinlock_t lock;
 	struct rtc_status rtc_status;
 	struct delayed_work eoi_inject;
diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
index 650642b18d151..c2d7cfe82d004 100644
--- a/arch/x86/kvm/irq.h
+++ b/arch/x86/kvm/irq.h
@@ -56,7 +56,6 @@ struct kvm_pic {
 	struct kvm_io_device dev_master;
 	struct kvm_io_device dev_slave;
 	struct kvm_io_device dev_elcr;
-	void (*ack_notifier)(void *opaque, int irq);
 	unsigned long irq_states[PIC_NUM_PINS];
 };
 
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH MANUALSEL 5.15 3/9] KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 3/9] KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled Sasha Levin
@ 2021-12-13 14:27   ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-12-13 14:27 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable; +Cc: tglx, mingo, bp, dave.hansen, x86, kvm

On 12/13/21 15:19, Sasha Levin wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
> 
> [ Upstream commit e90e51d5f01d2baae5dcce280866bbb96816e978 ]
> 
> There is nothing to synchronize if APICv is disabled, since neither
> other vCPUs nor assigned devices can set PIR.ON.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>   arch/x86/kvm/vmx/vmx.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index dacdf2395f01a..4e212f04268bb 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7776,10 +7776,10 @@ static __init int hardware_setup(void)
>   		ple_window_shrink = 0;
>   	}
>   
> -	if (!cpu_has_vmx_apicv()) {
> +	if (!cpu_has_vmx_apicv())
>   		enable_apicv = 0;
> +	if (!enable_apicv)
>   		vmx_x86_ops.sync_pir_to_irr = NULL;
> -	}
>   
>   	if (cpu_has_vmx_tsc_scaling()) {
>   		kvm_has_tsc_control = true;
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH MANUALSEL 5.15 4/9] KVM: SEV: do not take kvm->lock when destroying
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 4/9] KVM: SEV: do not take kvm->lock when destroying Sasha Levin
@ 2021-12-13 14:27   ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-12-13 14:27 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable
  Cc: Sean Christopherson, tglx, mingo, bp, dave.hansen, x86, kvm

On 12/13/21 15:19, Sasha Levin wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
> 
> [ Upstream commit 10a37929efeb4c51a0069afdd537c4fa3831f6e5 ]
> 
> Taking the lock is useless since there are no other references,
> and there are already accesses (e.g. to sev->enc_context_owner)
> that do not take it.  So get rid of it.
> 
> Reviewed-by: Sean Christopherson <seanjc@google.com>
> Message-Id: <20211123005036.2954379-12-pbonzini@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>   arch/x86/kvm/svm/sev.c | 4 ----
>   1 file changed, 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 134c4ea5e6ad8..ae8092f0d401e 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -1842,8 +1842,6 @@ void sev_vm_destroy(struct kvm *kvm)
>   		return;
>   	}
>   
> -	mutex_lock(&kvm->lock);
> -
>   	/*
>   	 * Ensure that all guest tagged cache entries are flushed before
>   	 * releasing the pages back to the system for use. CLFLUSH will
> @@ -1863,8 +1861,6 @@ void sev_vm_destroy(struct kvm *kvm)
>   		}
>   	}
>   
> -	mutex_unlock(&kvm->lock);
> -
>   	sev_unbind_asid(kvm, sev->handle);
>   	sev_asid_free(sev);
>   }
> 

This is cosmetic only, so

NACK

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH MANUALSEL 5.15 5/9] KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILE
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 5/9] KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILE Sasha Levin
@ 2021-12-13 14:27   ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-12-13 14:27 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable
  Cc: Vitaly Kuznetsov, Sean Christopherson, shuah, kvm, linux-kselftest

On 12/13/21 15:19, Sasha Levin wrote:
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> [ Upstream commit 908fa88e420f30dde6d80f092795a18ec72ca6d3 ]
> 
> With the elevated 'KVM_CAP_MAX_VCPUS' value kvm_create_max_vcpus test
> may hit RLIMIT_NOFILE limits:
> 
>   # ./kvm_create_max_vcpus
>   KVM_CAP_MAX_VCPU_ID: 4096
>   KVM_CAP_MAX_VCPUS: 1024
>   Testing creating 1024 vCPUs, with IDs 0...1023.
>   /dev/kvm not available (errno: 24), skipping test
> 
> Adjust RLIMIT_NOFILE limits to make sure KVM_CAP_MAX_VCPUS fds can be
> opened. Note, raising hard limit ('rlim_max') requires CAP_SYS_RESOURCE
> capability which is generally not needed to run kvm selftests (but without
> raising the limit the test is doomed to fail anyway).
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Message-Id: <20211123135953.667434-1-vkuznets@redhat.com>
> [Skip the test if the hard limit can be raised. - Paolo]
> Reviewed-by: Sean Christopherson <seanjc@google.com>
> Tested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>   .../selftests/kvm/kvm_create_max_vcpus.c      | 30 +++++++++++++++++++
>   1 file changed, 30 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/kvm_create_max_vcpus.c b/tools/testing/selftests/kvm/kvm_create_max_vcpus.c
> index 0299cd81b8ba2..aa3795cd7bd3d 100644
> --- a/tools/testing/selftests/kvm/kvm_create_max_vcpus.c
> +++ b/tools/testing/selftests/kvm/kvm_create_max_vcpus.c
> @@ -12,6 +12,7 @@
>   #include <stdio.h>
>   #include <stdlib.h>
>   #include <string.h>
> +#include <sys/resource.h>
>   
>   #include "test_util.h"
>   
> @@ -40,10 +41,39 @@ int main(int argc, char *argv[])
>   {
>   	int kvm_max_vcpu_id = kvm_check_cap(KVM_CAP_MAX_VCPU_ID);
>   	int kvm_max_vcpus = kvm_check_cap(KVM_CAP_MAX_VCPUS);
> +	/*
> +	 * Number of file descriptors reqired, KVM_CAP_MAX_VCPUS for vCPU fds +
> +	 * an arbitrary number for everything else.
> +	 */
> +	int nr_fds_wanted = kvm_max_vcpus + 100;
> +	struct rlimit rl;
>   
>   	pr_info("KVM_CAP_MAX_VCPU_ID: %d\n", kvm_max_vcpu_id);
>   	pr_info("KVM_CAP_MAX_VCPUS: %d\n", kvm_max_vcpus);
>   
> +	/*
> +	 * Check that we're allowed to open nr_fds_wanted file descriptors and
> +	 * try raising the limits if needed.
> +	 */
> +	TEST_ASSERT(!getrlimit(RLIMIT_NOFILE, &rl), "getrlimit() failed!");
> +
> +	if (rl.rlim_cur < nr_fds_wanted) {
> +		rl.rlim_cur = nr_fds_wanted;
> +		if (rl.rlim_max < nr_fds_wanted) {
> +			int old_rlim_max = rl.rlim_max;
> +			rl.rlim_max = nr_fds_wanted;
> +
> +			int r = setrlimit(RLIMIT_NOFILE, &rl);
> +			if (r < 0) {
> +				printf("RLIMIT_NOFILE hard limit is too low (%d, wanted %d)\n",
> +				       old_rlim_max, nr_fds_wanted);
> +				exit(KSFT_SKIP);
> +			}
> +		} else {
> +			TEST_ASSERT(!setrlimit(RLIMIT_NOFILE, &rl), "setrlimit() failed!");
> +		}
> +	}
> +
>   	/*
>   	 * Upstream KVM prior to 4.8 does not support KVM_CAP_MAX_VCPU_ID.
>   	 * Userspace is supposed to use KVM_CAP_MAX_VCPUS as the maximum ID
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH MANUALSEL 5.15 6/9] KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 6/9] KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN Sasha Levin
@ 2021-12-13 14:28   ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-12-13 14:28 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable
  Cc: Vitaly Kuznetsov, tglx, mingo, bp, dave.hansen, x86, kvm

On 12/13/21 15:19, Sasha Levin wrote:
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> [ Upstream commit feb627e8d6f69c9a319fe279710959efb3eba873 ]
> 
> Commit 63f5a1909f9e ("KVM: x86: Alert userspace that KVM_SET_CPUID{,2}
> after KVM_RUN is broken") officially deprecated KVM_SET_CPUID{,2} ioctls
> after first successful KVM_RUN and promissed to make this sequence forbiden
> in 5.16. It's time to fulfil the promise.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Message-Id: <20211122175818.608220-3-vkuznets@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>   arch/x86/kvm/mmu/mmu.c | 28 +++++++++++-----------------
>   arch/x86/kvm/x86.c     | 19 +++++++++++++++++++
>   2 files changed, 30 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 0a88cb4f731f4..31392e492ce5e 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5022,6 +5022,14 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   	/*
>   	 * Invalidate all MMU roles to force them to reinitialize as CPUID
>   	 * information is factored into reserved bit calculations.
> +	 *
> +	 * Correctly handling multiple vCPU models with respect to paging and
> +	 * physical address properties) in a single VM would require tracking
> +	 * all relevant CPUID information in kvm_mmu_page_role. That is very
> +	 * undesirable as it would increase the memory requirements for
> +	 * gfn_track (see struct kvm_mmu_page_role comments).  For now that
> +	 * problem is swept under the rug; KVM's CPUID API is horrific and
> +	 * it's all but impossible to solve it without introducing a new API.
>   	 */
>   	vcpu->arch.root_mmu.mmu_role.ext.valid = 0;
>   	vcpu->arch.guest_mmu.mmu_role.ext.valid = 0;
> @@ -5029,24 +5037,10 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   	kvm_mmu_reset_context(vcpu);
>   
>   	/*
> -	 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
> -	 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
> -	 * tracked in kvm_mmu_page_role.  As a result, KVM may miss guest page
> -	 * faults due to reusing SPs/SPTEs.  Alert userspace, but otherwise
> -	 * sweep the problem under the rug.
> -	 *
> -	 * KVM's horrific CPUID ABI makes the problem all but impossible to
> -	 * solve, as correctly handling multiple vCPU models (with respect to
> -	 * paging and physical address properties) in a single VM would require
> -	 * tracking all relevant CPUID information in kvm_mmu_page_role.  That
> -	 * is very undesirable as it would double the memory requirements for
> -	 * gfn_track (see struct kvm_mmu_page_role comments), and in practice
> -	 * no sane VMM mucks with the core vCPU model on the fly.
> +	 * Changing guest CPUID after KVM_RUN is forbidden, see the comment in
> +	 * kvm_arch_vcpu_ioctl().
>   	 */
> -	if (vcpu->arch.last_vmentry_cpu != -1) {
> -		pr_warn_ratelimited("KVM: KVM_SET_CPUID{,2} after KVM_RUN may cause guest instability\n");
> -		pr_warn_ratelimited("KVM: KVM_SET_CPUID{,2} will fail after KVM_RUN starting with Linux 5.16\n");
> -	}
> +	KVM_BUG_ON(vcpu->arch.last_vmentry_cpu != -1, vcpu->kvm);
>   }
>   
>   void kvm_mmu_reset_context(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b7aa845f7beee..2a584070833f9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5088,6 +5088,17 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>   		struct kvm_cpuid __user *cpuid_arg = argp;
>   		struct kvm_cpuid cpuid;
>   
> +		/*
> +		 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
> +		 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
> +		 * tracked in kvm_mmu_page_role.  As a result, KVM may miss guest page
> +		 * faults due to reusing SPs/SPTEs.  In practice no sane VMM mucks with
> +		 * the core vCPU model on the fly, so fail.
> +		 */
> +		r = -EINVAL;
> +		if (vcpu->arch.last_vmentry_cpu != -1)
> +			goto out;
> +
>   		r = -EFAULT;
>   		if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
>   			goto out;
> @@ -5098,6 +5109,14 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>   		struct kvm_cpuid2 __user *cpuid_arg = argp;
>   		struct kvm_cpuid2 cpuid;
>   
> +		/*
> +		 * KVM_SET_CPUID{,2} after KVM_RUN is forbidded, see the comment in
> +		 * KVM_SET_CPUID case above.
> +		 */
> +		r = -EINVAL;
> +		if (vcpu->arch.last_vmentry_cpu != -1)
> +			goto out;
> +
>   		r = -EFAULT;
>   		if (copy_from_user(&cpuid, cpuid_arg, sizeof(cpuid)))
>   			goto out;
> 

NACK

the error says "starting with Linux 5.16" for a reason... :)

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH MANUALSEL 5.15 7/9] KVM: selftests: Avoid KVM_SET_CPUID2 after KVM_RUN in hyperv_features test
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 7/9] KVM: selftests: Avoid KVM_SET_CPUID2 after KVM_RUN in hyperv_features test Sasha Levin
@ 2021-12-13 14:28   ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-12-13 14:28 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable
  Cc: Vitaly Kuznetsov, shuah, seanjc, ricarkol, maz, kvm, linux-kselftest

On 12/13/21 15:19, Sasha Levin wrote:
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> [ Upstream commit 6c1186430a808f97e2052bd5d9eff12c5d5defb0 ]
> 
> hyperv_features's sole purpose is to test access to various Hyper-V MSRs
> and hypercalls with different CPUID data. As KVM_SET_CPUID2 after KVM_RUN
> is deprecated and soon-to-be forbidden, avoid it by re-creating test VM
> for each sub-test.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Message-Id: <20211122175818.608220-2-vkuznets@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>   .../selftests/kvm/x86_64/hyperv_features.c    | 140 +++++++++---------
>   1 file changed, 71 insertions(+), 69 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> index 91d88aaa98992..672915ce73d8f 100644
> --- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> +++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
> @@ -165,10 +165,10 @@ static void hv_set_cpuid(struct kvm_vm *vm, struct kvm_cpuid2 *cpuid,
>   	vcpu_set_cpuid(vm, VCPU_ID, cpuid);
>   }
>   
> -static void guest_test_msrs_access(struct kvm_vm *vm, struct msr_data *msr,
> -				   struct kvm_cpuid2 *best)
> +static void guest_test_msrs_access(void)
>   {
>   	struct kvm_run *run;
> +	struct kvm_vm *vm;
>   	struct ucall uc;
>   	int stage = 0, r;
>   	struct kvm_cpuid_entry2 feat = {
> @@ -180,11 +180,34 @@ static void guest_test_msrs_access(struct kvm_vm *vm, struct msr_data *msr,
>   	struct kvm_cpuid_entry2 dbg = {
>   		.function = HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES
>   	};
> -	struct kvm_enable_cap cap = {0};
> -
> -	run = vcpu_state(vm, VCPU_ID);
> +	struct kvm_cpuid2 *best;
> +	vm_vaddr_t msr_gva;
> +	struct kvm_enable_cap cap = {
> +		.cap = KVM_CAP_HYPERV_ENFORCE_CPUID,
> +		.args = {1}
> +	};
> +	struct msr_data *msr;
>   
>   	while (true) {
> +		vm = vm_create_default(VCPU_ID, 0, guest_msr);
> +
> +		msr_gva = vm_vaddr_alloc_page(vm);
> +		memset(addr_gva2hva(vm, msr_gva), 0x0, getpagesize());
> +		msr = addr_gva2hva(vm, msr_gva);
> +
> +		vcpu_args_set(vm, VCPU_ID, 1, msr_gva);
> +		vcpu_enable_cap(vm, VCPU_ID, &cap);
> +
> +		vcpu_set_hv_cpuid(vm, VCPU_ID);
> +
> +		best = kvm_get_supported_hv_cpuid();
> +
> +		vm_init_descriptor_tables(vm);
> +		vcpu_init_descriptor_tables(vm, VCPU_ID);
> +		vm_install_exception_handler(vm, GP_VECTOR, guest_gp_handler);
> +
> +		run = vcpu_state(vm, VCPU_ID);
> +
>   		switch (stage) {
>   		case 0:
>   			/*
> @@ -315,6 +338,7 @@ static void guest_test_msrs_access(struct kvm_vm *vm, struct msr_data *msr,
>   			 * capability enabled and guest visible CPUID bit unset.
>   			 */
>   			cap.cap = KVM_CAP_HYPERV_SYNIC2;
> +			cap.args[0] = 0;
>   			vcpu_enable_cap(vm, VCPU_ID, &cap);
>   			break;
>   		case 22:
> @@ -461,9 +485,9 @@ static void guest_test_msrs_access(struct kvm_vm *vm, struct msr_data *msr,
>   
>   		switch (get_ucall(vm, VCPU_ID, &uc)) {
>   		case UCALL_SYNC:
> -			TEST_ASSERT(uc.args[1] == stage,
> -				    "Unexpected stage: %ld (%d expected)\n",
> -				    uc.args[1], stage);
> +			TEST_ASSERT(uc.args[1] == 0,
> +				    "Unexpected stage: %ld (0 expected)\n",
> +				    uc.args[1]);
>   			break;
>   		case UCALL_ABORT:
>   			TEST_FAIL("%s at %s:%ld", (const char *)uc.args[0],
> @@ -474,13 +498,14 @@ static void guest_test_msrs_access(struct kvm_vm *vm, struct msr_data *msr,
>   		}
>   
>   		stage++;
> +		kvm_vm_free(vm);
>   	}
>   }
>   
> -static void guest_test_hcalls_access(struct kvm_vm *vm, struct hcall_data *hcall,
> -				     void *input, void *output, struct kvm_cpuid2 *best)
> +static void guest_test_hcalls_access(void)
>   {
>   	struct kvm_run *run;
> +	struct kvm_vm *vm;
>   	struct ucall uc;
>   	int stage = 0, r;
>   	struct kvm_cpuid_entry2 feat = {
> @@ -493,10 +518,38 @@ static void guest_test_hcalls_access(struct kvm_vm *vm, struct hcall_data *hcall
>   	struct kvm_cpuid_entry2 dbg = {
>   		.function = HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES
>   	};
> -
> -	run = vcpu_state(vm, VCPU_ID);
> +	struct kvm_enable_cap cap = {
> +		.cap = KVM_CAP_HYPERV_ENFORCE_CPUID,
> +		.args = {1}
> +	};
> +	vm_vaddr_t hcall_page, hcall_params;
> +	struct hcall_data *hcall;
> +	struct kvm_cpuid2 *best;
>   
>   	while (true) {
> +		vm = vm_create_default(VCPU_ID, 0, guest_hcall);
> +
> +		vm_init_descriptor_tables(vm);
> +		vcpu_init_descriptor_tables(vm, VCPU_ID);
> +		vm_install_exception_handler(vm, UD_VECTOR, guest_ud_handler);
> +
> +		/* Hypercall input/output */
> +		hcall_page = vm_vaddr_alloc_pages(vm, 2);
> +		hcall = addr_gva2hva(vm, hcall_page);
> +		memset(addr_gva2hva(vm, hcall_page), 0x0, 2 * getpagesize());
> +
> +		hcall_params = vm_vaddr_alloc_page(vm);
> +		memset(addr_gva2hva(vm, hcall_params), 0x0, getpagesize());
> +
> +		vcpu_args_set(vm, VCPU_ID, 2, addr_gva2gpa(vm, hcall_page), hcall_params);
> +		vcpu_enable_cap(vm, VCPU_ID, &cap);
> +
> +		vcpu_set_hv_cpuid(vm, VCPU_ID);
> +
> +		best = kvm_get_supported_hv_cpuid();
> +
> +		run = vcpu_state(vm, VCPU_ID);
> +
>   		switch (stage) {
>   		case 0:
>   			hcall->control = 0xdeadbeef;
> @@ -606,9 +659,9 @@ static void guest_test_hcalls_access(struct kvm_vm *vm, struct hcall_data *hcall
>   
>   		switch (get_ucall(vm, VCPU_ID, &uc)) {
>   		case UCALL_SYNC:
> -			TEST_ASSERT(uc.args[1] == stage,
> -				    "Unexpected stage: %ld (%d expected)\n",
> -				    uc.args[1], stage);
> +			TEST_ASSERT(uc.args[1] == 0,
> +				    "Unexpected stage: %ld (0 expected)\n",
> +				    uc.args[1]);
>   			break;
>   		case UCALL_ABORT:
>   			TEST_FAIL("%s at %s:%ld", (const char *)uc.args[0],
> @@ -619,66 +672,15 @@ static void guest_test_hcalls_access(struct kvm_vm *vm, struct hcall_data *hcall
>   		}
>   
>   		stage++;
> +		kvm_vm_free(vm);
>   	}
>   }
>   
>   int main(void)
>   {
> -	struct kvm_cpuid2 *best;
> -	struct kvm_vm *vm;
> -	vm_vaddr_t msr_gva, hcall_page, hcall_params;
> -	struct kvm_enable_cap cap = {
> -		.cap = KVM_CAP_HYPERV_ENFORCE_CPUID,
> -		.args = {1}
> -	};
> -
> -	/* Test MSRs */
> -	vm = vm_create_default(VCPU_ID, 0, guest_msr);
> -
> -	msr_gva = vm_vaddr_alloc_page(vm);
> -	memset(addr_gva2hva(vm, msr_gva), 0x0, getpagesize());
> -	vcpu_args_set(vm, VCPU_ID, 1, msr_gva);
> -	vcpu_enable_cap(vm, VCPU_ID, &cap);
> -
> -	vcpu_set_hv_cpuid(vm, VCPU_ID);
> -
> -	best = kvm_get_supported_hv_cpuid();
> -
> -	vm_init_descriptor_tables(vm);
> -	vcpu_init_descriptor_tables(vm, VCPU_ID);
> -	vm_install_exception_handler(vm, GP_VECTOR, guest_gp_handler);
> -
>   	pr_info("Testing access to Hyper-V specific MSRs\n");
> -	guest_test_msrs_access(vm, addr_gva2hva(vm, msr_gva),
> -			       best);
> -	kvm_vm_free(vm);
> -
> -	/* Test hypercalls */
> -	vm = vm_create_default(VCPU_ID, 0, guest_hcall);
> -
> -	vm_init_descriptor_tables(vm);
> -	vcpu_init_descriptor_tables(vm, VCPU_ID);
> -	vm_install_exception_handler(vm, UD_VECTOR, guest_ud_handler);
> -
> -	/* Hypercall input/output */
> -	hcall_page = vm_vaddr_alloc_pages(vm, 2);
> -	memset(addr_gva2hva(vm, hcall_page), 0x0, 2 * getpagesize());
> -
> -	hcall_params = vm_vaddr_alloc_page(vm);
> -	memset(addr_gva2hva(vm, hcall_params), 0x0, getpagesize());
> -
> -	vcpu_args_set(vm, VCPU_ID, 2, addr_gva2gpa(vm, hcall_page), hcall_params);
> -	vcpu_enable_cap(vm, VCPU_ID, &cap);
> -
> -	vcpu_set_hv_cpuid(vm, VCPU_ID);
> -
> -	best = kvm_get_supported_hv_cpuid();
> +	guest_test_msrs_access();
>   
>   	pr_info("Testing access to Hyper-V hypercalls\n");
> -	guest_test_hcalls_access(vm, addr_gva2hva(vm, hcall_params),
> -				 addr_gva2hva(vm, hcall_page),
> -				 addr_gva2hva(vm, hcall_page) + getpagesize(),
> -				 best);
> -
> -	kvm_vm_free(vm);
> +	guest_test_hcalls_access();
>   }
> 

NACK

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH MANUALSEL 5.15 8/9] KVM: downgrade two BUG_ONs to WARN_ON_ONCE
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 8/9] KVM: downgrade two BUG_ONs to WARN_ON_ONCE Sasha Levin
@ 2021-12-13 14:28   ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-12-13 14:28 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable; +Cc: kvm

On 12/13/21 15:19, Sasha Levin wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
> 
> [ Upstream commit 5f25e71e311478f9bb0a8ef49e7d8b95316491d7 ]
> 
> This is not an unrecoverable situation.  Users of kvm_read_guest_offset_cached
> and kvm_write_guest_offset_cached must expect the read/write to fail, and
> therefore it is possible to just return early with an error value.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>   virt/kvm/kvm_main.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index ce1847bc898b2..c6bfd4e15d28a 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3001,7 +3001,8 @@ int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
>   	int r;
>   	gpa_t gpa = ghc->gpa + offset;
>   
> -	BUG_ON(len + offset > ghc->len);
> +	if (WARN_ON_ONCE(len + offset > ghc->len))
> +		return -EINVAL;
>   
>   	if (slots->generation != ghc->generation) {
>   		if (__kvm_gfn_to_hva_cache_init(slots, ghc, ghc->gpa, ghc->len))
> @@ -3038,7 +3039,8 @@ int kvm_read_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
>   	int r;
>   	gpa_t gpa = ghc->gpa + offset;
>   
> -	BUG_ON(len + offset > ghc->len);
> +	if (WARN_ON_ONCE(len + offset > ghc->len))
> +		return -EINVAL;
>   
>   	if (slots->generation != ghc->generation) {
>   		if (__kvm_gfn_to_hva_cache_init(slots, ghc, ghc->gpa, ghc->len))
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH MANUALSEL 5.15 9/9] x86/kvm: remove unused ack_notifier callbacks
  2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 9/9] x86/kvm: remove unused ack_notifier callbacks Sasha Levin
@ 2021-12-13 14:28   ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2021-12-13 14:28 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable
  Cc: Juergen Gross, tglx, mingo, bp, dave.hansen, x86, kvm

On 12/13/21 15:19, Sasha Levin wrote:
> From: Juergen Gross <jgross@suse.com>
> 
> [ Upstream commit 9dba4d24cbb5524dd39ab1e08886373b17f07ff2 ]
> 
> Commit f52447261bc8c2 ("KVM: irq ack notification") introduced an
> ack_notifier() callback in struct kvm_pic and in struct kvm_ioapic
> without using them anywhere. Remove those callbacks again.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Message-Id: <20211117071617.19504-1-jgross@suse.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>   arch/x86/kvm/ioapic.h | 1 -
>   arch/x86/kvm/irq.h    | 1 -
>   2 files changed, 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
> index 27e61ff3ac3e8..f1b2b2a6ff4db 100644
> --- a/arch/x86/kvm/ioapic.h
> +++ b/arch/x86/kvm/ioapic.h
> @@ -81,7 +81,6 @@ struct kvm_ioapic {
>   	unsigned long irq_states[IOAPIC_NUM_PINS];
>   	struct kvm_io_device dev;
>   	struct kvm *kvm;
> -	void (*ack_notifier)(void *opaque, int irq);
>   	spinlock_t lock;
>   	struct rtc_status rtc_status;
>   	struct delayed_work eoi_inject;
> diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
> index 650642b18d151..c2d7cfe82d004 100644
> --- a/arch/x86/kvm/irq.h
> +++ b/arch/x86/kvm/irq.h
> @@ -56,7 +56,6 @@ struct kvm_pic {
>   	struct kvm_io_device dev_master;
>   	struct kvm_io_device dev_slave;
>   	struct kvm_io_device dev_elcr;
> -	void (*ack_notifier)(void *opaque, int irq);
>   	unsigned long irq_states[PIC_NUM_PINS];
>   };
>   
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Let's save 24 bytes per VM. :)

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-12-13 14:28 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-13 14:19 [PATCH MANUALSEL 5.15 1/9] sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full Sasha Levin
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 2/9] timers/nohz: Last resort update jiffies on nohz_full IRQ entry Sasha Levin
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 3/9] KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled Sasha Levin
2021-12-13 14:27   ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 4/9] KVM: SEV: do not take kvm->lock when destroying Sasha Levin
2021-12-13 14:27   ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 5/9] KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILE Sasha Levin
2021-12-13 14:27   ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 6/9] KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN Sasha Levin
2021-12-13 14:28   ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 7/9] KVM: selftests: Avoid KVM_SET_CPUID2 after KVM_RUN in hyperv_features test Sasha Levin
2021-12-13 14:28   ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 8/9] KVM: downgrade two BUG_ONs to WARN_ON_ONCE Sasha Levin
2021-12-13 14:28   ` Paolo Bonzini
2021-12-13 14:19 ` [PATCH MANUALSEL 5.15 9/9] x86/kvm: remove unused ack_notifier callbacks Sasha Levin
2021-12-13 14:28   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).