kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption
@ 2022-09-16  0:54 Michal Luczaj
  2022-09-16  0:54 ` [RFC PATCH 1/4] KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache Michal Luczaj
                   ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-16  0:54 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, shuah, Michal Luczaj

There seem to be two problems with the way arch.xen.shinfo_cache
instance of gfn_to_pfn_cache is treated.

1. gpc->lock is taken without checking if it was actually initialized.
   e.g. kvm_xen_set_evtchn_fast():

	read_lock_irqsave(&gpc->lock, flags);
	if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
		goto out_rcu;

INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 2 PID: 959 Comm: xenirq Not tainted 6.0.0-rc5 #12
Call Trace:
 dump_stack_lvl+0x5b/0x77
 register_lock_class+0x46d/0x480
 __lock_acquire+0x64/0x1fa0
 lock_acquire+0xbf/0x2b0
 ? kvm_xen_set_evtchn_fast+0xc7/0x400 [kvm]
 ? lock_acquire+0xcf/0x2b0
 ? _raw_read_lock_irqsave+0x99/0xa0
 _raw_read_lock_irqsave+0x81/0xa0
 ? kvm_xen_set_evtchn_fast+0xc7/0x400 [kvm]
 kvm_xen_set_evtchn_fast+0xc7/0x400 [kvm]
 ? kvm_xen_set_evtchn_fast+0x7e/0x400 [kvm]
 ? find_held_lock+0x2b/0x80
 kvm_xen_hvm_evtchn_send+0x4b/0x90 [kvm]
 kvm_arch_vm_ioctl+0x4de/0xca0 [kvm]
 ? vmx_vcpu_put+0x18/0x1e0 [kvm_intel]
 ? kvm_arch_vcpu_put+0x1db/0x250 [kvm]
 ? vcpu_put+0x46/0x70 [kvm]
 ? kvm_arch_vcpu_ioctl+0xd0/0x1710 [kvm]
 kvm_vm_ioctl+0x4e4/0xdd0 [kvm]
 ? lock_is_held_type+0xe3/0x140
 __x64_sys_ioctl+0x8d/0xd0
 do_syscall_64+0x58/0x80
 ? __do_fast_syscall_32+0xeb/0xf0
 ? lockdep_hardirqs_on+0x7d/0x100
 ? lock_is_held_type+0xe3/0x140
 ? do_syscall_64+0x67/0x80
 ? lockdep_hardirqs_on+0x7d/0x100
 ? do_syscall_64+0x67/0x80
 ? lockdep_hardirqs_on+0x7d/0x100
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

2. kvm_gfn_to_pfn_cache_init() allows for gpc->lock reinitialization.
   This can lead to situation where a lock that is already taken gets
   reinitialized in another thread (and becomes corrupted).

   For example: a race between ioctl(KVM_XEN_HVM_EVTCHN_SEND) and
   kvm_gfn_to_pfn_cache_init():

                  (thread 1)                |           (thread 2)
                                            |
   kvm_xen_set_evtchn_fast                  |
    read_lock_irqsave(&gpc->lock, ...)      |
                                            | kvm_gfn_to_pfn_cache_init
                                            |  rwlock_init(&gpc->lock)
    read_unlock_irqrestore(&gpc->lock, ...) |

Testing shinfo lock corruption (KVM_XEN_HVM_EVTCHN_SEND)
rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-...D
 } 26610 jiffies s: 265 root: 0x2/.
rcu: blocking rcu_node structures (internal RCU debug):
Task dump for CPU 1:
task:xen_shinfo_test state:R  running task     stack:    0 pid:  952
ppid:   867 flags:0x00000008
Call Trace:
 ? exc_page_fault+0x121/0x2b0
 ? rcu_read_lock_sched_held+0x10/0x80
 ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	1-...0: (5 ticks this GP) idle=6b94/1/0x4000000000000000
softirq=5929/5931 fqs=15261
	(detected by 0, t=65002 jiffies, g=5465, q=100 ncpus=4)
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 952 Comm: xen_shinfo_test Not tainted 6.0.0-rc5 #12
RIP: 0010:queued_write_lock_slowpath+0x68/0x90
Call Trace:
 do_raw_write_lock+0xad/0xb0
 kvm_gfn_to_pfn_cache_refresh+0x2a5/0x630 [kvm]
 kvm_xen_hvm_set_attr+0x19d/0x5e0 [kvm]
 kvm_arch_vm_ioctl+0x8ca/0xca0 [kvm]
 ? rcu_read_lock_sched_held+0x10/0x80
 kvm_vm_ioctl+0x4e4/0xdd0 [kvm]
 ? rcu_read_lock_sched_held+0x10/0x80
 ? do_raw_write_trylock+0x29/0x40
 ? rcu_read_lock_sched_held+0x10/0x80
 ? lock_release+0x1ef/0x2d0
 ? lock_release+0x1ef/0x2d0
 __x64_sys_ioctl+0x8d/0xd0
 do_syscall_64+0x58/0x80
 ? exc_page_fault+0x121/0x2b0
 ? rcu_read_lock_sched_held+0x10/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Testing shinfo lock corruption (KVM_XEN_HVM_EVTCHN_SEND)
BUG: kernel NULL pointer dereference, address: 0000000000000800
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 1049cf067 P4D 1049cf067 PUD 104ba0067 PMD 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 955 Comm: xen_shinfo_test Not tainted 6.0.0-rc5 #12
RIP: 0010:kvm_xen_set_evtchn_fast+0x10f/0x400 [kvm]
Call Trace:
 kvm_xen_hvm_evtchn_send+0x4b/0x90 [kvm]
 kvm_arch_vm_ioctl+0x4de/0xca0 [kvm]
 ? kvm_xen_hvm_evtchn_send+0x6e/0x90 [kvm]
 ? kvm_arch_vm_ioctl+0x4de/0xca0 [kvm]
 kvm_vm_ioctl+0x4e4/0xdd0 [kvm]
 ? kvm_vm_ioctl+0x4e4/0xdd0 [kvm]
 ? rcu_read_lock_sched_held+0x10/0x80
 ? lock_release+0x1ef/0x2d0
 __x64_sys_ioctl+0x8d/0xd0
 do_syscall_64+0x58/0x80
 ? rcu_read_lock_sched_held+0x10/0x80
 ? trace_hardirqs_on_prepare+0x55/0xe0
 ? do_syscall_64+0x67/0x80
 ? rcu_read_lock_sched_held+0x10/0x80
 ? trace_hardirqs_on_prepare+0x55/0xe0
 ? do_syscall_64+0x67/0x80
 ? do_syscall_64+0x67/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

   Similar story with the handling of hypercall SCHEDOP_poll in
   wait_pending_event():

Testing shinfo lock corruption (SCHEDOP_poll)
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 	1-...0: (12 ticks this GP) idle=0a5c/1/0x4000000000000000
softirq=6640/6640 fqs=12988
rcu: 	2-...0: (10 ticks this GP) idle=66e4/1/0x4000000000000000
softirq=5526/5527 fqs=12988
	(detected by 0, t=65003 jiffies, g=7437, q=732 ncpus=4)
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1 skipped: idling at native_halt+0xa/0x10
Sending NMI from CPU 0 to CPUs 2:
NMI backtrace for cpu 2
CPU: 2 PID: 970 Comm: xen_shinfo_test Not tainted 6.0.0-rc5 #12
RIP: 0010:queued_write_lock_slowpath+0x66/0x90
Call Trace:
 do_raw_write_lock+0xad/0xb0
 kvm_gfn_to_pfn_cache_refresh+0x8a/0x630 [kvm]
 ? kvm_gfn_to_pfn_cache_init+0x122/0x130 [kvm]
 kvm_xen_hvm_set_attr+0x19d/0x5e0 [kvm]
 kvm_arch_vm_ioctl+0x8ca/0xca0 [kvm]
 ? __lock_acquire+0x3a4/0x1fa0
 ? __lock_acquire+0x3a4/0x1fa0
 kvm_vm_ioctl+0x4e4/0xdd0 [kvm]
 ? lock_is_held_type+0xe3/0x140
 ? lock_release+0x135/0x2d0
 __x64_sys_ioctl+0x8d/0xd0
 do_syscall_64+0x58/0x80
 ? lockdep_hardirqs_on+0x7d/0x100
 ? do_syscall_64+0x67/0x80
 ? do_syscall_64+0x67/0x80
 ? do_syscall_64+0x67/0x80
 ? do_syscall_64+0x67/0x80
 ? do_syscall_64+0x67/0x80
 ? do_syscall_64+0x67/0x80
 ? do_syscall_64+0x67/0x80
 ? lockdep_hardirqs_on+0x7d/0x100
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Testing shinfo lock corruption (SCHEDOP_poll)
watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [kworker/1:2:260]
irq event stamp: 6990
hardirqs last  enabled at (6989): [<ffffffff81e5c964>]
_raw_spin_unlock_irq+0x24/0x50
hardirqs last disabled at (6990): [<ffffffff81e53e51>]
__schedule+0xd41/0x1620
softirqs last  enabled at (5790): [<ffffffff81766e78>]
rht_deferred_worker+0x708/0xbe0
softirqs last disabled at (5788): [<ffffffff81766967>]
rht_deferred_worker+0x1f7/0xbe0
CPU: 1 PID: 260 Comm: kworker/1:2 Not tainted 6.0.0-rc5 #12
Workqueue: rcu_gp wait_rcu_exp_gp
RIP: 0010:smp_call_function_single+0x11a/0x160
Call Trace:
 ? trace_hardirqs_on+0x2b/0xd0
 __sync_rcu_exp_select_node_cpus+0x267/0x460
 sync_rcu_exp_select_cpus+0x1ec/0x3e0
 wait_rcu_exp_gp+0xf/0x20
 process_one_work+0x254/0x560
 worker_thread+0x4f/0x390
 ? _raw_spin_unlock_irqrestore+0x40/0x60
 ? process_one_work+0x560/0x560
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x1f/0x30

I'm providing a set of patches: check if shinfo lock was initialized,
disallow lock reinitialization.  Along with crudely made testcases.

Note: as I understand, kvm->lock mutex cannot be used to protect from
those races because of kvm_xen_set_evtchn_fast() being called from
kvm_arch_set_irq_inatomic()?

I'm sending this as a RFC as I have doubts if explicitly disallowing
reinitialization this way is the most elegant solution.  Especially as
the problem appears to affect only the shinfo gfn_to_pfn_cache.

Michal Luczaj (4):
  KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache
  KVM: x86/xen: Ensure kvm_xen_schedop_poll() can use shinfo_cache
  KVM: x86/xen: Disallow gpc locks reinitialization
  KVM: x86/xen: Test shinfo_cache lock races

 arch/x86/kvm/xen.c                            |   5 +-
 include/linux/kvm_types.h                     |   1 +
 .../selftests/kvm/x86_64/xen_shinfo_test.c    | 100 ++++++++++++++++++
 virt/kvm/pfncache.c                           |   7 +-
 4 files changed, 110 insertions(+), 3 deletions(-)

-- 
2.37.2


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [RFC PATCH 1/4] KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache
  2022-09-16  0:54 [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption Michal Luczaj
@ 2022-09-16  0:54 ` Michal Luczaj
  2022-09-16  0:54 ` [RFC PATCH 2/4] KVM: x86/xen: Ensure kvm_xen_schedop_poll() " Michal Luczaj
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-16  0:54 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, shuah, Michal Luczaj

Before taking gpc->lock, ensure it has been initialized.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/xen.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 280cb5dc7341..e32c2cf06223 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -1348,7 +1348,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 		WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx);
 	}
 
-	if (!vcpu->arch.xen.vcpu_info_cache.active)
+	if (!vcpu->arch.xen.vcpu_info_cache.active || !gpc->active)
 		return -EINVAL;
 
 	if (xe->port >= max_evtchn_port(kvm))
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH 2/4] KVM: x86/xen: Ensure kvm_xen_schedop_poll() can use shinfo_cache
  2022-09-16  0:54 [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption Michal Luczaj
  2022-09-16  0:54 ` [RFC PATCH 1/4] KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache Michal Luczaj
@ 2022-09-16  0:54 ` Michal Luczaj
  2022-09-16  0:54 ` [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization Michal Luczaj
  2022-09-16  0:54 ` [RFC PATCH 4/4] KVM: x86/xen: Test shinfo_cache lock races Michal Luczaj
  3 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-16  0:54 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, shuah, Michal Luczaj

Before taking gpc->lock, ensure it has been initialized.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/xen.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index e32c2cf06223..c5d431a54afa 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -965,6 +965,9 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports,
 	bool ret = true;
 	int idx, i;
 
+	if (!gpc->active)
+		return true;
+
 	read_lock_irqsave(&gpc->lock, flags);
 	idx = srcu_read_lock(&kvm->srcu);
 	if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization
  2022-09-16  0:54 [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption Michal Luczaj
  2022-09-16  0:54 ` [RFC PATCH 1/4] KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache Michal Luczaj
  2022-09-16  0:54 ` [RFC PATCH 2/4] KVM: x86/xen: Ensure kvm_xen_schedop_poll() " Michal Luczaj
@ 2022-09-16  0:54 ` Michal Luczaj
  2022-09-16 17:12   ` Sean Christopherson
  2022-09-16  0:54 ` [RFC PATCH 4/4] KVM: x86/xen: Test shinfo_cache lock races Michal Luczaj
  3 siblings, 1 reply; 34+ messages in thread
From: Michal Luczaj @ 2022-09-16  0:54 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, shuah, Michal Luczaj

There are race conditions possible due to kvm_gfn_to_pfn_cache_init()'s
ability to _re_initialize gfn_to_pfn_cache.lock.

For example: a race between ioctl(KVM_XEN_HVM_EVTCHN_SEND) and
kvm_gfn_to_pfn_cache_init() leads to a corrupted shinfo gpc lock.

                (thread 1)                |           (thread 2)
                                          |
 kvm_xen_set_evtchn_fast                  |
  read_lock_irqsave(&gpc->lock, ...)      |
                                          | kvm_gfn_to_pfn_cache_init
                                          |  rwlock_init(&gpc->lock)
  read_unlock_irqrestore(&gpc->lock, ...) |

Introduce bool locks_initialized.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 include/linux/kvm_types.h | 1 +
 virt/kvm/pfncache.c       | 7 +++++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index 3ca3db020e0e..7e7b7667cd9e 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -74,6 +74,7 @@ struct gfn_to_pfn_cache {
 	void *khva;
 	kvm_pfn_t pfn;
 	enum pfn_cache_usage usage;
+	bool locks_initialized;
 	bool active;
 	bool valid;
 };
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 68ff41d39545..564607e10586 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -354,8 +354,11 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 	WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
 
 	if (!gpc->active) {
-		rwlock_init(&gpc->lock);
-		mutex_init(&gpc->refresh_lock);
+		if (!gpc->locks_initialized) {
+			rwlock_init(&gpc->lock);
+			mutex_init(&gpc->refresh_lock);
+			gpc->locks_initialized = true;
+		}
 
 		gpc->khva = NULL;
 		gpc->pfn = KVM_PFN_ERR_FAULT;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH 4/4] KVM: x86/xen: Test shinfo_cache lock races
  2022-09-16  0:54 [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption Michal Luczaj
                   ` (2 preceding siblings ...)
  2022-09-16  0:54 ` [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization Michal Luczaj
@ 2022-09-16  0:54 ` Michal Luczaj
  3 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-16  0:54 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, shuah, Michal Luczaj

Tests for races between shinfo_cache initialization/destruction
(causing lock reinitialization) and hypercall/ioctl processing
(acquiring uninitialized lock, holding soon-to-be-corrupted lock).

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 .../selftests/kvm/x86_64/xen_shinfo_test.c    | 100 ++++++++++++++++++
 1 file changed, 100 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
index 8a5cb800f50e..8e251b2bfa62 100644
--- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
+++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
@@ -15,9 +15,13 @@
 #include <time.h>
 #include <sched.h>
 #include <signal.h>
+#include <pthread.h>
 
 #include <sys/eventfd.h>
 
+/* Defined in include/linux/kvm_types.h */
+#define GPA_INVALID		(~(ulong)0)
+
 #define SHINFO_REGION_GVA	0xc0000000ULL
 #define SHINFO_REGION_GPA	0xc0000000ULL
 #define SHINFO_REGION_SLOT	10
@@ -44,6 +48,8 @@
 
 #define MIN_STEAL_TIME		50000
 
+#define SHINFO_RACE_TIMEOUT	2	/* seconds */
+
 #define __HYPERVISOR_set_timer_op	15
 #define __HYPERVISOR_sched_op		29
 #define __HYPERVISOR_event_channel_op	32
@@ -325,6 +331,32 @@ static void guest_code(void)
 	guest_wait_for_irq();
 
 	GUEST_SYNC(21);
+	/* Racing host ioctls */
+
+	guest_wait_for_irq();
+
+	GUEST_SYNC(22);
+	/* Racing vmcall against host ioctl */
+
+	ports[0] = 0;
+
+	p = (struct sched_poll) {
+		.ports = ports,
+		.nr_ports = 1,
+		.timeout = 0
+	};
+
+	do {
+		asm volatile("vmcall"
+			     : "=a" (rax)
+			     : "a" (__HYPERVISOR_sched_op),
+			       "D" (SCHEDOP_poll),
+			       "S" (&p)
+			     : "memory");
+	} while (!guest_saw_irq);
+	guest_saw_irq = false;
+
+	GUEST_SYNC(23);
 }
 
 static int cmp_timespec(struct timespec *a, struct timespec *b)
@@ -352,11 +384,36 @@ static void handle_alrm(int sig)
 	TEST_FAIL("IRQ delivery timed out");
 }
 
+static void *juggle_shinfo_state(void *arg)
+{
+	struct kvm_vm *vm = (struct kvm_vm *)arg;
+
+	struct kvm_xen_hvm_attr cache_init = {
+		.type = KVM_XEN_ATTR_TYPE_SHARED_INFO,
+		.u.shared_info.gfn = SHINFO_REGION_GPA / PAGE_SIZE
+	};
+
+	struct kvm_xen_hvm_attr cache_destroy = {
+		.type = KVM_XEN_ATTR_TYPE_SHARED_INFO,
+		.u.shared_info.gfn = GPA_INVALID
+	};
+
+	for (;;) {
+		__vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &cache_init);
+		__vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &cache_destroy);
+		pthread_testcancel();
+	};
+
+	return NULL;
+}
+
 int main(int argc, char *argv[])
 {
 	struct timespec min_ts, max_ts, vm_ts;
 	struct kvm_vm *vm;
+	pthread_t thread;
 	bool verbose;
+	int ret;
 
 	verbose = argc > 1 && (!strncmp(argv[1], "-v", 3) ||
 			       !strncmp(argv[1], "--verbose", 10));
@@ -785,6 +842,49 @@ int main(int argc, char *argv[])
 			case 21:
 				TEST_ASSERT(!evtchn_irq_expected,
 					    "Expected event channel IRQ but it didn't happen");
+				alarm(0);
+
+				if (verbose)
+					printf("Testing shinfo lock corruption (KVM_XEN_HVM_EVTCHN_SEND)\n");
+
+				ret = pthread_create(&thread, NULL, &juggle_shinfo_state, (void *)vm);
+				TEST_ASSERT(ret == 0, "pthread_create() failed: %s", strerror(ret));
+
+				struct kvm_irq_routing_xen_evtchn uxe = {
+					.port = 1,
+					.vcpu = vcpu->id,
+					.priority = KVM_IRQ_ROUTING_XEN_EVTCHN_PRIO_2LEVEL
+				};
+
+				evtchn_irq_expected = true;
+				for (time_t t = time(NULL) + SHINFO_RACE_TIMEOUT; time(NULL) < t;)
+					__vm_ioctl(vm, KVM_XEN_HVM_EVTCHN_SEND, &uxe);
+				break;
+
+			case 22:
+				TEST_ASSERT(!evtchn_irq_expected,
+					    "Expected event channel IRQ but it didn't happen");
+
+				if (verbose)
+					printf("Testing shinfo lock corruption (SCHEDOP_poll)\n");
+
+				shinfo->evtchn_pending[0] = 1;
+
+				evtchn_irq_expected = true;
+				tmr.u.timer.expires_ns = rs->state_entry_time +
+							 SHINFO_RACE_TIMEOUT * 1000000000ULL;
+				vcpu_ioctl(vcpu, KVM_XEN_VCPU_SET_ATTR, &tmr);
+				break;
+
+			case 23:
+				TEST_ASSERT(!evtchn_irq_expected,
+					    "Expected event channel IRQ but it didn't happen");
+
+				ret = pthread_cancel(thread);
+				TEST_ASSERT(ret == 0, "pthread_cancel() failed: %s", strerror(ret));
+
+				ret = pthread_join(thread, 0);
+				TEST_ASSERT(ret == 0, "pthread_join() failed: %s", strerror(ret));
 				goto done;
 
 			case 0x20:
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization
  2022-09-16  0:54 ` [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization Michal Luczaj
@ 2022-09-16 17:12   ` Sean Christopherson
  2022-09-18 23:13     ` Michal Luczaj
                       ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-09-16 17:12 UTC (permalink / raw)
  To: Michal Luczaj; +Cc: kvm, pbonzini, shuah

On Fri, Sep 16, 2022, Michal Luczaj wrote:
> There are race conditions possible due to kvm_gfn_to_pfn_cache_init()'s
> ability to _re_initialize gfn_to_pfn_cache.lock.
> 
> For example: a race between ioctl(KVM_XEN_HVM_EVTCHN_SEND) and
> kvm_gfn_to_pfn_cache_init() leads to a corrupted shinfo gpc lock.
> 
>                 (thread 1)                |           (thread 2)
>                                           |
>  kvm_xen_set_evtchn_fast                  |
>   read_lock_irqsave(&gpc->lock, ...)      |
>                                           | kvm_gfn_to_pfn_cache_init
>                                           |  rwlock_init(&gpc->lock)
>   read_unlock_irqrestore(&gpc->lock, ...) |
> 

Please explicitly include a sample call stack for reaching kvm_gfn_to_pfn_cache_init().
Without that, it's difficult to understand if this is a bug in the gfn_to_pfn_cache
code, or if it's a bug in the caller.

> Introduce bool locks_initialized.
> 
> Signed-off-by: Michal Luczaj <mhal@rbox.co>
> ---
>  include/linux/kvm_types.h | 1 +
>  virt/kvm/pfncache.c       | 7 +++++--
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
> index 3ca3db020e0e..7e7b7667cd9e 100644
> --- a/include/linux/kvm_types.h
> +++ b/include/linux/kvm_types.h
> @@ -74,6 +74,7 @@ struct gfn_to_pfn_cache {
>  	void *khva;
>  	kvm_pfn_t pfn;
>  	enum pfn_cache_usage usage;
> +	bool locks_initialized;
>  	bool active;
>  	bool valid;
>  };
> diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
> index 68ff41d39545..564607e10586 100644
> --- a/virt/kvm/pfncache.c
> +++ b/virt/kvm/pfncache.c
> @@ -354,8 +354,11 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
>  	WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
>  
>  	if (!gpc->active) {
> -		rwlock_init(&gpc->lock);
> -		mutex_init(&gpc->refresh_lock);
> +		if (!gpc->locks_initialized) {

Rather than add another flag, move the lock initialization to another helper and
call the new helper from e.g. kvm_xen_init_vm() and kvm_xen_init_vcpu().  There
is zero reason to initialize locks on-demand.  That way, patches 1 and 2 aren't
necessary because gpc->lock is always valid.

And then at the same time, rename "cache_init" and "cache_destroy" to
activate+deactivate to avoid implying that the cache really is destroyed/freed.
And 

Adding a true init() API will also allow for future cleanups.  @kvm, @vcpu, @len,
and @usage all should be immutable in the sense that they are properties of the
cache, i.e. can be moved into init().  The nice side effect of moving most of that
stuff into init() is that it makes it very obvious from the activate() call sites
that the gpa is the only mutable information.

I.e. as additional patches, do:

  1. Formalize "gpc" as the acronym and use it in function names to reduce line
     lengths.  Maybe keep the long name for gfn_to_pfn_cache_invalidate_start()
     though?  Since that is a very different API.

  2. Snapshot @usage during kvm_gpc_init().

  3. Add a KVM backpointer in "struct gfn_to_pfn_cache" and snapshot it during
     initialization.  The extra memory cost is negligible in the grand scheme,
     and not having to constantly provide @kvm makes the call sites prettier, and
     it avoids weirdness where @kvm is mandatory but @vcpu is not.

  4. Add a backpointer for @vcpu too, since again the memory overhead is minor,
     and taking @vcpu in "activate" implies that it's legal to share a cache
     between multiple vCPUs, which is not the case.  And opportunistically add a
     WARN to assert that @vcpu is non-NULL if KVM_GUEST_USES_PFN. 

  5. Snapshot @len during kvm_gcp_init().

so that we end up with something like (completely untested):

	bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
	int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
	void kvm_gpc_unmap(struct gfn_to_pfn_cache *gpc)

	void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm,
			  enum pfn_cache_usage usage, unsigned long len)
	{
		WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
		WARN_ON_ONCE((usage & KVM_GUEST_USES_PFN) && !vcpu);

		rwlock_init(&gpc->lock);
		mutex_init(&gpc->refresh_lock);

		gpc->kvm = kvm;
		gpc->vcpu = vcpu;
		gpc->usage = usage;
		gpc->len = len;
	}
	EXPORT_SYMBOL_GPL(kvm_gpc_init);

	int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa)
	{
		if (!gpc->active) {
			gpc->khva = NULL;
			gpc->pfn = KVM_PFN_ERR_FAULT;
			gpc->uhva = KVM_HVA_ERR_BAD;
			gpc->valid = false;
			gpc->active = true;

			spin_lock(&gcp->kvm->gpc_lock);
			list_add(&gpc->list, &gcp->kvm->gpc_list);
			spin_unlock(&gcp->kvm->gpc_lock);
		}
		return kvm_gpc_refresh(gpc, gpa);
	}
	EXPORT_SYMBOL_GPL(kvm_gpc_activate);

	void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc)
	{
		if (gpc->active) {
			spin_lock(&gpc->kvm->gpc_lock);
			list_del(&gpc->list);
			spin_unlock(&gpc->kvm->gpc_lock);

			kvm_gpc_unmap(gpc);
			gpc->active = false;
		}
	}
	EXPORT_SYMBOL_GPL(kvm_gpc_deactivate);

Let me know if yout want to take on the above cleanups, if not I'll add them to
my todo list.

Thanks!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization
  2022-09-16 17:12   ` Sean Christopherson
@ 2022-09-18 23:13     ` Michal Luczaj
  2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
  2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
  2 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-18 23:13 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, pbonzini, shuah

On 9/16/22 19:12, Sean Christopherson wrote:
> On Fri, Sep 16, 2022, Michal Luczaj wrote:
>> For example: a race between ioctl(KVM_XEN_HVM_EVTCHN_SEND) and
>> kvm_gfn_to_pfn_cache_init() leads to a corrupted shinfo gpc lock.
>>
>>                 (thread 1)                |           (thread 2)
>>                                           |
>>  kvm_xen_set_evtchn_fast                  |
>>   read_lock_irqsave(&gpc->lock, ...)      |
>>                                           | kvm_gfn_to_pfn_cache_init
>>                                           |  rwlock_init(&gpc->lock)
>>   read_unlock_irqrestore(&gpc->lock, ...) |
>>
> 
> Please explicitly include a sample call stack for reaching kvm_gfn_to_pfn_cache_init().
> Without that, it's difficult to understand if this is a bug in the gfn_to_pfn_cache
> code, or if it's a bug in the caller.

OK, I'll try to be more specific.

> Rather than add another flag, (...)
> Let me know if yout want to take on the above cleanups, if not I'll add them to
> my todo list.

Sure, I'll do it.

Thanks for all the suggestions,
Michal


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix
  2022-09-16 17:12   ` Sean Christopherson
  2022-09-18 23:13     ` Michal Luczaj
@ 2022-09-21  2:01     ` Michal Luczaj
  2022-09-21  2:01       ` [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj
                         ` (7 more replies)
  2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
  2 siblings, 8 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-21  2:01 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Here are some clean ups following Sean's suggestions and a single fix
for a race condition.

Michal Luczaj (8):
  KVM: x86: Add initializer for gfn_to_pfn_cache
  KVM: x86: Shorten gfn_to_pfn_cache function names
  KVM: x86: Remove unused argument in gpc_unmap_khva()
  KVM: x86: Store immutable gfn_to_pfn_cache properties
  KVM: x86: Clean up kvm_gpc_check()
  KVM: x86: Clean up hva_to_pfn_retry()
  KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap()
  KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast()

 arch/x86/kvm/x86.c        | 24 ++++++------
 arch/x86/kvm/xen.c        | 78 +++++++++++++++++--------------------
 include/linux/kvm_host.h  | 64 ++++++++++++++++---------------
 include/linux/kvm_types.h |  2 +
 virt/kvm/pfncache.c       | 81 +++++++++++++++++++++------------------
 5 files changed, 128 insertions(+), 121 deletions(-)

-- 
2.37.3


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache
  2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
@ 2022-09-21  2:01       ` Michal Luczaj
  2022-10-10 23:38         ` Sean Christopherson
  2022-09-21  2:01       ` [PATCH 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj
                         ` (6 subsequent siblings)
  7 siblings, 1 reply; 34+ messages in thread
From: Michal Luczaj @ 2022-09-21  2:01 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Move the gfn_to_pfn_cache lock initialization to another helper and
call the new helper during VM/vCPU creation.

Rename "cache_init" and "cache_destroy" to activate+deactivate to
avoid implying that the cache really is destroyed/freed.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/x86.c       | 12 +++++----
 arch/x86/kvm/xen.c       | 57 +++++++++++++++++++++-------------------
 include/linux/kvm_host.h | 23 +++++++++++-----
 virt/kvm/pfncache.c      | 21 ++++++++-------
 4 files changed, 65 insertions(+), 48 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 43a6a7efc6ec..15032b7f0589 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2301,11 +2301,11 @@ static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time,
 
 	/* we verify if the enable bit is set... */
 	if (system_time & 1) {
-		kvm_gfn_to_pfn_cache_init(vcpu->kvm, &vcpu->arch.pv_time, vcpu,
-					  KVM_HOST_USES_PFN, system_time & ~1ULL,
-					  sizeof(struct pvclock_vcpu_time_info));
+		kvm_gpc_activate(vcpu->kvm, &vcpu->arch.pv_time, vcpu,
+				 KVM_HOST_USES_PFN, system_time & ~1ULL,
+				 sizeof(struct pvclock_vcpu_time_info));
 	} else {
-		kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.pv_time);
+		kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time);
 	}
 
 	return;
@@ -3374,7 +3374,7 @@ static int kvm_pv_enable_async_pf_int(struct kvm_vcpu *vcpu, u64 data)
 
 static void kvmclock_reset(struct kvm_vcpu *vcpu)
 {
-	kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.pv_time);
+	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time);
 	vcpu->arch.time = 0;
 }
 
@@ -11551,6 +11551,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	vcpu->arch.regs_avail = ~0;
 	vcpu->arch.regs_dirty = ~0;
 
+	kvm_gpc_init(&vcpu->arch.pv_time);
+
 	if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
 		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
 	else
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 280cb5dc7341..cecf8299b187 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -42,13 +42,13 @@ static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn)
 	int idx = srcu_read_lock(&kvm->srcu);
 
 	if (gfn == GPA_INVALID) {
-		kvm_gfn_to_pfn_cache_destroy(kvm, gpc);
+		kvm_gpc_deactivate(kvm, gpc);
 		goto out;
 	}
 
 	do {
-		ret = kvm_gfn_to_pfn_cache_init(kvm, gpc, NULL, KVM_HOST_USES_PFN,
-						gpa, PAGE_SIZE);
+		ret = kvm_gpc_activate(kvm, gpc, NULL, KVM_HOST_USES_PFN, gpa,
+				       PAGE_SIZE);
 		if (ret)
 			goto out;
 
@@ -554,15 +554,15 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 			     offsetof(struct compat_vcpu_info, time));
 
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache);
+			kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gfn_to_pfn_cache_init(vcpu->kvm,
-					      &vcpu->arch.xen.vcpu_info_cache,
-					      NULL, KVM_HOST_USES_PFN, data->u.gpa,
-					      sizeof(struct vcpu_info));
+		r = kvm_gpc_activate(vcpu->kvm,
+				     &vcpu->arch.xen.vcpu_info_cache, NULL,
+				     KVM_HOST_USES_PFN, data->u.gpa,
+				     sizeof(struct vcpu_info));
 		if (!r)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 
@@ -570,16 +570,16 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO:
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gfn_to_pfn_cache_destroy(vcpu->kvm,
-						     &vcpu->arch.xen.vcpu_time_info_cache);
+			kvm_gpc_deactivate(vcpu->kvm,
+					   &vcpu->arch.xen.vcpu_time_info_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gfn_to_pfn_cache_init(vcpu->kvm,
-					      &vcpu->arch.xen.vcpu_time_info_cache,
-					      NULL, KVM_HOST_USES_PFN, data->u.gpa,
-					      sizeof(struct pvclock_vcpu_time_info));
+		r = kvm_gpc_activate(vcpu->kvm,
+				     &vcpu->arch.xen.vcpu_time_info_cache,
+				     NULL, KVM_HOST_USES_PFN, data->u.gpa,
+				     sizeof(struct pvclock_vcpu_time_info));
 		if (!r)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 		break;
@@ -590,16 +590,15 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 			break;
 		}
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gfn_to_pfn_cache_destroy(vcpu->kvm,
-						     &vcpu->arch.xen.runstate_cache);
+			kvm_gpc_deactivate(vcpu->kvm,
+					   &vcpu->arch.xen.runstate_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gfn_to_pfn_cache_init(vcpu->kvm,
-					      &vcpu->arch.xen.runstate_cache,
-					      NULL, KVM_HOST_USES_PFN, data->u.gpa,
-					      sizeof(struct vcpu_runstate_info));
+		r = kvm_gpc_activate(vcpu->kvm, &vcpu->arch.xen.runstate_cache,
+				     NULL, KVM_HOST_USES_PFN, data->u.gpa,
+				     sizeof(struct vcpu_runstate_info));
 		break;
 
 	case KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT:
@@ -1817,7 +1816,12 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.xen.vcpu_id = vcpu->vcpu_idx;
 	vcpu->arch.xen.poll_evtchn = 0;
+
 	timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0);
+
+	kvm_gpc_init(&vcpu->arch.xen.runstate_cache);
+	kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache);
+	kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache);
 }
 
 void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
@@ -1825,18 +1829,17 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
 	if (kvm_xen_timer_enabled(vcpu))
 		kvm_xen_stop_timer(vcpu);
 
-	kvm_gfn_to_pfn_cache_destroy(vcpu->kvm,
-				     &vcpu->arch.xen.runstate_cache);
-	kvm_gfn_to_pfn_cache_destroy(vcpu->kvm,
-				     &vcpu->arch.xen.vcpu_info_cache);
-	kvm_gfn_to_pfn_cache_destroy(vcpu->kvm,
-				     &vcpu->arch.xen.vcpu_time_info_cache);
+	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.runstate_cache);
+	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache);
+	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_time_info_cache);
+
 	del_timer_sync(&vcpu->arch.xen.poll_timer);
 }
 
 void kvm_xen_init_vm(struct kvm *kvm)
 {
 	idr_init(&kvm->arch.xen.evtchn_ports);
+	kvm_gpc_init(&kvm->arch.xen.shinfo_cache);
 }
 
 void kvm_xen_destroy_vm(struct kvm *kvm)
@@ -1844,7 +1847,7 @@ void kvm_xen_destroy_vm(struct kvm *kvm)
 	struct evtchnfd *evtchnfd;
 	int i;
 
-	kvm_gfn_to_pfn_cache_destroy(kvm, &kvm->arch.xen.shinfo_cache);
+	kvm_gpc_deactivate(kvm, &kvm->arch.xen.shinfo_cache);
 
 	idr_for_each_entry(&kvm->arch.xen.evtchn_ports, evtchnfd, i) {
 		if (!evtchnfd->deliver.port.port)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f4519d3689e1..9fd67026d91a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1241,8 +1241,17 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data,
 void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
 
 /**
- * kvm_gfn_to_pfn_cache_init - prepare a cached kernel mapping and HPA for a
- *                             given guest physical address.
+ * kvm_gpc_init - initialize gfn_to_pfn_cache.
+ *
+ * @gpc:	   struct gfn_to_pfn_cache object.
+ *
+ * This sets up a gfn_to_pfn_cache by initializing locks.
+ */
+void kvm_gpc_init(struct gfn_to_pfn_cache *gpc);
+
+/**
+ * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest
+ *                    physical address.
  *
  * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
@@ -1266,9 +1275,9 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
  * kvm_gfn_to_pfn_cache_check() to ensure that the cache is valid before
  * accessing the target page.
  */
-int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-			      struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
-			      gpa_t gpa, unsigned long len);
+int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+		     struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
+		     gpa_t gpa, unsigned long len);
 
 /**
  * kvm_gfn_to_pfn_cache_check - check validity of a gfn_to_pfn_cache.
@@ -1325,7 +1334,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
 
 /**
- * kvm_gfn_to_pfn_cache_destroy - destroy and unlink a gfn_to_pfn_cache.
+ * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache.
  *
  * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
@@ -1333,7 +1342,7 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
  * This removes a cache from the @kvm's list to be processed on MMU notifier
  * invocation.
  */
-void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
+void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
 
 void kvm_sigset_activate(struct kvm_vcpu *vcpu);
 void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 68ff41d39545..08f97cf97264 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -346,17 +346,20 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 }
 EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_unmap);
 
+void kvm_gpc_init(struct gfn_to_pfn_cache *gpc)
+{
+	rwlock_init(&gpc->lock);
+	mutex_init(&gpc->refresh_lock);
+}
+EXPORT_SYMBOL_GPL(kvm_gpc_init);
 
-int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-			      struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
-			      gpa_t gpa, unsigned long len)
+int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+		     struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
+		     gpa_t gpa, unsigned long len)
 {
 	WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
 
 	if (!gpc->active) {
-		rwlock_init(&gpc->lock);
-		mutex_init(&gpc->refresh_lock);
-
 		gpc->khva = NULL;
 		gpc->pfn = KVM_PFN_ERR_FAULT;
 		gpc->uhva = KVM_HVA_ERR_BAD;
@@ -371,9 +374,9 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 	}
 	return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len);
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_init);
+EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
-void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 {
 	if (gpc->active) {
 		spin_lock(&kvm->gpc_lock);
@@ -384,4 +387,4 @@ void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 		gpc->active = false;
 	}
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_destroy);
+EXPORT_SYMBOL_GPL(kvm_gpc_deactivate);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names
  2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
  2022-09-21  2:01       ` [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj
@ 2022-09-21  2:01       ` Michal Luczaj
  2022-09-21  2:01       ` [PATCH 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj
                         ` (5 subsequent siblings)
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-21  2:01 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Formalize "gpc" as the acronym and use it in function names.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/x86.c       |  8 ++++----
 arch/x86/kvm/xen.c       | 29 ++++++++++++++---------------
 include/linux/kvm_host.h | 21 ++++++++++-----------
 virt/kvm/pfncache.c      | 20 ++++++++++----------
 4 files changed, 38 insertions(+), 40 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 15032b7f0589..45136ce7185e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3020,12 +3020,12 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v,
 	unsigned long flags;
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
-					   offset + sizeof(*guest_hv_clock))) {
+	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
+			      offset + sizeof(*guest_hv_clock))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
-		if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa,
-						 offset + sizeof(*guest_hv_clock)))
+		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
+				    offset + sizeof(*guest_hv_clock)))
 			return;
 
 		read_lock_irqsave(&gpc->lock, flags);
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index cecf8299b187..361f77dc7a3d 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -218,15 +218,14 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state)
 		user_len = sizeof(struct compat_vcpu_runstate_info);
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
-					   user_len)) {
+	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, user_len)) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
 		/* When invoked from kvm_sched_out() we cannot sleep */
 		if (state == RUNSTATE_runnable)
 			return;
 
-		if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa, user_len))
+		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, user_len))
 			return;
 
 		read_lock_irqsave(&gpc->lock, flags);
@@ -352,12 +351,12 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v)
 	 * little more honest about it.
 	 */
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
-					   sizeof(struct vcpu_info))) {
+	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
+			      sizeof(struct vcpu_info))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
-		if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa,
-						 sizeof(struct vcpu_info)))
+		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
+				    sizeof(struct vcpu_info)))
 			return;
 
 		read_lock_irqsave(&gpc->lock, flags);
@@ -417,8 +416,8 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
 		     sizeof_field(struct compat_vcpu_info, evtchn_upcall_pending));
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
-					   sizeof(struct vcpu_info))) {
+	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
+			      sizeof(struct vcpu_info))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
 		/*
@@ -432,8 +431,8 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
 		if (in_atomic() || !task_is_running(current))
 			return 1;
 
-		if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa,
-						 sizeof(struct vcpu_info))) {
+		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
+				    sizeof(struct vcpu_info))) {
 			/*
 			 * If this failed, userspace has screwed up the
 			 * vcpu_info mapping. No interrupts for you.
@@ -966,7 +965,7 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports,
 
 	read_lock_irqsave(&gpc->lock, flags);
 	idx = srcu_read_lock(&kvm->srcu);
-	if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
+	if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
 		goto out_rcu;
 
 	ret = false;
@@ -1358,7 +1357,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 	idx = srcu_read_lock(&kvm->srcu);
 
 	read_lock_irqsave(&gpc->lock, flags);
-	if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
+	if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
 		goto out_rcu;
 
 	if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) {
@@ -1392,7 +1391,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 		gpc = &vcpu->arch.xen.vcpu_info_cache;
 
 		read_lock_irqsave(&gpc->lock, flags);
-		if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) {
+		if (!kvm_gpc_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 			/*
 			 * Could not access the vcpu_info. Set the bit in-kernel
 			 * and prod the vCPU to deliver it for itself.
@@ -1490,7 +1489,7 @@ static int kvm_xen_set_evtchn(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 			break;
 
 		idx = srcu_read_lock(&kvm->srcu);
-		rc = kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE);
+		rc = kvm_gpc_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE);
 		srcu_read_unlock(&kvm->srcu, idx);
 	} while(!rc);
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9fd67026d91a..f687e56c24bc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1271,16 +1271,15 @@ void kvm_gpc_init(struct gfn_to_pfn_cache *gpc);
  *                 -EFAULT for an untranslatable guest physical address.
  *
  * This primes a gfn_to_pfn_cache and links it into the @kvm's list for
- * invalidations to be processed.  Callers are required to use
- * kvm_gfn_to_pfn_cache_check() to ensure that the cache is valid before
- * accessing the target page.
+ * invalidations to be processed.  Callers are required to use kvm_gpc_check()
+ * to ensure that the cache is valid before accessing the target page.
  */
 int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 		     struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
 		     gpa_t gpa, unsigned long len);
 
 /**
- * kvm_gfn_to_pfn_cache_check - check validity of a gfn_to_pfn_cache.
+ * kvm_gpc_check - check validity of a gfn_to_pfn_cache.
  *
  * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
@@ -1297,11 +1296,11 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
  * Callers in IN_GUEST_MODE may do so without locking, although they should
  * still hold a read lock on kvm->scru for the memslot checks.
  */
-bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-				gpa_t gpa, unsigned long len);
+bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
+		   unsigned long len);
 
 /**
- * kvm_gfn_to_pfn_cache_refresh - update a previously initialized cache.
+ * kvm_gpc_refresh - update a previously initialized cache.
  *
  * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
@@ -1318,11 +1317,11 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
  * still lock and check the cache status, as this function does not return
  * with the lock still held to permit access.
  */
-int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-				 gpa_t gpa, unsigned long len);
+int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
+		    unsigned long len);
 
 /**
- * kvm_gfn_to_pfn_cache_unmap - temporarily unmap a gfn_to_pfn_cache.
+ * kvm_gpc_unmap - temporarily unmap a gfn_to_pfn_cache.
  *
  * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
@@ -1331,7 +1330,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
  * but at least the mapping from GPA to userspace HVA will remain cached
  * and can be reused on a subsequent refresh.
  */
-void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
+void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
 
 /**
  * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache.
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 08f97cf97264..cc65fab0dbef 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -76,8 +76,8 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start,
 	}
 }
 
-bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-				gpa_t gpa, unsigned long len)
+bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
+		   unsigned long len)
 {
 	struct kvm_memslots *slots = kvm_memslots(kvm);
 
@@ -93,7 +93,7 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 
 	return true;
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_check);
+EXPORT_SYMBOL_GPL(kvm_gpc_check);
 
 static void gpc_unmap_khva(struct kvm *kvm, kvm_pfn_t pfn, void *khva)
 {
@@ -235,8 +235,8 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 	return -EFAULT;
 }
 
-int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-				 gpa_t gpa, unsigned long len)
+int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
+		    unsigned long len)
 {
 	struct kvm_memslots *slots = kvm_memslots(kvm);
 	unsigned long page_offset = gpa & ~PAGE_MASK;
@@ -317,9 +317,9 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_refresh);
+EXPORT_SYMBOL_GPL(kvm_gpc_refresh);
 
-void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 {
 	void *old_khva;
 	kvm_pfn_t old_pfn;
@@ -344,7 +344,7 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 
 	gpc_unmap_khva(kvm, old_pfn, old_khva);
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_unmap);
+EXPORT_SYMBOL_GPL(kvm_gpc_unmap);
 
 void kvm_gpc_init(struct gfn_to_pfn_cache *gpc)
 {
@@ -372,7 +372,7 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 		list_add(&gpc->list, &kvm->gpc_list);
 		spin_unlock(&kvm->gpc_lock);
 	}
-	return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len);
+	return kvm_gpc_refresh(kvm, gpc, gpa, len);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
@@ -383,7 +383,7 @@ void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 		list_del(&gpc->list);
 		spin_unlock(&kvm->gpc_lock);
 
-		kvm_gfn_to_pfn_cache_unmap(kvm, gpc);
+		kvm_gpc_unmap(kvm, gpc);
 		gpc->active = false;
 	}
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva()
  2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
  2022-09-21  2:01       ` [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj
  2022-09-21  2:01       ` [PATCH 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj
@ 2022-09-21  2:01       ` Michal Luczaj
  2022-09-21  2:01       ` [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj
                         ` (4 subsequent siblings)
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-21  2:01 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

First argument is never used, remove it.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 virt/kvm/pfncache.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index cc65fab0dbef..76f1b669cf28 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -95,7 +95,7 @@ bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_check);
 
-static void gpc_unmap_khva(struct kvm *kvm, kvm_pfn_t pfn, void *khva)
+static void gpc_unmap_khva(kvm_pfn_t pfn, void *khva)
 {
 	/* Unmap the old pfn/page if it was mapped before. */
 	if (!is_error_noslot_pfn(pfn) && khva) {
@@ -174,7 +174,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 			 * the existing mapping and didn't create a new one.
 			 */
 			if (new_khva != old_khva)
-				gpc_unmap_khva(kvm, new_pfn, new_khva);
+				gpc_unmap_khva(new_pfn, new_khva);
 
 			kvm_release_pfn_clean(new_pfn);
 
@@ -313,7 +313,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 	mutex_unlock(&gpc->refresh_lock);
 
 	if (old_pfn != new_pfn)
-		gpc_unmap_khva(kvm, old_pfn, old_khva);
+		gpc_unmap_khva(old_pfn, old_khva);
 
 	return ret;
 }
@@ -342,7 +342,7 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 	write_unlock_irq(&gpc->lock);
 	mutex_unlock(&gpc->refresh_lock);
 
-	gpc_unmap_khva(kvm, old_pfn, old_khva);
+	gpc_unmap_khva(old_pfn, old_khva);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_unmap);
 
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties
  2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
                         ` (2 preceding siblings ...)
  2022-09-21  2:01       ` [PATCH 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj
@ 2022-09-21  2:01       ` Michal Luczaj
  2022-10-10 23:42         ` Sean Christopherson
  2022-10-11  0:37         ` Sean Christopherson
  2022-09-21  2:01       ` [PATCH 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj
                         ` (3 subsequent siblings)
  7 siblings, 2 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-21  2:01 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Move the assignment of immutable properties @kvm, @vcpu, @usage, @len
to the initializer.  Make _init(), _activate() and _deactivate() use
stored values.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/x86.c        | 16 ++++++-------
 arch/x86/kvm/xen.c        | 50 ++++++++++++++++++---------------------
 include/linux/kvm_host.h  | 40 +++++++++++++++----------------
 include/linux/kvm_types.h |  2 ++
 virt/kvm/pfncache.c       | 36 +++++++++++++++-------------
 5 files changed, 72 insertions(+), 72 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 45136ce7185e..ed8e4f8c9cf0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2300,13 +2300,10 @@ static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time,
 	kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
 
 	/* we verify if the enable bit is set... */
-	if (system_time & 1) {
-		kvm_gpc_activate(vcpu->kvm, &vcpu->arch.pv_time, vcpu,
-				 KVM_HOST_USES_PFN, system_time & ~1ULL,
-				 sizeof(struct pvclock_vcpu_time_info));
-	} else {
-		kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time);
-	}
+	if (system_time & 1)
+		kvm_gpc_activate(&vcpu->arch.pv_time, system_time & ~1ULL);
+	else
+		kvm_gpc_deactivate(&vcpu->arch.pv_time);
 
 	return;
 }
@@ -3374,7 +3371,7 @@ static int kvm_pv_enable_async_pf_int(struct kvm_vcpu *vcpu, u64 data)
 
 static void kvmclock_reset(struct kvm_vcpu *vcpu)
 {
-	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time);
+	kvm_gpc_deactivate(&vcpu->arch.pv_time);
 	vcpu->arch.time = 0;
 }
 
@@ -11551,7 +11548,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	vcpu->arch.regs_avail = ~0;
 	vcpu->arch.regs_dirty = ~0;
 
-	kvm_gpc_init(&vcpu->arch.pv_time);
+	kvm_gpc_init(&vcpu->arch.pv_time, vcpu->kvm, vcpu, KVM_HOST_USES_PFN,
+		     sizeof(struct pvclock_vcpu_time_info));
 
 	if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
 		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 361f77dc7a3d..9b4b0e6e66e5 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -42,13 +42,12 @@ static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn)
 	int idx = srcu_read_lock(&kvm->srcu);
 
 	if (gfn == GPA_INVALID) {
-		kvm_gpc_deactivate(kvm, gpc);
+		kvm_gpc_deactivate(gpc);
 		goto out;
 	}
 
 	do {
-		ret = kvm_gpc_activate(kvm, gpc, NULL, KVM_HOST_USES_PFN, gpa,
-				       PAGE_SIZE);
+		ret = kvm_gpc_activate(gpc, gpa);
 		if (ret)
 			goto out;
 
@@ -553,15 +552,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 			     offsetof(struct compat_vcpu_info, time));
 
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache);
+			kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gpc_activate(vcpu->kvm,
-				     &vcpu->arch.xen.vcpu_info_cache, NULL,
-				     KVM_HOST_USES_PFN, data->u.gpa,
-				     sizeof(struct vcpu_info));
+		r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_info_cache,
+				     data->u.gpa);
 		if (!r)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 
@@ -569,16 +566,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO:
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gpc_deactivate(vcpu->kvm,
-					   &vcpu->arch.xen.vcpu_time_info_cache);
+			kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_time_info_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gpc_activate(vcpu->kvm,
-				     &vcpu->arch.xen.vcpu_time_info_cache,
-				     NULL, KVM_HOST_USES_PFN, data->u.gpa,
-				     sizeof(struct pvclock_vcpu_time_info));
+		r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_time_info_cache,
+				     data->u.gpa);
 		if (!r)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 		break;
@@ -589,15 +583,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 			break;
 		}
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gpc_deactivate(vcpu->kvm,
-					   &vcpu->arch.xen.runstate_cache);
+			kvm_gpc_deactivate(&vcpu->arch.xen.runstate_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gpc_activate(vcpu->kvm, &vcpu->arch.xen.runstate_cache,
-				     NULL, KVM_HOST_USES_PFN, data->u.gpa,
-				     sizeof(struct vcpu_runstate_info));
+		r = kvm_gpc_activate(&vcpu->arch.xen.runstate_cache,
+				     data->u.gpa);
 		break;
 
 	case KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT:
@@ -1818,9 +1810,12 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu)
 
 	timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0);
 
-	kvm_gpc_init(&vcpu->arch.xen.runstate_cache);
-	kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache);
-	kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache);
+	kvm_gpc_init(&vcpu->arch.xen.runstate_cache, vcpu->kvm, NULL,
+		     KVM_HOST_USES_PFN, sizeof(struct vcpu_runstate_info));
+	kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache, vcpu->kvm, NULL,
+		     KVM_HOST_USES_PFN, sizeof(struct vcpu_info));
+	kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache, vcpu->kvm, NULL,
+		     KVM_HOST_USES_PFN, sizeof(struct pvclock_vcpu_time_info));
 }
 
 void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
@@ -1828,9 +1823,9 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
 	if (kvm_xen_timer_enabled(vcpu))
 		kvm_xen_stop_timer(vcpu);
 
-	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.runstate_cache);
-	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache);
-	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_time_info_cache);
+	kvm_gpc_deactivate(&vcpu->arch.xen.runstate_cache);
+	kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache);
+	kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_time_info_cache);
 
 	del_timer_sync(&vcpu->arch.xen.poll_timer);
 }
@@ -1838,7 +1833,8 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
 void kvm_xen_init_vm(struct kvm *kvm)
 {
 	idr_init(&kvm->arch.xen.evtchn_ports);
-	kvm_gpc_init(&kvm->arch.xen.shinfo_cache);
+	kvm_gpc_init(&kvm->arch.xen.shinfo_cache, kvm, NULL, KVM_HOST_USES_PFN,
+		     PAGE_SIZE);
 }
 
 void kvm_xen_destroy_vm(struct kvm *kvm)
@@ -1846,7 +1842,7 @@ void kvm_xen_destroy_vm(struct kvm *kvm)
 	struct evtchnfd *evtchnfd;
 	int i;
 
-	kvm_gpc_deactivate(kvm, &kvm->arch.xen.shinfo_cache);
+	kvm_gpc_deactivate(&kvm->arch.xen.shinfo_cache);
 
 	idr_for_each_entry(&kvm->arch.xen.evtchn_ports, evtchnfd, i) {
 		if (!evtchnfd->deliver.port.port)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f687e56c24bc..024b8df5302c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1244,17 +1244,7 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
  * kvm_gpc_init - initialize gfn_to_pfn_cache.
  *
  * @gpc:	   struct gfn_to_pfn_cache object.
- *
- * This sets up a gfn_to_pfn_cache by initializing locks.
- */
-void kvm_gpc_init(struct gfn_to_pfn_cache *gpc);
-
-/**
- * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest
- *                    physical address.
- *
  * @kvm:	   pointer to kvm instance.
- * @gpc:	   struct gfn_to_pfn_cache object.
  * @vcpu:	   vCPU to be used for marking pages dirty and to be woken on
  *		   invalidation.
  * @usage:	   indicates if the resulting host physical PFN is used while
@@ -1263,20 +1253,31 @@ void kvm_gpc_init(struct gfn_to_pfn_cache *gpc);
  *		   changes!---will also force @vcpu to exit the guest and
  *		   refresh the cache); and/or if the PFN used directly
  *		   by KVM (and thus needs a kernel virtual mapping).
- * @gpa:	   guest physical address to map.
  * @len:	   sanity check; the range being access must fit a single page.
  *
+ * This sets up a gfn_to_pfn_cache by initializing locks and assigning the
+ * immutable attributes.
+ */
+void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm,
+		  struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
+		  unsigned long len);
+
+/**
+ * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest
+ *                    physical address.
+ *
+ * @gpc:	   struct gfn_to_pfn_cache object.
+ * @gpa:	   guest physical address to map.
+ *
  * @return:	   0 for success.
  *		   -EINVAL for a mapping which would cross a page boundary.
- *                 -EFAULT for an untranslatable guest physical address.
+ *		   -EFAULT for an untranslatable guest physical address.
  *
- * This primes a gfn_to_pfn_cache and links it into the @kvm's list for
+ * This primes a gfn_to_pfn_cache and links it into the @gpc->kvm's list for
  * invalidations to be processed.  Callers are required to use kvm_gpc_check()
  * to ensure that the cache is valid before accessing the target page.
  */
-int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-		     struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
-		     gpa_t gpa, unsigned long len);
+int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa);
 
 /**
  * kvm_gpc_check - check validity of a gfn_to_pfn_cache.
@@ -1335,13 +1336,12 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
 /**
  * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache.
  *
- * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
  *
- * This removes a cache from the @kvm's list to be processed on MMU notifier
- * invocation.
+ * This removes a cache from the @gpc->kvm's list to be processed on MMU
+ * notifier invocation.
  */
-void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
+void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc);
 
 void kvm_sigset_activate(struct kvm_vcpu *vcpu);
 void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index 3ca3db020e0e..d66b276d29e0 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -67,12 +67,14 @@ struct gfn_to_pfn_cache {
 	gpa_t gpa;
 	unsigned long uhva;
 	struct kvm_memory_slot *memslot;
+	struct kvm *kvm;
 	struct kvm_vcpu *vcpu;
 	struct list_head list;
 	rwlock_t lock;
 	struct mutex refresh_lock;
 	void *khva;
 	kvm_pfn_t pfn;
+	unsigned long len;
 	enum pfn_cache_usage usage;
 	bool active;
 	bool valid;
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 76f1b669cf28..56ca0e9c6ed7 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -346,44 +346,48 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_unmap);
 
-void kvm_gpc_init(struct gfn_to_pfn_cache *gpc)
+void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm,
+		  struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
+		  unsigned long len)
 {
+	WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
+	WARN_ON_ONCE((usage & KVM_GUEST_USES_PFN) && !vcpu);
+
 	rwlock_init(&gpc->lock);
 	mutex_init(&gpc->refresh_lock);
+
+	gpc->kvm = kvm;
+	gpc->vcpu = vcpu;
+	gpc->usage = usage;
+	gpc->len = len;
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_init);
 
-int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-		     struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
-		     gpa_t gpa, unsigned long len)
+int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa)
 {
-	WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
-
 	if (!gpc->active) {
 		gpc->khva = NULL;
 		gpc->pfn = KVM_PFN_ERR_FAULT;
 		gpc->uhva = KVM_HVA_ERR_BAD;
-		gpc->vcpu = vcpu;
-		gpc->usage = usage;
 		gpc->valid = false;
 		gpc->active = true;
 
-		spin_lock(&kvm->gpc_lock);
-		list_add(&gpc->list, &kvm->gpc_list);
-		spin_unlock(&kvm->gpc_lock);
+		spin_lock(&gpc->kvm->gpc_lock);
+		list_add(&gpc->list, &gpc->kvm->gpc_list);
+		spin_unlock(&gpc->kvm->gpc_lock);
 	}
-	return kvm_gpc_refresh(kvm, gpc, gpa, len);
+	return kvm_gpc_refresh(gpc->kvm, gpc, gpa, gpc->len);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
-void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc)
 {
 	if (gpc->active) {
-		spin_lock(&kvm->gpc_lock);
+		spin_lock(&gpc->kvm->gpc_lock);
 		list_del(&gpc->list);
-		spin_unlock(&kvm->gpc_lock);
+		spin_unlock(&gpc->kvm->gpc_lock);
 
-		kvm_gpc_unmap(kvm, gpc);
+		kvm_gpc_unmap(gpc->kvm, gpc);
 		gpc->active = false;
 	}
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 5/8] KVM: x86: Clean up kvm_gpc_check()
  2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
                         ` (3 preceding siblings ...)
  2022-09-21  2:01       ` [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj
@ 2022-09-21  2:01       ` Michal Luczaj
  2022-09-21  2:01       ` [PATCH 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj
                         ` (2 subsequent siblings)
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-21  2:01 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Make kvm_gpc_check() use kvm instance cached in gfn_to_pfn_cache.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/x86.c       |  2 +-
 arch/x86/kvm/xen.c       | 14 ++++++--------
 include/linux/kvm_host.h |  4 +---
 virt/kvm/pfncache.c      |  5 ++---
 4 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ed8e4f8c9cf0..273f1ed3b5ae 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3017,7 +3017,7 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v,
 	unsigned long flags;
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
+	while (!kvm_gpc_check(gpc, gpc->gpa,
 			      offset + sizeof(*guest_hv_clock))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 9b4b0e6e66e5..84b95258773a 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -217,7 +217,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state)
 		user_len = sizeof(struct compat_vcpu_runstate_info);
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, user_len)) {
+	while (!kvm_gpc_check(gpc, gpc->gpa, user_len)) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
 		/* When invoked from kvm_sched_out() we cannot sleep */
@@ -350,8 +350,7 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v)
 	 * little more honest about it.
 	 */
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
-			      sizeof(struct vcpu_info))) {
+	while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
 		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
@@ -415,8 +414,7 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
 		     sizeof_field(struct compat_vcpu_info, evtchn_upcall_pending));
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
-			      sizeof(struct vcpu_info))) {
+	while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
 		/*
@@ -957,7 +955,7 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports,
 
 	read_lock_irqsave(&gpc->lock, flags);
 	idx = srcu_read_lock(&kvm->srcu);
-	if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
+	if (!kvm_gpc_check(gpc, gpc->gpa, PAGE_SIZE))
 		goto out_rcu;
 
 	ret = false;
@@ -1349,7 +1347,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 	idx = srcu_read_lock(&kvm->srcu);
 
 	read_lock_irqsave(&gpc->lock, flags);
-	if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
+	if (!kvm_gpc_check(gpc, gpc->gpa, PAGE_SIZE))
 		goto out_rcu;
 
 	if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) {
@@ -1383,7 +1381,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 		gpc = &vcpu->arch.xen.vcpu_info_cache;
 
 		read_lock_irqsave(&gpc->lock, flags);
-		if (!kvm_gpc_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) {
+		if (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 			/*
 			 * Could not access the vcpu_info. Set the bit in-kernel
 			 * and prod the vCPU to deliver it for itself.
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 024b8df5302c..0b66d4889ec3 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1282,7 +1282,6 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa);
 /**
  * kvm_gpc_check - check validity of a gfn_to_pfn_cache.
  *
- * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
  * @gpa:	   current guest physical address to map.
  * @len:	   sanity check; the range being access must fit a single page.
@@ -1297,8 +1296,7 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa);
  * Callers in IN_GUEST_MODE may do so without locking, although they should
  * still hold a read lock on kvm->scru for the memslot checks.
  */
-bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
-		   unsigned long len);
+bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len);
 
 /**
  * kvm_gpc_refresh - update a previously initialized cache.
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 56ca0e9c6ed7..eb91025d7242 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -76,10 +76,9 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start,
 	}
 }
 
-bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
-		   unsigned long len)
+bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
 {
-	struct kvm_memslots *slots = kvm_memslots(kvm);
+	struct kvm_memslots *slots = kvm_memslots(gpc->kvm);
 
 	if ((gpa & ~PAGE_MASK) + len > PAGE_SIZE)
 		return false;
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 6/8] KVM: x86: Clean up hva_to_pfn_retry()
  2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
                         ` (4 preceding siblings ...)
  2022-09-21  2:01       ` [PATCH 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj
@ 2022-09-21  2:01       ` Michal Luczaj
  2022-09-21  2:01       ` [PATCH 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj
  2022-09-21  2:01       ` [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-21  2:01 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Make hva_to_pfn_retry() use kvm instance cached in gfn_to_pfn_cache.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 virt/kvm/pfncache.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index eb91025d7242..a2c95e393e34 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -135,7 +135,7 @@ static inline bool mmu_notifier_retry_cache(struct kvm *kvm, unsigned long mmu_s
 	return kvm->mmu_invalidate_seq != mmu_seq;
 }
 
-static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
 {
 	/* Note, the new page offset may be different than the old! */
 	void *old_khva = gpc->khva - offset_in_page(gpc->khva);
@@ -155,7 +155,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 	gpc->valid = false;
 
 	do {
-		mmu_seq = kvm->mmu_invalidate_seq;
+		mmu_seq = gpc->kvm->mmu_invalidate_seq;
 		smp_rmb();
 
 		write_unlock_irq(&gpc->lock);
@@ -213,7 +213,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 		 * attempting to refresh.
 		 */
 		WARN_ON_ONCE(gpc->valid);
-	} while (mmu_notifier_retry_cache(kvm, mmu_seq));
+	} while (mmu_notifier_retry_cache(gpc->kvm, mmu_seq));
 
 	gpc->valid = true;
 	gpc->pfn = new_pfn;
@@ -285,7 +285,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 	 * drop the lock and do the HVA to PFN lookup again.
 	 */
 	if (!gpc->valid || old_uhva != gpc->uhva) {
-		ret = hva_to_pfn_retry(kvm, gpc);
+		ret = hva_to_pfn_retry(gpc);
 	} else {
 		/* If the HVA→PFN mapping was already valid, don't unmap it. */
 		old_pfn = KVM_PFN_ERR_FAULT;
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap()
  2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
                         ` (5 preceding siblings ...)
  2022-09-21  2:01       ` [PATCH 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj
@ 2022-09-21  2:01       ` Michal Luczaj
  2022-09-21  2:01       ` [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-09-21  2:01 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Make kvm_gpc_refresh() use kvm instance cached in gfn_to_pfn_cache.

First argument of kvm_gpc_unmap() becomes unneeded; remove it from
function definition.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/x86.c       |  2 +-
 arch/x86/kvm/xen.c       | 10 ++++------
 include/linux/kvm_host.h | 10 ++++------
 virt/kvm/pfncache.c      | 11 +++++------
 4 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 273f1ed3b5ae..d24d1731d2bc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3021,7 +3021,7 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v,
 			      offset + sizeof(*guest_hv_clock))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
-		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
+		if (kvm_gpc_refresh(gpc, gpc->gpa,
 				    offset + sizeof(*guest_hv_clock)))
 			return;
 
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 84b95258773a..bcaaa83290fb 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -224,7 +224,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state)
 		if (state == RUNSTATE_runnable)
 			return;
 
-		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, user_len))
+		if (kvm_gpc_refresh(gpc, gpc->gpa, user_len))
 			return;
 
 		read_lock_irqsave(&gpc->lock, flags);
@@ -353,8 +353,7 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v)
 	while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
-		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
-				    sizeof(struct vcpu_info)))
+		if (kvm_gpc_refresh(gpc, gpc->gpa, sizeof(struct vcpu_info)))
 			return;
 
 		read_lock_irqsave(&gpc->lock, flags);
@@ -428,8 +427,7 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
 		if (in_atomic() || !task_is_running(current))
 			return 1;
 
-		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
-				    sizeof(struct vcpu_info))) {
+		if (kvm_gpc_refresh(gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 			/*
 			 * If this failed, userspace has screwed up the
 			 * vcpu_info mapping. No interrupts for you.
@@ -1479,7 +1477,7 @@ static int kvm_xen_set_evtchn(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 			break;
 
 		idx = srcu_read_lock(&kvm->srcu);
-		rc = kvm_gpc_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE);
+		rc = kvm_gpc_refresh(gpc, gpc->gpa, PAGE_SIZE);
 		srcu_read_unlock(&kvm->srcu, idx);
 	} while(!rc);
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 0b66d4889ec3..efead11bec84 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1301,35 +1301,33 @@ bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len);
 /**
  * kvm_gpc_refresh - update a previously initialized cache.
  *
- * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
  * @gpa:	   updated guest physical address to map.
  * @len:	   sanity check; the range being access must fit a single page.
  *
  * @return:	   0 for success.
  *		   -EINVAL for a mapping which would cross a page boundary.
- *                 -EFAULT for an untranslatable guest physical address.
+ *		   -EFAULT for an untranslatable guest physical address.
  *
  * This will attempt to refresh a gfn_to_pfn_cache. Note that a successful
- * returm from this function does not mean the page can be immediately
+ * return from this function does not mean the page can be immediately
  * accessed because it may have raced with an invalidation. Callers must
  * still lock and check the cache status, as this function does not return
  * with the lock still held to permit access.
  */
-int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
+int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 		    unsigned long len);
 
 /**
  * kvm_gpc_unmap - temporarily unmap a gfn_to_pfn_cache.
  *
- * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
  *
  * This unmaps the referenced page. The cache is left in the invalid state
  * but at least the mapping from GPA to userspace HVA will remain cached
  * and can be reused on a subsequent refresh.
  */
-void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
+void kvm_gpc_unmap(struct gfn_to_pfn_cache *gpc);
 
 /**
  * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache.
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index a2c95e393e34..45b9b96c0ea3 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -234,10 +234,9 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
 	return -EFAULT;
 }
 
-int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
-		    unsigned long len)
+int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
 {
-	struct kvm_memslots *slots = kvm_memslots(kvm);
+	struct kvm_memslots *slots = kvm_memslots(gpc->kvm);
 	unsigned long page_offset = gpa & ~PAGE_MASK;
 	kvm_pfn_t old_pfn, new_pfn;
 	unsigned long old_uhva;
@@ -318,7 +317,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_refresh);
 
-void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+void kvm_gpc_unmap(struct gfn_to_pfn_cache *gpc)
 {
 	void *old_khva;
 	kvm_pfn_t old_pfn;
@@ -375,7 +374,7 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa)
 		list_add(&gpc->list, &gpc->kvm->gpc_list);
 		spin_unlock(&gpc->kvm->gpc_lock);
 	}
-	return kvm_gpc_refresh(gpc->kvm, gpc, gpa, gpc->len);
+	return kvm_gpc_refresh(gpc, gpa, gpc->len);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
@@ -386,7 +385,7 @@ void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc)
 		list_del(&gpc->list);
 		spin_unlock(&gpc->kvm->gpc_lock);
 
-		kvm_gpc_unmap(gpc->kvm, gpc);
+		kvm_gpc_unmap(gpc);
 		gpc->active = false;
 	}
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast()
  2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
                         ` (6 preceding siblings ...)
  2022-09-21  2:01       ` [PATCH 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj
@ 2022-09-21  2:01       ` Michal Luczaj
  2022-10-10 23:28         ` Sean Christopherson
  7 siblings, 1 reply; 34+ messages in thread
From: Michal Luczaj @ 2022-09-21  2:01 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

There's a race between kvm_xen_set_evtchn_fast() and kvm_gpc_activate()
resulting in a near-NULL pointer write.

1. Deactivate shinfo cache:

kvm_xen_hvm_set_attr
case KVM_XEN_ATTR_TYPE_SHARED_INFO
 kvm_gpc_deactivate
  kvm_gpc_unmap
   gpc->valid = false
   gpc->khva = NULL
  gpc->active = false

Result: active = false, valid = false

2. Cause cache refresh:

kvm_arch_vm_ioctl
case KVM_XEN_HVM_EVTCHN_SEND
 kvm_xen_hvm_evtchn_send
  kvm_xen_set_evtchn
   kvm_xen_set_evtchn_fast
    kvm_gpc_check
    return -EWOULDBLOCK because !gpc->valid
   kvm_xen_set_evtchn_fast
    return -EWOULDBLOCK
   kvm_gpc_refresh
    hva_to_pfn_retry
     gpc->valid = true
     gpc->khva = not NULL

Result: active = false, valid = true

3. Race ioctl KVM_XEN_HVM_EVTCHN_SEND against ioctl
KVM_XEN_ATTR_TYPE_SHARED_INFO:

kvm_arch_vm_ioctl
case KVM_XEN_HVM_EVTCHN_SEND
 kvm_xen_hvm_evtchn_send
  kvm_xen_set_evtchn
   kvm_xen_set_evtchn_fast
    read_lock gpc->lock
                                          kvm_xen_hvm_set_attr case
                                          KVM_XEN_ATTR_TYPE_SHARED_INFO
                                           mutex_lock kvm->lock
                                           kvm_xen_shared_info_init
                                            kvm_gpc_activate
                                             gpc->khva = NULL
    kvm_gpc_check
     [ Check passes because gpc->valid is
       still true, even though gpc->khva
       is already NULL. ]
    shinfo = gpc->khva
    pending_bits = shinfo->evtchn_pending
    CRASH: test_and_set_bit(..., pending_bits)

Protect kvm_gpc_activate() cache properties writes by write lock
gpc->lock.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
Attaching more details:

[   86.127703] BUG: kernel NULL pointer dereference, address: 0000000000000800
[   86.127751] #PF: supervisor write access in kernel mode
[   86.127778] #PF: error_code(0x0002) - not-present page
[   86.127801] PGD 105792067 P4D 105792067 PUD 105793067 PMD 0
[   86.127826] Oops: 0002 [#1] PREEMPT SMP NOPTI
[   86.127850] CPU: 0 PID: 945 Comm: xen_shinfo_test Not tainted 6.0.0-rc5-test+ #31
[   86.127874] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[   86.127898] RIP: 0010:kvm_xen_set_evtchn_fast (./arch/x86/include/asm/bitops.h:138 ./include/asm-generic/bitops/instrumented-atomic.h:72 arch/x86/kvm/xen.c:1370) kvm
[ 86.127960] Code: 5a 84 c0 0f 84 01 01 00 00 80 bb b4 9f 00 00 00 8b 45 00 48 8b 93 c8 a0 00 00 75 4d 41 89 c0 48 8d b2 80 08 00 00 41 c1 e8 05 <f0> 48 0f ab 82 00 08 00 00 0f 82 19 01 00 00 8b 45 00 48 0f a3 06
All code
========
   0:	5a                   	pop    %rdx
   1:	84 c0                	test   %al,%al
   3:	0f 84 01 01 00 00    	je     0x10a
   9:	80 bb b4 9f 00 00 00 	cmpb   $0x0,0x9fb4(%rbx)
  10:	8b 45 00             	mov    0x0(%rbp),%eax
  13:	48 8b 93 c8 a0 00 00 	mov    0xa0c8(%rbx),%rdx
  1a:	75 4d                	jne    0x69
  1c:	41 89 c0             	mov    %eax,%r8d
  1f:	48 8d b2 80 08 00 00 	lea    0x880(%rdx),%rsi
  26:	41 c1 e8 05          	shr    $0x5,%r8d
  2a:*	f0 48 0f ab 82 00 08 	lock bts %rax,0x800(%rdx)		<-- trapping instruction
  31:	00 00
  33:	0f 82 19 01 00 00    	jb     0x152
  39:	8b 45 00             	mov    0x0(%rbp),%eax
  3c:	48 0f a3 06          	bt     %rax,(%rsi)

Code starting with the faulting instruction
===========================================
   0:	f0 48 0f ab 82 00 08 	lock bts %rax,0x800(%rdx)
   7:	00 00
   9:	0f 82 19 01 00 00    	jb     0x128
   f:	8b 45 00             	mov    0x0(%rbp),%eax
  12:	48 0f a3 06          	bt     %rax,(%rsi)
[   86.127982] RSP: 0018:ffffc90001367c50 EFLAGS: 00010046
[   86.128001] RAX: 0000000000000001 RBX: ffffc90001369000 RCX: 0000000000000001
[   86.128021] RDX: 0000000000000000 RSI: 0000000000000a00 RDI: ffffffff82886a66
[   86.128040] RBP: ffffc90001367ca8 R08: 0000000000000000 R09: 000000006cc00c97
[   86.128060] R10: ffff88810c150000 R11: 0000000076cc00c9 R12: 0000000000000001
[   86.128079] R13: ffffc90001372ff8 R14: ffff8881045c0000 R15: ffffc90001373830
[   86.128098] FS:  00007f71d6111740(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
[   86.128118] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   86.128138] CR2: 0000000000000800 CR3: 0000000104774006 CR4: 0000000000772ef0
[   86.128158] PKRU: 55555554
[   86.128177] Call Trace:
[   86.128196]  <TASK>
[   86.128215] kvm_xen_hvm_evtchn_send (arch/x86/kvm/xen.c:1432 arch/x86/kvm/xen.c:1562) kvm
[   86.128256] kvm_arch_vm_ioctl (arch/x86/kvm/x86.c:6883) kvm
[   86.128294] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007)
[   86.128315] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007)
[   86.128335] ? kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm
[   86.128368] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm
[   86.128401] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710)
[   86.128422] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688)
[   86.128442] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856)
[   86.128462] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[   86.128482] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128501] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128520] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383)
[   86.128539] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128558] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128577] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128596] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128615] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128634] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128653] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383)
[   86.128673] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[   86.128692] RIP: 0033:0x7f71d6152c6b
[ 86.128712] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48
All code
========
   0:	73 01                	jae    0x3
   2:	c3                   	ret
   3:	48 8b 0d b5 b1 1b 00 	mov    0x1bb1b5(%rip),%rcx        # 0x1bb1bf
   a:	f7 d8                	neg    %eax
   c:	64 89 01             	mov    %eax,%fs:(%rcx)
   f:	48 83 c8 ff          	or     $0xffffffffffffffff,%rax
  13:	c3                   	ret
  14:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
  1b:	00 00 00
  1e:	90                   	nop
  1f:	f3 0f 1e fa          	endbr64
  23:	b8 10 00 00 00       	mov    $0x10,%eax
  28:	0f 05                	syscall
  2a:*	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax		<-- trapping instruction
  30:	73 01                	jae    0x33
  32:	c3                   	ret
  33:	48 8b 0d 85 b1 1b 00 	mov    0x1bb185(%rip),%rcx        # 0x1bb1bf
  3a:	f7 d8                	neg    %eax
  3c:	64 89 01             	mov    %eax,%fs:(%rcx)
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax
   6:	73 01                	jae    0x9
   8:	c3                   	ret
   9:	48 8b 0d 85 b1 1b 00 	mov    0x1bb185(%rip),%rcx        # 0x1bb195
  10:	f7 d8                	neg    %eax
  12:	64 89 01             	mov    %eax,%fs:(%rcx)
  15:	48                   	rex.W
[   86.128735] RSP: 002b:00007fff7c716c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   86.128755] RAX: ffffffffffffffda RBX: 00007f71d61116c0 RCX: 00007f71d6152c6b
[   86.128775] RDX: 00007fff7c716f00 RSI: 00000000400caed0 RDI: 0000000000000004
[   86.128794] RBP: 0000000001f2b2a0 R08: 0000000000417343 R09: 00007f71d62cc341
[   86.128814] R10: 00007fff7c74c258 R11: 0000000000000246 R12: 00007f71d6329000
[   86.128833] R13: 00000000632870b9 R14: 0000000000000001 R15: 00007f71d632a020
[   86.128854]  </TASK>
[   86.128873] Modules linked in: kvm_intel 9p fscache netfs nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr sunrpc intel_rapl_msr intel_rapl_common kvm irqbypass rapl pcspkr 9pnet_virtio i2c_piix4 9pnet drm zram ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_console serio_raw ata_generic virtio_blk pata_acpi qemu_fw_cfg ipmi_devintf ipmi_msghandler fuse
[   86.128893] Unloaded tainted modules: kvm_intel():1 [last unloaded: kvm_intel]
[   86.128944] CR2: 0000000000000800
[   86.128962] ---[ end trace 0000000000000000 ]---
[   86.128982] RIP: 0010:kvm_xen_set_evtchn_fast (./arch/x86/include/asm/bitops.h:138 ./include/asm-generic/bitops/instrumented-atomic.h:72 arch/x86/kvm/xen.c:1370) kvm
[ 86.129022] Code: 5a 84 c0 0f 84 01 01 00 00 80 bb b4 9f 00 00 00 8b 45 00 48 8b 93 c8 a0 00 00 75 4d 41 89 c0 48 8d b2 80 08 00 00 41 c1 e8 05 <f0> 48 0f ab 82 00 08 00 00 0f 82 19 01 00 00 8b 45 00 48 0f a3 06
All code
========
   0:	5a                   	pop    %rdx
   1:	84 c0                	test   %al,%al
   3:	0f 84 01 01 00 00    	je     0x10a
   9:	80 bb b4 9f 00 00 00 	cmpb   $0x0,0x9fb4(%rbx)
  10:	8b 45 00             	mov    0x0(%rbp),%eax
  13:	48 8b 93 c8 a0 00 00 	mov    0xa0c8(%rbx),%rdx
  1a:	75 4d                	jne    0x69
  1c:	41 89 c0             	mov    %eax,%r8d
  1f:	48 8d b2 80 08 00 00 	lea    0x880(%rdx),%rsi
  26:	41 c1 e8 05          	shr    $0x5,%r8d
  2a:*	f0 48 0f ab 82 00 08 	lock bts %rax,0x800(%rdx)		<-- trapping instruction
  31:	00 00
  33:	0f 82 19 01 00 00    	jb     0x152
  39:	8b 45 00             	mov    0x0(%rbp),%eax
  3c:	48 0f a3 06          	bt     %rax,(%rsi)

Code starting with the faulting instruction
===========================================
   0:	f0 48 0f ab 82 00 08 	lock bts %rax,0x800(%rdx)
   7:	00 00
   9:	0f 82 19 01 00 00    	jb     0x128
   f:	8b 45 00             	mov    0x0(%rbp),%eax
  12:	48 0f a3 06          	bt     %rax,(%rsi)
[   86.129044] RSP: 0018:ffffc90001367c50 EFLAGS: 00010046
[   86.129064] RAX: 0000000000000001 RBX: ffffc90001369000 RCX: 0000000000000001
[   86.129083] RDX: 0000000000000000 RSI: 0000000000000a00 RDI: ffffffff82886a66
[   86.129103] RBP: ffffc90001367ca8 R08: 0000000000000000 R09: 000000006cc00c97
[   86.129122] R10: ffff88810c150000 R11: 0000000076cc00c9 R12: 0000000000000001
[   86.129142] R13: ffffc90001372ff8 R14: ffff8881045c0000 R15: ffffc90001373830
[   86.129161] FS:  00007f71d6111740(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
[   86.129181] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   86.129201] CR2: 0000000000000800 CR3: 0000000104774006 CR4: 0000000000772ef0
[   86.129221] PKRU: 55555554
[   86.129240] note: xen_shinfo_test[945] exited with preempt_count 1
[  151.131754] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  151.131793] rcu: 	1-...0: (13 ticks this GP) idle=785c/1/0x4000000000000000 softirq=4758/4760 fqs=16038
[  151.131816] 	(detected by 2, t=65002 jiffies, g=6449, q=1299 ncpus=4)
[  151.131837] Sending NMI from CPU 2 to CPUs 1:
[  151.131862] NMI backtrace for cpu 1
[  151.131864] CPU: 1 PID: 949 Comm: xen_shinfo_test Tainted: G      D            6.0.0-rc5-test+ #31
[  151.131866] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[  151.131866] RIP: 0010:queued_write_lock_slowpath (kernel/locking/qrwlock.c:85)
[ 151.131870] Code: ff 90 48 89 df 5b 5d e9 86 fd ff ff f0 81 0b 00 01 00 00 ba ff 00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 8b 03 <3d> 00 01 00 00 75 f5 89 c8 f0 0f b1 13 74 c0 eb e2 89 c6 48 89 ef
All code
========
   0:	ff 90 48 89 df 5b    	call   *0x5bdf8948(%rax)
   6:	5d                   	pop    %rbp
   7:	e9 86 fd ff ff       	jmp    0xfffffffffffffd92
   c:	f0 81 0b 00 01 00 00 	lock orl $0x100,(%rbx)
  13:	ba ff 00 00 00       	mov    $0xff,%edx
  18:	b9 00 01 00 00       	mov    $0x100,%ecx
  1d:	8b 03                	mov    (%rbx),%eax
  1f:	3d 00 01 00 00       	cmp    $0x100,%eax
  24:	74 0b                	je     0x31
  26:	f3 90                	pause
  28:	8b 03                	mov    (%rbx),%eax
  2a:*	3d 00 01 00 00       	cmp    $0x100,%eax		<-- trapping instruction
  2f:	75 f5                	jne    0x26
  31:	89 c8                	mov    %ecx,%eax
  33:	f0 0f b1 13          	lock cmpxchg %edx,(%rbx)
  37:	74 c0                	je     0xfffffffffffffff9
  39:	eb e2                	jmp    0x1d
  3b:	89 c6                	mov    %eax,%esi
  3d:	48 89 ef             	mov    %rbp,%rdi

Code starting with the faulting instruction
===========================================
   0:	3d 00 01 00 00       	cmp    $0x100,%eax
   5:	75 f5                	jne    0xfffffffffffffffc
   7:	89 c8                	mov    %ecx,%eax
   9:	f0 0f b1 13          	lock cmpxchg %edx,(%rbx)
   d:	74 c0                	je     0xffffffffffffffcf
   f:	eb e2                	jmp    0xfffffffffffffff3
  11:	89 c6                	mov    %eax,%esi
  13:	48 89 ef             	mov    %rbp,%rdi
[  151.131871] RSP: 0018:ffffc900011f7b98 EFLAGS: 00000006
[  151.131872] RAX: 0000000000000300 RBX: ffffc90001372ff8 RCX: 0000000000000100
[  151.131872] RDX: 00000000000000ff RSI: ffffffff827ea0a3 RDI: ffffffff82886a66
[  151.131873] RBP: ffffc90001372ffc R08: ffff888107864160 R09: 00000000777cbcf6
[  151.131873] R10: ffff888107863300 R11: 000000006777cbcf R12: ffffc90001372ff8
[  151.131874] R13: ffffc90001369000 R14: ffffc90001372fb8 R15: ffffc90001369170
[  151.131874] FS:  00007f71d5f09640(0000) GS:ffff888237c80000(0000) knlGS:0000000000000000
[  151.131875] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  151.131876] CR2: 00007f71d5f08ef8 CR3: 0000000104774002 CR4: 0000000000772ee0
[  151.131877] PKRU: 55555554
[  151.131878] Call Trace:
[  151.131879]  <TASK>
[  151.131880] do_raw_write_lock (./include/asm-generic/qrwlock.h:101 kernel/locking/spinlock_debug.c:210)
[  151.131883] kvm_gpc_refresh (arch/x86/kvm/../../../virt/kvm/pfncache.c:262) kvm
[  151.131907] ? kvm_gpc_activate (./include/linux/spinlock.h:390 arch/x86/kvm/../../../virt/kvm/pfncache.c:375) kvm
[  151.131925] kvm_xen_hvm_set_attr (arch/x86/kvm/xen.c:51 arch/x86/kvm/xen.c:464) kvm
[  151.131950] kvm_arch_vm_ioctl (arch/x86/kvm/x86.c:6883) kvm
[  151.131971] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007)
[  151.131973] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm
[  151.131991] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710)
[  151.131993] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688)
[  151.131995] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856)
[  151.131997] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[  151.131998] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710)
[  151.132000] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132000] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383)
[  151.132002] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132003] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132003] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383)
[  151.132004] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132005] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132006] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132007] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383)
[  151.132008] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[  151.132009] RIP: 0033:0x7f71d6152c6b
[ 151.132011] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48
All code
========
   0:	73 01                	jae    0x3
   2:	c3                   	ret
   3:	48 8b 0d b5 b1 1b 00 	mov    0x1bb1b5(%rip),%rcx        # 0x1bb1bf
   a:	f7 d8                	neg    %eax
   c:	64 89 01             	mov    %eax,%fs:(%rcx)
   f:	48 83 c8 ff          	or     $0xffffffffffffffff,%rax
  13:	c3                   	ret
  14:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
  1b:	00 00 00
  1e:	90                   	nop
  1f:	f3 0f 1e fa          	endbr64
  23:	b8 10 00 00 00       	mov    $0x10,%eax
  28:	0f 05                	syscall
  2a:*	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax		<-- trapping instruction
  30:	73 01                	jae    0x33
  32:	c3                   	ret
  33:	48 8b 0d 85 b1 1b 00 	mov    0x1bb185(%rip),%rcx        # 0x1bb1bf
  3a:	f7 d8                	neg    %eax
  3c:	64 89 01             	mov    %eax,%fs:(%rcx)
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax
   6:	73 01                	jae    0x9
   8:	c3                   	ret
   9:	48 8b 0d 85 b1 1b 00 	mov    0x1bb185(%rip),%rcx        # 0x1bb195
  10:	f7 d8                	neg    %eax
  12:	64 89 01             	mov    %eax,%fs:(%rcx)
  15:	48                   	rex.W
[  151.132011] RSP: 002b:00007f71d5f08da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  151.132012] RAX: ffffffffffffffda RBX: 0000000001f2b2a0 RCX: 00007f71d6152c6b
[  151.132013] RDX: 00007f71d5f08db0 RSI: 000000004048aec9 RDI: 0000000000000004
[  151.132013] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fff7c716b3f
[  151.132013] R10: 00007f71d611c948 R11: 0000000000000246 R12: 00007f71d5f09640
[  151.132014] R13: 0000000000000004 R14: 00007f71d61b3550 R15: 0000000000000000
[  151.132016]  </TASK>

 virt/kvm/pfncache.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 45b9b96c0ea3..e987669c3506 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -364,11 +364,13 @@ EXPORT_SYMBOL_GPL(kvm_gpc_init);
 int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa)
 {
 	if (!gpc->active) {
+		write_lock_irq(&gpc->lock);
 		gpc->khva = NULL;
 		gpc->pfn = KVM_PFN_ERR_FAULT;
 		gpc->uhva = KVM_HVA_ERR_BAD;
 		gpc->valid = false;
 		gpc->active = true;
+		write_unlock_irq(&gpc->lock);
 
 		spin_lock(&gpc->kvm->gpc_lock);
 		list_add(&gpc->list, &gpc->kvm->gpc_list);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix
  2022-09-16 17:12   ` Sean Christopherson
  2022-09-18 23:13     ` Michal Luczaj
  2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
@ 2022-10-05 12:30     ` Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj
                         ` (7 more replies)
  2 siblings, 8 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Politely resending after two weeks :)

Series marked v2, but there're no functional changes, I've only tweaked
patch 4/8 commit message.

I've also made sure there are no merge conflicts with 6.0 release.

Please let me know if this needs any corrections and/or to wait.

Thanks,
Michal

Michal Luczaj (8):
  KVM: x86: Add initializer for gfn_to_pfn_cache
  KVM: x86: Shorten gfn_to_pfn_cache function names
  KVM: x86: Remove unused argument in gpc_unmap_khva()
  KVM: x86: Store immutable gfn_to_pfn_cache properties
  KVM: x86: Clean up kvm_gpc_check()
  KVM: x86: Clean up hva_to_pfn_retry()
  KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap()
  KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast()

 arch/x86/kvm/x86.c        | 24 ++++++------
 arch/x86/kvm/xen.c        | 78 +++++++++++++++++--------------------
 include/linux/kvm_host.h  | 64 ++++++++++++++++---------------
 include/linux/kvm_types.h |  2 +
 virt/kvm/pfncache.c       | 81 +++++++++++++++++++++------------------
 5 files changed, 128 insertions(+), 121 deletions(-)

-- 
2.37.3


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v2 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache
  2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
@ 2022-10-05 12:30       ` Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Move the gfn_to_pfn_cache lock initialization to another helper and
call the new helper during VM/vCPU creation.

Rename "cache_init" and "cache_destroy" to activate+deactivate to
avoid implying that the cache really is destroyed/freed.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/x86.c       | 12 +++++----
 arch/x86/kvm/xen.c       | 57 +++++++++++++++++++++-------------------
 include/linux/kvm_host.h | 23 +++++++++++-----
 virt/kvm/pfncache.c      | 21 ++++++++-------
 4 files changed, 65 insertions(+), 48 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 43a6a7efc6ec..15032b7f0589 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2301,11 +2301,11 @@ static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time,
 
 	/* we verify if the enable bit is set... */
 	if (system_time & 1) {
-		kvm_gfn_to_pfn_cache_init(vcpu->kvm, &vcpu->arch.pv_time, vcpu,
-					  KVM_HOST_USES_PFN, system_time & ~1ULL,
-					  sizeof(struct pvclock_vcpu_time_info));
+		kvm_gpc_activate(vcpu->kvm, &vcpu->arch.pv_time, vcpu,
+				 KVM_HOST_USES_PFN, system_time & ~1ULL,
+				 sizeof(struct pvclock_vcpu_time_info));
 	} else {
-		kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.pv_time);
+		kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time);
 	}
 
 	return;
@@ -3374,7 +3374,7 @@ static int kvm_pv_enable_async_pf_int(struct kvm_vcpu *vcpu, u64 data)
 
 static void kvmclock_reset(struct kvm_vcpu *vcpu)
 {
-	kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.pv_time);
+	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time);
 	vcpu->arch.time = 0;
 }
 
@@ -11551,6 +11551,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	vcpu->arch.regs_avail = ~0;
 	vcpu->arch.regs_dirty = ~0;
 
+	kvm_gpc_init(&vcpu->arch.pv_time);
+
 	if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
 		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
 	else
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 280cb5dc7341..cecf8299b187 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -42,13 +42,13 @@ static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn)
 	int idx = srcu_read_lock(&kvm->srcu);
 
 	if (gfn == GPA_INVALID) {
-		kvm_gfn_to_pfn_cache_destroy(kvm, gpc);
+		kvm_gpc_deactivate(kvm, gpc);
 		goto out;
 	}
 
 	do {
-		ret = kvm_gfn_to_pfn_cache_init(kvm, gpc, NULL, KVM_HOST_USES_PFN,
-						gpa, PAGE_SIZE);
+		ret = kvm_gpc_activate(kvm, gpc, NULL, KVM_HOST_USES_PFN, gpa,
+				       PAGE_SIZE);
 		if (ret)
 			goto out;
 
@@ -554,15 +554,15 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 			     offsetof(struct compat_vcpu_info, time));
 
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache);
+			kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gfn_to_pfn_cache_init(vcpu->kvm,
-					      &vcpu->arch.xen.vcpu_info_cache,
-					      NULL, KVM_HOST_USES_PFN, data->u.gpa,
-					      sizeof(struct vcpu_info));
+		r = kvm_gpc_activate(vcpu->kvm,
+				     &vcpu->arch.xen.vcpu_info_cache, NULL,
+				     KVM_HOST_USES_PFN, data->u.gpa,
+				     sizeof(struct vcpu_info));
 		if (!r)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 
@@ -570,16 +570,16 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO:
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gfn_to_pfn_cache_destroy(vcpu->kvm,
-						     &vcpu->arch.xen.vcpu_time_info_cache);
+			kvm_gpc_deactivate(vcpu->kvm,
+					   &vcpu->arch.xen.vcpu_time_info_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gfn_to_pfn_cache_init(vcpu->kvm,
-					      &vcpu->arch.xen.vcpu_time_info_cache,
-					      NULL, KVM_HOST_USES_PFN, data->u.gpa,
-					      sizeof(struct pvclock_vcpu_time_info));
+		r = kvm_gpc_activate(vcpu->kvm,
+				     &vcpu->arch.xen.vcpu_time_info_cache,
+				     NULL, KVM_HOST_USES_PFN, data->u.gpa,
+				     sizeof(struct pvclock_vcpu_time_info));
 		if (!r)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 		break;
@@ -590,16 +590,15 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 			break;
 		}
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gfn_to_pfn_cache_destroy(vcpu->kvm,
-						     &vcpu->arch.xen.runstate_cache);
+			kvm_gpc_deactivate(vcpu->kvm,
+					   &vcpu->arch.xen.runstate_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gfn_to_pfn_cache_init(vcpu->kvm,
-					      &vcpu->arch.xen.runstate_cache,
-					      NULL, KVM_HOST_USES_PFN, data->u.gpa,
-					      sizeof(struct vcpu_runstate_info));
+		r = kvm_gpc_activate(vcpu->kvm, &vcpu->arch.xen.runstate_cache,
+				     NULL, KVM_HOST_USES_PFN, data->u.gpa,
+				     sizeof(struct vcpu_runstate_info));
 		break;
 
 	case KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT:
@@ -1817,7 +1816,12 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.xen.vcpu_id = vcpu->vcpu_idx;
 	vcpu->arch.xen.poll_evtchn = 0;
+
 	timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0);
+
+	kvm_gpc_init(&vcpu->arch.xen.runstate_cache);
+	kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache);
+	kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache);
 }
 
 void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
@@ -1825,18 +1829,17 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
 	if (kvm_xen_timer_enabled(vcpu))
 		kvm_xen_stop_timer(vcpu);
 
-	kvm_gfn_to_pfn_cache_destroy(vcpu->kvm,
-				     &vcpu->arch.xen.runstate_cache);
-	kvm_gfn_to_pfn_cache_destroy(vcpu->kvm,
-				     &vcpu->arch.xen.vcpu_info_cache);
-	kvm_gfn_to_pfn_cache_destroy(vcpu->kvm,
-				     &vcpu->arch.xen.vcpu_time_info_cache);
+	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.runstate_cache);
+	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache);
+	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_time_info_cache);
+
 	del_timer_sync(&vcpu->arch.xen.poll_timer);
 }
 
 void kvm_xen_init_vm(struct kvm *kvm)
 {
 	idr_init(&kvm->arch.xen.evtchn_ports);
+	kvm_gpc_init(&kvm->arch.xen.shinfo_cache);
 }
 
 void kvm_xen_destroy_vm(struct kvm *kvm)
@@ -1844,7 +1847,7 @@ void kvm_xen_destroy_vm(struct kvm *kvm)
 	struct evtchnfd *evtchnfd;
 	int i;
 
-	kvm_gfn_to_pfn_cache_destroy(kvm, &kvm->arch.xen.shinfo_cache);
+	kvm_gpc_deactivate(kvm, &kvm->arch.xen.shinfo_cache);
 
 	idr_for_each_entry(&kvm->arch.xen.evtchn_ports, evtchnfd, i) {
 		if (!evtchnfd->deliver.port.port)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f4519d3689e1..9fd67026d91a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1241,8 +1241,17 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data,
 void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
 
 /**
- * kvm_gfn_to_pfn_cache_init - prepare a cached kernel mapping and HPA for a
- *                             given guest physical address.
+ * kvm_gpc_init - initialize gfn_to_pfn_cache.
+ *
+ * @gpc:	   struct gfn_to_pfn_cache object.
+ *
+ * This sets up a gfn_to_pfn_cache by initializing locks.
+ */
+void kvm_gpc_init(struct gfn_to_pfn_cache *gpc);
+
+/**
+ * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest
+ *                    physical address.
  *
  * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
@@ -1266,9 +1275,9 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
  * kvm_gfn_to_pfn_cache_check() to ensure that the cache is valid before
  * accessing the target page.
  */
-int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-			      struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
-			      gpa_t gpa, unsigned long len);
+int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+		     struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
+		     gpa_t gpa, unsigned long len);
 
 /**
  * kvm_gfn_to_pfn_cache_check - check validity of a gfn_to_pfn_cache.
@@ -1325,7 +1334,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
 
 /**
- * kvm_gfn_to_pfn_cache_destroy - destroy and unlink a gfn_to_pfn_cache.
+ * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache.
  *
  * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
@@ -1333,7 +1342,7 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
  * This removes a cache from the @kvm's list to be processed on MMU notifier
  * invocation.
  */
-void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
+void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
 
 void kvm_sigset_activate(struct kvm_vcpu *vcpu);
 void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 68ff41d39545..08f97cf97264 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -346,17 +346,20 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 }
 EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_unmap);
 
+void kvm_gpc_init(struct gfn_to_pfn_cache *gpc)
+{
+	rwlock_init(&gpc->lock);
+	mutex_init(&gpc->refresh_lock);
+}
+EXPORT_SYMBOL_GPL(kvm_gpc_init);
 
-int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-			      struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
-			      gpa_t gpa, unsigned long len)
+int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+		     struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
+		     gpa_t gpa, unsigned long len)
 {
 	WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
 
 	if (!gpc->active) {
-		rwlock_init(&gpc->lock);
-		mutex_init(&gpc->refresh_lock);
-
 		gpc->khva = NULL;
 		gpc->pfn = KVM_PFN_ERR_FAULT;
 		gpc->uhva = KVM_HVA_ERR_BAD;
@@ -371,9 +374,9 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 	}
 	return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len);
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_init);
+EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
-void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 {
 	if (gpc->active) {
 		spin_lock(&kvm->gpc_lock);
@@ -384,4 +387,4 @@ void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 		gpc->active = false;
 	}
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_destroy);
+EXPORT_SYMBOL_GPL(kvm_gpc_deactivate);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names
  2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj
@ 2022-10-05 12:30       ` Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj
                         ` (5 subsequent siblings)
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Formalize "gpc" as the acronym and use it in function names.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/x86.c       |  8 ++++----
 arch/x86/kvm/xen.c       | 29 ++++++++++++++---------------
 include/linux/kvm_host.h | 21 ++++++++++-----------
 virt/kvm/pfncache.c      | 20 ++++++++++----------
 4 files changed, 38 insertions(+), 40 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 15032b7f0589..45136ce7185e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3020,12 +3020,12 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v,
 	unsigned long flags;
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
-					   offset + sizeof(*guest_hv_clock))) {
+	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
+			      offset + sizeof(*guest_hv_clock))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
-		if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa,
-						 offset + sizeof(*guest_hv_clock)))
+		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
+				    offset + sizeof(*guest_hv_clock)))
 			return;
 
 		read_lock_irqsave(&gpc->lock, flags);
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index cecf8299b187..361f77dc7a3d 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -218,15 +218,14 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state)
 		user_len = sizeof(struct compat_vcpu_runstate_info);
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
-					   user_len)) {
+	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, user_len)) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
 		/* When invoked from kvm_sched_out() we cannot sleep */
 		if (state == RUNSTATE_runnable)
 			return;
 
-		if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa, user_len))
+		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, user_len))
 			return;
 
 		read_lock_irqsave(&gpc->lock, flags);
@@ -352,12 +351,12 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v)
 	 * little more honest about it.
 	 */
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
-					   sizeof(struct vcpu_info))) {
+	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
+			      sizeof(struct vcpu_info))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
-		if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa,
-						 sizeof(struct vcpu_info)))
+		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
+				    sizeof(struct vcpu_info)))
 			return;
 
 		read_lock_irqsave(&gpc->lock, flags);
@@ -417,8 +416,8 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
 		     sizeof_field(struct compat_vcpu_info, evtchn_upcall_pending));
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
-					   sizeof(struct vcpu_info))) {
+	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
+			      sizeof(struct vcpu_info))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
 		/*
@@ -432,8 +431,8 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
 		if (in_atomic() || !task_is_running(current))
 			return 1;
 
-		if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa,
-						 sizeof(struct vcpu_info))) {
+		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
+				    sizeof(struct vcpu_info))) {
 			/*
 			 * If this failed, userspace has screwed up the
 			 * vcpu_info mapping. No interrupts for you.
@@ -966,7 +965,7 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports,
 
 	read_lock_irqsave(&gpc->lock, flags);
 	idx = srcu_read_lock(&kvm->srcu);
-	if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
+	if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
 		goto out_rcu;
 
 	ret = false;
@@ -1358,7 +1357,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 	idx = srcu_read_lock(&kvm->srcu);
 
 	read_lock_irqsave(&gpc->lock, flags);
-	if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
+	if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
 		goto out_rcu;
 
 	if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) {
@@ -1392,7 +1391,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 		gpc = &vcpu->arch.xen.vcpu_info_cache;
 
 		read_lock_irqsave(&gpc->lock, flags);
-		if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) {
+		if (!kvm_gpc_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 			/*
 			 * Could not access the vcpu_info. Set the bit in-kernel
 			 * and prod the vCPU to deliver it for itself.
@@ -1490,7 +1489,7 @@ static int kvm_xen_set_evtchn(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 			break;
 
 		idx = srcu_read_lock(&kvm->srcu);
-		rc = kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE);
+		rc = kvm_gpc_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE);
 		srcu_read_unlock(&kvm->srcu, idx);
 	} while(!rc);
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9fd67026d91a..f687e56c24bc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1271,16 +1271,15 @@ void kvm_gpc_init(struct gfn_to_pfn_cache *gpc);
  *                 -EFAULT for an untranslatable guest physical address.
  *
  * This primes a gfn_to_pfn_cache and links it into the @kvm's list for
- * invalidations to be processed.  Callers are required to use
- * kvm_gfn_to_pfn_cache_check() to ensure that the cache is valid before
- * accessing the target page.
+ * invalidations to be processed.  Callers are required to use kvm_gpc_check()
+ * to ensure that the cache is valid before accessing the target page.
  */
 int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 		     struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
 		     gpa_t gpa, unsigned long len);
 
 /**
- * kvm_gfn_to_pfn_cache_check - check validity of a gfn_to_pfn_cache.
+ * kvm_gpc_check - check validity of a gfn_to_pfn_cache.
  *
  * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
@@ -1297,11 +1296,11 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
  * Callers in IN_GUEST_MODE may do so without locking, although they should
  * still hold a read lock on kvm->scru for the memslot checks.
  */
-bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-				gpa_t gpa, unsigned long len);
+bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
+		   unsigned long len);
 
 /**
- * kvm_gfn_to_pfn_cache_refresh - update a previously initialized cache.
+ * kvm_gpc_refresh - update a previously initialized cache.
  *
  * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
@@ -1318,11 +1317,11 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
  * still lock and check the cache status, as this function does not return
  * with the lock still held to permit access.
  */
-int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-				 gpa_t gpa, unsigned long len);
+int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
+		    unsigned long len);
 
 /**
- * kvm_gfn_to_pfn_cache_unmap - temporarily unmap a gfn_to_pfn_cache.
+ * kvm_gpc_unmap - temporarily unmap a gfn_to_pfn_cache.
  *
  * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
@@ -1331,7 +1330,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
  * but at least the mapping from GPA to userspace HVA will remain cached
  * and can be reused on a subsequent refresh.
  */
-void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
+void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
 
 /**
  * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache.
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 08f97cf97264..cc65fab0dbef 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -76,8 +76,8 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start,
 	}
 }
 
-bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-				gpa_t gpa, unsigned long len)
+bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
+		   unsigned long len)
 {
 	struct kvm_memslots *slots = kvm_memslots(kvm);
 
@@ -93,7 +93,7 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 
 	return true;
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_check);
+EXPORT_SYMBOL_GPL(kvm_gpc_check);
 
 static void gpc_unmap_khva(struct kvm *kvm, kvm_pfn_t pfn, void *khva)
 {
@@ -235,8 +235,8 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 	return -EFAULT;
 }
 
-int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-				 gpa_t gpa, unsigned long len)
+int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
+		    unsigned long len)
 {
 	struct kvm_memslots *slots = kvm_memslots(kvm);
 	unsigned long page_offset = gpa & ~PAGE_MASK;
@@ -317,9 +317,9 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_refresh);
+EXPORT_SYMBOL_GPL(kvm_gpc_refresh);
 
-void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 {
 	void *old_khva;
 	kvm_pfn_t old_pfn;
@@ -344,7 +344,7 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 
 	gpc_unmap_khva(kvm, old_pfn, old_khva);
 }
-EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_unmap);
+EXPORT_SYMBOL_GPL(kvm_gpc_unmap);
 
 void kvm_gpc_init(struct gfn_to_pfn_cache *gpc)
 {
@@ -372,7 +372,7 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 		list_add(&gpc->list, &kvm->gpc_list);
 		spin_unlock(&kvm->gpc_lock);
 	}
-	return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len);
+	return kvm_gpc_refresh(kvm, gpc, gpa, len);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
@@ -383,7 +383,7 @@ void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 		list_del(&gpc->list);
 		spin_unlock(&kvm->gpc_lock);
 
-		kvm_gfn_to_pfn_cache_unmap(kvm, gpc);
+		kvm_gpc_unmap(kvm, gpc);
 		gpc->active = false;
 	}
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva()
  2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj
@ 2022-10-05 12:30       ` Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj
                         ` (4 subsequent siblings)
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

First argument is never used, remove it.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 virt/kvm/pfncache.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index cc65fab0dbef..76f1b669cf28 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -95,7 +95,7 @@ bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_check);
 
-static void gpc_unmap_khva(struct kvm *kvm, kvm_pfn_t pfn, void *khva)
+static void gpc_unmap_khva(kvm_pfn_t pfn, void *khva)
 {
 	/* Unmap the old pfn/page if it was mapped before. */
 	if (!is_error_noslot_pfn(pfn) && khva) {
@@ -174,7 +174,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 			 * the existing mapping and didn't create a new one.
 			 */
 			if (new_khva != old_khva)
-				gpc_unmap_khva(kvm, new_pfn, new_khva);
+				gpc_unmap_khva(new_pfn, new_khva);
 
 			kvm_release_pfn_clean(new_pfn);
 
@@ -313,7 +313,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 	mutex_unlock(&gpc->refresh_lock);
 
 	if (old_pfn != new_pfn)
-		gpc_unmap_khva(kvm, old_pfn, old_khva);
+		gpc_unmap_khva(old_pfn, old_khva);
 
 	return ret;
 }
@@ -342,7 +342,7 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 	write_unlock_irq(&gpc->lock);
 	mutex_unlock(&gpc->refresh_lock);
 
-	gpc_unmap_khva(kvm, old_pfn, old_khva);
+	gpc_unmap_khva(old_pfn, old_khva);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_unmap);
 
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties
  2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
                         ` (2 preceding siblings ...)
  2022-10-05 12:30       ` [PATCH v2 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj
@ 2022-10-05 12:30       ` Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj
                         ` (3 subsequent siblings)
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Move the assignment of immutable properties @kvm, @vcpu, @usage, @len
to the initializer.  Make _activate() and _deactivate() use stored
values.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/x86.c        | 16 ++++++-------
 arch/x86/kvm/xen.c        | 50 ++++++++++++++++++---------------------
 include/linux/kvm_host.h  | 40 +++++++++++++++----------------
 include/linux/kvm_types.h |  2 ++
 virt/kvm/pfncache.c       | 36 +++++++++++++++-------------
 5 files changed, 72 insertions(+), 72 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 45136ce7185e..ed8e4f8c9cf0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2300,13 +2300,10 @@ static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time,
 	kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
 
 	/* we verify if the enable bit is set... */
-	if (system_time & 1) {
-		kvm_gpc_activate(vcpu->kvm, &vcpu->arch.pv_time, vcpu,
-				 KVM_HOST_USES_PFN, system_time & ~1ULL,
-				 sizeof(struct pvclock_vcpu_time_info));
-	} else {
-		kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time);
-	}
+	if (system_time & 1)
+		kvm_gpc_activate(&vcpu->arch.pv_time, system_time & ~1ULL);
+	else
+		kvm_gpc_deactivate(&vcpu->arch.pv_time);
 
 	return;
 }
@@ -3374,7 +3371,7 @@ static int kvm_pv_enable_async_pf_int(struct kvm_vcpu *vcpu, u64 data)
 
 static void kvmclock_reset(struct kvm_vcpu *vcpu)
 {
-	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time);
+	kvm_gpc_deactivate(&vcpu->arch.pv_time);
 	vcpu->arch.time = 0;
 }
 
@@ -11551,7 +11548,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	vcpu->arch.regs_avail = ~0;
 	vcpu->arch.regs_dirty = ~0;
 
-	kvm_gpc_init(&vcpu->arch.pv_time);
+	kvm_gpc_init(&vcpu->arch.pv_time, vcpu->kvm, vcpu, KVM_HOST_USES_PFN,
+		     sizeof(struct pvclock_vcpu_time_info));
 
 	if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
 		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 361f77dc7a3d..9b4b0e6e66e5 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -42,13 +42,12 @@ static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn)
 	int idx = srcu_read_lock(&kvm->srcu);
 
 	if (gfn == GPA_INVALID) {
-		kvm_gpc_deactivate(kvm, gpc);
+		kvm_gpc_deactivate(gpc);
 		goto out;
 	}
 
 	do {
-		ret = kvm_gpc_activate(kvm, gpc, NULL, KVM_HOST_USES_PFN, gpa,
-				       PAGE_SIZE);
+		ret = kvm_gpc_activate(gpc, gpa);
 		if (ret)
 			goto out;
 
@@ -553,15 +552,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 			     offsetof(struct compat_vcpu_info, time));
 
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache);
+			kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gpc_activate(vcpu->kvm,
-				     &vcpu->arch.xen.vcpu_info_cache, NULL,
-				     KVM_HOST_USES_PFN, data->u.gpa,
-				     sizeof(struct vcpu_info));
+		r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_info_cache,
+				     data->u.gpa);
 		if (!r)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 
@@ -569,16 +566,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO:
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gpc_deactivate(vcpu->kvm,
-					   &vcpu->arch.xen.vcpu_time_info_cache);
+			kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_time_info_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gpc_activate(vcpu->kvm,
-				     &vcpu->arch.xen.vcpu_time_info_cache,
-				     NULL, KVM_HOST_USES_PFN, data->u.gpa,
-				     sizeof(struct pvclock_vcpu_time_info));
+		r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_time_info_cache,
+				     data->u.gpa);
 		if (!r)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 		break;
@@ -589,15 +583,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 			break;
 		}
 		if (data->u.gpa == GPA_INVALID) {
-			kvm_gpc_deactivate(vcpu->kvm,
-					   &vcpu->arch.xen.runstate_cache);
+			kvm_gpc_deactivate(&vcpu->arch.xen.runstate_cache);
 			r = 0;
 			break;
 		}
 
-		r = kvm_gpc_activate(vcpu->kvm, &vcpu->arch.xen.runstate_cache,
-				     NULL, KVM_HOST_USES_PFN, data->u.gpa,
-				     sizeof(struct vcpu_runstate_info));
+		r = kvm_gpc_activate(&vcpu->arch.xen.runstate_cache,
+				     data->u.gpa);
 		break;
 
 	case KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT:
@@ -1818,9 +1810,12 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu)
 
 	timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0);
 
-	kvm_gpc_init(&vcpu->arch.xen.runstate_cache);
-	kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache);
-	kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache);
+	kvm_gpc_init(&vcpu->arch.xen.runstate_cache, vcpu->kvm, NULL,
+		     KVM_HOST_USES_PFN, sizeof(struct vcpu_runstate_info));
+	kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache, vcpu->kvm, NULL,
+		     KVM_HOST_USES_PFN, sizeof(struct vcpu_info));
+	kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache, vcpu->kvm, NULL,
+		     KVM_HOST_USES_PFN, sizeof(struct pvclock_vcpu_time_info));
 }
 
 void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
@@ -1828,9 +1823,9 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
 	if (kvm_xen_timer_enabled(vcpu))
 		kvm_xen_stop_timer(vcpu);
 
-	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.runstate_cache);
-	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache);
-	kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_time_info_cache);
+	kvm_gpc_deactivate(&vcpu->arch.xen.runstate_cache);
+	kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache);
+	kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_time_info_cache);
 
 	del_timer_sync(&vcpu->arch.xen.poll_timer);
 }
@@ -1838,7 +1833,8 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
 void kvm_xen_init_vm(struct kvm *kvm)
 {
 	idr_init(&kvm->arch.xen.evtchn_ports);
-	kvm_gpc_init(&kvm->arch.xen.shinfo_cache);
+	kvm_gpc_init(&kvm->arch.xen.shinfo_cache, kvm, NULL, KVM_HOST_USES_PFN,
+		     PAGE_SIZE);
 }
 
 void kvm_xen_destroy_vm(struct kvm *kvm)
@@ -1846,7 +1842,7 @@ void kvm_xen_destroy_vm(struct kvm *kvm)
 	struct evtchnfd *evtchnfd;
 	int i;
 
-	kvm_gpc_deactivate(kvm, &kvm->arch.xen.shinfo_cache);
+	kvm_gpc_deactivate(&kvm->arch.xen.shinfo_cache);
 
 	idr_for_each_entry(&kvm->arch.xen.evtchn_ports, evtchnfd, i) {
 		if (!evtchnfd->deliver.port.port)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f687e56c24bc..024b8df5302c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1244,17 +1244,7 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
  * kvm_gpc_init - initialize gfn_to_pfn_cache.
  *
  * @gpc:	   struct gfn_to_pfn_cache object.
- *
- * This sets up a gfn_to_pfn_cache by initializing locks.
- */
-void kvm_gpc_init(struct gfn_to_pfn_cache *gpc);
-
-/**
- * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest
- *                    physical address.
- *
  * @kvm:	   pointer to kvm instance.
- * @gpc:	   struct gfn_to_pfn_cache object.
  * @vcpu:	   vCPU to be used for marking pages dirty and to be woken on
  *		   invalidation.
  * @usage:	   indicates if the resulting host physical PFN is used while
@@ -1263,20 +1253,31 @@ void kvm_gpc_init(struct gfn_to_pfn_cache *gpc);
  *		   changes!---will also force @vcpu to exit the guest and
  *		   refresh the cache); and/or if the PFN used directly
  *		   by KVM (and thus needs a kernel virtual mapping).
- * @gpa:	   guest physical address to map.
  * @len:	   sanity check; the range being access must fit a single page.
  *
+ * This sets up a gfn_to_pfn_cache by initializing locks and assigning the
+ * immutable attributes.
+ */
+void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm,
+		  struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
+		  unsigned long len);
+
+/**
+ * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest
+ *                    physical address.
+ *
+ * @gpc:	   struct gfn_to_pfn_cache object.
+ * @gpa:	   guest physical address to map.
+ *
  * @return:	   0 for success.
  *		   -EINVAL for a mapping which would cross a page boundary.
- *                 -EFAULT for an untranslatable guest physical address.
+ *		   -EFAULT for an untranslatable guest physical address.
  *
- * This primes a gfn_to_pfn_cache and links it into the @kvm's list for
+ * This primes a gfn_to_pfn_cache and links it into the @gpc->kvm's list for
  * invalidations to be processed.  Callers are required to use kvm_gpc_check()
  * to ensure that the cache is valid before accessing the target page.
  */
-int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-		     struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
-		     gpa_t gpa, unsigned long len);
+int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa);
 
 /**
  * kvm_gpc_check - check validity of a gfn_to_pfn_cache.
@@ -1335,13 +1336,12 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
 /**
  * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache.
  *
- * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
  *
- * This removes a cache from the @kvm's list to be processed on MMU notifier
- * invocation.
+ * This removes a cache from the @gpc->kvm's list to be processed on MMU
+ * notifier invocation.
  */
-void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
+void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc);
 
 void kvm_sigset_activate(struct kvm_vcpu *vcpu);
 void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index 3ca3db020e0e..d66b276d29e0 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -67,12 +67,14 @@ struct gfn_to_pfn_cache {
 	gpa_t gpa;
 	unsigned long uhva;
 	struct kvm_memory_slot *memslot;
+	struct kvm *kvm;
 	struct kvm_vcpu *vcpu;
 	struct list_head list;
 	rwlock_t lock;
 	struct mutex refresh_lock;
 	void *khva;
 	kvm_pfn_t pfn;
+	unsigned long len;
 	enum pfn_cache_usage usage;
 	bool active;
 	bool valid;
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 76f1b669cf28..56ca0e9c6ed7 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -346,44 +346,48 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_unmap);
 
-void kvm_gpc_init(struct gfn_to_pfn_cache *gpc)
+void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm,
+		  struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
+		  unsigned long len)
 {
+	WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
+	WARN_ON_ONCE((usage & KVM_GUEST_USES_PFN) && !vcpu);
+
 	rwlock_init(&gpc->lock);
 	mutex_init(&gpc->refresh_lock);
+
+	gpc->kvm = kvm;
+	gpc->vcpu = vcpu;
+	gpc->usage = usage;
+	gpc->len = len;
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_init);
 
-int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
-		     struct kvm_vcpu *vcpu, enum pfn_cache_usage usage,
-		     gpa_t gpa, unsigned long len)
+int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa)
 {
-	WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
-
 	if (!gpc->active) {
 		gpc->khva = NULL;
 		gpc->pfn = KVM_PFN_ERR_FAULT;
 		gpc->uhva = KVM_HVA_ERR_BAD;
-		gpc->vcpu = vcpu;
-		gpc->usage = usage;
 		gpc->valid = false;
 		gpc->active = true;
 
-		spin_lock(&kvm->gpc_lock);
-		list_add(&gpc->list, &kvm->gpc_list);
-		spin_unlock(&kvm->gpc_lock);
+		spin_lock(&gpc->kvm->gpc_lock);
+		list_add(&gpc->list, &gpc->kvm->gpc_list);
+		spin_unlock(&gpc->kvm->gpc_lock);
 	}
-	return kvm_gpc_refresh(kvm, gpc, gpa, len);
+	return kvm_gpc_refresh(gpc->kvm, gpc, gpa, gpc->len);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
-void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc)
 {
 	if (gpc->active) {
-		spin_lock(&kvm->gpc_lock);
+		spin_lock(&gpc->kvm->gpc_lock);
 		list_del(&gpc->list);
-		spin_unlock(&kvm->gpc_lock);
+		spin_unlock(&gpc->kvm->gpc_lock);
 
-		kvm_gpc_unmap(kvm, gpc);
+		kvm_gpc_unmap(gpc->kvm, gpc);
 		gpc->active = false;
 	}
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 5/8] KVM: x86: Clean up kvm_gpc_check()
  2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
                         ` (3 preceding siblings ...)
  2022-10-05 12:30       ` [PATCH v2 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj
@ 2022-10-05 12:30       ` Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj
                         ` (2 subsequent siblings)
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Make kvm_gpc_check() use kvm instance cached in gfn_to_pfn_cache.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/x86.c       |  2 +-
 arch/x86/kvm/xen.c       | 14 ++++++--------
 include/linux/kvm_host.h |  4 +---
 virt/kvm/pfncache.c      |  5 ++---
 4 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ed8e4f8c9cf0..273f1ed3b5ae 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3017,7 +3017,7 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v,
 	unsigned long flags;
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
+	while (!kvm_gpc_check(gpc, gpc->gpa,
 			      offset + sizeof(*guest_hv_clock))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 9b4b0e6e66e5..84b95258773a 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -217,7 +217,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state)
 		user_len = sizeof(struct compat_vcpu_runstate_info);
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, user_len)) {
+	while (!kvm_gpc_check(gpc, gpc->gpa, user_len)) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
 		/* When invoked from kvm_sched_out() we cannot sleep */
@@ -350,8 +350,7 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v)
 	 * little more honest about it.
 	 */
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
-			      sizeof(struct vcpu_info))) {
+	while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
 		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
@@ -415,8 +414,7 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
 		     sizeof_field(struct compat_vcpu_info, evtchn_upcall_pending));
 
 	read_lock_irqsave(&gpc->lock, flags);
-	while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa,
-			      sizeof(struct vcpu_info))) {
+	while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
 		/*
@@ -957,7 +955,7 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports,
 
 	read_lock_irqsave(&gpc->lock, flags);
 	idx = srcu_read_lock(&kvm->srcu);
-	if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
+	if (!kvm_gpc_check(gpc, gpc->gpa, PAGE_SIZE))
 		goto out_rcu;
 
 	ret = false;
@@ -1349,7 +1347,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 	idx = srcu_read_lock(&kvm->srcu);
 
 	read_lock_irqsave(&gpc->lock, flags);
-	if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE))
+	if (!kvm_gpc_check(gpc, gpc->gpa, PAGE_SIZE))
 		goto out_rcu;
 
 	if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) {
@@ -1383,7 +1381,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 		gpc = &vcpu->arch.xen.vcpu_info_cache;
 
 		read_lock_irqsave(&gpc->lock, flags);
-		if (!kvm_gpc_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) {
+		if (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 			/*
 			 * Could not access the vcpu_info. Set the bit in-kernel
 			 * and prod the vCPU to deliver it for itself.
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 024b8df5302c..0b66d4889ec3 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1282,7 +1282,6 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa);
 /**
  * kvm_gpc_check - check validity of a gfn_to_pfn_cache.
  *
- * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
  * @gpa:	   current guest physical address to map.
  * @len:	   sanity check; the range being access must fit a single page.
@@ -1297,8 +1296,7 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa);
  * Callers in IN_GUEST_MODE may do so without locking, although they should
  * still hold a read lock on kvm->scru for the memslot checks.
  */
-bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
-		   unsigned long len);
+bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len);
 
 /**
  * kvm_gpc_refresh - update a previously initialized cache.
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 56ca0e9c6ed7..eb91025d7242 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -76,10 +76,9 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start,
 	}
 }
 
-bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
-		   unsigned long len)
+bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
 {
-	struct kvm_memslots *slots = kvm_memslots(kvm);
+	struct kvm_memslots *slots = kvm_memslots(gpc->kvm);
 
 	if ((gpa & ~PAGE_MASK) + len > PAGE_SIZE)
 		return false;
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 6/8] KVM: x86: Clean up hva_to_pfn_retry()
  2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
                         ` (4 preceding siblings ...)
  2022-10-05 12:30       ` [PATCH v2 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj
@ 2022-10-05 12:30       ` Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Make hva_to_pfn_retry() use kvm instance cached in gfn_to_pfn_cache.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 virt/kvm/pfncache.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index eb91025d7242..a2c95e393e34 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -135,7 +135,7 @@ static inline bool mmu_notifier_retry_cache(struct kvm *kvm, unsigned long mmu_s
 	return kvm->mmu_invalidate_seq != mmu_seq;
 }
 
-static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
 {
 	/* Note, the new page offset may be different than the old! */
 	void *old_khva = gpc->khva - offset_in_page(gpc->khva);
@@ -155,7 +155,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 	gpc->valid = false;
 
 	do {
-		mmu_seq = kvm->mmu_invalidate_seq;
+		mmu_seq = gpc->kvm->mmu_invalidate_seq;
 		smp_rmb();
 
 		write_unlock_irq(&gpc->lock);
@@ -213,7 +213,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 		 * attempting to refresh.
 		 */
 		WARN_ON_ONCE(gpc->valid);
-	} while (mmu_notifier_retry_cache(kvm, mmu_seq));
+	} while (mmu_notifier_retry_cache(gpc->kvm, mmu_seq));
 
 	gpc->valid = true;
 	gpc->pfn = new_pfn;
@@ -285,7 +285,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 	 * drop the lock and do the HVA to PFN lookup again.
 	 */
 	if (!gpc->valid || old_uhva != gpc->uhva) {
-		ret = hva_to_pfn_retry(kvm, gpc);
+		ret = hva_to_pfn_retry(gpc);
 	} else {
 		/* If the HVA→PFN mapping was already valid, don't unmap it. */
 		old_pfn = KVM_PFN_ERR_FAULT;
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap()
  2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
                         ` (5 preceding siblings ...)
  2022-10-05 12:30       ` [PATCH v2 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj
@ 2022-10-05 12:30       ` Michal Luczaj
  2022-10-05 12:30       ` [PATCH v2 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

Make kvm_gpc_refresh() use kvm instance cached in gfn_to_pfn_cache.

First argument of kvm_gpc_unmap() becomes unneeded; remove it from
function definition.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 arch/x86/kvm/x86.c       |  2 +-
 arch/x86/kvm/xen.c       | 10 ++++------
 include/linux/kvm_host.h | 10 ++++------
 virt/kvm/pfncache.c      | 11 +++++------
 4 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 273f1ed3b5ae..d24d1731d2bc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3021,7 +3021,7 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v,
 			      offset + sizeof(*guest_hv_clock))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
-		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
+		if (kvm_gpc_refresh(gpc, gpc->gpa,
 				    offset + sizeof(*guest_hv_clock)))
 			return;
 
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 84b95258773a..bcaaa83290fb 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -224,7 +224,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state)
 		if (state == RUNSTATE_runnable)
 			return;
 
-		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, user_len))
+		if (kvm_gpc_refresh(gpc, gpc->gpa, user_len))
 			return;
 
 		read_lock_irqsave(&gpc->lock, flags);
@@ -353,8 +353,7 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v)
 	while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 		read_unlock_irqrestore(&gpc->lock, flags);
 
-		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
-				    sizeof(struct vcpu_info)))
+		if (kvm_gpc_refresh(gpc, gpc->gpa, sizeof(struct vcpu_info)))
 			return;
 
 		read_lock_irqsave(&gpc->lock, flags);
@@ -428,8 +427,7 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v)
 		if (in_atomic() || !task_is_running(current))
 			return 1;
 
-		if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa,
-				    sizeof(struct vcpu_info))) {
+		if (kvm_gpc_refresh(gpc, gpc->gpa, sizeof(struct vcpu_info))) {
 			/*
 			 * If this failed, userspace has screwed up the
 			 * vcpu_info mapping. No interrupts for you.
@@ -1479,7 +1477,7 @@ static int kvm_xen_set_evtchn(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 			break;
 
 		idx = srcu_read_lock(&kvm->srcu);
-		rc = kvm_gpc_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE);
+		rc = kvm_gpc_refresh(gpc, gpc->gpa, PAGE_SIZE);
 		srcu_read_unlock(&kvm->srcu, idx);
 	} while(!rc);
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 0b66d4889ec3..efead11bec84 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1301,35 +1301,33 @@ bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len);
 /**
  * kvm_gpc_refresh - update a previously initialized cache.
  *
- * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
  * @gpa:	   updated guest physical address to map.
  * @len:	   sanity check; the range being access must fit a single page.
  *
  * @return:	   0 for success.
  *		   -EINVAL for a mapping which would cross a page boundary.
- *                 -EFAULT for an untranslatable guest physical address.
+ *		   -EFAULT for an untranslatable guest physical address.
  *
  * This will attempt to refresh a gfn_to_pfn_cache. Note that a successful
- * returm from this function does not mean the page can be immediately
+ * return from this function does not mean the page can be immediately
  * accessed because it may have raced with an invalidation. Callers must
  * still lock and check the cache status, as this function does not return
  * with the lock still held to permit access.
  */
-int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
+int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 		    unsigned long len);
 
 /**
  * kvm_gpc_unmap - temporarily unmap a gfn_to_pfn_cache.
  *
- * @kvm:	   pointer to kvm instance.
  * @gpc:	   struct gfn_to_pfn_cache object.
  *
  * This unmaps the referenced page. The cache is left in the invalid state
  * but at least the mapping from GPA to userspace HVA will remain cached
  * and can be reused on a subsequent refresh.
  */
-void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc);
+void kvm_gpc_unmap(struct gfn_to_pfn_cache *gpc);
 
 /**
  * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache.
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index a2c95e393e34..45b9b96c0ea3 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -234,10 +234,9 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
 	return -EFAULT;
 }
 
-int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
-		    unsigned long len)
+int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len)
 {
-	struct kvm_memslots *slots = kvm_memslots(kvm);
+	struct kvm_memslots *slots = kvm_memslots(gpc->kvm);
 	unsigned long page_offset = gpa & ~PAGE_MASK;
 	kvm_pfn_t old_pfn, new_pfn;
 	unsigned long old_uhva;
@@ -318,7 +317,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa,
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_refresh);
 
-void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+void kvm_gpc_unmap(struct gfn_to_pfn_cache *gpc)
 {
 	void *old_khva;
 	kvm_pfn_t old_pfn;
@@ -375,7 +374,7 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa)
 		list_add(&gpc->list, &gpc->kvm->gpc_list);
 		spin_unlock(&gpc->kvm->gpc_lock);
 	}
-	return kvm_gpc_refresh(gpc->kvm, gpc, gpa, gpc->len);
+	return kvm_gpc_refresh(gpc, gpa, gpc->len);
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 
@@ -386,7 +385,7 @@ void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc)
 		list_del(&gpc->list);
 		spin_unlock(&gpc->kvm->gpc_lock);
 
-		kvm_gpc_unmap(gpc->kvm, gpc);
+		kvm_gpc_unmap(gpc);
 		gpc->active = false;
 	}
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast()
  2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
                         ` (6 preceding siblings ...)
  2022-10-05 12:30       ` [PATCH v2 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj
@ 2022-10-05 12:30       ` Michal Luczaj
  7 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw)
  To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj

There's a race between kvm_xen_set_evtchn_fast() and kvm_gpc_activate()
resulting in a near-NULL pointer write.

1. Deactivate shinfo cache:

kvm_xen_hvm_set_attr
case KVM_XEN_ATTR_TYPE_SHARED_INFO
 kvm_gpc_deactivate
  kvm_gpc_unmap
   gpc->valid = false
   gpc->khva = NULL
  gpc->active = false

Result: active = false, valid = false

2. Cause cache refresh:

kvm_arch_vm_ioctl
case KVM_XEN_HVM_EVTCHN_SEND
 kvm_xen_hvm_evtchn_send
  kvm_xen_set_evtchn
   kvm_xen_set_evtchn_fast
    kvm_gpc_check
    return -EWOULDBLOCK because !gpc->valid
   kvm_xen_set_evtchn_fast
    return -EWOULDBLOCK
   kvm_gpc_refresh
    hva_to_pfn_retry
     gpc->valid = true
     gpc->khva = ~0

Result: active = false, valid = true

3. Race ioctl KVM_XEN_HVM_EVTCHN_SEND against ioctl
KVM_XEN_ATTR_TYPE_SHARED_INFO:

kvm_arch_vm_ioctl
case KVM_XEN_HVM_EVTCHN_SEND
 kvm_xen_hvm_evtchn_send
  kvm_xen_set_evtchn
   kvm_xen_set_evtchn_fast
    read_lock gpc->lock
                                          kvm_xen_hvm_set_attr case
                                          KVM_XEN_ATTR_TYPE_SHARED_INFO
                                           mutex_lock kvm->lock
                                           kvm_xen_shared_info_init
                                            kvm_gpc_activate
                                             gpc->khva = NULL
    kvm_gpc_check
     [ Check passes because gpc->valid is
       still true, even though gpc->khva
       is already NULL. ]
    shinfo = gpc->khva
    pending_bits = shinfo->evtchn_pending
    CRASH: test_and_set_bit(..., pending_bits)

Protect kvm_gpc_activate() cache properties writes by write lock
gpc->lock.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
Attaching more details:

[   86.127703] BUG: kernel NULL pointer dereference, address: 0000000000000800
[   86.127751] #PF: supervisor write access in kernel mode
[   86.127778] #PF: error_code(0x0002) - not-present page
[   86.127801] PGD 105792067 P4D 105792067 PUD 105793067 PMD 0
[   86.127826] Oops: 0002 [#1] PREEMPT SMP NOPTI
[   86.127850] CPU: 0 PID: 945 Comm: xen_shinfo_test Not tainted 6.0.0-rc5-test+ #31
[   86.127874] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[   86.127898] RIP: 0010:kvm_xen_set_evtchn_fast (./arch/x86/include/asm/bitops.h:138 ./include/asm-generic/bitops/instrumented-atomic.h:72 arch/x86/kvm/xen.c:1370) kvm
[ 86.127960] Code: 5a 84 c0 0f 84 01 01 00 00 80 bb b4 9f 00 00 00 8b 45 00 48 8b 93 c8 a0 00 00 75 4d 41 89 c0 48 8d b2 80 08 00 00 41 c1 e8 05 <f0> 48 0f ab 82 00 08 00 00 0f 82 19 01 00 00 8b 45 00 48 0f a3 06
All code
========
   0:	5a                   	pop    %rdx
   1:	84 c0                	test   %al,%al
   3:	0f 84 01 01 00 00    	je     0x10a
   9:	80 bb b4 9f 00 00 00 	cmpb   $0x0,0x9fb4(%rbx)
  10:	8b 45 00             	mov    0x0(%rbp),%eax
  13:	48 8b 93 c8 a0 00 00 	mov    0xa0c8(%rbx),%rdx
  1a:	75 4d                	jne    0x69
  1c:	41 89 c0             	mov    %eax,%r8d
  1f:	48 8d b2 80 08 00 00 	lea    0x880(%rdx),%rsi
  26:	41 c1 e8 05          	shr    $0x5,%r8d
  2a:*	f0 48 0f ab 82 00 08 	lock bts %rax,0x800(%rdx)		<-- trapping instruction
  31:	00 00
  33:	0f 82 19 01 00 00    	jb     0x152
  39:	8b 45 00             	mov    0x0(%rbp),%eax
  3c:	48 0f a3 06          	bt     %rax,(%rsi)

Code starting with the faulting instruction
===========================================
   0:	f0 48 0f ab 82 00 08 	lock bts %rax,0x800(%rdx)
   7:	00 00
   9:	0f 82 19 01 00 00    	jb     0x128
   f:	8b 45 00             	mov    0x0(%rbp),%eax
  12:	48 0f a3 06          	bt     %rax,(%rsi)
[   86.127982] RSP: 0018:ffffc90001367c50 EFLAGS: 00010046
[   86.128001] RAX: 0000000000000001 RBX: ffffc90001369000 RCX: 0000000000000001
[   86.128021] RDX: 0000000000000000 RSI: 0000000000000a00 RDI: ffffffff82886a66
[   86.128040] RBP: ffffc90001367ca8 R08: 0000000000000000 R09: 000000006cc00c97
[   86.128060] R10: ffff88810c150000 R11: 0000000076cc00c9 R12: 0000000000000001
[   86.128079] R13: ffffc90001372ff8 R14: ffff8881045c0000 R15: ffffc90001373830
[   86.128098] FS:  00007f71d6111740(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
[   86.128118] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   86.128138] CR2: 0000000000000800 CR3: 0000000104774006 CR4: 0000000000772ef0
[   86.128158] PKRU: 55555554
[   86.128177] Call Trace:
[   86.128196]  <TASK>
[   86.128215] kvm_xen_hvm_evtchn_send (arch/x86/kvm/xen.c:1432 arch/x86/kvm/xen.c:1562) kvm
[   86.128256] kvm_arch_vm_ioctl (arch/x86/kvm/x86.c:6883) kvm
[   86.128294] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007)
[   86.128315] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007)
[   86.128335] ? kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm
[   86.128368] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm
[   86.128401] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710)
[   86.128422] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688)
[   86.128442] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856)
[   86.128462] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[   86.128482] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128501] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128520] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383)
[   86.128539] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128558] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128577] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128596] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128615] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128634] ? do_syscall_64 (arch/x86/entry/common.c:87)
[   86.128653] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383)
[   86.128673] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[   86.128692] RIP: 0033:0x7f71d6152c6b
[ 86.128712] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48
All code
========
   0:	73 01                	jae    0x3
   2:	c3                   	ret
   3:	48 8b 0d b5 b1 1b 00 	mov    0x1bb1b5(%rip),%rcx        # 0x1bb1bf
   a:	f7 d8                	neg    %eax
   c:	64 89 01             	mov    %eax,%fs:(%rcx)
   f:	48 83 c8 ff          	or     $0xffffffffffffffff,%rax
  13:	c3                   	ret
  14:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
  1b:	00 00 00
  1e:	90                   	nop
  1f:	f3 0f 1e fa          	endbr64
  23:	b8 10 00 00 00       	mov    $0x10,%eax
  28:	0f 05                	syscall
  2a:*	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax		<-- trapping instruction
  30:	73 01                	jae    0x33
  32:	c3                   	ret
  33:	48 8b 0d 85 b1 1b 00 	mov    0x1bb185(%rip),%rcx        # 0x1bb1bf
  3a:	f7 d8                	neg    %eax
  3c:	64 89 01             	mov    %eax,%fs:(%rcx)
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax
   6:	73 01                	jae    0x9
   8:	c3                   	ret
   9:	48 8b 0d 85 b1 1b 00 	mov    0x1bb185(%rip),%rcx        # 0x1bb195
  10:	f7 d8                	neg    %eax
  12:	64 89 01             	mov    %eax,%fs:(%rcx)
  15:	48                   	rex.W
[   86.128735] RSP: 002b:00007fff7c716c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   86.128755] RAX: ffffffffffffffda RBX: 00007f71d61116c0 RCX: 00007f71d6152c6b
[   86.128775] RDX: 00007fff7c716f00 RSI: 00000000400caed0 RDI: 0000000000000004
[   86.128794] RBP: 0000000001f2b2a0 R08: 0000000000417343 R09: 00007f71d62cc341
[   86.128814] R10: 00007fff7c74c258 R11: 0000000000000246 R12: 00007f71d6329000
[   86.128833] R13: 00000000632870b9 R14: 0000000000000001 R15: 00007f71d632a020
[   86.128854]  </TASK>
[   86.128873] Modules linked in: kvm_intel 9p fscache netfs nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr sunrpc intel_rapl_msr intel_rapl_common kvm irqbypass rapl pcspkr 9pnet_virtio i2c_piix4 9pnet drm zram ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_console serio_raw ata_generic virtio_blk pata_acpi qemu_fw_cfg ipmi_devintf ipmi_msghandler fuse
[   86.128893] Unloaded tainted modules: kvm_intel():1 [last unloaded: kvm_intel]
[   86.128944] CR2: 0000000000000800
[   86.128962] ---[ end trace 0000000000000000 ]---
[   86.128982] RIP: 0010:kvm_xen_set_evtchn_fast (./arch/x86/include/asm/bitops.h:138 ./include/asm-generic/bitops/instrumented-atomic.h:72 arch/x86/kvm/xen.c:1370) kvm
[ 86.129022] Code: 5a 84 c0 0f 84 01 01 00 00 80 bb b4 9f 00 00 00 8b 45 00 48 8b 93 c8 a0 00 00 75 4d 41 89 c0 48 8d b2 80 08 00 00 41 c1 e8 05 <f0> 48 0f ab 82 00 08 00 00 0f 82 19 01 00 00 8b 45 00 48 0f a3 06
All code
========
   0:	5a                   	pop    %rdx
   1:	84 c0                	test   %al,%al
   3:	0f 84 01 01 00 00    	je     0x10a
   9:	80 bb b4 9f 00 00 00 	cmpb   $0x0,0x9fb4(%rbx)
  10:	8b 45 00             	mov    0x0(%rbp),%eax
  13:	48 8b 93 c8 a0 00 00 	mov    0xa0c8(%rbx),%rdx
  1a:	75 4d                	jne    0x69
  1c:	41 89 c0             	mov    %eax,%r8d
  1f:	48 8d b2 80 08 00 00 	lea    0x880(%rdx),%rsi
  26:	41 c1 e8 05          	shr    $0x5,%r8d
  2a:*	f0 48 0f ab 82 00 08 	lock bts %rax,0x800(%rdx)		<-- trapping instruction
  31:	00 00
  33:	0f 82 19 01 00 00    	jb     0x152
  39:	8b 45 00             	mov    0x0(%rbp),%eax
  3c:	48 0f a3 06          	bt     %rax,(%rsi)

Code starting with the faulting instruction
===========================================
   0:	f0 48 0f ab 82 00 08 	lock bts %rax,0x800(%rdx)
   7:	00 00
   9:	0f 82 19 01 00 00    	jb     0x128
   f:	8b 45 00             	mov    0x0(%rbp),%eax
  12:	48 0f a3 06          	bt     %rax,(%rsi)
[   86.129044] RSP: 0018:ffffc90001367c50 EFLAGS: 00010046
[   86.129064] RAX: 0000000000000001 RBX: ffffc90001369000 RCX: 0000000000000001
[   86.129083] RDX: 0000000000000000 RSI: 0000000000000a00 RDI: ffffffff82886a66
[   86.129103] RBP: ffffc90001367ca8 R08: 0000000000000000 R09: 000000006cc00c97
[   86.129122] R10: ffff88810c150000 R11: 0000000076cc00c9 R12: 0000000000000001
[   86.129142] R13: ffffc90001372ff8 R14: ffff8881045c0000 R15: ffffc90001373830
[   86.129161] FS:  00007f71d6111740(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
[   86.129181] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   86.129201] CR2: 0000000000000800 CR3: 0000000104774006 CR4: 0000000000772ef0
[   86.129221] PKRU: 55555554
[   86.129240] note: xen_shinfo_test[945] exited with preempt_count 1
[  151.131754] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  151.131793] rcu: 	1-...0: (13 ticks this GP) idle=785c/1/0x4000000000000000 softirq=4758/4760 fqs=16038
[  151.131816] 	(detected by 2, t=65002 jiffies, g=6449, q=1299 ncpus=4)
[  151.131837] Sending NMI from CPU 2 to CPUs 1:
[  151.131862] NMI backtrace for cpu 1
[  151.131864] CPU: 1 PID: 949 Comm: xen_shinfo_test Tainted: G      D            6.0.0-rc5-test+ #31
[  151.131866] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[  151.131866] RIP: 0010:queued_write_lock_slowpath (kernel/locking/qrwlock.c:85)
[ 151.131870] Code: ff 90 48 89 df 5b 5d e9 86 fd ff ff f0 81 0b 00 01 00 00 ba ff 00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 8b 03 <3d> 00 01 00 00 75 f5 89 c8 f0 0f b1 13 74 c0 eb e2 89 c6 48 89 ef
All code
========
   0:	ff 90 48 89 df 5b    	call   *0x5bdf8948(%rax)
   6:	5d                   	pop    %rbp
   7:	e9 86 fd ff ff       	jmp    0xfffffffffffffd92
   c:	f0 81 0b 00 01 00 00 	lock orl $0x100,(%rbx)
  13:	ba ff 00 00 00       	mov    $0xff,%edx
  18:	b9 00 01 00 00       	mov    $0x100,%ecx
  1d:	8b 03                	mov    (%rbx),%eax
  1f:	3d 00 01 00 00       	cmp    $0x100,%eax
  24:	74 0b                	je     0x31
  26:	f3 90                	pause
  28:	8b 03                	mov    (%rbx),%eax
  2a:*	3d 00 01 00 00       	cmp    $0x100,%eax		<-- trapping instruction
  2f:	75 f5                	jne    0x26
  31:	89 c8                	mov    %ecx,%eax
  33:	f0 0f b1 13          	lock cmpxchg %edx,(%rbx)
  37:	74 c0                	je     0xfffffffffffffff9
  39:	eb e2                	jmp    0x1d
  3b:	89 c6                	mov    %eax,%esi
  3d:	48 89 ef             	mov    %rbp,%rdi

Code starting with the faulting instruction
===========================================
   0:	3d 00 01 00 00       	cmp    $0x100,%eax
   5:	75 f5                	jne    0xfffffffffffffffc
   7:	89 c8                	mov    %ecx,%eax
   9:	f0 0f b1 13          	lock cmpxchg %edx,(%rbx)
   d:	74 c0                	je     0xffffffffffffffcf
   f:	eb e2                	jmp    0xfffffffffffffff3
  11:	89 c6                	mov    %eax,%esi
  13:	48 89 ef             	mov    %rbp,%rdi
[  151.131871] RSP: 0018:ffffc900011f7b98 EFLAGS: 00000006
[  151.131872] RAX: 0000000000000300 RBX: ffffc90001372ff8 RCX: 0000000000000100
[  151.131872] RDX: 00000000000000ff RSI: ffffffff827ea0a3 RDI: ffffffff82886a66
[  151.131873] RBP: ffffc90001372ffc R08: ffff888107864160 R09: 00000000777cbcf6
[  151.131873] R10: ffff888107863300 R11: 000000006777cbcf R12: ffffc90001372ff8
[  151.131874] R13: ffffc90001369000 R14: ffffc90001372fb8 R15: ffffc90001369170
[  151.131874] FS:  00007f71d5f09640(0000) GS:ffff888237c80000(0000) knlGS:0000000000000000
[  151.131875] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  151.131876] CR2: 00007f71d5f08ef8 CR3: 0000000104774002 CR4: 0000000000772ee0
[  151.131877] PKRU: 55555554
[  151.131878] Call Trace:
[  151.131879]  <TASK>
[  151.131880] do_raw_write_lock (./include/asm-generic/qrwlock.h:101 kernel/locking/spinlock_debug.c:210)
[  151.131883] kvm_gpc_refresh (arch/x86/kvm/../../../virt/kvm/pfncache.c:262) kvm
[  151.131907] ? kvm_gpc_activate (./include/linux/spinlock.h:390 arch/x86/kvm/../../../virt/kvm/pfncache.c:375) kvm
[  151.131925] kvm_xen_hvm_set_attr (arch/x86/kvm/xen.c:51 arch/x86/kvm/xen.c:464) kvm
[  151.131950] kvm_arch_vm_ioctl (arch/x86/kvm/x86.c:6883) kvm
[  151.131971] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007)
[  151.131973] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm
[  151.131991] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710)
[  151.131993] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688)
[  151.131995] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856)
[  151.131997] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[  151.131998] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710)
[  151.132000] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132000] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383)
[  151.132002] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132003] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132003] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383)
[  151.132004] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132005] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132006] ? do_syscall_64 (arch/x86/entry/common.c:87)
[  151.132007] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383)
[  151.132008] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[  151.132009] RIP: 0033:0x7f71d6152c6b
[ 151.132011] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48
All code
========
   0:	73 01                	jae    0x3
   2:	c3                   	ret
   3:	48 8b 0d b5 b1 1b 00 	mov    0x1bb1b5(%rip),%rcx        # 0x1bb1bf
   a:	f7 d8                	neg    %eax
   c:	64 89 01             	mov    %eax,%fs:(%rcx)
   f:	48 83 c8 ff          	or     $0xffffffffffffffff,%rax
  13:	c3                   	ret
  14:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
  1b:	00 00 00
  1e:	90                   	nop
  1f:	f3 0f 1e fa          	endbr64
  23:	b8 10 00 00 00       	mov    $0x10,%eax
  28:	0f 05                	syscall
  2a:*	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax		<-- trapping instruction
  30:	73 01                	jae    0x33
  32:	c3                   	ret
  33:	48 8b 0d 85 b1 1b 00 	mov    0x1bb185(%rip),%rcx        # 0x1bb1bf
  3a:	f7 d8                	neg    %eax
  3c:	64 89 01             	mov    %eax,%fs:(%rcx)
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax
   6:	73 01                	jae    0x9
   8:	c3                   	ret
   9:	48 8b 0d 85 b1 1b 00 	mov    0x1bb185(%rip),%rcx        # 0x1bb195
  10:	f7 d8                	neg    %eax
  12:	64 89 01             	mov    %eax,%fs:(%rcx)
  15:	48                   	rex.W
[  151.132011] RSP: 002b:00007f71d5f08da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  151.132012] RAX: ffffffffffffffda RBX: 0000000001f2b2a0 RCX: 00007f71d6152c6b
[  151.132013] RDX: 00007f71d5f08db0 RSI: 000000004048aec9 RDI: 0000000000000004
[  151.132013] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fff7c716b3f
[  151.132013] R10: 00007f71d611c948 R11: 0000000000000246 R12: 00007f71d5f09640
[  151.132014] R13: 0000000000000004 R14: 00007f71d61b3550 R15: 0000000000000000
[  151.132016]  </TASK>

 virt/kvm/pfncache.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 45b9b96c0ea3..e987669c3506 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -364,11 +364,13 @@ EXPORT_SYMBOL_GPL(kvm_gpc_init);
 int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa)
 {
 	if (!gpc->active) {
+		write_lock_irq(&gpc->lock);
 		gpc->khva = NULL;
 		gpc->pfn = KVM_PFN_ERR_FAULT;
 		gpc->uhva = KVM_HVA_ERR_BAD;
 		gpc->valid = false;
 		gpc->active = true;
+		write_unlock_irq(&gpc->lock);
 
 		spin_lock(&gpc->kvm->gpc_lock);
 		list_add(&gpc->list, &gpc->kvm->gpc_list);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast()
  2022-09-21  2:01       ` [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj
@ 2022-10-10 23:28         ` Sean Christopherson
  2022-10-13  0:22           ` Sean Christopherson
  0 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2022-10-10 23:28 UTC (permalink / raw)
  To: Michal Luczaj; +Cc: kvm, pbonzini

On Wed, Sep 21, 2022, Michal Luczaj wrote:
> There's a race between kvm_xen_set_evtchn_fast() and kvm_gpc_activate()
> resulting in a near-NULL pointer write.
> 
> 1. Deactivate shinfo cache:
> 
> kvm_xen_hvm_set_attr
> case KVM_XEN_ATTR_TYPE_SHARED_INFO
>  kvm_gpc_deactivate
>   kvm_gpc_unmap
>    gpc->valid = false
>    gpc->khva = NULL
>   gpc->active = false
> 
> Result: active = false, valid = false
> 
> 2. Cause cache refresh:
> 
> kvm_arch_vm_ioctl
> case KVM_XEN_HVM_EVTCHN_SEND
>  kvm_xen_hvm_evtchn_send
>   kvm_xen_set_evtchn
>    kvm_xen_set_evtchn_fast
>     kvm_gpc_check
>     return -EWOULDBLOCK because !gpc->valid
>    kvm_xen_set_evtchn_fast
>     return -EWOULDBLOCK
>    kvm_gpc_refresh
>     hva_to_pfn_retry
>      gpc->valid = true
>      gpc->khva = not NULL
> 
> Result: active = false, valid = true

This is the real bug.  KVM should not succesfully refresh an inactive cache.
It's not just the potential for NULL pointer deref, the cache also isn't on the
list of active caches, i.e. won't get mmu_notifier events, and so KVM could get
a use-after-free of userspace memory.

KVM_XEN_HVM_EVTCHN_SEND does check that the per-vCPU cache is active, but does so
outside of the gpc->lock.

Minus your race condition analysis, which I'll insert into the changelog (assuming
this works), I believe the proper fix is to check "active" during check and refresh.
Oof, and there are ordering bugs too.  Compile-tested patch below.

If this fixes things on your end (I'll properly test tomorrow too), I'll post a
v2 of the entire series.  There are some cleanups that can be done on top, e.g.
I think we should drop kvm_gpc_unmap() entirely until there's actually a user,
because it's not at all obvious that it's (a) necessary and (b) has desirable
behavior.

Note, the below patch applies after patch 1 of this series.  I don't know if anyone
will actually want to backport the fix, but it's not too hard to keep the backport
dependency to just patch 1.

--
From: Sean Christopherson <seanjc@google.com>
Date: Mon, 10 Oct 2022 13:06:13 -0700
Subject: [PATCH] KVM: Reject attempts to consume or refresh inactive
 gfn_to_pfn_cache

Reject kvm_gpc_check() and kvm_gpc_refresh() if the cache is inactive.
No checking the active flag during refresh is particular egregious, as
KVM can end up with a valid, inactive cache, which can lead to a variety
of use-after-free bugs, e.g. consuming a NULL kernel pointer or missing
an mmu_notifier invalidation due to the cache not being on the list of
gfns to invalidate.

Note, "active" needs to be set if and only if the cache is on the list
of caches, i.e. is reachable via mmu_notifier events.  If a relevant
mmu_notifier event occurs while the cache is "active" but not on the
list, KVM will not acquire the cache's lock and so will not serailize
the mmu_notifier event with active users and/or kvm_gpc_refresh().

A race between KVM_XEN_ATTR_TYPE_SHARED_INFO and KVM_XEN_HVM_EVTCHN_SEND
can be exploited to trigger the bug.

<will insert your awesome race analysis>

Reported-by: : Michal Luczaj <mhal@rbox.co>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/pfncache.c | 36 ++++++++++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index b32ed4a7c900..dfc72aa88d71 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -81,6 +81,9 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 {
 	struct kvm_memslots *slots = kvm_memslots(kvm);
 
+	if (!gpc->active)
+		return false;
+
 	if ((gpa & ~PAGE_MASK) + len > PAGE_SIZE)
 		return false;
 
@@ -240,8 +243,9 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 {
 	struct kvm_memslots *slots = kvm_memslots(kvm);
 	unsigned long page_offset = gpa & ~PAGE_MASK;
-	kvm_pfn_t old_pfn, new_pfn;
+	bool unmap_old = false;
 	unsigned long old_uhva;
+	kvm_pfn_t old_pfn;
 	void *old_khva;
 	int ret = 0;
 
@@ -261,6 +265,9 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 
 	write_lock_irq(&gpc->lock);
 
+	if (!gpc->active)
+		goto out_unlock;
+
 	old_pfn = gpc->pfn;
 	old_khva = gpc->khva - offset_in_page(gpc->khva);
 	old_uhva = gpc->uhva;
@@ -305,14 +312,15 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 		gpc->khva = NULL;
 	}
 
-	/* Snapshot the new pfn before dropping the lock! */
-	new_pfn = gpc->pfn;
+	/* Detect a pfn change before dropping the lock! */
+	unmap_old = (old_pfn != gpc->pfn);
 
+out_unlock:
 	write_unlock_irq(&gpc->lock);
 
 	mutex_unlock(&gpc->refresh_lock);
 
-	if (old_pfn != new_pfn)
+	if (unmap_old)
 		gpc_unmap_khva(kvm, old_pfn, old_khva);
 
 	return ret;
@@ -368,11 +376,19 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
 		gpc->vcpu = vcpu;
 		gpc->usage = usage;
 		gpc->valid = false;
-		gpc->active = true;
 
 		spin_lock(&kvm->gpc_lock);
 		list_add(&gpc->list, &kvm->gpc_list);
 		spin_unlock(&kvm->gpc_lock);
+
+		/*
+		 * Activate the cache after adding it to the list, a concurrent
+		 * refresh must not establish a mapping until the cache is
+		 * reachable by mmu_notifier events.
+		 */
+		write_lock_irq(&gpc->lock);
+		gpc->active = true;
+		write_unlock_irq(&gpc->lock);
 	}
 	return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len);
 }
@@ -381,12 +397,20 @@ EXPORT_SYMBOL_GPL(kvm_gpc_activate);
 void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
 {
 	if (gpc->active) {
+		/*
+		 * Deactivate the cache before removing it from the list, KVM
+		 * must stall mmu_notifier events until all users go away, i.e.
+		 * until gpc->lock is dropped and refresh is guaranteed to fail.
+		 */
+		write_lock_irq(&gpc->lock);
+		gpc->active = false;
+		write_unlock_irq(&gpc->lock);
+
 		spin_lock(&kvm->gpc_lock);
 		list_del(&gpc->list);
 		spin_unlock(&kvm->gpc_lock);
 
 		kvm_gfn_to_pfn_cache_unmap(kvm, gpc);
-		gpc->active = false;
 	}
 }
 EXPORT_SYMBOL_GPL(kvm_gpc_deactivate);

base-commit: 09e5b3d617d28e3011253370f827151cc6cba6ad
-- 


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache
  2022-09-21  2:01       ` [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj
@ 2022-10-10 23:38         ` Sean Christopherson
  0 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-10-10 23:38 UTC (permalink / raw)
  To: Michal Luczaj; +Cc: kvm, pbonzini

On Wed, Sep 21, 2022, Michal Luczaj wrote:
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index f4519d3689e1..9fd67026d91a 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -1241,8 +1241,17 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data,
>  void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
>  
>  /**
> - * kvm_gfn_to_pfn_cache_init - prepare a cached kernel mapping and HPA for a
> - *                             given guest physical address.
> + * kvm_gpc_init - initialize gfn_to_pfn_cache.
> + *
> + * @gpc:	   struct gfn_to_pfn_cache object.
> + *
> + * This sets up a gfn_to_pfn_cache by initializing locks.

I think it makes sense to add a blurb calling out that the cache needs to be
zeroed before init, i.e. needs to be zero-allocated.  I'll add it in v2.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties
  2022-09-21  2:01       ` [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj
@ 2022-10-10 23:42         ` Sean Christopherson
  2022-10-11  0:37         ` Sean Christopherson
  1 sibling, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-10-10 23:42 UTC (permalink / raw)
  To: Michal Luczaj; +Cc: kvm, pbonzini

On Wed, Sep 21, 2022, Michal Luczaj wrote:
> +int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa)
>  {
> -	WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
> -

I think it's worth grabbing "kvm" in a local variable.  It minimizes the diff,
and we've had cleanup in other areas of KVM to replace repeated pointer chasing
with a local variable.  I'll do this in v2 as well.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties
  2022-09-21  2:01       ` [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj
  2022-10-10 23:42         ` Sean Christopherson
@ 2022-10-11  0:37         ` Sean Christopherson
  1 sibling, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2022-10-11  0:37 UTC (permalink / raw)
  To: Michal Luczaj; +Cc: kvm, pbonzini

On Wed, Sep 21, 2022, Michal Luczaj wrote:
> Move the assignment of immutable properties @kvm, @vcpu, @usage, @len
> to the initializer.  Make _init(), _activate() and _deactivate() use
> stored values.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Michal Luczaj <mhal@rbox.co>
> @@ -1818,9 +1810,12 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu)
>  
>  	timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0);
>  
> -	kvm_gpc_init(&vcpu->arch.xen.runstate_cache);
> -	kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache);
> -	kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache);
> +	kvm_gpc_init(&vcpu->arch.xen.runstate_cache, vcpu->kvm, NULL,
> +		     KVM_HOST_USES_PFN, sizeof(struct vcpu_runstate_info));
> +	kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache, vcpu->kvm, NULL,
> +		     KVM_HOST_USES_PFN, sizeof(struct vcpu_info));

Argh, KVM Xen doesn't actually treat these two as immutable values.  I suspect
you encountered this as well since check() and refresh() didn't drop @len.  I'm
99% certain we can still make the length immutable, it'll just require a bit of
massaging, i.e. extra patches.

The vcpu_info_cache length is effectively immutable, the use of the common helper
kvm_setup_guest_pvclock() just makes it less obvious.  This can be addressed by
making the param a "max_len" or "max_size" or whatever, e.g. so that accessing a
subset still verifies the cache as a whole.

The runstate_cache is messier since it actually consumes two different sizes, but
that's arguably a bug that was introduced by commit a795cd43c5b5 ("KVM: x86/xen:
Use gfn_to_pfn_cache for runstate area").  Prior to that, KVM always used the larger
non-compat size.  And KVM still uses the larger size during activation, i.e. KVM
will "fail" activation but allow a sneaky 32-bit guest to access a rejected struct
sitting at the very end of the page.  I'm pretty sure that hole can be fixed without
breaking KVM's ABI.

With those addressed, the max length becomes immutable and @len can be dropped.
I'll fiddle with this tomorrow and hopefully include patches for that in v2.

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index b2be60c6efa4..9e79ef2cca99 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -212,10 +212,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state)
        if (!vx->runstate_cache.active)
                return;
 
-       if (IS_ENABLED(CONFIG_64BIT) && v->kvm->arch.xen.long_mode)
-               user_len = sizeof(struct vcpu_runstate_info);
-       else
-               user_len = sizeof(struct compat_vcpu_runstate_info);
+       user_len = sizeof(struct vcpu_runstate_info);
 
        read_lock_irqsave(&gpc->lock, flags);
        while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast()
  2022-10-10 23:28         ` Sean Christopherson
@ 2022-10-13  0:22           ` Sean Christopherson
  2022-10-13 20:28             ` Sean Christopherson
  0 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2022-10-13  0:22 UTC (permalink / raw)
  To: Michal Luczaj; +Cc: kvm, pbonzini

On Mon, Oct 10, 2022, Sean Christopherson wrote:
> On Wed, Sep 21, 2022, Michal Luczaj wrote:
> If this fixes things on your end (I'll properly test tomorrow too), I'll post a
> v2 of the entire series.  There are some cleanups that can be done on top, e.g.
> I think we should drop kvm_gpc_unmap() entirely until there's actually a user,
> because it's not at all obvious that it's (a) necessary and (b) has desirable
> behavior.

Sorry for the delay, I initially missed that you included a selftest for the race
in the original RFC.  The kernel is no longer exploding, but the test is intermittently
soft hanging waiting for the "IRQ".  I'll debug and hopefully post tomorrow.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast()
  2022-10-13  0:22           ` Sean Christopherson
@ 2022-10-13 20:28             ` Sean Christopherson
  2022-10-20 15:59               ` Michal Luczaj
  0 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2022-10-13 20:28 UTC (permalink / raw)
  To: Michal Luczaj; +Cc: kvm, pbonzini

On Thu, Oct 13, 2022, Sean Christopherson wrote:
> On Mon, Oct 10, 2022, Sean Christopherson wrote:
> > On Wed, Sep 21, 2022, Michal Luczaj wrote:
> > If this fixes things on your end (I'll properly test tomorrow too), I'll post a
> > v2 of the entire series.  There are some cleanups that can be done on top, e.g.
> > I think we should drop kvm_gpc_unmap() entirely until there's actually a user,
> > because it's not at all obvious that it's (a) necessary and (b) has desirable
> > behavior.
> 
> Sorry for the delay, I initially missed that you included a selftest for the race
> in the original RFC.  The kernel is no longer exploding, but the test is intermittently
> soft hanging waiting for the "IRQ".  I'll debug and hopefully post tomorrow.

Ended up being a test bug (technically).  KVM drops the timer IRQ if the shared
info page is invalid.  As annoying as that is, there's isn't really a better
option, and invalidating a shared page while vCPUs are running really is a VMM
bug.

To fix, I added an intermediate stage in the test that re-arms the timer if the
IRQ doesn't arrive in a reasonable amount of time.

Patches incoming...

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast()
  2022-10-13 20:28             ` Sean Christopherson
@ 2022-10-20 15:59               ` Michal Luczaj
  2022-10-20 16:58                 ` Sean Christopherson
  0 siblings, 1 reply; 34+ messages in thread
From: Michal Luczaj @ 2022-10-20 15:59 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, pbonzini

On 10/13/22 22:28, Sean Christopherson wrote:
> On Thu, Oct 13, 2022, Sean Christopherson wrote:
>> On Mon, Oct 10, 2022, Sean Christopherson wrote:
>>> On Wed, Sep 21, 2022, Michal Luczaj wrote:
>>> If this fixes things on your end (I'll properly test tomorrow too), I'll post a
>>> v2 of the entire series.  There are some cleanups that can be done on top, e.g.
>>> I think we should drop kvm_gpc_unmap() entirely until there's actually a user,
>>> because it's not at all obvious that it's (a) necessary and (b) has desirable
>>> behavior.
>>
>> Sorry for the delay, I initially missed that you included a selftest for the race
>> in the original RFC.  The kernel is no longer exploding, but the test is intermittently
>> soft hanging waiting for the "IRQ".  I'll debug and hopefully post tomorrow.
> 
> Ended up being a test bug (technically).  KVM drops the timer IRQ if the shared
> info page is invalid.  As annoying as that is, there's isn't really a better
> option, and invalidating a shared page while vCPUs are running really is a VMM
> bug.
> 
> To fix, I added an intermediate stage in the test that re-arms the timer if the
> IRQ doesn't arrive in a reasonable amount of time.
> 
> Patches incoming...

Sorry for the late reply, I was away.
Thank you for the whole v2 series. And I'm glad you've found my testcase
useful, even if a bit buggy ;)

Speaking about SCHEDOP_poll, are XEN vmcalls considered trusted?
I've noticed that kvm_xen_schedop_poll() fetches guest-provided
sched_poll.ports without checking if the values are sane. Then, in
wait_pending_event(), there's test_bit(ports[i], pending_bits) which
(for some high ports[i] values) results in KASAN complaining about
"use-after-free":

[   36.463417] ==================================================================
[   36.463564] BUG: KASAN: use-after-free in kvm_xen_hypercall+0xf39/0x1110 [kvm]
[   36.463962] Read of size 8 at addr ffff88815b87b800 by task xen_shinfo_test/956
[   36.464149] CPU: 1 PID: 956 Comm: xen_shinfo_test Not tainted 6.1.0-rc1-kasan+ #49
[   36.464252] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[   36.464259] Call Trace:
[   36.464259]  <TASK>
[   36.464259]  dump_stack_lvl+0x5b/0x73
[   36.464259]  print_report+0x17f/0x477
[   36.464259]  ? __virt_addr_valid+0xd5/0x150
[   36.464259]  ? kvm_xen_hypercall+0xf39/0x1110 [kvm]
[   36.464259]  ? kvm_xen_hypercall+0xf39/0x1110 [kvm]
[   36.464259]  kasan_report+0xbb/0xf0
[   36.464259]  ? kvm_xen_hypercall+0xf39/0x1110 [kvm]
[   36.464259]  kasan_check_range+0x136/0x1b0
[   36.464259]  kvm_xen_hypercall+0xf39/0x1110 [kvm]
[   36.464259]  ? kvm_xen_set_evtchn.part.0+0x190/0x190 [kvm]
[   36.464259]  ? get_kvmclock+0x86/0x360 [kvm]
[   36.464259]  ? pvclock_clocksource_read+0x13a/0x190
[   36.464259]  kvm_emulate_hypercall+0x1d7/0x860 [kvm]
[   36.464259]  ? get_kvmclock+0x151/0x360 [kvm]
[   36.464259]  ? kvm_fast_pio+0x260/0x260 [kvm]
[   36.464259]  ? kvm_post_set_cr4+0xf0/0xf0 [kvm]
[   36.464259]  ? lock_release+0x9c/0x430
[   36.464259]  ? rcu_qs+0x2b/0xb0
[   36.464259]  ? rcu_note_context_switch+0x18e/0x9b0
[   36.464259]  ? rcu_read_lock_sched_held+0x10/0x70
[   36.464259]  ? lock_acquire+0xb1/0x3d0
[   36.464259]  ? vmx_vmexit+0x6c/0x19d [kvm_intel]
[   36.464259]  ? vmx_vmexit+0x8d/0x19d [kvm_intel]
[   36.464259]  ? rcu_read_lock_sched_held+0x10/0x70
[   36.464259]  ? lock_acquire+0xb1/0x3d0
[   36.464259]  ? lock_downgrade+0x380/0x380
[   36.464259]  ? vmx_vcpu_run+0x5bf/0x1260 [kvm_intel]
[   36.464259]  vmx_handle_exit+0x295/0xa50 [kvm_intel]
[   36.464259]  vcpu_enter_guest.constprop.0+0x1436/0x1ed0 [kvm]
[   36.464259]  ? kvm_check_and_inject_events+0x800/0x800 [kvm]
[   36.464259]  ? lock_downgrade+0x380/0x380
[   36.464259]  ? __blkcg_punt_bio_submit+0xd0/0xd0
[   36.464259]  ? kvm_arch_vcpu_ioctl_run+0xa46/0xf70 [kvm]
[   36.464259]  ? unlock_page_memcg+0x1e0/0x1e0
[   36.464259]  ? __local_bh_enable_ip+0x8f/0x100
[   36.464259]  ? trace_hardirqs_on+0x2d/0xf0
[   36.464259]  ? fpu_swap_kvm_fpstate+0xbd/0x1c0
[   36.464259]  ? kvm_arch_vcpu_ioctl_run+0x418/0xf70 [kvm]
[   36.464259]  kvm_arch_vcpu_ioctl_run+0x418/0xf70 [kvm]
[   36.464259]  kvm_vcpu_ioctl+0x332/0x8f0 [kvm]
[   36.464259]  ? kvm_clear_dirty_log_protect+0x430/0x430 [kvm]
[   36.464259]  ? do_vfs_ioctl+0x951/0xbf0
[   36.464259]  ? vfs_fileattr_set+0x480/0x480
[   36.464259]  ? kernel_write+0x360/0x360
[   36.464259]  ? selinux_inode_getsecctx+0x50/0x50
[   36.464259]  ? ioctl_has_perm.constprop.0.isra.0+0x133/0x200
[   36.464259]  ? selinux_inode_getsecctx+0x50/0x50
[   36.464259]  ? ksys_write+0xc4/0x140
[   36.464259]  ? __ia32_sys_read+0x40/0x40
[   36.464259]  ? lockdep_hardirqs_on_prepare+0xe/0x220
[   36.464259]  __x64_sys_ioctl+0xb8/0xf0
[   36.464259]  do_syscall_64+0x55/0x80
[   36.464259]  ? lockdep_hardirqs_on_prepare+0xe/0x220
[   36.464259]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[   36.464259] RIP: 0033:0x7f81e303ec6b
[   36.464259] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48
[   36.464259] RSP: 002b:00007fff72009028 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   36.464259] RAX: ffffffffffffffda RBX: 00007f81e33936c0 RCX: 00007f81e303ec6b
[   36.464259] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000007
[   36.464259] RBP: 00000000008457b0 R08: 0000000000000000 R09: 00000000004029f6
[   36.464259] R10: 00007f81e31b838b R11: 0000000000000246 R12: 00007f81e33a4000
[   36.464259] R13: 00007f81e33a2000 R14: 0000000000000000 R15: 00007f81e33a3020
[   36.464259]  </TASK>
[   36.464259] The buggy address belongs to the physical page:
[   36.464259] page:00000000d9b176a3 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x1 pfn:0x15b87b
[   36.464259] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[   36.464259] raw: 0017ffffc0000000 ffffea0005504e48 ffffea00056cf688 0000000000000000
[   36.464259] raw: 0000000000000001 0000000000000000 00000000ffffff7f 0000000000000000
[   36.464259] page dumped because: kasan: bad access detected
[   36.464259] Memory state around the buggy address:
[   36.464259]  ffff88815b87b700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[   36.464259]  ffff88815b87b780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[   36.464259] >ffff88815b87b800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[   36.464259]                    ^
[   36.464259]  ffff88815b87b880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[   36.464259]  ffff88815b87b900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[   36.464259] ==================================================================

I can't reproduce it under non-KASAN build, I'm not sure what's going on.

Anyway, here's the testcase, applies after your selftests patches in
https://lore.kernel.org/kvm/20221013211234.1318131-1-seanjc@google.com/

---
 .../selftests/kvm/x86_64/xen_shinfo_test.c    | 38 +++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
index 2a5727188c8d..402e3d7b86b0 100644
--- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
+++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c
@@ -375,6 +375,29 @@ static void guest_code(void)
 	guest_saw_irq = false;
 
 	GUEST_SYNC(24);
+	/* Terminate racer thread */
+
+	GUEST_SYNC(25);
+	/* Test SCHEDOP_poll out-of-bounds read */
+
+	p = (struct sched_poll) {
+		.ports = ports,
+		.nr_ports = 1,
+		.timeout = 1
+	};
+
+	ports[0] = (PAGE_SIZE*2) << 3;
+	for (i = 0; i < 0x1000; i++) {
+		asm volatile("vmcall"
+			     : "=a" (rax)
+			     : "a" (__HYPERVISOR_sched_op),
+			       "D" (SCHEDOP_poll),
+			       "S" (&p)
+			     : "memory");
+		ports[0] += PAGE_SIZE << 3;
+	}
+
+	GUEST_SYNC(26);
 }
 
 static int cmp_timespec(struct timespec *a, struct timespec *b)
@@ -925,6 +948,21 @@ int main(int argc, char *argv[])
 
 				ret = pthread_join(thread, 0);
 				TEST_ASSERT(ret == 0, "pthread_join() failed: %s", strerror(ret));
+
+				/* shinfo juggling done; reset to a valid GFN. */
+				struct kvm_xen_hvm_attr ha = {
+					.type = KVM_XEN_ATTR_TYPE_SHARED_INFO,
+					.u.shared_info.gfn = SHINFO_REGION_GPA / PAGE_SIZE,
+				};
+				vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &ha);
+				break;
+
+			case 25:
+				if (verbose)
+					printf("Testing SCHEDOP_poll out-of-bounds read\n");
+				break;
+
+			case 26:
 				goto done;
 
 			case 0x20:
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast()
  2022-10-20 15:59               ` Michal Luczaj
@ 2022-10-20 16:58                 ` Sean Christopherson
  2022-10-21  2:39                   ` Michal Luczaj
  0 siblings, 1 reply; 34+ messages in thread
From: Sean Christopherson @ 2022-10-20 16:58 UTC (permalink / raw)
  To: Michal Luczaj; +Cc: kvm, pbonzini

On Thu, Oct 20, 2022, Michal Luczaj wrote:
> On 10/13/22 22:28, Sean Christopherson wrote:
> > On Thu, Oct 13, 2022, Sean Christopherson wrote:
> >> On Mon, Oct 10, 2022, Sean Christopherson wrote:
> >>> On Wed, Sep 21, 2022, Michal Luczaj wrote:
> >>> If this fixes things on your end (I'll properly test tomorrow too), I'll post a
> >>> v2 of the entire series.  There are some cleanups that can be done on top, e.g.
> >>> I think we should drop kvm_gpc_unmap() entirely until there's actually a user,
> >>> because it's not at all obvious that it's (a) necessary and (b) has desirable
> >>> behavior.
> >>
> >> Sorry for the delay, I initially missed that you included a selftest for the race
> >> in the original RFC.  The kernel is no longer exploding, but the test is intermittently
> >> soft hanging waiting for the "IRQ".  I'll debug and hopefully post tomorrow.
> > 
> > Ended up being a test bug (technically).  KVM drops the timer IRQ if the shared
> > info page is invalid.  As annoying as that is, there's isn't really a better
> > option, and invalidating a shared page while vCPUs are running really is a VMM
> > bug.
> > 
> > To fix, I added an intermediate stage in the test that re-arms the timer if the
> > IRQ doesn't arrive in a reasonable amount of time.
> > 
> > Patches incoming...
> 
> Sorry for the late reply, I was away.
> Thank you for the whole v2 series. And I'm glad you've found my testcase
> useful, even if a bit buggy ;)
> 
> Speaking about SCHEDOP_poll, are XEN vmcalls considered trusted?

I highly doubt they are trusted.

> I've noticed that kvm_xen_schedop_poll() fetches guest-provided
> sched_poll.ports without checking if the values are sane. Then, in
> wait_pending_event(), there's test_bit(ports[i], pending_bits) which
> (for some high ports[i] values) results in KASAN complaining about
> "use-after-free":
> 
> [   36.463417] ==================================================================
> [   36.463564] BUG: KASAN: use-after-free in kvm_xen_hypercall+0xf39/0x1110 [kvm]

...

> I can't reproduce it under non-KASAN build, I'm not sure what's going on.

KASAN is rightly complaining because, as you already pointed out, the high ports[i]
value will touch memory well beyond the shinfo->evtchn_pending array.  Non-KASAN
builds don't have visible failures because the rogue access is only a read, and
the result of the test_bit() only affects whether or not KVM temporarily stalls
the vCPU.  In other words, KVM is leaking host state to the guest, but there is
no memory corruption and no functional impact on the guest.

I think this would be the way to fix this particular mess?

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 93c628d3e3a9..5d09a47db732 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -961,7 +961,9 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports,
        struct kvm *kvm = vcpu->kvm;
        struct gfn_to_pfn_cache *gpc = &kvm->arch.xen.shinfo_cache;
        unsigned long *pending_bits;
+       unsigned long nr_bits;
        unsigned long flags;
+       evtchn_port_t port;
        bool ret = true;
        int idx, i;
 
@@ -974,13 +976,19 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports,
        if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) {
                struct shared_info *shinfo = gpc->khva;
                pending_bits = (unsigned long *)&shinfo->evtchn_pending;
+               nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE;
        } else {
                struct compat_shared_info *shinfo = gpc->khva;
                pending_bits = (unsigned long *)&shinfo->evtchn_pending;
+               nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE;
        }
 
        for (i = 0; i < nr_ports; i++) {
-               if (test_bit(ports[i], pending_bits)) {
+               port = ports[i];
+               if (port >= nr_bits)
+                       continue;
+
+               if (test_bit(array_index_nospec(port, nr_bits), pending_bits)) {
                        ret = true;
                        break;
                }


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast()
  2022-10-20 16:58                 ` Sean Christopherson
@ 2022-10-21  2:39                   ` Michal Luczaj
  0 siblings, 0 replies; 34+ messages in thread
From: Michal Luczaj @ 2022-10-21  2:39 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, pbonzini

On 10/20/22 18:58, Sean Christopherson wrote:
> On Thu, Oct 20, 2022, Michal Luczaj wrote:
>> Speaking about SCHEDOP_poll, are XEN vmcalls considered trusted?
> 
> I highly doubt they are trusted.

Does it mean a CPL3 guest can vmcall SCHEDOP_poll? If so, is the use of
kvm_mmu_gva_to_gpa_system() justified?

>> I've noticed that kvm_xen_schedop_poll() fetches guest-provided
>> sched_poll.ports without checking if the values are sane. Then, in
>> wait_pending_event(), there's test_bit(ports[i], pending_bits) which
>> (for some high ports[i] values) results in KASAN complaining about
>> "use-after-free":
>>
>> [   36.463417] ==================================================================
>> [   36.463564] BUG: KASAN: use-after-free in kvm_xen_hypercall+0xf39/0x1110 [kvm]
> 
> ...
> 
>> I can't reproduce it under non-KASAN build, I'm not sure what's going on.
> 
> KASAN is rightly complaining because, as you already pointed out, the high ports[i]
> value will touch memory well beyond the shinfo->evtchn_pending array.  Non-KASAN
> builds don't have visible failures because the rogue access is only a read, and
> the result of the test_bit() only affects whether or not KVM temporarily stalls
> the vCPU.  In other words, KVM is leaking host state to the guest, but there is
> no memory corruption and no functional impact on the guest.

OK, so such vCPU stall-or-not is a side channel leaking host memory bit by
bit, right? I'm trying to understand what is actually being leaked here.
Is it user space memory of process that is using KVM on the host?

> I think this would be the way to fix this particular mess?
> 
> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> index 93c628d3e3a9..5d09a47db732 100644
> --- a/arch/x86/kvm/xen.c
> +++ b/arch/x86/kvm/xen.c
> @@ -961,7 +961,9 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports,
>         struct kvm *kvm = vcpu->kvm;
>         struct gfn_to_pfn_cache *gpc = &kvm->arch.xen.shinfo_cache;
>         unsigned long *pending_bits;
> +       unsigned long nr_bits;
>         unsigned long flags;
> +       evtchn_port_t port;
>         bool ret = true;
>         int idx, i;
>  
> @@ -974,13 +976,19 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports,
>         if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) {
>                 struct shared_info *shinfo = gpc->khva;
>                 pending_bits = (unsigned long *)&shinfo->evtchn_pending;
> +               nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE;
>         } else {
>                 struct compat_shared_info *shinfo = gpc->khva;
>                 pending_bits = (unsigned long *)&shinfo->evtchn_pending;
> +               nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE;
>         }
>  
>         for (i = 0; i < nr_ports; i++) {
> -               if (test_bit(ports[i], pending_bits)) {
> +               port = ports[i];
> +               if (port >= nr_bits)
> +                       continue;
> +
> +               if (test_bit(array_index_nospec(port, nr_bits), pending_bits)) {
>                         ret = true;
>                         break;
>                 }

Great, that looks good and passes the test.

Thanks,
Michal


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2022-10-21  2:40 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-16  0:54 [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption Michal Luczaj
2022-09-16  0:54 ` [RFC PATCH 1/4] KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache Michal Luczaj
2022-09-16  0:54 ` [RFC PATCH 2/4] KVM: x86/xen: Ensure kvm_xen_schedop_poll() " Michal Luczaj
2022-09-16  0:54 ` [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization Michal Luczaj
2022-09-16 17:12   ` Sean Christopherson
2022-09-18 23:13     ` Michal Luczaj
2022-09-21  2:01     ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
2022-09-21  2:01       ` [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj
2022-10-10 23:38         ` Sean Christopherson
2022-09-21  2:01       ` [PATCH 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj
2022-09-21  2:01       ` [PATCH 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj
2022-09-21  2:01       ` [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj
2022-10-10 23:42         ` Sean Christopherson
2022-10-11  0:37         ` Sean Christopherson
2022-09-21  2:01       ` [PATCH 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj
2022-09-21  2:01       ` [PATCH 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj
2022-09-21  2:01       ` [PATCH 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj
2022-09-21  2:01       ` [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj
2022-10-10 23:28         ` Sean Christopherson
2022-10-13  0:22           ` Sean Christopherson
2022-10-13 20:28             ` Sean Christopherson
2022-10-20 15:59               ` Michal Luczaj
2022-10-20 16:58                 ` Sean Christopherson
2022-10-21  2:39                   ` Michal Luczaj
2022-10-05 12:30     ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj
2022-10-05 12:30       ` [PATCH v2 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj
2022-10-05 12:30       ` [PATCH v2 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj
2022-10-05 12:30       ` [PATCH v2 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj
2022-10-05 12:30       ` [PATCH v2 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj
2022-10-05 12:30       ` [PATCH v2 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj
2022-10-05 12:30       ` [PATCH v2 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj
2022-10-05 12:30       ` [PATCH v2 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj
2022-10-05 12:30       ` [PATCH v2 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj
2022-09-16  0:54 ` [RFC PATCH 4/4] KVM: x86/xen: Test shinfo_cache lock races Michal Luczaj

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).