* [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption @ 2022-09-16 0:54 Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 1/4] KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache Michal Luczaj ` (3 more replies) 0 siblings, 4 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-16 0:54 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, shuah, Michal Luczaj There seem to be two problems with the way arch.xen.shinfo_cache instance of gfn_to_pfn_cache is treated. 1. gpc->lock is taken without checking if it was actually initialized. e.g. kvm_xen_set_evtchn_fast(): read_lock_irqsave(&gpc->lock, flags); if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) goto out_rcu; INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. CPU: 2 PID: 959 Comm: xenirq Not tainted 6.0.0-rc5 #12 Call Trace: dump_stack_lvl+0x5b/0x77 register_lock_class+0x46d/0x480 __lock_acquire+0x64/0x1fa0 lock_acquire+0xbf/0x2b0 ? kvm_xen_set_evtchn_fast+0xc7/0x400 [kvm] ? lock_acquire+0xcf/0x2b0 ? _raw_read_lock_irqsave+0x99/0xa0 _raw_read_lock_irqsave+0x81/0xa0 ? kvm_xen_set_evtchn_fast+0xc7/0x400 [kvm] kvm_xen_set_evtchn_fast+0xc7/0x400 [kvm] ? kvm_xen_set_evtchn_fast+0x7e/0x400 [kvm] ? find_held_lock+0x2b/0x80 kvm_xen_hvm_evtchn_send+0x4b/0x90 [kvm] kvm_arch_vm_ioctl+0x4de/0xca0 [kvm] ? vmx_vcpu_put+0x18/0x1e0 [kvm_intel] ? kvm_arch_vcpu_put+0x1db/0x250 [kvm] ? vcpu_put+0x46/0x70 [kvm] ? kvm_arch_vcpu_ioctl+0xd0/0x1710 [kvm] kvm_vm_ioctl+0x4e4/0xdd0 [kvm] ? lock_is_held_type+0xe3/0x140 __x64_sys_ioctl+0x8d/0xd0 do_syscall_64+0x58/0x80 ? __do_fast_syscall_32+0xeb/0xf0 ? lockdep_hardirqs_on+0x7d/0x100 ? lock_is_held_type+0xe3/0x140 ? do_syscall_64+0x67/0x80 ? lockdep_hardirqs_on+0x7d/0x100 ? do_syscall_64+0x67/0x80 ? lockdep_hardirqs_on+0x7d/0x100 entry_SYSCALL_64_after_hwframe+0x63/0xcd 2. kvm_gfn_to_pfn_cache_init() allows for gpc->lock reinitialization. This can lead to situation where a lock that is already taken gets reinitialized in another thread (and becomes corrupted). For example: a race between ioctl(KVM_XEN_HVM_EVTCHN_SEND) and kvm_gfn_to_pfn_cache_init(): (thread 1) | (thread 2) | kvm_xen_set_evtchn_fast | read_lock_irqsave(&gpc->lock, ...) | | kvm_gfn_to_pfn_cache_init | rwlock_init(&gpc->lock) read_unlock_irqrestore(&gpc->lock, ...) | Testing shinfo lock corruption (KVM_XEN_HVM_EVTCHN_SEND) rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-...D } 26610 jiffies s: 265 root: 0x2/. rcu: blocking rcu_node structures (internal RCU debug): Task dump for CPU 1: task:xen_shinfo_test state:R running task stack: 0 pid: 952 ppid: 867 flags:0x00000008 Call Trace: ? exc_page_fault+0x121/0x2b0 ? rcu_read_lock_sched_held+0x10/0x80 ? entry_SYSCALL_64_after_hwframe+0x63/0xcd rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 1-...0: (5 ticks this GP) idle=6b94/1/0x4000000000000000 softirq=5929/5931 fqs=15261 (detected by 0, t=65002 jiffies, g=5465, q=100 ncpus=4) Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 952 Comm: xen_shinfo_test Not tainted 6.0.0-rc5 #12 RIP: 0010:queued_write_lock_slowpath+0x68/0x90 Call Trace: do_raw_write_lock+0xad/0xb0 kvm_gfn_to_pfn_cache_refresh+0x2a5/0x630 [kvm] kvm_xen_hvm_set_attr+0x19d/0x5e0 [kvm] kvm_arch_vm_ioctl+0x8ca/0xca0 [kvm] ? rcu_read_lock_sched_held+0x10/0x80 kvm_vm_ioctl+0x4e4/0xdd0 [kvm] ? rcu_read_lock_sched_held+0x10/0x80 ? do_raw_write_trylock+0x29/0x40 ? rcu_read_lock_sched_held+0x10/0x80 ? lock_release+0x1ef/0x2d0 ? lock_release+0x1ef/0x2d0 __x64_sys_ioctl+0x8d/0xd0 do_syscall_64+0x58/0x80 ? exc_page_fault+0x121/0x2b0 ? rcu_read_lock_sched_held+0x10/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Testing shinfo lock corruption (KVM_XEN_HVM_EVTCHN_SEND) BUG: kernel NULL pointer dereference, address: 0000000000000800 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 1049cf067 P4D 1049cf067 PUD 104ba0067 PMD 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 955 Comm: xen_shinfo_test Not tainted 6.0.0-rc5 #12 RIP: 0010:kvm_xen_set_evtchn_fast+0x10f/0x400 [kvm] Call Trace: kvm_xen_hvm_evtchn_send+0x4b/0x90 [kvm] kvm_arch_vm_ioctl+0x4de/0xca0 [kvm] ? kvm_xen_hvm_evtchn_send+0x6e/0x90 [kvm] ? kvm_arch_vm_ioctl+0x4de/0xca0 [kvm] kvm_vm_ioctl+0x4e4/0xdd0 [kvm] ? kvm_vm_ioctl+0x4e4/0xdd0 [kvm] ? rcu_read_lock_sched_held+0x10/0x80 ? lock_release+0x1ef/0x2d0 __x64_sys_ioctl+0x8d/0xd0 do_syscall_64+0x58/0x80 ? rcu_read_lock_sched_held+0x10/0x80 ? trace_hardirqs_on_prepare+0x55/0xe0 ? do_syscall_64+0x67/0x80 ? rcu_read_lock_sched_held+0x10/0x80 ? trace_hardirqs_on_prepare+0x55/0xe0 ? do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Similar story with the handling of hypercall SCHEDOP_poll in wait_pending_event(): Testing shinfo lock corruption (SCHEDOP_poll) rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 1-...0: (12 ticks this GP) idle=0a5c/1/0x4000000000000000 softirq=6640/6640 fqs=12988 rcu: 2-...0: (10 ticks this GP) idle=66e4/1/0x4000000000000000 softirq=5526/5527 fqs=12988 (detected by 0, t=65003 jiffies, g=7437, q=732 ncpus=4) Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 skipped: idling at native_halt+0xa/0x10 Sending NMI from CPU 0 to CPUs 2: NMI backtrace for cpu 2 CPU: 2 PID: 970 Comm: xen_shinfo_test Not tainted 6.0.0-rc5 #12 RIP: 0010:queued_write_lock_slowpath+0x66/0x90 Call Trace: do_raw_write_lock+0xad/0xb0 kvm_gfn_to_pfn_cache_refresh+0x8a/0x630 [kvm] ? kvm_gfn_to_pfn_cache_init+0x122/0x130 [kvm] kvm_xen_hvm_set_attr+0x19d/0x5e0 [kvm] kvm_arch_vm_ioctl+0x8ca/0xca0 [kvm] ? __lock_acquire+0x3a4/0x1fa0 ? __lock_acquire+0x3a4/0x1fa0 kvm_vm_ioctl+0x4e4/0xdd0 [kvm] ? lock_is_held_type+0xe3/0x140 ? lock_release+0x135/0x2d0 __x64_sys_ioctl+0x8d/0xd0 do_syscall_64+0x58/0x80 ? lockdep_hardirqs_on+0x7d/0x100 ? do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 ? lockdep_hardirqs_on+0x7d/0x100 entry_SYSCALL_64_after_hwframe+0x63/0xcd Testing shinfo lock corruption (SCHEDOP_poll) watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [kworker/1:2:260] irq event stamp: 6990 hardirqs last enabled at (6989): [<ffffffff81e5c964>] _raw_spin_unlock_irq+0x24/0x50 hardirqs last disabled at (6990): [<ffffffff81e53e51>] __schedule+0xd41/0x1620 softirqs last enabled at (5790): [<ffffffff81766e78>] rht_deferred_worker+0x708/0xbe0 softirqs last disabled at (5788): [<ffffffff81766967>] rht_deferred_worker+0x1f7/0xbe0 CPU: 1 PID: 260 Comm: kworker/1:2 Not tainted 6.0.0-rc5 #12 Workqueue: rcu_gp wait_rcu_exp_gp RIP: 0010:smp_call_function_single+0x11a/0x160 Call Trace: ? trace_hardirqs_on+0x2b/0xd0 __sync_rcu_exp_select_node_cpus+0x267/0x460 sync_rcu_exp_select_cpus+0x1ec/0x3e0 wait_rcu_exp_gp+0xf/0x20 process_one_work+0x254/0x560 worker_thread+0x4f/0x390 ? _raw_spin_unlock_irqrestore+0x40/0x60 ? process_one_work+0x560/0x560 kthread+0xe6/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 I'm providing a set of patches: check if shinfo lock was initialized, disallow lock reinitialization. Along with crudely made testcases. Note: as I understand, kvm->lock mutex cannot be used to protect from those races because of kvm_xen_set_evtchn_fast() being called from kvm_arch_set_irq_inatomic()? I'm sending this as a RFC as I have doubts if explicitly disallowing reinitialization this way is the most elegant solution. Especially as the problem appears to affect only the shinfo gfn_to_pfn_cache. Michal Luczaj (4): KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache KVM: x86/xen: Ensure kvm_xen_schedop_poll() can use shinfo_cache KVM: x86/xen: Disallow gpc locks reinitialization KVM: x86/xen: Test shinfo_cache lock races arch/x86/kvm/xen.c | 5 +- include/linux/kvm_types.h | 1 + .../selftests/kvm/x86_64/xen_shinfo_test.c | 100 ++++++++++++++++++ virt/kvm/pfncache.c | 7 +- 4 files changed, 110 insertions(+), 3 deletions(-) -- 2.37.2 ^ permalink raw reply [flat|nested] 34+ messages in thread
* [RFC PATCH 1/4] KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache 2022-09-16 0:54 [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption Michal Luczaj @ 2022-09-16 0:54 ` Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 2/4] KVM: x86/xen: Ensure kvm_xen_schedop_poll() " Michal Luczaj ` (2 subsequent siblings) 3 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-16 0:54 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, shuah, Michal Luczaj Before taking gpc->lock, ensure it has been initialized. Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/xen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 280cb5dc7341..e32c2cf06223 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -1348,7 +1348,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx); } - if (!vcpu->arch.xen.vcpu_info_cache.active) + if (!vcpu->arch.xen.vcpu_info_cache.active || !gpc->active) return -EINVAL; if (xe->port >= max_evtchn_port(kvm)) -- 2.37.2 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH 2/4] KVM: x86/xen: Ensure kvm_xen_schedop_poll() can use shinfo_cache 2022-09-16 0:54 [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 1/4] KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache Michal Luczaj @ 2022-09-16 0:54 ` Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 4/4] KVM: x86/xen: Test shinfo_cache lock races Michal Luczaj 3 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-16 0:54 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, shuah, Michal Luczaj Before taking gpc->lock, ensure it has been initialized. Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/xen.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index e32c2cf06223..c5d431a54afa 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -965,6 +965,9 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, bool ret = true; int idx, i; + if (!gpc->active) + return true; + read_lock_irqsave(&gpc->lock, flags); idx = srcu_read_lock(&kvm->srcu); if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) -- 2.37.2 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization 2022-09-16 0:54 [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 1/4] KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 2/4] KVM: x86/xen: Ensure kvm_xen_schedop_poll() " Michal Luczaj @ 2022-09-16 0:54 ` Michal Luczaj 2022-09-16 17:12 ` Sean Christopherson 2022-09-16 0:54 ` [RFC PATCH 4/4] KVM: x86/xen: Test shinfo_cache lock races Michal Luczaj 3 siblings, 1 reply; 34+ messages in thread From: Michal Luczaj @ 2022-09-16 0:54 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, shuah, Michal Luczaj There are race conditions possible due to kvm_gfn_to_pfn_cache_init()'s ability to _re_initialize gfn_to_pfn_cache.lock. For example: a race between ioctl(KVM_XEN_HVM_EVTCHN_SEND) and kvm_gfn_to_pfn_cache_init() leads to a corrupted shinfo gpc lock. (thread 1) | (thread 2) | kvm_xen_set_evtchn_fast | read_lock_irqsave(&gpc->lock, ...) | | kvm_gfn_to_pfn_cache_init | rwlock_init(&gpc->lock) read_unlock_irqrestore(&gpc->lock, ...) | Introduce bool locks_initialized. Signed-off-by: Michal Luczaj <mhal@rbox.co> --- include/linux/kvm_types.h | 1 + virt/kvm/pfncache.c | 7 +++++-- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index 3ca3db020e0e..7e7b7667cd9e 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -74,6 +74,7 @@ struct gfn_to_pfn_cache { void *khva; kvm_pfn_t pfn; enum pfn_cache_usage usage; + bool locks_initialized; bool active; bool valid; }; diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 68ff41d39545..564607e10586 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -354,8 +354,11 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage); if (!gpc->active) { - rwlock_init(&gpc->lock); - mutex_init(&gpc->refresh_lock); + if (!gpc->locks_initialized) { + rwlock_init(&gpc->lock); + mutex_init(&gpc->refresh_lock); + gpc->locks_initialized = true; + } gpc->khva = NULL; gpc->pfn = KVM_PFN_ERR_FAULT; -- 2.37.2 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization 2022-09-16 0:54 ` [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization Michal Luczaj @ 2022-09-16 17:12 ` Sean Christopherson 2022-09-18 23:13 ` Michal Luczaj ` (2 more replies) 0 siblings, 3 replies; 34+ messages in thread From: Sean Christopherson @ 2022-09-16 17:12 UTC (permalink / raw) To: Michal Luczaj; +Cc: kvm, pbonzini, shuah On Fri, Sep 16, 2022, Michal Luczaj wrote: > There are race conditions possible due to kvm_gfn_to_pfn_cache_init()'s > ability to _re_initialize gfn_to_pfn_cache.lock. > > For example: a race between ioctl(KVM_XEN_HVM_EVTCHN_SEND) and > kvm_gfn_to_pfn_cache_init() leads to a corrupted shinfo gpc lock. > > (thread 1) | (thread 2) > | > kvm_xen_set_evtchn_fast | > read_lock_irqsave(&gpc->lock, ...) | > | kvm_gfn_to_pfn_cache_init > | rwlock_init(&gpc->lock) > read_unlock_irqrestore(&gpc->lock, ...) | > Please explicitly include a sample call stack for reaching kvm_gfn_to_pfn_cache_init(). Without that, it's difficult to understand if this is a bug in the gfn_to_pfn_cache code, or if it's a bug in the caller. > Introduce bool locks_initialized. > > Signed-off-by: Michal Luczaj <mhal@rbox.co> > --- > include/linux/kvm_types.h | 1 + > virt/kvm/pfncache.c | 7 +++++-- > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h > index 3ca3db020e0e..7e7b7667cd9e 100644 > --- a/include/linux/kvm_types.h > +++ b/include/linux/kvm_types.h > @@ -74,6 +74,7 @@ struct gfn_to_pfn_cache { > void *khva; > kvm_pfn_t pfn; > enum pfn_cache_usage usage; > + bool locks_initialized; > bool active; > bool valid; > }; > diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c > index 68ff41d39545..564607e10586 100644 > --- a/virt/kvm/pfncache.c > +++ b/virt/kvm/pfncache.c > @@ -354,8 +354,11 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, > WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage); > > if (!gpc->active) { > - rwlock_init(&gpc->lock); > - mutex_init(&gpc->refresh_lock); > + if (!gpc->locks_initialized) { Rather than add another flag, move the lock initialization to another helper and call the new helper from e.g. kvm_xen_init_vm() and kvm_xen_init_vcpu(). There is zero reason to initialize locks on-demand. That way, patches 1 and 2 aren't necessary because gpc->lock is always valid. And then at the same time, rename "cache_init" and "cache_destroy" to activate+deactivate to avoid implying that the cache really is destroyed/freed. And Adding a true init() API will also allow for future cleanups. @kvm, @vcpu, @len, and @usage all should be immutable in the sense that they are properties of the cache, i.e. can be moved into init(). The nice side effect of moving most of that stuff into init() is that it makes it very obvious from the activate() call sites that the gpa is the only mutable information. I.e. as additional patches, do: 1. Formalize "gpc" as the acronym and use it in function names to reduce line lengths. Maybe keep the long name for gfn_to_pfn_cache_invalidate_start() though? Since that is a very different API. 2. Snapshot @usage during kvm_gpc_init(). 3. Add a KVM backpointer in "struct gfn_to_pfn_cache" and snapshot it during initialization. The extra memory cost is negligible in the grand scheme, and not having to constantly provide @kvm makes the call sites prettier, and it avoids weirdness where @kvm is mandatory but @vcpu is not. 4. Add a backpointer for @vcpu too, since again the memory overhead is minor, and taking @vcpu in "activate" implies that it's legal to share a cache between multiple vCPUs, which is not the case. And opportunistically add a WARN to assert that @vcpu is non-NULL if KVM_GUEST_USES_PFN. 5. Snapshot @len during kvm_gcp_init(). so that we end up with something like (completely untested): bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len) int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len) void kvm_gpc_unmap(struct gfn_to_pfn_cache *gpc) void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm, enum pfn_cache_usage usage, unsigned long len) { WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage); WARN_ON_ONCE((usage & KVM_GUEST_USES_PFN) && !vcpu); rwlock_init(&gpc->lock); mutex_init(&gpc->refresh_lock); gpc->kvm = kvm; gpc->vcpu = vcpu; gpc->usage = usage; gpc->len = len; } EXPORT_SYMBOL_GPL(kvm_gpc_init); int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa) { if (!gpc->active) { gpc->khva = NULL; gpc->pfn = KVM_PFN_ERR_FAULT; gpc->uhva = KVM_HVA_ERR_BAD; gpc->valid = false; gpc->active = true; spin_lock(&gcp->kvm->gpc_lock); list_add(&gpc->list, &gcp->kvm->gpc_list); spin_unlock(&gcp->kvm->gpc_lock); } return kvm_gpc_refresh(gpc, gpa); } EXPORT_SYMBOL_GPL(kvm_gpc_activate); void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc) { if (gpc->active) { spin_lock(&gpc->kvm->gpc_lock); list_del(&gpc->list); spin_unlock(&gpc->kvm->gpc_lock); kvm_gpc_unmap(gpc); gpc->active = false; } } EXPORT_SYMBOL_GPL(kvm_gpc_deactivate); Let me know if yout want to take on the above cleanups, if not I'll add them to my todo list. Thanks! ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization 2022-09-16 17:12 ` Sean Christopherson @ 2022-09-18 23:13 ` Michal Luczaj 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj 2 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-18 23:13 UTC (permalink / raw) To: Sean Christopherson; +Cc: kvm, pbonzini, shuah On 9/16/22 19:12, Sean Christopherson wrote: > On Fri, Sep 16, 2022, Michal Luczaj wrote: >> For example: a race between ioctl(KVM_XEN_HVM_EVTCHN_SEND) and >> kvm_gfn_to_pfn_cache_init() leads to a corrupted shinfo gpc lock. >> >> (thread 1) | (thread 2) >> | >> kvm_xen_set_evtchn_fast | >> read_lock_irqsave(&gpc->lock, ...) | >> | kvm_gfn_to_pfn_cache_init >> | rwlock_init(&gpc->lock) >> read_unlock_irqrestore(&gpc->lock, ...) | >> > > Please explicitly include a sample call stack for reaching kvm_gfn_to_pfn_cache_init(). > Without that, it's difficult to understand if this is a bug in the gfn_to_pfn_cache > code, or if it's a bug in the caller. OK, I'll try to be more specific. > Rather than add another flag, (...) > Let me know if yout want to take on the above cleanups, if not I'll add them to > my todo list. Sure, I'll do it. Thanks for all the suggestions, Michal ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix 2022-09-16 17:12 ` Sean Christopherson 2022-09-18 23:13 ` Michal Luczaj @ 2022-09-21 2:01 ` Michal Luczaj 2022-09-21 2:01 ` [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj ` (7 more replies) 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj 2 siblings, 8 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-21 2:01 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Here are some clean ups following Sean's suggestions and a single fix for a race condition. Michal Luczaj (8): KVM: x86: Add initializer for gfn_to_pfn_cache KVM: x86: Shorten gfn_to_pfn_cache function names KVM: x86: Remove unused argument in gpc_unmap_khva() KVM: x86: Store immutable gfn_to_pfn_cache properties KVM: x86: Clean up kvm_gpc_check() KVM: x86: Clean up hva_to_pfn_retry() KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() arch/x86/kvm/x86.c | 24 ++++++------ arch/x86/kvm/xen.c | 78 +++++++++++++++++-------------------- include/linux/kvm_host.h | 64 ++++++++++++++++--------------- include/linux/kvm_types.h | 2 + virt/kvm/pfncache.c | 81 +++++++++++++++++++++------------------ 5 files changed, 128 insertions(+), 121 deletions(-) -- 2.37.3 ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj @ 2022-09-21 2:01 ` Michal Luczaj 2022-10-10 23:38 ` Sean Christopherson 2022-09-21 2:01 ` [PATCH 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj ` (6 subsequent siblings) 7 siblings, 1 reply; 34+ messages in thread From: Michal Luczaj @ 2022-09-21 2:01 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Move the gfn_to_pfn_cache lock initialization to another helper and call the new helper during VM/vCPU creation. Rename "cache_init" and "cache_destroy" to activate+deactivate to avoid implying that the cache really is destroyed/freed. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/x86.c | 12 +++++---- arch/x86/kvm/xen.c | 57 +++++++++++++++++++++------------------- include/linux/kvm_host.h | 23 +++++++++++----- virt/kvm/pfncache.c | 21 ++++++++------- 4 files changed, 65 insertions(+), 48 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 43a6a7efc6ec..15032b7f0589 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2301,11 +2301,11 @@ static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time, /* we verify if the enable bit is set... */ if (system_time & 1) { - kvm_gfn_to_pfn_cache_init(vcpu->kvm, &vcpu->arch.pv_time, vcpu, - KVM_HOST_USES_PFN, system_time & ~1ULL, - sizeof(struct pvclock_vcpu_time_info)); + kvm_gpc_activate(vcpu->kvm, &vcpu->arch.pv_time, vcpu, + KVM_HOST_USES_PFN, system_time & ~1ULL, + sizeof(struct pvclock_vcpu_time_info)); } else { - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.pv_time); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time); } return; @@ -3374,7 +3374,7 @@ static int kvm_pv_enable_async_pf_int(struct kvm_vcpu *vcpu, u64 data) static void kvmclock_reset(struct kvm_vcpu *vcpu) { - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.pv_time); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time); vcpu->arch.time = 0; } @@ -11551,6 +11551,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.regs_avail = ~0; vcpu->arch.regs_dirty = ~0; + kvm_gpc_init(&vcpu->arch.pv_time); + if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu)) vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; else diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 280cb5dc7341..cecf8299b187 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -42,13 +42,13 @@ static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn) int idx = srcu_read_lock(&kvm->srcu); if (gfn == GPA_INVALID) { - kvm_gfn_to_pfn_cache_destroy(kvm, gpc); + kvm_gpc_deactivate(kvm, gpc); goto out; } do { - ret = kvm_gfn_to_pfn_cache_init(kvm, gpc, NULL, KVM_HOST_USES_PFN, - gpa, PAGE_SIZE); + ret = kvm_gpc_activate(kvm, gpc, NULL, KVM_HOST_USES_PFN, gpa, + PAGE_SIZE); if (ret) goto out; @@ -554,15 +554,15 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) offsetof(struct compat_vcpu_info, time)); if (data->u.gpa == GPA_INVALID) { - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache); r = 0; break; } - r = kvm_gfn_to_pfn_cache_init(vcpu->kvm, - &vcpu->arch.xen.vcpu_info_cache, - NULL, KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct vcpu_info)); + r = kvm_gpc_activate(vcpu->kvm, + &vcpu->arch.xen.vcpu_info_cache, NULL, + KVM_HOST_USES_PFN, data->u.gpa, + sizeof(struct vcpu_info)); if (!r) kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); @@ -570,16 +570,16 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO: if (data->u.gpa == GPA_INVALID) { - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, - &vcpu->arch.xen.vcpu_time_info_cache); + kvm_gpc_deactivate(vcpu->kvm, + &vcpu->arch.xen.vcpu_time_info_cache); r = 0; break; } - r = kvm_gfn_to_pfn_cache_init(vcpu->kvm, - &vcpu->arch.xen.vcpu_time_info_cache, - NULL, KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct pvclock_vcpu_time_info)); + r = kvm_gpc_activate(vcpu->kvm, + &vcpu->arch.xen.vcpu_time_info_cache, + NULL, KVM_HOST_USES_PFN, data->u.gpa, + sizeof(struct pvclock_vcpu_time_info)); if (!r) kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); break; @@ -590,16 +590,15 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) break; } if (data->u.gpa == GPA_INVALID) { - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, - &vcpu->arch.xen.runstate_cache); + kvm_gpc_deactivate(vcpu->kvm, + &vcpu->arch.xen.runstate_cache); r = 0; break; } - r = kvm_gfn_to_pfn_cache_init(vcpu->kvm, - &vcpu->arch.xen.runstate_cache, - NULL, KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct vcpu_runstate_info)); + r = kvm_gpc_activate(vcpu->kvm, &vcpu->arch.xen.runstate_cache, + NULL, KVM_HOST_USES_PFN, data->u.gpa, + sizeof(struct vcpu_runstate_info)); break; case KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT: @@ -1817,7 +1816,12 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu) { vcpu->arch.xen.vcpu_id = vcpu->vcpu_idx; vcpu->arch.xen.poll_evtchn = 0; + timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0); + + kvm_gpc_init(&vcpu->arch.xen.runstate_cache); + kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache); + kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache); } void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu) @@ -1825,18 +1829,17 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu) if (kvm_xen_timer_enabled(vcpu)) kvm_xen_stop_timer(vcpu); - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, - &vcpu->arch.xen.runstate_cache); - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, - &vcpu->arch.xen.vcpu_info_cache); - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, - &vcpu->arch.xen.vcpu_time_info_cache); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.runstate_cache); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_time_info_cache); + del_timer_sync(&vcpu->arch.xen.poll_timer); } void kvm_xen_init_vm(struct kvm *kvm) { idr_init(&kvm->arch.xen.evtchn_ports); + kvm_gpc_init(&kvm->arch.xen.shinfo_cache); } void kvm_xen_destroy_vm(struct kvm *kvm) @@ -1844,7 +1847,7 @@ void kvm_xen_destroy_vm(struct kvm *kvm) struct evtchnfd *evtchnfd; int i; - kvm_gfn_to_pfn_cache_destroy(kvm, &kvm->arch.xen.shinfo_cache); + kvm_gpc_deactivate(kvm, &kvm->arch.xen.shinfo_cache); idr_for_each_entry(&kvm->arch.xen.evtchn_ports, evtchnfd, i) { if (!evtchnfd->deliver.port.port) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f4519d3689e1..9fd67026d91a 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1241,8 +1241,17 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data, void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn); /** - * kvm_gfn_to_pfn_cache_init - prepare a cached kernel mapping and HPA for a - * given guest physical address. + * kvm_gpc_init - initialize gfn_to_pfn_cache. + * + * @gpc: struct gfn_to_pfn_cache object. + * + * This sets up a gfn_to_pfn_cache by initializing locks. + */ +void kvm_gpc_init(struct gfn_to_pfn_cache *gpc); + +/** + * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest + * physical address. * * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. @@ -1266,9 +1275,9 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn); * kvm_gfn_to_pfn_cache_check() to ensure that the cache is valid before * accessing the target page. */ -int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, - gpa_t gpa, unsigned long len); +int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, + struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, + gpa_t gpa, unsigned long len); /** * kvm_gfn_to_pfn_cache_check - check validity of a gfn_to_pfn_cache. @@ -1325,7 +1334,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); /** - * kvm_gfn_to_pfn_cache_destroy - destroy and unlink a gfn_to_pfn_cache. + * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache. * * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. @@ -1333,7 +1342,7 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); * This removes a cache from the @kvm's list to be processed on MMU notifier * invocation. */ -void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); +void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); void kvm_sigset_activate(struct kvm_vcpu *vcpu); void kvm_sigset_deactivate(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 68ff41d39545..08f97cf97264 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -346,17 +346,20 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) } EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_unmap); +void kvm_gpc_init(struct gfn_to_pfn_cache *gpc) +{ + rwlock_init(&gpc->lock); + mutex_init(&gpc->refresh_lock); +} +EXPORT_SYMBOL_GPL(kvm_gpc_init); -int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, - gpa_t gpa, unsigned long len) +int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, + struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, + gpa_t gpa, unsigned long len) { WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage); if (!gpc->active) { - rwlock_init(&gpc->lock); - mutex_init(&gpc->refresh_lock); - gpc->khva = NULL; gpc->pfn = KVM_PFN_ERR_FAULT; gpc->uhva = KVM_HVA_ERR_BAD; @@ -371,9 +374,9 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, } return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len); } -EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_init); +EXPORT_SYMBOL_GPL(kvm_gpc_activate); -void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) +void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) { if (gpc->active) { spin_lock(&kvm->gpc_lock); @@ -384,4 +387,4 @@ void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) gpc->active = false; } } -EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_destroy); +EXPORT_SYMBOL_GPL(kvm_gpc_deactivate); -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache 2022-09-21 2:01 ` [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj @ 2022-10-10 23:38 ` Sean Christopherson 0 siblings, 0 replies; 34+ messages in thread From: Sean Christopherson @ 2022-10-10 23:38 UTC (permalink / raw) To: Michal Luczaj; +Cc: kvm, pbonzini On Wed, Sep 21, 2022, Michal Luczaj wrote: > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index f4519d3689e1..9fd67026d91a 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -1241,8 +1241,17 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data, > void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn); > > /** > - * kvm_gfn_to_pfn_cache_init - prepare a cached kernel mapping and HPA for a > - * given guest physical address. > + * kvm_gpc_init - initialize gfn_to_pfn_cache. > + * > + * @gpc: struct gfn_to_pfn_cache object. > + * > + * This sets up a gfn_to_pfn_cache by initializing locks. I think it makes sense to add a blurb calling out that the cache needs to be zeroed before init, i.e. needs to be zero-allocated. I'll add it in v2. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj 2022-09-21 2:01 ` [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj @ 2022-09-21 2:01 ` Michal Luczaj 2022-09-21 2:01 ` [PATCH 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj ` (5 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-21 2:01 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Formalize "gpc" as the acronym and use it in function names. No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/x86.c | 8 ++++---- arch/x86/kvm/xen.c | 29 ++++++++++++++--------------- include/linux/kvm_host.h | 21 ++++++++++----------- virt/kvm/pfncache.c | 20 ++++++++++---------- 4 files changed, 38 insertions(+), 40 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 15032b7f0589..45136ce7185e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3020,12 +3020,12 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v, unsigned long flags; read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, - offset + sizeof(*guest_hv_clock))) { + while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, + offset + sizeof(*guest_hv_clock))) { read_unlock_irqrestore(&gpc->lock, flags); - if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa, - offset + sizeof(*guest_hv_clock))) + if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, + offset + sizeof(*guest_hv_clock))) return; read_lock_irqsave(&gpc->lock, flags); diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index cecf8299b187..361f77dc7a3d 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -218,15 +218,14 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) user_len = sizeof(struct compat_vcpu_runstate_info); read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, - user_len)) { + while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, user_len)) { read_unlock_irqrestore(&gpc->lock, flags); /* When invoked from kvm_sched_out() we cannot sleep */ if (state == RUNSTATE_runnable) return; - if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa, user_len)) + if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, user_len)) return; read_lock_irqsave(&gpc->lock, flags); @@ -352,12 +351,12 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v) * little more honest about it. */ read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, + sizeof(struct vcpu_info))) { read_unlock_irqrestore(&gpc->lock, flags); - if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) + if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, + sizeof(struct vcpu_info))) return; read_lock_irqsave(&gpc->lock, flags); @@ -417,8 +416,8 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v) sizeof_field(struct compat_vcpu_info, evtchn_upcall_pending)); read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, + sizeof(struct vcpu_info))) { read_unlock_irqrestore(&gpc->lock, flags); /* @@ -432,8 +431,8 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v) if (in_atomic() || !task_is_running(current)) return 1; - if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, + sizeof(struct vcpu_info))) { /* * If this failed, userspace has screwed up the * vcpu_info mapping. No interrupts for you. @@ -966,7 +965,7 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, read_lock_irqsave(&gpc->lock, flags); idx = srcu_read_lock(&kvm->srcu); - if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) + if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) goto out_rcu; ret = false; @@ -1358,7 +1357,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) idx = srcu_read_lock(&kvm->srcu); read_lock_irqsave(&gpc->lock, flags); - if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) + if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) goto out_rcu; if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) { @@ -1392,7 +1391,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) gpc = &vcpu->arch.xen.vcpu_info_cache; read_lock_irqsave(&gpc->lock, flags); - if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) { + if (!kvm_gpc_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) { /* * Could not access the vcpu_info. Set the bit in-kernel * and prod the vCPU to deliver it for itself. @@ -1490,7 +1489,7 @@ static int kvm_xen_set_evtchn(struct kvm_xen_evtchn *xe, struct kvm *kvm) break; idx = srcu_read_lock(&kvm->srcu); - rc = kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE); + rc = kvm_gpc_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE); srcu_read_unlock(&kvm->srcu, idx); } while(!rc); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9fd67026d91a..f687e56c24bc 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1271,16 +1271,15 @@ void kvm_gpc_init(struct gfn_to_pfn_cache *gpc); * -EFAULT for an untranslatable guest physical address. * * This primes a gfn_to_pfn_cache and links it into the @kvm's list for - * invalidations to be processed. Callers are required to use - * kvm_gfn_to_pfn_cache_check() to ensure that the cache is valid before - * accessing the target page. + * invalidations to be processed. Callers are required to use kvm_gpc_check() + * to ensure that the cache is valid before accessing the target page. */ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, gpa_t gpa, unsigned long len); /** - * kvm_gfn_to_pfn_cache_check - check validity of a gfn_to_pfn_cache. + * kvm_gpc_check - check validity of a gfn_to_pfn_cache. * * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. @@ -1297,11 +1296,11 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, * Callers in IN_GUEST_MODE may do so without locking, although they should * still hold a read lock on kvm->scru for the memslot checks. */ -bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - gpa_t gpa, unsigned long len); +bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, + unsigned long len); /** - * kvm_gfn_to_pfn_cache_refresh - update a previously initialized cache. + * kvm_gpc_refresh - update a previously initialized cache. * * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. @@ -1318,11 +1317,11 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, * still lock and check the cache status, as this function does not return * with the lock still held to permit access. */ -int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - gpa_t gpa, unsigned long len); +int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, + unsigned long len); /** - * kvm_gfn_to_pfn_cache_unmap - temporarily unmap a gfn_to_pfn_cache. + * kvm_gpc_unmap - temporarily unmap a gfn_to_pfn_cache. * * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. @@ -1331,7 +1330,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, * but at least the mapping from GPA to userspace HVA will remain cached * and can be reused on a subsequent refresh. */ -void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); +void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); /** * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache. diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 08f97cf97264..cc65fab0dbef 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -76,8 +76,8 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start, } } -bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - gpa_t gpa, unsigned long len) +bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, + unsigned long len) { struct kvm_memslots *slots = kvm_memslots(kvm); @@ -93,7 +93,7 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, return true; } -EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_check); +EXPORT_SYMBOL_GPL(kvm_gpc_check); static void gpc_unmap_khva(struct kvm *kvm, kvm_pfn_t pfn, void *khva) { @@ -235,8 +235,8 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) return -EFAULT; } -int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - gpa_t gpa, unsigned long len) +int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, + unsigned long len) { struct kvm_memslots *slots = kvm_memslots(kvm); unsigned long page_offset = gpa & ~PAGE_MASK; @@ -317,9 +317,9 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, return ret; } -EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_refresh); +EXPORT_SYMBOL_GPL(kvm_gpc_refresh); -void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) +void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) { void *old_khva; kvm_pfn_t old_pfn; @@ -344,7 +344,7 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) gpc_unmap_khva(kvm, old_pfn, old_khva); } -EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_unmap); +EXPORT_SYMBOL_GPL(kvm_gpc_unmap); void kvm_gpc_init(struct gfn_to_pfn_cache *gpc) { @@ -372,7 +372,7 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, list_add(&gpc->list, &kvm->gpc_list); spin_unlock(&kvm->gpc_lock); } - return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len); + return kvm_gpc_refresh(kvm, gpc, gpa, len); } EXPORT_SYMBOL_GPL(kvm_gpc_activate); @@ -383,7 +383,7 @@ void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) list_del(&gpc->list); spin_unlock(&kvm->gpc_lock); - kvm_gfn_to_pfn_cache_unmap(kvm, gpc); + kvm_gpc_unmap(kvm, gpc); gpc->active = false; } } -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj 2022-09-21 2:01 ` [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj 2022-09-21 2:01 ` [PATCH 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj @ 2022-09-21 2:01 ` Michal Luczaj 2022-09-21 2:01 ` [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj ` (4 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-21 2:01 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj First argument is never used, remove it. Signed-off-by: Michal Luczaj <mhal@rbox.co> --- virt/kvm/pfncache.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index cc65fab0dbef..76f1b669cf28 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -95,7 +95,7 @@ bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, } EXPORT_SYMBOL_GPL(kvm_gpc_check); -static void gpc_unmap_khva(struct kvm *kvm, kvm_pfn_t pfn, void *khva) +static void gpc_unmap_khva(kvm_pfn_t pfn, void *khva) { /* Unmap the old pfn/page if it was mapped before. */ if (!is_error_noslot_pfn(pfn) && khva) { @@ -174,7 +174,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) * the existing mapping and didn't create a new one. */ if (new_khva != old_khva) - gpc_unmap_khva(kvm, new_pfn, new_khva); + gpc_unmap_khva(new_pfn, new_khva); kvm_release_pfn_clean(new_pfn); @@ -313,7 +313,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, mutex_unlock(&gpc->refresh_lock); if (old_pfn != new_pfn) - gpc_unmap_khva(kvm, old_pfn, old_khva); + gpc_unmap_khva(old_pfn, old_khva); return ret; } @@ -342,7 +342,7 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) write_unlock_irq(&gpc->lock); mutex_unlock(&gpc->refresh_lock); - gpc_unmap_khva(kvm, old_pfn, old_khva); + gpc_unmap_khva(old_pfn, old_khva); } EXPORT_SYMBOL_GPL(kvm_gpc_unmap); -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj ` (2 preceding siblings ...) 2022-09-21 2:01 ` [PATCH 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj @ 2022-09-21 2:01 ` Michal Luczaj 2022-10-10 23:42 ` Sean Christopherson 2022-10-11 0:37 ` Sean Christopherson 2022-09-21 2:01 ` [PATCH 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj ` (3 subsequent siblings) 7 siblings, 2 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-21 2:01 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Move the assignment of immutable properties @kvm, @vcpu, @usage, @len to the initializer. Make _init(), _activate() and _deactivate() use stored values. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/x86.c | 16 ++++++------- arch/x86/kvm/xen.c | 50 ++++++++++++++++++--------------------- include/linux/kvm_host.h | 40 +++++++++++++++---------------- include/linux/kvm_types.h | 2 ++ virt/kvm/pfncache.c | 36 +++++++++++++++------------- 5 files changed, 72 insertions(+), 72 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 45136ce7185e..ed8e4f8c9cf0 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2300,13 +2300,10 @@ static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time, kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu); /* we verify if the enable bit is set... */ - if (system_time & 1) { - kvm_gpc_activate(vcpu->kvm, &vcpu->arch.pv_time, vcpu, - KVM_HOST_USES_PFN, system_time & ~1ULL, - sizeof(struct pvclock_vcpu_time_info)); - } else { - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time); - } + if (system_time & 1) + kvm_gpc_activate(&vcpu->arch.pv_time, system_time & ~1ULL); + else + kvm_gpc_deactivate(&vcpu->arch.pv_time); return; } @@ -3374,7 +3371,7 @@ static int kvm_pv_enable_async_pf_int(struct kvm_vcpu *vcpu, u64 data) static void kvmclock_reset(struct kvm_vcpu *vcpu) { - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time); + kvm_gpc_deactivate(&vcpu->arch.pv_time); vcpu->arch.time = 0; } @@ -11551,7 +11548,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.regs_avail = ~0; vcpu->arch.regs_dirty = ~0; - kvm_gpc_init(&vcpu->arch.pv_time); + kvm_gpc_init(&vcpu->arch.pv_time, vcpu->kvm, vcpu, KVM_HOST_USES_PFN, + sizeof(struct pvclock_vcpu_time_info)); if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu)) vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 361f77dc7a3d..9b4b0e6e66e5 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -42,13 +42,12 @@ static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn) int idx = srcu_read_lock(&kvm->srcu); if (gfn == GPA_INVALID) { - kvm_gpc_deactivate(kvm, gpc); + kvm_gpc_deactivate(gpc); goto out; } do { - ret = kvm_gpc_activate(kvm, gpc, NULL, KVM_HOST_USES_PFN, gpa, - PAGE_SIZE); + ret = kvm_gpc_activate(gpc, gpa); if (ret) goto out; @@ -553,15 +552,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) offsetof(struct compat_vcpu_info, time)); if (data->u.gpa == GPA_INVALID) { - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache); r = 0; break; } - r = kvm_gpc_activate(vcpu->kvm, - &vcpu->arch.xen.vcpu_info_cache, NULL, - KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct vcpu_info)); + r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_info_cache, + data->u.gpa); if (!r) kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); @@ -569,16 +566,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO: if (data->u.gpa == GPA_INVALID) { - kvm_gpc_deactivate(vcpu->kvm, - &vcpu->arch.xen.vcpu_time_info_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_time_info_cache); r = 0; break; } - r = kvm_gpc_activate(vcpu->kvm, - &vcpu->arch.xen.vcpu_time_info_cache, - NULL, KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct pvclock_vcpu_time_info)); + r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_time_info_cache, + data->u.gpa); if (!r) kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); break; @@ -589,15 +583,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) break; } if (data->u.gpa == GPA_INVALID) { - kvm_gpc_deactivate(vcpu->kvm, - &vcpu->arch.xen.runstate_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.runstate_cache); r = 0; break; } - r = kvm_gpc_activate(vcpu->kvm, &vcpu->arch.xen.runstate_cache, - NULL, KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct vcpu_runstate_info)); + r = kvm_gpc_activate(&vcpu->arch.xen.runstate_cache, + data->u.gpa); break; case KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT: @@ -1818,9 +1810,12 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu) timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0); - kvm_gpc_init(&vcpu->arch.xen.runstate_cache); - kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache); - kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache); + kvm_gpc_init(&vcpu->arch.xen.runstate_cache, vcpu->kvm, NULL, + KVM_HOST_USES_PFN, sizeof(struct vcpu_runstate_info)); + kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache, vcpu->kvm, NULL, + KVM_HOST_USES_PFN, sizeof(struct vcpu_info)); + kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache, vcpu->kvm, NULL, + KVM_HOST_USES_PFN, sizeof(struct pvclock_vcpu_time_info)); } void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu) @@ -1828,9 +1823,9 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu) if (kvm_xen_timer_enabled(vcpu)) kvm_xen_stop_timer(vcpu); - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.runstate_cache); - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache); - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_time_info_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.runstate_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_time_info_cache); del_timer_sync(&vcpu->arch.xen.poll_timer); } @@ -1838,7 +1833,8 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu) void kvm_xen_init_vm(struct kvm *kvm) { idr_init(&kvm->arch.xen.evtchn_ports); - kvm_gpc_init(&kvm->arch.xen.shinfo_cache); + kvm_gpc_init(&kvm->arch.xen.shinfo_cache, kvm, NULL, KVM_HOST_USES_PFN, + PAGE_SIZE); } void kvm_xen_destroy_vm(struct kvm *kvm) @@ -1846,7 +1842,7 @@ void kvm_xen_destroy_vm(struct kvm *kvm) struct evtchnfd *evtchnfd; int i; - kvm_gpc_deactivate(kvm, &kvm->arch.xen.shinfo_cache); + kvm_gpc_deactivate(&kvm->arch.xen.shinfo_cache); idr_for_each_entry(&kvm->arch.xen.evtchn_ports, evtchnfd, i) { if (!evtchnfd->deliver.port.port) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f687e56c24bc..024b8df5302c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1244,17 +1244,7 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn); * kvm_gpc_init - initialize gfn_to_pfn_cache. * * @gpc: struct gfn_to_pfn_cache object. - * - * This sets up a gfn_to_pfn_cache by initializing locks. - */ -void kvm_gpc_init(struct gfn_to_pfn_cache *gpc); - -/** - * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest - * physical address. - * * @kvm: pointer to kvm instance. - * @gpc: struct gfn_to_pfn_cache object. * @vcpu: vCPU to be used for marking pages dirty and to be woken on * invalidation. * @usage: indicates if the resulting host physical PFN is used while @@ -1263,20 +1253,31 @@ void kvm_gpc_init(struct gfn_to_pfn_cache *gpc); * changes!---will also force @vcpu to exit the guest and * refresh the cache); and/or if the PFN used directly * by KVM (and thus needs a kernel virtual mapping). - * @gpa: guest physical address to map. * @len: sanity check; the range being access must fit a single page. * + * This sets up a gfn_to_pfn_cache by initializing locks and assigning the + * immutable attributes. + */ +void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm, + struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, + unsigned long len); + +/** + * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest + * physical address. + * + * @gpc: struct gfn_to_pfn_cache object. + * @gpa: guest physical address to map. + * * @return: 0 for success. * -EINVAL for a mapping which would cross a page boundary. - * -EFAULT for an untranslatable guest physical address. + * -EFAULT for an untranslatable guest physical address. * - * This primes a gfn_to_pfn_cache and links it into the @kvm's list for + * This primes a gfn_to_pfn_cache and links it into the @gpc->kvm's list for * invalidations to be processed. Callers are required to use kvm_gpc_check() * to ensure that the cache is valid before accessing the target page. */ -int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, - gpa_t gpa, unsigned long len); +int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa); /** * kvm_gpc_check - check validity of a gfn_to_pfn_cache. @@ -1335,13 +1336,12 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); /** * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache. * - * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. * - * This removes a cache from the @kvm's list to be processed on MMU notifier - * invocation. + * This removes a cache from the @gpc->kvm's list to be processed on MMU + * notifier invocation. */ -void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); +void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc); void kvm_sigset_activate(struct kvm_vcpu *vcpu); void kvm_sigset_deactivate(struct kvm_vcpu *vcpu); diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index 3ca3db020e0e..d66b276d29e0 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -67,12 +67,14 @@ struct gfn_to_pfn_cache { gpa_t gpa; unsigned long uhva; struct kvm_memory_slot *memslot; + struct kvm *kvm; struct kvm_vcpu *vcpu; struct list_head list; rwlock_t lock; struct mutex refresh_lock; void *khva; kvm_pfn_t pfn; + unsigned long len; enum pfn_cache_usage usage; bool active; bool valid; diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 76f1b669cf28..56ca0e9c6ed7 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -346,44 +346,48 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) } EXPORT_SYMBOL_GPL(kvm_gpc_unmap); -void kvm_gpc_init(struct gfn_to_pfn_cache *gpc) +void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm, + struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, + unsigned long len) { + WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage); + WARN_ON_ONCE((usage & KVM_GUEST_USES_PFN) && !vcpu); + rwlock_init(&gpc->lock); mutex_init(&gpc->refresh_lock); + + gpc->kvm = kvm; + gpc->vcpu = vcpu; + gpc->usage = usage; + gpc->len = len; } EXPORT_SYMBOL_GPL(kvm_gpc_init); -int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, - gpa_t gpa, unsigned long len) +int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa) { - WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage); - if (!gpc->active) { gpc->khva = NULL; gpc->pfn = KVM_PFN_ERR_FAULT; gpc->uhva = KVM_HVA_ERR_BAD; - gpc->vcpu = vcpu; - gpc->usage = usage; gpc->valid = false; gpc->active = true; - spin_lock(&kvm->gpc_lock); - list_add(&gpc->list, &kvm->gpc_list); - spin_unlock(&kvm->gpc_lock); + spin_lock(&gpc->kvm->gpc_lock); + list_add(&gpc->list, &gpc->kvm->gpc_list); + spin_unlock(&gpc->kvm->gpc_lock); } - return kvm_gpc_refresh(kvm, gpc, gpa, len); + return kvm_gpc_refresh(gpc->kvm, gpc, gpa, gpc->len); } EXPORT_SYMBOL_GPL(kvm_gpc_activate); -void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) +void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc) { if (gpc->active) { - spin_lock(&kvm->gpc_lock); + spin_lock(&gpc->kvm->gpc_lock); list_del(&gpc->list); - spin_unlock(&kvm->gpc_lock); + spin_unlock(&gpc->kvm->gpc_lock); - kvm_gpc_unmap(kvm, gpc); + kvm_gpc_unmap(gpc->kvm, gpc); gpc->active = false; } } -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties 2022-09-21 2:01 ` [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj @ 2022-10-10 23:42 ` Sean Christopherson 2022-10-11 0:37 ` Sean Christopherson 1 sibling, 0 replies; 34+ messages in thread From: Sean Christopherson @ 2022-10-10 23:42 UTC (permalink / raw) To: Michal Luczaj; +Cc: kvm, pbonzini On Wed, Sep 21, 2022, Michal Luczaj wrote: > +int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa) > { > - WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage); > - I think it's worth grabbing "kvm" in a local variable. It minimizes the diff, and we've had cleanup in other areas of KVM to replace repeated pointer chasing with a local variable. I'll do this in v2 as well. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties 2022-09-21 2:01 ` [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj 2022-10-10 23:42 ` Sean Christopherson @ 2022-10-11 0:37 ` Sean Christopherson 1 sibling, 0 replies; 34+ messages in thread From: Sean Christopherson @ 2022-10-11 0:37 UTC (permalink / raw) To: Michal Luczaj; +Cc: kvm, pbonzini On Wed, Sep 21, 2022, Michal Luczaj wrote: > Move the assignment of immutable properties @kvm, @vcpu, @usage, @len > to the initializer. Make _init(), _activate() and _deactivate() use > stored values. > > Suggested-by: Sean Christopherson <seanjc@google.com> > Signed-off-by: Michal Luczaj <mhal@rbox.co> > @@ -1818,9 +1810,12 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu) > > timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0); > > - kvm_gpc_init(&vcpu->arch.xen.runstate_cache); > - kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache); > - kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache); > + kvm_gpc_init(&vcpu->arch.xen.runstate_cache, vcpu->kvm, NULL, > + KVM_HOST_USES_PFN, sizeof(struct vcpu_runstate_info)); > + kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache, vcpu->kvm, NULL, > + KVM_HOST_USES_PFN, sizeof(struct vcpu_info)); Argh, KVM Xen doesn't actually treat these two as immutable values. I suspect you encountered this as well since check() and refresh() didn't drop @len. I'm 99% certain we can still make the length immutable, it'll just require a bit of massaging, i.e. extra patches. The vcpu_info_cache length is effectively immutable, the use of the common helper kvm_setup_guest_pvclock() just makes it less obvious. This can be addressed by making the param a "max_len" or "max_size" or whatever, e.g. so that accessing a subset still verifies the cache as a whole. The runstate_cache is messier since it actually consumes two different sizes, but that's arguably a bug that was introduced by commit a795cd43c5b5 ("KVM: x86/xen: Use gfn_to_pfn_cache for runstate area"). Prior to that, KVM always used the larger non-compat size. And KVM still uses the larger size during activation, i.e. KVM will "fail" activation but allow a sneaky 32-bit guest to access a rejected struct sitting at the very end of the page. I'm pretty sure that hole can be fixed without breaking KVM's ABI. With those addressed, the max length becomes immutable and @len can be dropped. I'll fiddle with this tomorrow and hopefully include patches for that in v2. diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index b2be60c6efa4..9e79ef2cca99 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -212,10 +212,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) if (!vx->runstate_cache.active) return; - if (IS_ENABLED(CONFIG_64BIT) && v->kvm->arch.xen.long_mode) - user_len = sizeof(struct vcpu_runstate_info); - else - user_len = sizeof(struct compat_vcpu_runstate_info); + user_len = sizeof(struct vcpu_runstate_info); read_lock_irqsave(&gpc->lock, flags); while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH 5/8] KVM: x86: Clean up kvm_gpc_check() 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj ` (3 preceding siblings ...) 2022-09-21 2:01 ` [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj @ 2022-09-21 2:01 ` Michal Luczaj 2022-09-21 2:01 ` [PATCH 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj ` (2 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-21 2:01 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Make kvm_gpc_check() use kvm instance cached in gfn_to_pfn_cache. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/x86.c | 2 +- arch/x86/kvm/xen.c | 14 ++++++-------- include/linux/kvm_host.h | 4 +--- virt/kvm/pfncache.c | 5 ++--- 4 files changed, 10 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ed8e4f8c9cf0..273f1ed3b5ae 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3017,7 +3017,7 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v, unsigned long flags; read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, + while (!kvm_gpc_check(gpc, gpc->gpa, offset + sizeof(*guest_hv_clock))) { read_unlock_irqrestore(&gpc->lock, flags); diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 9b4b0e6e66e5..84b95258773a 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -217,7 +217,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) user_len = sizeof(struct compat_vcpu_runstate_info); read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, user_len)) { + while (!kvm_gpc_check(gpc, gpc->gpa, user_len)) { read_unlock_irqrestore(&gpc->lock, flags); /* When invoked from kvm_sched_out() we cannot sleep */ @@ -350,8 +350,7 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v) * little more honest about it. */ read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) { read_unlock_irqrestore(&gpc->lock, flags); if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, @@ -415,8 +414,7 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v) sizeof_field(struct compat_vcpu_info, evtchn_upcall_pending)); read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) { read_unlock_irqrestore(&gpc->lock, flags); /* @@ -957,7 +955,7 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, read_lock_irqsave(&gpc->lock, flags); idx = srcu_read_lock(&kvm->srcu); - if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) + if (!kvm_gpc_check(gpc, gpc->gpa, PAGE_SIZE)) goto out_rcu; ret = false; @@ -1349,7 +1347,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) idx = srcu_read_lock(&kvm->srcu); read_lock_irqsave(&gpc->lock, flags); - if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) + if (!kvm_gpc_check(gpc, gpc->gpa, PAGE_SIZE)) goto out_rcu; if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) { @@ -1383,7 +1381,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) gpc = &vcpu->arch.xen.vcpu_info_cache; read_lock_irqsave(&gpc->lock, flags); - if (!kvm_gpc_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) { + if (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) { /* * Could not access the vcpu_info. Set the bit in-kernel * and prod the vCPU to deliver it for itself. diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 024b8df5302c..0b66d4889ec3 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1282,7 +1282,6 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa); /** * kvm_gpc_check - check validity of a gfn_to_pfn_cache. * - * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. * @gpa: current guest physical address to map. * @len: sanity check; the range being access must fit a single page. @@ -1297,8 +1296,7 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa); * Callers in IN_GUEST_MODE may do so without locking, although they should * still hold a read lock on kvm->scru for the memslot checks. */ -bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, - unsigned long len); +bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len); /** * kvm_gpc_refresh - update a previously initialized cache. diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 56ca0e9c6ed7..eb91025d7242 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -76,10 +76,9 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start, } } -bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, - unsigned long len) +bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len) { - struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memslots *slots = kvm_memslots(gpc->kvm); if ((gpa & ~PAGE_MASK) + len > PAGE_SIZE) return false; -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH 6/8] KVM: x86: Clean up hva_to_pfn_retry() 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj ` (4 preceding siblings ...) 2022-09-21 2:01 ` [PATCH 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj @ 2022-09-21 2:01 ` Michal Luczaj 2022-09-21 2:01 ` [PATCH 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj 2022-09-21 2:01 ` [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-21 2:01 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Make hva_to_pfn_retry() use kvm instance cached in gfn_to_pfn_cache. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- virt/kvm/pfncache.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index eb91025d7242..a2c95e393e34 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -135,7 +135,7 @@ static inline bool mmu_notifier_retry_cache(struct kvm *kvm, unsigned long mmu_s return kvm->mmu_invalidate_seq != mmu_seq; } -static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) +static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc) { /* Note, the new page offset may be different than the old! */ void *old_khva = gpc->khva - offset_in_page(gpc->khva); @@ -155,7 +155,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) gpc->valid = false; do { - mmu_seq = kvm->mmu_invalidate_seq; + mmu_seq = gpc->kvm->mmu_invalidate_seq; smp_rmb(); write_unlock_irq(&gpc->lock); @@ -213,7 +213,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) * attempting to refresh. */ WARN_ON_ONCE(gpc->valid); - } while (mmu_notifier_retry_cache(kvm, mmu_seq)); + } while (mmu_notifier_retry_cache(gpc->kvm, mmu_seq)); gpc->valid = true; gpc->pfn = new_pfn; @@ -285,7 +285,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, * drop the lock and do the HVA to PFN lookup again. */ if (!gpc->valid || old_uhva != gpc->uhva) { - ret = hva_to_pfn_retry(kvm, gpc); + ret = hva_to_pfn_retry(gpc); } else { /* If the HVA→PFN mapping was already valid, don't unmap it. */ old_pfn = KVM_PFN_ERR_FAULT; -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj ` (5 preceding siblings ...) 2022-09-21 2:01 ` [PATCH 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj @ 2022-09-21 2:01 ` Michal Luczaj 2022-09-21 2:01 ` [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-21 2:01 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Make kvm_gpc_refresh() use kvm instance cached in gfn_to_pfn_cache. First argument of kvm_gpc_unmap() becomes unneeded; remove it from function definition. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/x86.c | 2 +- arch/x86/kvm/xen.c | 10 ++++------ include/linux/kvm_host.h | 10 ++++------ virt/kvm/pfncache.c | 11 +++++------ 4 files changed, 14 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 273f1ed3b5ae..d24d1731d2bc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3021,7 +3021,7 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v, offset + sizeof(*guest_hv_clock))) { read_unlock_irqrestore(&gpc->lock, flags); - if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, + if (kvm_gpc_refresh(gpc, gpc->gpa, offset + sizeof(*guest_hv_clock))) return; diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 84b95258773a..bcaaa83290fb 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -224,7 +224,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) if (state == RUNSTATE_runnable) return; - if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, user_len)) + if (kvm_gpc_refresh(gpc, gpc->gpa, user_len)) return; read_lock_irqsave(&gpc->lock, flags); @@ -353,8 +353,7 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v) while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) { read_unlock_irqrestore(&gpc->lock, flags); - if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) + if (kvm_gpc_refresh(gpc, gpc->gpa, sizeof(struct vcpu_info))) return; read_lock_irqsave(&gpc->lock, flags); @@ -428,8 +427,7 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v) if (in_atomic() || !task_is_running(current)) return 1; - if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + if (kvm_gpc_refresh(gpc, gpc->gpa, sizeof(struct vcpu_info))) { /* * If this failed, userspace has screwed up the * vcpu_info mapping. No interrupts for you. @@ -1479,7 +1477,7 @@ static int kvm_xen_set_evtchn(struct kvm_xen_evtchn *xe, struct kvm *kvm) break; idx = srcu_read_lock(&kvm->srcu); - rc = kvm_gpc_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE); + rc = kvm_gpc_refresh(gpc, gpc->gpa, PAGE_SIZE); srcu_read_unlock(&kvm->srcu, idx); } while(!rc); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 0b66d4889ec3..efead11bec84 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1301,35 +1301,33 @@ bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len); /** * kvm_gpc_refresh - update a previously initialized cache. * - * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. * @gpa: updated guest physical address to map. * @len: sanity check; the range being access must fit a single page. * * @return: 0 for success. * -EINVAL for a mapping which would cross a page boundary. - * -EFAULT for an untranslatable guest physical address. + * -EFAULT for an untranslatable guest physical address. * * This will attempt to refresh a gfn_to_pfn_cache. Note that a successful - * returm from this function does not mean the page can be immediately + * return from this function does not mean the page can be immediately * accessed because it may have raced with an invalidation. Callers must * still lock and check the cache status, as this function does not return * with the lock still held to permit access. */ -int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, +int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len); /** * kvm_gpc_unmap - temporarily unmap a gfn_to_pfn_cache. * - * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. * * This unmaps the referenced page. The cache is left in the invalid state * but at least the mapping from GPA to userspace HVA will remain cached * and can be reused on a subsequent refresh. */ -void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); +void kvm_gpc_unmap(struct gfn_to_pfn_cache *gpc); /** * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache. diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index a2c95e393e34..45b9b96c0ea3 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -234,10 +234,9 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc) return -EFAULT; } -int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, - unsigned long len) +int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len) { - struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memslots *slots = kvm_memslots(gpc->kvm); unsigned long page_offset = gpa & ~PAGE_MASK; kvm_pfn_t old_pfn, new_pfn; unsigned long old_uhva; @@ -318,7 +317,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, } EXPORT_SYMBOL_GPL(kvm_gpc_refresh); -void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) +void kvm_gpc_unmap(struct gfn_to_pfn_cache *gpc) { void *old_khva; kvm_pfn_t old_pfn; @@ -375,7 +374,7 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa) list_add(&gpc->list, &gpc->kvm->gpc_list); spin_unlock(&gpc->kvm->gpc_lock); } - return kvm_gpc_refresh(gpc->kvm, gpc, gpa, gpc->len); + return kvm_gpc_refresh(gpc, gpa, gpc->len); } EXPORT_SYMBOL_GPL(kvm_gpc_activate); @@ -386,7 +385,7 @@ void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc) list_del(&gpc->list); spin_unlock(&gpc->kvm->gpc_lock); - kvm_gpc_unmap(gpc->kvm, gpc); + kvm_gpc_unmap(gpc); gpc->active = false; } } -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj ` (6 preceding siblings ...) 2022-09-21 2:01 ` [PATCH 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj @ 2022-09-21 2:01 ` Michal Luczaj 2022-10-10 23:28 ` Sean Christopherson 7 siblings, 1 reply; 34+ messages in thread From: Michal Luczaj @ 2022-09-21 2:01 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj There's a race between kvm_xen_set_evtchn_fast() and kvm_gpc_activate() resulting in a near-NULL pointer write. 1. Deactivate shinfo cache: kvm_xen_hvm_set_attr case KVM_XEN_ATTR_TYPE_SHARED_INFO kvm_gpc_deactivate kvm_gpc_unmap gpc->valid = false gpc->khva = NULL gpc->active = false Result: active = false, valid = false 2. Cause cache refresh: kvm_arch_vm_ioctl case KVM_XEN_HVM_EVTCHN_SEND kvm_xen_hvm_evtchn_send kvm_xen_set_evtchn kvm_xen_set_evtchn_fast kvm_gpc_check return -EWOULDBLOCK because !gpc->valid kvm_xen_set_evtchn_fast return -EWOULDBLOCK kvm_gpc_refresh hva_to_pfn_retry gpc->valid = true gpc->khva = not NULL Result: active = false, valid = true 3. Race ioctl KVM_XEN_HVM_EVTCHN_SEND against ioctl KVM_XEN_ATTR_TYPE_SHARED_INFO: kvm_arch_vm_ioctl case KVM_XEN_HVM_EVTCHN_SEND kvm_xen_hvm_evtchn_send kvm_xen_set_evtchn kvm_xen_set_evtchn_fast read_lock gpc->lock kvm_xen_hvm_set_attr case KVM_XEN_ATTR_TYPE_SHARED_INFO mutex_lock kvm->lock kvm_xen_shared_info_init kvm_gpc_activate gpc->khva = NULL kvm_gpc_check [ Check passes because gpc->valid is still true, even though gpc->khva is already NULL. ] shinfo = gpc->khva pending_bits = shinfo->evtchn_pending CRASH: test_and_set_bit(..., pending_bits) Protect kvm_gpc_activate() cache properties writes by write lock gpc->lock. Signed-off-by: Michal Luczaj <mhal@rbox.co> --- Attaching more details: [ 86.127703] BUG: kernel NULL pointer dereference, address: 0000000000000800 [ 86.127751] #PF: supervisor write access in kernel mode [ 86.127778] #PF: error_code(0x0002) - not-present page [ 86.127801] PGD 105792067 P4D 105792067 PUD 105793067 PMD 0 [ 86.127826] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 86.127850] CPU: 0 PID: 945 Comm: xen_shinfo_test Not tainted 6.0.0-rc5-test+ #31 [ 86.127874] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014 [ 86.127898] RIP: 0010:kvm_xen_set_evtchn_fast (./arch/x86/include/asm/bitops.h:138 ./include/asm-generic/bitops/instrumented-atomic.h:72 arch/x86/kvm/xen.c:1370) kvm [ 86.127960] Code: 5a 84 c0 0f 84 01 01 00 00 80 bb b4 9f 00 00 00 8b 45 00 48 8b 93 c8 a0 00 00 75 4d 41 89 c0 48 8d b2 80 08 00 00 41 c1 e8 05 <f0> 48 0f ab 82 00 08 00 00 0f 82 19 01 00 00 8b 45 00 48 0f a3 06 All code ======== 0: 5a pop %rdx 1: 84 c0 test %al,%al 3: 0f 84 01 01 00 00 je 0x10a 9: 80 bb b4 9f 00 00 00 cmpb $0x0,0x9fb4(%rbx) 10: 8b 45 00 mov 0x0(%rbp),%eax 13: 48 8b 93 c8 a0 00 00 mov 0xa0c8(%rbx),%rdx 1a: 75 4d jne 0x69 1c: 41 89 c0 mov %eax,%r8d 1f: 48 8d b2 80 08 00 00 lea 0x880(%rdx),%rsi 26: 41 c1 e8 05 shr $0x5,%r8d 2a:* f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) <-- trapping instruction 31: 00 00 33: 0f 82 19 01 00 00 jb 0x152 39: 8b 45 00 mov 0x0(%rbp),%eax 3c: 48 0f a3 06 bt %rax,(%rsi) Code starting with the faulting instruction =========================================== 0: f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) 7: 00 00 9: 0f 82 19 01 00 00 jb 0x128 f: 8b 45 00 mov 0x0(%rbp),%eax 12: 48 0f a3 06 bt %rax,(%rsi) [ 86.127982] RSP: 0018:ffffc90001367c50 EFLAGS: 00010046 [ 86.128001] RAX: 0000000000000001 RBX: ffffc90001369000 RCX: 0000000000000001 [ 86.128021] RDX: 0000000000000000 RSI: 0000000000000a00 RDI: ffffffff82886a66 [ 86.128040] RBP: ffffc90001367ca8 R08: 0000000000000000 R09: 000000006cc00c97 [ 86.128060] R10: ffff88810c150000 R11: 0000000076cc00c9 R12: 0000000000000001 [ 86.128079] R13: ffffc90001372ff8 R14: ffff8881045c0000 R15: ffffc90001373830 [ 86.128098] FS: 00007f71d6111740(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 [ 86.128118] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 86.128138] CR2: 0000000000000800 CR3: 0000000104774006 CR4: 0000000000772ef0 [ 86.128158] PKRU: 55555554 [ 86.128177] Call Trace: [ 86.128196] <TASK> [ 86.128215] kvm_xen_hvm_evtchn_send (arch/x86/kvm/xen.c:1432 arch/x86/kvm/xen.c:1562) kvm [ 86.128256] kvm_arch_vm_ioctl (arch/x86/kvm/x86.c:6883) kvm [ 86.128294] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007) [ 86.128315] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007) [ 86.128335] ? kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm [ 86.128368] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm [ 86.128401] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710) [ 86.128422] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688) [ 86.128442] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856) [ 86.128462] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 86.128482] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128501] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128520] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 86.128539] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128558] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128577] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128596] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128615] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128634] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128653] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 86.128673] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 86.128692] RIP: 0033:0x7f71d6152c6b [ 86.128712] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48 All code ======== 0: 73 01 jae 0x3 2: c3 ret 3: 48 8b 0d b5 b1 1b 00 mov 0x1bb1b5(%rip),%rcx # 0x1bb1bf a: f7 d8 neg %eax c: 64 89 01 mov %eax,%fs:(%rcx) f: 48 83 c8 ff or $0xffffffffffffffff,%rax 13: c3 ret 14: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 1b: 00 00 00 1e: 90 nop 1f: f3 0f 1e fa endbr64 23: b8 10 00 00 00 mov $0x10,%eax 28: 0f 05 syscall 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction 30: 73 01 jae 0x33 32: c3 ret 33: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb1bf 3a: f7 d8 neg %eax 3c: 64 89 01 mov %eax,%fs:(%rcx) 3f: 48 rex.W Code starting with the faulting instruction =========================================== 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax 6: 73 01 jae 0x9 8: c3 ret 9: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb195 10: f7 d8 neg %eax 12: 64 89 01 mov %eax,%fs:(%rcx) 15: 48 rex.W [ 86.128735] RSP: 002b:00007fff7c716c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 86.128755] RAX: ffffffffffffffda RBX: 00007f71d61116c0 RCX: 00007f71d6152c6b [ 86.128775] RDX: 00007fff7c716f00 RSI: 00000000400caed0 RDI: 0000000000000004 [ 86.128794] RBP: 0000000001f2b2a0 R08: 0000000000417343 R09: 00007f71d62cc341 [ 86.128814] R10: 00007fff7c74c258 R11: 0000000000000246 R12: 00007f71d6329000 [ 86.128833] R13: 00000000632870b9 R14: 0000000000000001 R15: 00007f71d632a020 [ 86.128854] </TASK> [ 86.128873] Modules linked in: kvm_intel 9p fscache netfs nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr sunrpc intel_rapl_msr intel_rapl_common kvm irqbypass rapl pcspkr 9pnet_virtio i2c_piix4 9pnet drm zram ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_console serio_raw ata_generic virtio_blk pata_acpi qemu_fw_cfg ipmi_devintf ipmi_msghandler fuse [ 86.128893] Unloaded tainted modules: kvm_intel():1 [last unloaded: kvm_intel] [ 86.128944] CR2: 0000000000000800 [ 86.128962] ---[ end trace 0000000000000000 ]--- [ 86.128982] RIP: 0010:kvm_xen_set_evtchn_fast (./arch/x86/include/asm/bitops.h:138 ./include/asm-generic/bitops/instrumented-atomic.h:72 arch/x86/kvm/xen.c:1370) kvm [ 86.129022] Code: 5a 84 c0 0f 84 01 01 00 00 80 bb b4 9f 00 00 00 8b 45 00 48 8b 93 c8 a0 00 00 75 4d 41 89 c0 48 8d b2 80 08 00 00 41 c1 e8 05 <f0> 48 0f ab 82 00 08 00 00 0f 82 19 01 00 00 8b 45 00 48 0f a3 06 All code ======== 0: 5a pop %rdx 1: 84 c0 test %al,%al 3: 0f 84 01 01 00 00 je 0x10a 9: 80 bb b4 9f 00 00 00 cmpb $0x0,0x9fb4(%rbx) 10: 8b 45 00 mov 0x0(%rbp),%eax 13: 48 8b 93 c8 a0 00 00 mov 0xa0c8(%rbx),%rdx 1a: 75 4d jne 0x69 1c: 41 89 c0 mov %eax,%r8d 1f: 48 8d b2 80 08 00 00 lea 0x880(%rdx),%rsi 26: 41 c1 e8 05 shr $0x5,%r8d 2a:* f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) <-- trapping instruction 31: 00 00 33: 0f 82 19 01 00 00 jb 0x152 39: 8b 45 00 mov 0x0(%rbp),%eax 3c: 48 0f a3 06 bt %rax,(%rsi) Code starting with the faulting instruction =========================================== 0: f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) 7: 00 00 9: 0f 82 19 01 00 00 jb 0x128 f: 8b 45 00 mov 0x0(%rbp),%eax 12: 48 0f a3 06 bt %rax,(%rsi) [ 86.129044] RSP: 0018:ffffc90001367c50 EFLAGS: 00010046 [ 86.129064] RAX: 0000000000000001 RBX: ffffc90001369000 RCX: 0000000000000001 [ 86.129083] RDX: 0000000000000000 RSI: 0000000000000a00 RDI: ffffffff82886a66 [ 86.129103] RBP: ffffc90001367ca8 R08: 0000000000000000 R09: 000000006cc00c97 [ 86.129122] R10: ffff88810c150000 R11: 0000000076cc00c9 R12: 0000000000000001 [ 86.129142] R13: ffffc90001372ff8 R14: ffff8881045c0000 R15: ffffc90001373830 [ 86.129161] FS: 00007f71d6111740(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 [ 86.129181] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 86.129201] CR2: 0000000000000800 CR3: 0000000104774006 CR4: 0000000000772ef0 [ 86.129221] PKRU: 55555554 [ 86.129240] note: xen_shinfo_test[945] exited with preempt_count 1 [ 151.131754] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 151.131793] rcu: 1-...0: (13 ticks this GP) idle=785c/1/0x4000000000000000 softirq=4758/4760 fqs=16038 [ 151.131816] (detected by 2, t=65002 jiffies, g=6449, q=1299 ncpus=4) [ 151.131837] Sending NMI from CPU 2 to CPUs 1: [ 151.131862] NMI backtrace for cpu 1 [ 151.131864] CPU: 1 PID: 949 Comm: xen_shinfo_test Tainted: G D 6.0.0-rc5-test+ #31 [ 151.131866] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014 [ 151.131866] RIP: 0010:queued_write_lock_slowpath (kernel/locking/qrwlock.c:85) [ 151.131870] Code: ff 90 48 89 df 5b 5d e9 86 fd ff ff f0 81 0b 00 01 00 00 ba ff 00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 8b 03 <3d> 00 01 00 00 75 f5 89 c8 f0 0f b1 13 74 c0 eb e2 89 c6 48 89 ef All code ======== 0: ff 90 48 89 df 5b call *0x5bdf8948(%rax) 6: 5d pop %rbp 7: e9 86 fd ff ff jmp 0xfffffffffffffd92 c: f0 81 0b 00 01 00 00 lock orl $0x100,(%rbx) 13: ba ff 00 00 00 mov $0xff,%edx 18: b9 00 01 00 00 mov $0x100,%ecx 1d: 8b 03 mov (%rbx),%eax 1f: 3d 00 01 00 00 cmp $0x100,%eax 24: 74 0b je 0x31 26: f3 90 pause 28: 8b 03 mov (%rbx),%eax 2a:* 3d 00 01 00 00 cmp $0x100,%eax <-- trapping instruction 2f: 75 f5 jne 0x26 31: 89 c8 mov %ecx,%eax 33: f0 0f b1 13 lock cmpxchg %edx,(%rbx) 37: 74 c0 je 0xfffffffffffffff9 39: eb e2 jmp 0x1d 3b: 89 c6 mov %eax,%esi 3d: 48 89 ef mov %rbp,%rdi Code starting with the faulting instruction =========================================== 0: 3d 00 01 00 00 cmp $0x100,%eax 5: 75 f5 jne 0xfffffffffffffffc 7: 89 c8 mov %ecx,%eax 9: f0 0f b1 13 lock cmpxchg %edx,(%rbx) d: 74 c0 je 0xffffffffffffffcf f: eb e2 jmp 0xfffffffffffffff3 11: 89 c6 mov %eax,%esi 13: 48 89 ef mov %rbp,%rdi [ 151.131871] RSP: 0018:ffffc900011f7b98 EFLAGS: 00000006 [ 151.131872] RAX: 0000000000000300 RBX: ffffc90001372ff8 RCX: 0000000000000100 [ 151.131872] RDX: 00000000000000ff RSI: ffffffff827ea0a3 RDI: ffffffff82886a66 [ 151.131873] RBP: ffffc90001372ffc R08: ffff888107864160 R09: 00000000777cbcf6 [ 151.131873] R10: ffff888107863300 R11: 000000006777cbcf R12: ffffc90001372ff8 [ 151.131874] R13: ffffc90001369000 R14: ffffc90001372fb8 R15: ffffc90001369170 [ 151.131874] FS: 00007f71d5f09640(0000) GS:ffff888237c80000(0000) knlGS:0000000000000000 [ 151.131875] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 151.131876] CR2: 00007f71d5f08ef8 CR3: 0000000104774002 CR4: 0000000000772ee0 [ 151.131877] PKRU: 55555554 [ 151.131878] Call Trace: [ 151.131879] <TASK> [ 151.131880] do_raw_write_lock (./include/asm-generic/qrwlock.h:101 kernel/locking/spinlock_debug.c:210) [ 151.131883] kvm_gpc_refresh (arch/x86/kvm/../../../virt/kvm/pfncache.c:262) kvm [ 151.131907] ? kvm_gpc_activate (./include/linux/spinlock.h:390 arch/x86/kvm/../../../virt/kvm/pfncache.c:375) kvm [ 151.131925] kvm_xen_hvm_set_attr (arch/x86/kvm/xen.c:51 arch/x86/kvm/xen.c:464) kvm [ 151.131950] kvm_arch_vm_ioctl (arch/x86/kvm/x86.c:6883) kvm [ 151.131971] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007) [ 151.131973] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm [ 151.131991] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710) [ 151.131993] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688) [ 151.131995] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856) [ 151.131997] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 151.131998] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710) [ 151.132000] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132000] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 151.132002] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132003] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132003] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 151.132004] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132005] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132006] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132007] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 151.132008] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 151.132009] RIP: 0033:0x7f71d6152c6b [ 151.132011] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48 All code ======== 0: 73 01 jae 0x3 2: c3 ret 3: 48 8b 0d b5 b1 1b 00 mov 0x1bb1b5(%rip),%rcx # 0x1bb1bf a: f7 d8 neg %eax c: 64 89 01 mov %eax,%fs:(%rcx) f: 48 83 c8 ff or $0xffffffffffffffff,%rax 13: c3 ret 14: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 1b: 00 00 00 1e: 90 nop 1f: f3 0f 1e fa endbr64 23: b8 10 00 00 00 mov $0x10,%eax 28: 0f 05 syscall 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction 30: 73 01 jae 0x33 32: c3 ret 33: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb1bf 3a: f7 d8 neg %eax 3c: 64 89 01 mov %eax,%fs:(%rcx) 3f: 48 rex.W Code starting with the faulting instruction =========================================== 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax 6: 73 01 jae 0x9 8: c3 ret 9: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb195 10: f7 d8 neg %eax 12: 64 89 01 mov %eax,%fs:(%rcx) 15: 48 rex.W [ 151.132011] RSP: 002b:00007f71d5f08da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 151.132012] RAX: ffffffffffffffda RBX: 0000000001f2b2a0 RCX: 00007f71d6152c6b [ 151.132013] RDX: 00007f71d5f08db0 RSI: 000000004048aec9 RDI: 0000000000000004 [ 151.132013] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fff7c716b3f [ 151.132013] R10: 00007f71d611c948 R11: 0000000000000246 R12: 00007f71d5f09640 [ 151.132014] R13: 0000000000000004 R14: 00007f71d61b3550 R15: 0000000000000000 [ 151.132016] </TASK> virt/kvm/pfncache.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 45b9b96c0ea3..e987669c3506 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -364,11 +364,13 @@ EXPORT_SYMBOL_GPL(kvm_gpc_init); int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa) { if (!gpc->active) { + write_lock_irq(&gpc->lock); gpc->khva = NULL; gpc->pfn = KVM_PFN_ERR_FAULT; gpc->uhva = KVM_HVA_ERR_BAD; gpc->valid = false; gpc->active = true; + write_unlock_irq(&gpc->lock); spin_lock(&gpc->kvm->gpc_lock); list_add(&gpc->list, &gpc->kvm->gpc_list); -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() 2022-09-21 2:01 ` [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj @ 2022-10-10 23:28 ` Sean Christopherson 2022-10-13 0:22 ` Sean Christopherson 0 siblings, 1 reply; 34+ messages in thread From: Sean Christopherson @ 2022-10-10 23:28 UTC (permalink / raw) To: Michal Luczaj; +Cc: kvm, pbonzini On Wed, Sep 21, 2022, Michal Luczaj wrote: > There's a race between kvm_xen_set_evtchn_fast() and kvm_gpc_activate() > resulting in a near-NULL pointer write. > > 1. Deactivate shinfo cache: > > kvm_xen_hvm_set_attr > case KVM_XEN_ATTR_TYPE_SHARED_INFO > kvm_gpc_deactivate > kvm_gpc_unmap > gpc->valid = false > gpc->khva = NULL > gpc->active = false > > Result: active = false, valid = false > > 2. Cause cache refresh: > > kvm_arch_vm_ioctl > case KVM_XEN_HVM_EVTCHN_SEND > kvm_xen_hvm_evtchn_send > kvm_xen_set_evtchn > kvm_xen_set_evtchn_fast > kvm_gpc_check > return -EWOULDBLOCK because !gpc->valid > kvm_xen_set_evtchn_fast > return -EWOULDBLOCK > kvm_gpc_refresh > hva_to_pfn_retry > gpc->valid = true > gpc->khva = not NULL > > Result: active = false, valid = true This is the real bug. KVM should not succesfully refresh an inactive cache. It's not just the potential for NULL pointer deref, the cache also isn't on the list of active caches, i.e. won't get mmu_notifier events, and so KVM could get a use-after-free of userspace memory. KVM_XEN_HVM_EVTCHN_SEND does check that the per-vCPU cache is active, but does so outside of the gpc->lock. Minus your race condition analysis, which I'll insert into the changelog (assuming this works), I believe the proper fix is to check "active" during check and refresh. Oof, and there are ordering bugs too. Compile-tested patch below. If this fixes things on your end (I'll properly test tomorrow too), I'll post a v2 of the entire series. There are some cleanups that can be done on top, e.g. I think we should drop kvm_gpc_unmap() entirely until there's actually a user, because it's not at all obvious that it's (a) necessary and (b) has desirable behavior. Note, the below patch applies after patch 1 of this series. I don't know if anyone will actually want to backport the fix, but it's not too hard to keep the backport dependency to just patch 1. -- From: Sean Christopherson <seanjc@google.com> Date: Mon, 10 Oct 2022 13:06:13 -0700 Subject: [PATCH] KVM: Reject attempts to consume or refresh inactive gfn_to_pfn_cache Reject kvm_gpc_check() and kvm_gpc_refresh() if the cache is inactive. No checking the active flag during refresh is particular egregious, as KVM can end up with a valid, inactive cache, which can lead to a variety of use-after-free bugs, e.g. consuming a NULL kernel pointer or missing an mmu_notifier invalidation due to the cache not being on the list of gfns to invalidate. Note, "active" needs to be set if and only if the cache is on the list of caches, i.e. is reachable via mmu_notifier events. If a relevant mmu_notifier event occurs while the cache is "active" but not on the list, KVM will not acquire the cache's lock and so will not serailize the mmu_notifier event with active users and/or kvm_gpc_refresh(). A race between KVM_XEN_ATTR_TYPE_SHARED_INFO and KVM_XEN_HVM_EVTCHN_SEND can be exploited to trigger the bug. <will insert your awesome race analysis> Reported-by: : Michal Luczaj <mhal@rbox.co> Signed-off-by: Sean Christopherson <seanjc@google.com> --- virt/kvm/pfncache.c | 36 ++++++++++++++++++++++++++++++------ 1 file changed, 30 insertions(+), 6 deletions(-) diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index b32ed4a7c900..dfc72aa88d71 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -81,6 +81,9 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, { struct kvm_memslots *slots = kvm_memslots(kvm); + if (!gpc->active) + return false; + if ((gpa & ~PAGE_MASK) + len > PAGE_SIZE) return false; @@ -240,8 +243,9 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, { struct kvm_memslots *slots = kvm_memslots(kvm); unsigned long page_offset = gpa & ~PAGE_MASK; - kvm_pfn_t old_pfn, new_pfn; + bool unmap_old = false; unsigned long old_uhva; + kvm_pfn_t old_pfn; void *old_khva; int ret = 0; @@ -261,6 +265,9 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, write_lock_irq(&gpc->lock); + if (!gpc->active) + goto out_unlock; + old_pfn = gpc->pfn; old_khva = gpc->khva - offset_in_page(gpc->khva); old_uhva = gpc->uhva; @@ -305,14 +312,15 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpc->khva = NULL; } - /* Snapshot the new pfn before dropping the lock! */ - new_pfn = gpc->pfn; + /* Detect a pfn change before dropping the lock! */ + unmap_old = (old_pfn != gpc->pfn); +out_unlock: write_unlock_irq(&gpc->lock); mutex_unlock(&gpc->refresh_lock); - if (old_pfn != new_pfn) + if (unmap_old) gpc_unmap_khva(kvm, old_pfn, old_khva); return ret; @@ -368,11 +376,19 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpc->vcpu = vcpu; gpc->usage = usage; gpc->valid = false; - gpc->active = true; spin_lock(&kvm->gpc_lock); list_add(&gpc->list, &kvm->gpc_list); spin_unlock(&kvm->gpc_lock); + + /* + * Activate the cache after adding it to the list, a concurrent + * refresh must not establish a mapping until the cache is + * reachable by mmu_notifier events. + */ + write_lock_irq(&gpc->lock); + gpc->active = true; + write_unlock_irq(&gpc->lock); } return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len); } @@ -381,12 +397,20 @@ EXPORT_SYMBOL_GPL(kvm_gpc_activate); void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) { if (gpc->active) { + /* + * Deactivate the cache before removing it from the list, KVM + * must stall mmu_notifier events until all users go away, i.e. + * until gpc->lock is dropped and refresh is guaranteed to fail. + */ + write_lock_irq(&gpc->lock); + gpc->active = false; + write_unlock_irq(&gpc->lock); + spin_lock(&kvm->gpc_lock); list_del(&gpc->list); spin_unlock(&kvm->gpc_lock); kvm_gfn_to_pfn_cache_unmap(kvm, gpc); - gpc->active = false; } } EXPORT_SYMBOL_GPL(kvm_gpc_deactivate); base-commit: 09e5b3d617d28e3011253370f827151cc6cba6ad -- ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() 2022-10-10 23:28 ` Sean Christopherson @ 2022-10-13 0:22 ` Sean Christopherson 2022-10-13 20:28 ` Sean Christopherson 0 siblings, 1 reply; 34+ messages in thread From: Sean Christopherson @ 2022-10-13 0:22 UTC (permalink / raw) To: Michal Luczaj; +Cc: kvm, pbonzini On Mon, Oct 10, 2022, Sean Christopherson wrote: > On Wed, Sep 21, 2022, Michal Luczaj wrote: > If this fixes things on your end (I'll properly test tomorrow too), I'll post a > v2 of the entire series. There are some cleanups that can be done on top, e.g. > I think we should drop kvm_gpc_unmap() entirely until there's actually a user, > because it's not at all obvious that it's (a) necessary and (b) has desirable > behavior. Sorry for the delay, I initially missed that you included a selftest for the race in the original RFC. The kernel is no longer exploding, but the test is intermittently soft hanging waiting for the "IRQ". I'll debug and hopefully post tomorrow. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() 2022-10-13 0:22 ` Sean Christopherson @ 2022-10-13 20:28 ` Sean Christopherson 2022-10-20 15:59 ` Michal Luczaj 0 siblings, 1 reply; 34+ messages in thread From: Sean Christopherson @ 2022-10-13 20:28 UTC (permalink / raw) To: Michal Luczaj; +Cc: kvm, pbonzini On Thu, Oct 13, 2022, Sean Christopherson wrote: > On Mon, Oct 10, 2022, Sean Christopherson wrote: > > On Wed, Sep 21, 2022, Michal Luczaj wrote: > > If this fixes things on your end (I'll properly test tomorrow too), I'll post a > > v2 of the entire series. There are some cleanups that can be done on top, e.g. > > I think we should drop kvm_gpc_unmap() entirely until there's actually a user, > > because it's not at all obvious that it's (a) necessary and (b) has desirable > > behavior. > > Sorry for the delay, I initially missed that you included a selftest for the race > in the original RFC. The kernel is no longer exploding, but the test is intermittently > soft hanging waiting for the "IRQ". I'll debug and hopefully post tomorrow. Ended up being a test bug (technically). KVM drops the timer IRQ if the shared info page is invalid. As annoying as that is, there's isn't really a better option, and invalidating a shared page while vCPUs are running really is a VMM bug. To fix, I added an intermediate stage in the test that re-arms the timer if the IRQ doesn't arrive in a reasonable amount of time. Patches incoming... ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() 2022-10-13 20:28 ` Sean Christopherson @ 2022-10-20 15:59 ` Michal Luczaj 2022-10-20 16:58 ` Sean Christopherson 0 siblings, 1 reply; 34+ messages in thread From: Michal Luczaj @ 2022-10-20 15:59 UTC (permalink / raw) To: Sean Christopherson; +Cc: kvm, pbonzini On 10/13/22 22:28, Sean Christopherson wrote: > On Thu, Oct 13, 2022, Sean Christopherson wrote: >> On Mon, Oct 10, 2022, Sean Christopherson wrote: >>> On Wed, Sep 21, 2022, Michal Luczaj wrote: >>> If this fixes things on your end (I'll properly test tomorrow too), I'll post a >>> v2 of the entire series. There are some cleanups that can be done on top, e.g. >>> I think we should drop kvm_gpc_unmap() entirely until there's actually a user, >>> because it's not at all obvious that it's (a) necessary and (b) has desirable >>> behavior. >> >> Sorry for the delay, I initially missed that you included a selftest for the race >> in the original RFC. The kernel is no longer exploding, but the test is intermittently >> soft hanging waiting for the "IRQ". I'll debug and hopefully post tomorrow. > > Ended up being a test bug (technically). KVM drops the timer IRQ if the shared > info page is invalid. As annoying as that is, there's isn't really a better > option, and invalidating a shared page while vCPUs are running really is a VMM > bug. > > To fix, I added an intermediate stage in the test that re-arms the timer if the > IRQ doesn't arrive in a reasonable amount of time. > > Patches incoming... Sorry for the late reply, I was away. Thank you for the whole v2 series. And I'm glad you've found my testcase useful, even if a bit buggy ;) Speaking about SCHEDOP_poll, are XEN vmcalls considered trusted? I've noticed that kvm_xen_schedop_poll() fetches guest-provided sched_poll.ports without checking if the values are sane. Then, in wait_pending_event(), there's test_bit(ports[i], pending_bits) which (for some high ports[i] values) results in KASAN complaining about "use-after-free": [ 36.463417] ================================================================== [ 36.463564] BUG: KASAN: use-after-free in kvm_xen_hypercall+0xf39/0x1110 [kvm] [ 36.463962] Read of size 8 at addr ffff88815b87b800 by task xen_shinfo_test/956 [ 36.464149] CPU: 1 PID: 956 Comm: xen_shinfo_test Not tainted 6.1.0-rc1-kasan+ #49 [ 36.464252] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014 [ 36.464259] Call Trace: [ 36.464259] <TASK> [ 36.464259] dump_stack_lvl+0x5b/0x73 [ 36.464259] print_report+0x17f/0x477 [ 36.464259] ? __virt_addr_valid+0xd5/0x150 [ 36.464259] ? kvm_xen_hypercall+0xf39/0x1110 [kvm] [ 36.464259] ? kvm_xen_hypercall+0xf39/0x1110 [kvm] [ 36.464259] kasan_report+0xbb/0xf0 [ 36.464259] ? kvm_xen_hypercall+0xf39/0x1110 [kvm] [ 36.464259] kasan_check_range+0x136/0x1b0 [ 36.464259] kvm_xen_hypercall+0xf39/0x1110 [kvm] [ 36.464259] ? kvm_xen_set_evtchn.part.0+0x190/0x190 [kvm] [ 36.464259] ? get_kvmclock+0x86/0x360 [kvm] [ 36.464259] ? pvclock_clocksource_read+0x13a/0x190 [ 36.464259] kvm_emulate_hypercall+0x1d7/0x860 [kvm] [ 36.464259] ? get_kvmclock+0x151/0x360 [kvm] [ 36.464259] ? kvm_fast_pio+0x260/0x260 [kvm] [ 36.464259] ? kvm_post_set_cr4+0xf0/0xf0 [kvm] [ 36.464259] ? lock_release+0x9c/0x430 [ 36.464259] ? rcu_qs+0x2b/0xb0 [ 36.464259] ? rcu_note_context_switch+0x18e/0x9b0 [ 36.464259] ? rcu_read_lock_sched_held+0x10/0x70 [ 36.464259] ? lock_acquire+0xb1/0x3d0 [ 36.464259] ? vmx_vmexit+0x6c/0x19d [kvm_intel] [ 36.464259] ? vmx_vmexit+0x8d/0x19d [kvm_intel] [ 36.464259] ? rcu_read_lock_sched_held+0x10/0x70 [ 36.464259] ? lock_acquire+0xb1/0x3d0 [ 36.464259] ? lock_downgrade+0x380/0x380 [ 36.464259] ? vmx_vcpu_run+0x5bf/0x1260 [kvm_intel] [ 36.464259] vmx_handle_exit+0x295/0xa50 [kvm_intel] [ 36.464259] vcpu_enter_guest.constprop.0+0x1436/0x1ed0 [kvm] [ 36.464259] ? kvm_check_and_inject_events+0x800/0x800 [kvm] [ 36.464259] ? lock_downgrade+0x380/0x380 [ 36.464259] ? __blkcg_punt_bio_submit+0xd0/0xd0 [ 36.464259] ? kvm_arch_vcpu_ioctl_run+0xa46/0xf70 [kvm] [ 36.464259] ? unlock_page_memcg+0x1e0/0x1e0 [ 36.464259] ? __local_bh_enable_ip+0x8f/0x100 [ 36.464259] ? trace_hardirqs_on+0x2d/0xf0 [ 36.464259] ? fpu_swap_kvm_fpstate+0xbd/0x1c0 [ 36.464259] ? kvm_arch_vcpu_ioctl_run+0x418/0xf70 [kvm] [ 36.464259] kvm_arch_vcpu_ioctl_run+0x418/0xf70 [kvm] [ 36.464259] kvm_vcpu_ioctl+0x332/0x8f0 [kvm] [ 36.464259] ? kvm_clear_dirty_log_protect+0x430/0x430 [kvm] [ 36.464259] ? do_vfs_ioctl+0x951/0xbf0 [ 36.464259] ? vfs_fileattr_set+0x480/0x480 [ 36.464259] ? kernel_write+0x360/0x360 [ 36.464259] ? selinux_inode_getsecctx+0x50/0x50 [ 36.464259] ? ioctl_has_perm.constprop.0.isra.0+0x133/0x200 [ 36.464259] ? selinux_inode_getsecctx+0x50/0x50 [ 36.464259] ? ksys_write+0xc4/0x140 [ 36.464259] ? __ia32_sys_read+0x40/0x40 [ 36.464259] ? lockdep_hardirqs_on_prepare+0xe/0x220 [ 36.464259] __x64_sys_ioctl+0xb8/0xf0 [ 36.464259] do_syscall_64+0x55/0x80 [ 36.464259] ? lockdep_hardirqs_on_prepare+0xe/0x220 [ 36.464259] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 36.464259] RIP: 0033:0x7f81e303ec6b [ 36.464259] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48 [ 36.464259] RSP: 002b:00007fff72009028 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 36.464259] RAX: ffffffffffffffda RBX: 00007f81e33936c0 RCX: 00007f81e303ec6b [ 36.464259] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000007 [ 36.464259] RBP: 00000000008457b0 R08: 0000000000000000 R09: 00000000004029f6 [ 36.464259] R10: 00007f81e31b838b R11: 0000000000000246 R12: 00007f81e33a4000 [ 36.464259] R13: 00007f81e33a2000 R14: 0000000000000000 R15: 00007f81e33a3020 [ 36.464259] </TASK> [ 36.464259] The buggy address belongs to the physical page: [ 36.464259] page:00000000d9b176a3 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x1 pfn:0x15b87b [ 36.464259] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) [ 36.464259] raw: 0017ffffc0000000 ffffea0005504e48 ffffea00056cf688 0000000000000000 [ 36.464259] raw: 0000000000000001 0000000000000000 00000000ffffff7f 0000000000000000 [ 36.464259] page dumped because: kasan: bad access detected [ 36.464259] Memory state around the buggy address: [ 36.464259] ffff88815b87b700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 36.464259] ffff88815b87b780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 36.464259] >ffff88815b87b800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 36.464259] ^ [ 36.464259] ffff88815b87b880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 36.464259] ffff88815b87b900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 36.464259] ================================================================== I can't reproduce it under non-KASAN build, I'm not sure what's going on. Anyway, here's the testcase, applies after your selftests patches in https://lore.kernel.org/kvm/20221013211234.1318131-1-seanjc@google.com/ --- .../selftests/kvm/x86_64/xen_shinfo_test.c | 38 +++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c index 2a5727188c8d..402e3d7b86b0 100644 --- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c +++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c @@ -375,6 +375,29 @@ static void guest_code(void) guest_saw_irq = false; GUEST_SYNC(24); + /* Terminate racer thread */ + + GUEST_SYNC(25); + /* Test SCHEDOP_poll out-of-bounds read */ + + p = (struct sched_poll) { + .ports = ports, + .nr_ports = 1, + .timeout = 1 + }; + + ports[0] = (PAGE_SIZE*2) << 3; + for (i = 0; i < 0x1000; i++) { + asm volatile("vmcall" + : "=a" (rax) + : "a" (__HYPERVISOR_sched_op), + "D" (SCHEDOP_poll), + "S" (&p) + : "memory"); + ports[0] += PAGE_SIZE << 3; + } + + GUEST_SYNC(26); } static int cmp_timespec(struct timespec *a, struct timespec *b) @@ -925,6 +948,21 @@ int main(int argc, char *argv[]) ret = pthread_join(thread, 0); TEST_ASSERT(ret == 0, "pthread_join() failed: %s", strerror(ret)); + + /* shinfo juggling done; reset to a valid GFN. */ + struct kvm_xen_hvm_attr ha = { + .type = KVM_XEN_ATTR_TYPE_SHARED_INFO, + .u.shared_info.gfn = SHINFO_REGION_GPA / PAGE_SIZE, + }; + vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &ha); + break; + + case 25: + if (verbose) + printf("Testing SCHEDOP_poll out-of-bounds read\n"); + break; + + case 26: goto done; case 0x20: -- 2.38.0 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() 2022-10-20 15:59 ` Michal Luczaj @ 2022-10-20 16:58 ` Sean Christopherson 2022-10-21 2:39 ` Michal Luczaj 0 siblings, 1 reply; 34+ messages in thread From: Sean Christopherson @ 2022-10-20 16:58 UTC (permalink / raw) To: Michal Luczaj; +Cc: kvm, pbonzini On Thu, Oct 20, 2022, Michal Luczaj wrote: > On 10/13/22 22:28, Sean Christopherson wrote: > > On Thu, Oct 13, 2022, Sean Christopherson wrote: > >> On Mon, Oct 10, 2022, Sean Christopherson wrote: > >>> On Wed, Sep 21, 2022, Michal Luczaj wrote: > >>> If this fixes things on your end (I'll properly test tomorrow too), I'll post a > >>> v2 of the entire series. There are some cleanups that can be done on top, e.g. > >>> I think we should drop kvm_gpc_unmap() entirely until there's actually a user, > >>> because it's not at all obvious that it's (a) necessary and (b) has desirable > >>> behavior. > >> > >> Sorry for the delay, I initially missed that you included a selftest for the race > >> in the original RFC. The kernel is no longer exploding, but the test is intermittently > >> soft hanging waiting for the "IRQ". I'll debug and hopefully post tomorrow. > > > > Ended up being a test bug (technically). KVM drops the timer IRQ if the shared > > info page is invalid. As annoying as that is, there's isn't really a better > > option, and invalidating a shared page while vCPUs are running really is a VMM > > bug. > > > > To fix, I added an intermediate stage in the test that re-arms the timer if the > > IRQ doesn't arrive in a reasonable amount of time. > > > > Patches incoming... > > Sorry for the late reply, I was away. > Thank you for the whole v2 series. And I'm glad you've found my testcase > useful, even if a bit buggy ;) > > Speaking about SCHEDOP_poll, are XEN vmcalls considered trusted? I highly doubt they are trusted. > I've noticed that kvm_xen_schedop_poll() fetches guest-provided > sched_poll.ports without checking if the values are sane. Then, in > wait_pending_event(), there's test_bit(ports[i], pending_bits) which > (for some high ports[i] values) results in KASAN complaining about > "use-after-free": > > [ 36.463417] ================================================================== > [ 36.463564] BUG: KASAN: use-after-free in kvm_xen_hypercall+0xf39/0x1110 [kvm] ... > I can't reproduce it under non-KASAN build, I'm not sure what's going on. KASAN is rightly complaining because, as you already pointed out, the high ports[i] value will touch memory well beyond the shinfo->evtchn_pending array. Non-KASAN builds don't have visible failures because the rogue access is only a read, and the result of the test_bit() only affects whether or not KVM temporarily stalls the vCPU. In other words, KVM is leaking host state to the guest, but there is no memory corruption and no functional impact on the guest. I think this would be the way to fix this particular mess? diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 93c628d3e3a9..5d09a47db732 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -961,7 +961,9 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, struct kvm *kvm = vcpu->kvm; struct gfn_to_pfn_cache *gpc = &kvm->arch.xen.shinfo_cache; unsigned long *pending_bits; + unsigned long nr_bits; unsigned long flags; + evtchn_port_t port; bool ret = true; int idx, i; @@ -974,13 +976,19 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) { struct shared_info *shinfo = gpc->khva; pending_bits = (unsigned long *)&shinfo->evtchn_pending; + nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE; } else { struct compat_shared_info *shinfo = gpc->khva; pending_bits = (unsigned long *)&shinfo->evtchn_pending; + nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE; } for (i = 0; i < nr_ports; i++) { - if (test_bit(ports[i], pending_bits)) { + port = ports[i]; + if (port >= nr_bits) + continue; + + if (test_bit(array_index_nospec(port, nr_bits), pending_bits)) { ret = true; break; } ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() 2022-10-20 16:58 ` Sean Christopherson @ 2022-10-21 2:39 ` Michal Luczaj 0 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-10-21 2:39 UTC (permalink / raw) To: Sean Christopherson; +Cc: kvm, pbonzini On 10/20/22 18:58, Sean Christopherson wrote: > On Thu, Oct 20, 2022, Michal Luczaj wrote: >> Speaking about SCHEDOP_poll, are XEN vmcalls considered trusted? > > I highly doubt they are trusted. Does it mean a CPL3 guest can vmcall SCHEDOP_poll? If so, is the use of kvm_mmu_gva_to_gpa_system() justified? >> I've noticed that kvm_xen_schedop_poll() fetches guest-provided >> sched_poll.ports without checking if the values are sane. Then, in >> wait_pending_event(), there's test_bit(ports[i], pending_bits) which >> (for some high ports[i] values) results in KASAN complaining about >> "use-after-free": >> >> [ 36.463417] ================================================================== >> [ 36.463564] BUG: KASAN: use-after-free in kvm_xen_hypercall+0xf39/0x1110 [kvm] > > ... > >> I can't reproduce it under non-KASAN build, I'm not sure what's going on. > > KASAN is rightly complaining because, as you already pointed out, the high ports[i] > value will touch memory well beyond the shinfo->evtchn_pending array. Non-KASAN > builds don't have visible failures because the rogue access is only a read, and > the result of the test_bit() only affects whether or not KVM temporarily stalls > the vCPU. In other words, KVM is leaking host state to the guest, but there is > no memory corruption and no functional impact on the guest. OK, so such vCPU stall-or-not is a side channel leaking host memory bit by bit, right? I'm trying to understand what is actually being leaked here. Is it user space memory of process that is using KVM on the host? > I think this would be the way to fix this particular mess? > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > index 93c628d3e3a9..5d09a47db732 100644 > --- a/arch/x86/kvm/xen.c > +++ b/arch/x86/kvm/xen.c > @@ -961,7 +961,9 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, > struct kvm *kvm = vcpu->kvm; > struct gfn_to_pfn_cache *gpc = &kvm->arch.xen.shinfo_cache; > unsigned long *pending_bits; > + unsigned long nr_bits; > unsigned long flags; > + evtchn_port_t port; > bool ret = true; > int idx, i; > > @@ -974,13 +976,19 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, > if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) { > struct shared_info *shinfo = gpc->khva; > pending_bits = (unsigned long *)&shinfo->evtchn_pending; > + nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE; > } else { > struct compat_shared_info *shinfo = gpc->khva; > pending_bits = (unsigned long *)&shinfo->evtchn_pending; > + nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE; > } > > for (i = 0; i < nr_ports; i++) { > - if (test_bit(ports[i], pending_bits)) { > + port = ports[i]; > + if (port >= nr_bits) > + continue; > + > + if (test_bit(array_index_nospec(port, nr_bits), pending_bits)) { > ret = true; > break; > } Great, that looks good and passes the test. Thanks, Michal ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix 2022-09-16 17:12 ` Sean Christopherson 2022-09-18 23:13 ` Michal Luczaj 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj @ 2022-10-05 12:30 ` Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj ` (7 more replies) 2 siblings, 8 replies; 34+ messages in thread From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Politely resending after two weeks :) Series marked v2, but there're no functional changes, I've only tweaked patch 4/8 commit message. I've also made sure there are no merge conflicts with 6.0 release. Please let me know if this needs any corrections and/or to wait. Thanks, Michal Michal Luczaj (8): KVM: x86: Add initializer for gfn_to_pfn_cache KVM: x86: Shorten gfn_to_pfn_cache function names KVM: x86: Remove unused argument in gpc_unmap_khva() KVM: x86: Store immutable gfn_to_pfn_cache properties KVM: x86: Clean up kvm_gpc_check() KVM: x86: Clean up hva_to_pfn_retry() KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() arch/x86/kvm/x86.c | 24 ++++++------ arch/x86/kvm/xen.c | 78 +++++++++++++++++-------------------- include/linux/kvm_host.h | 64 ++++++++++++++++--------------- include/linux/kvm_types.h | 2 + virt/kvm/pfncache.c | 81 +++++++++++++++++++++------------------ 5 files changed, 128 insertions(+), 121 deletions(-) -- 2.37.3 ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v2 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj @ 2022-10-05 12:30 ` Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj ` (6 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Move the gfn_to_pfn_cache lock initialization to another helper and call the new helper during VM/vCPU creation. Rename "cache_init" and "cache_destroy" to activate+deactivate to avoid implying that the cache really is destroyed/freed. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/x86.c | 12 +++++---- arch/x86/kvm/xen.c | 57 +++++++++++++++++++++------------------- include/linux/kvm_host.h | 23 +++++++++++----- virt/kvm/pfncache.c | 21 ++++++++------- 4 files changed, 65 insertions(+), 48 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 43a6a7efc6ec..15032b7f0589 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2301,11 +2301,11 @@ static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time, /* we verify if the enable bit is set... */ if (system_time & 1) { - kvm_gfn_to_pfn_cache_init(vcpu->kvm, &vcpu->arch.pv_time, vcpu, - KVM_HOST_USES_PFN, system_time & ~1ULL, - sizeof(struct pvclock_vcpu_time_info)); + kvm_gpc_activate(vcpu->kvm, &vcpu->arch.pv_time, vcpu, + KVM_HOST_USES_PFN, system_time & ~1ULL, + sizeof(struct pvclock_vcpu_time_info)); } else { - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.pv_time); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time); } return; @@ -3374,7 +3374,7 @@ static int kvm_pv_enable_async_pf_int(struct kvm_vcpu *vcpu, u64 data) static void kvmclock_reset(struct kvm_vcpu *vcpu) { - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.pv_time); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time); vcpu->arch.time = 0; } @@ -11551,6 +11551,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.regs_avail = ~0; vcpu->arch.regs_dirty = ~0; + kvm_gpc_init(&vcpu->arch.pv_time); + if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu)) vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; else diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 280cb5dc7341..cecf8299b187 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -42,13 +42,13 @@ static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn) int idx = srcu_read_lock(&kvm->srcu); if (gfn == GPA_INVALID) { - kvm_gfn_to_pfn_cache_destroy(kvm, gpc); + kvm_gpc_deactivate(kvm, gpc); goto out; } do { - ret = kvm_gfn_to_pfn_cache_init(kvm, gpc, NULL, KVM_HOST_USES_PFN, - gpa, PAGE_SIZE); + ret = kvm_gpc_activate(kvm, gpc, NULL, KVM_HOST_USES_PFN, gpa, + PAGE_SIZE); if (ret) goto out; @@ -554,15 +554,15 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) offsetof(struct compat_vcpu_info, time)); if (data->u.gpa == GPA_INVALID) { - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache); r = 0; break; } - r = kvm_gfn_to_pfn_cache_init(vcpu->kvm, - &vcpu->arch.xen.vcpu_info_cache, - NULL, KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct vcpu_info)); + r = kvm_gpc_activate(vcpu->kvm, + &vcpu->arch.xen.vcpu_info_cache, NULL, + KVM_HOST_USES_PFN, data->u.gpa, + sizeof(struct vcpu_info)); if (!r) kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); @@ -570,16 +570,16 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO: if (data->u.gpa == GPA_INVALID) { - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, - &vcpu->arch.xen.vcpu_time_info_cache); + kvm_gpc_deactivate(vcpu->kvm, + &vcpu->arch.xen.vcpu_time_info_cache); r = 0; break; } - r = kvm_gfn_to_pfn_cache_init(vcpu->kvm, - &vcpu->arch.xen.vcpu_time_info_cache, - NULL, KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct pvclock_vcpu_time_info)); + r = kvm_gpc_activate(vcpu->kvm, + &vcpu->arch.xen.vcpu_time_info_cache, + NULL, KVM_HOST_USES_PFN, data->u.gpa, + sizeof(struct pvclock_vcpu_time_info)); if (!r) kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); break; @@ -590,16 +590,15 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) break; } if (data->u.gpa == GPA_INVALID) { - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, - &vcpu->arch.xen.runstate_cache); + kvm_gpc_deactivate(vcpu->kvm, + &vcpu->arch.xen.runstate_cache); r = 0; break; } - r = kvm_gfn_to_pfn_cache_init(vcpu->kvm, - &vcpu->arch.xen.runstate_cache, - NULL, KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct vcpu_runstate_info)); + r = kvm_gpc_activate(vcpu->kvm, &vcpu->arch.xen.runstate_cache, + NULL, KVM_HOST_USES_PFN, data->u.gpa, + sizeof(struct vcpu_runstate_info)); break; case KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT: @@ -1817,7 +1816,12 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu) { vcpu->arch.xen.vcpu_id = vcpu->vcpu_idx; vcpu->arch.xen.poll_evtchn = 0; + timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0); + + kvm_gpc_init(&vcpu->arch.xen.runstate_cache); + kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache); + kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache); } void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu) @@ -1825,18 +1829,17 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu) if (kvm_xen_timer_enabled(vcpu)) kvm_xen_stop_timer(vcpu); - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, - &vcpu->arch.xen.runstate_cache); - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, - &vcpu->arch.xen.vcpu_info_cache); - kvm_gfn_to_pfn_cache_destroy(vcpu->kvm, - &vcpu->arch.xen.vcpu_time_info_cache); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.runstate_cache); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache); + kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_time_info_cache); + del_timer_sync(&vcpu->arch.xen.poll_timer); } void kvm_xen_init_vm(struct kvm *kvm) { idr_init(&kvm->arch.xen.evtchn_ports); + kvm_gpc_init(&kvm->arch.xen.shinfo_cache); } void kvm_xen_destroy_vm(struct kvm *kvm) @@ -1844,7 +1847,7 @@ void kvm_xen_destroy_vm(struct kvm *kvm) struct evtchnfd *evtchnfd; int i; - kvm_gfn_to_pfn_cache_destroy(kvm, &kvm->arch.xen.shinfo_cache); + kvm_gpc_deactivate(kvm, &kvm->arch.xen.shinfo_cache); idr_for_each_entry(&kvm->arch.xen.evtchn_ports, evtchnfd, i) { if (!evtchnfd->deliver.port.port) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f4519d3689e1..9fd67026d91a 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1241,8 +1241,17 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data, void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn); /** - * kvm_gfn_to_pfn_cache_init - prepare a cached kernel mapping and HPA for a - * given guest physical address. + * kvm_gpc_init - initialize gfn_to_pfn_cache. + * + * @gpc: struct gfn_to_pfn_cache object. + * + * This sets up a gfn_to_pfn_cache by initializing locks. + */ +void kvm_gpc_init(struct gfn_to_pfn_cache *gpc); + +/** + * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest + * physical address. * * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. @@ -1266,9 +1275,9 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn); * kvm_gfn_to_pfn_cache_check() to ensure that the cache is valid before * accessing the target page. */ -int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, - gpa_t gpa, unsigned long len); +int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, + struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, + gpa_t gpa, unsigned long len); /** * kvm_gfn_to_pfn_cache_check - check validity of a gfn_to_pfn_cache. @@ -1325,7 +1334,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); /** - * kvm_gfn_to_pfn_cache_destroy - destroy and unlink a gfn_to_pfn_cache. + * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache. * * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. @@ -1333,7 +1342,7 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); * This removes a cache from the @kvm's list to be processed on MMU notifier * invocation. */ -void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); +void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); void kvm_sigset_activate(struct kvm_vcpu *vcpu); void kvm_sigset_deactivate(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 68ff41d39545..08f97cf97264 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -346,17 +346,20 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) } EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_unmap); +void kvm_gpc_init(struct gfn_to_pfn_cache *gpc) +{ + rwlock_init(&gpc->lock); + mutex_init(&gpc->refresh_lock); +} +EXPORT_SYMBOL_GPL(kvm_gpc_init); -int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, - gpa_t gpa, unsigned long len) +int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, + struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, + gpa_t gpa, unsigned long len) { WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage); if (!gpc->active) { - rwlock_init(&gpc->lock); - mutex_init(&gpc->refresh_lock); - gpc->khva = NULL; gpc->pfn = KVM_PFN_ERR_FAULT; gpc->uhva = KVM_HVA_ERR_BAD; @@ -371,9 +374,9 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, } return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len); } -EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_init); +EXPORT_SYMBOL_GPL(kvm_gpc_activate); -void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) +void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) { if (gpc->active) { spin_lock(&kvm->gpc_lock); @@ -384,4 +387,4 @@ void kvm_gfn_to_pfn_cache_destroy(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) gpc->active = false; } } -EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_destroy); +EXPORT_SYMBOL_GPL(kvm_gpc_deactivate); -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj @ 2022-10-05 12:30 ` Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj ` (5 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Formalize "gpc" as the acronym and use it in function names. No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/x86.c | 8 ++++---- arch/x86/kvm/xen.c | 29 ++++++++++++++--------------- include/linux/kvm_host.h | 21 ++++++++++----------- virt/kvm/pfncache.c | 20 ++++++++++---------- 4 files changed, 38 insertions(+), 40 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 15032b7f0589..45136ce7185e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3020,12 +3020,12 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v, unsigned long flags; read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, - offset + sizeof(*guest_hv_clock))) { + while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, + offset + sizeof(*guest_hv_clock))) { read_unlock_irqrestore(&gpc->lock, flags); - if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa, - offset + sizeof(*guest_hv_clock))) + if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, + offset + sizeof(*guest_hv_clock))) return; read_lock_irqsave(&gpc->lock, flags); diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index cecf8299b187..361f77dc7a3d 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -218,15 +218,14 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) user_len = sizeof(struct compat_vcpu_runstate_info); read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, - user_len)) { + while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, user_len)) { read_unlock_irqrestore(&gpc->lock, flags); /* When invoked from kvm_sched_out() we cannot sleep */ if (state == RUNSTATE_runnable) return; - if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa, user_len)) + if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, user_len)) return; read_lock_irqsave(&gpc->lock, flags); @@ -352,12 +351,12 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v) * little more honest about it. */ read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, + sizeof(struct vcpu_info))) { read_unlock_irqrestore(&gpc->lock, flags); - if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) + if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, + sizeof(struct vcpu_info))) return; read_lock_irqsave(&gpc->lock, flags); @@ -417,8 +416,8 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v) sizeof_field(struct compat_vcpu_info, evtchn_upcall_pending)); read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, + sizeof(struct vcpu_info))) { read_unlock_irqrestore(&gpc->lock, flags); /* @@ -432,8 +431,8 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v) if (in_atomic() || !task_is_running(current)) return 1; - if (kvm_gfn_to_pfn_cache_refresh(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, + sizeof(struct vcpu_info))) { /* * If this failed, userspace has screwed up the * vcpu_info mapping. No interrupts for you. @@ -966,7 +965,7 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, read_lock_irqsave(&gpc->lock, flags); idx = srcu_read_lock(&kvm->srcu); - if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) + if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) goto out_rcu; ret = false; @@ -1358,7 +1357,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) idx = srcu_read_lock(&kvm->srcu); read_lock_irqsave(&gpc->lock, flags); - if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) + if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) goto out_rcu; if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) { @@ -1392,7 +1391,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) gpc = &vcpu->arch.xen.vcpu_info_cache; read_lock_irqsave(&gpc->lock, flags); - if (!kvm_gfn_to_pfn_cache_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) { + if (!kvm_gpc_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) { /* * Could not access the vcpu_info. Set the bit in-kernel * and prod the vCPU to deliver it for itself. @@ -1490,7 +1489,7 @@ static int kvm_xen_set_evtchn(struct kvm_xen_evtchn *xe, struct kvm *kvm) break; idx = srcu_read_lock(&kvm->srcu); - rc = kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE); + rc = kvm_gpc_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE); srcu_read_unlock(&kvm->srcu, idx); } while(!rc); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9fd67026d91a..f687e56c24bc 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1271,16 +1271,15 @@ void kvm_gpc_init(struct gfn_to_pfn_cache *gpc); * -EFAULT for an untranslatable guest physical address. * * This primes a gfn_to_pfn_cache and links it into the @kvm's list for - * invalidations to be processed. Callers are required to use - * kvm_gfn_to_pfn_cache_check() to ensure that the cache is valid before - * accessing the target page. + * invalidations to be processed. Callers are required to use kvm_gpc_check() + * to ensure that the cache is valid before accessing the target page. */ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, gpa_t gpa, unsigned long len); /** - * kvm_gfn_to_pfn_cache_check - check validity of a gfn_to_pfn_cache. + * kvm_gpc_check - check validity of a gfn_to_pfn_cache. * * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. @@ -1297,11 +1296,11 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, * Callers in IN_GUEST_MODE may do so without locking, although they should * still hold a read lock on kvm->scru for the memslot checks. */ -bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - gpa_t gpa, unsigned long len); +bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, + unsigned long len); /** - * kvm_gfn_to_pfn_cache_refresh - update a previously initialized cache. + * kvm_gpc_refresh - update a previously initialized cache. * * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. @@ -1318,11 +1317,11 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, * still lock and check the cache status, as this function does not return * with the lock still held to permit access. */ -int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - gpa_t gpa, unsigned long len); +int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, + unsigned long len); /** - * kvm_gfn_to_pfn_cache_unmap - temporarily unmap a gfn_to_pfn_cache. + * kvm_gpc_unmap - temporarily unmap a gfn_to_pfn_cache. * * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. @@ -1331,7 +1330,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, * but at least the mapping from GPA to userspace HVA will remain cached * and can be reused on a subsequent refresh. */ -void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); +void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); /** * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache. diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 08f97cf97264..cc65fab0dbef 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -76,8 +76,8 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start, } } -bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - gpa_t gpa, unsigned long len) +bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, + unsigned long len) { struct kvm_memslots *slots = kvm_memslots(kvm); @@ -93,7 +93,7 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, return true; } -EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_check); +EXPORT_SYMBOL_GPL(kvm_gpc_check); static void gpc_unmap_khva(struct kvm *kvm, kvm_pfn_t pfn, void *khva) { @@ -235,8 +235,8 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) return -EFAULT; } -int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - gpa_t gpa, unsigned long len) +int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, + unsigned long len) { struct kvm_memslots *slots = kvm_memslots(kvm); unsigned long page_offset = gpa & ~PAGE_MASK; @@ -317,9 +317,9 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, return ret; } -EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_refresh); +EXPORT_SYMBOL_GPL(kvm_gpc_refresh); -void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) +void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) { void *old_khva; kvm_pfn_t old_pfn; @@ -344,7 +344,7 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) gpc_unmap_khva(kvm, old_pfn, old_khva); } -EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_unmap); +EXPORT_SYMBOL_GPL(kvm_gpc_unmap); void kvm_gpc_init(struct gfn_to_pfn_cache *gpc) { @@ -372,7 +372,7 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, list_add(&gpc->list, &kvm->gpc_list); spin_unlock(&kvm->gpc_lock); } - return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len); + return kvm_gpc_refresh(kvm, gpc, gpa, len); } EXPORT_SYMBOL_GPL(kvm_gpc_activate); @@ -383,7 +383,7 @@ void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) list_del(&gpc->list); spin_unlock(&kvm->gpc_lock); - kvm_gfn_to_pfn_cache_unmap(kvm, gpc); + kvm_gpc_unmap(kvm, gpc); gpc->active = false; } } -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj @ 2022-10-05 12:30 ` Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj ` (4 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj First argument is never used, remove it. Signed-off-by: Michal Luczaj <mhal@rbox.co> --- virt/kvm/pfncache.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index cc65fab0dbef..76f1b669cf28 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -95,7 +95,7 @@ bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, } EXPORT_SYMBOL_GPL(kvm_gpc_check); -static void gpc_unmap_khva(struct kvm *kvm, kvm_pfn_t pfn, void *khva) +static void gpc_unmap_khva(kvm_pfn_t pfn, void *khva) { /* Unmap the old pfn/page if it was mapped before. */ if (!is_error_noslot_pfn(pfn) && khva) { @@ -174,7 +174,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) * the existing mapping and didn't create a new one. */ if (new_khva != old_khva) - gpc_unmap_khva(kvm, new_pfn, new_khva); + gpc_unmap_khva(new_pfn, new_khva); kvm_release_pfn_clean(new_pfn); @@ -313,7 +313,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, mutex_unlock(&gpc->refresh_lock); if (old_pfn != new_pfn) - gpc_unmap_khva(kvm, old_pfn, old_khva); + gpc_unmap_khva(old_pfn, old_khva); return ret; } @@ -342,7 +342,7 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) write_unlock_irq(&gpc->lock); mutex_unlock(&gpc->refresh_lock); - gpc_unmap_khva(kvm, old_pfn, old_khva); + gpc_unmap_khva(old_pfn, old_khva); } EXPORT_SYMBOL_GPL(kvm_gpc_unmap); -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj ` (2 preceding siblings ...) 2022-10-05 12:30 ` [PATCH v2 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj @ 2022-10-05 12:30 ` Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj ` (3 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Move the assignment of immutable properties @kvm, @vcpu, @usage, @len to the initializer. Make _activate() and _deactivate() use stored values. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/x86.c | 16 ++++++------- arch/x86/kvm/xen.c | 50 ++++++++++++++++++--------------------- include/linux/kvm_host.h | 40 +++++++++++++++---------------- include/linux/kvm_types.h | 2 ++ virt/kvm/pfncache.c | 36 +++++++++++++++------------- 5 files changed, 72 insertions(+), 72 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 45136ce7185e..ed8e4f8c9cf0 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2300,13 +2300,10 @@ static void kvm_write_system_time(struct kvm_vcpu *vcpu, gpa_t system_time, kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu); /* we verify if the enable bit is set... */ - if (system_time & 1) { - kvm_gpc_activate(vcpu->kvm, &vcpu->arch.pv_time, vcpu, - KVM_HOST_USES_PFN, system_time & ~1ULL, - sizeof(struct pvclock_vcpu_time_info)); - } else { - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time); - } + if (system_time & 1) + kvm_gpc_activate(&vcpu->arch.pv_time, system_time & ~1ULL); + else + kvm_gpc_deactivate(&vcpu->arch.pv_time); return; } @@ -3374,7 +3371,7 @@ static int kvm_pv_enable_async_pf_int(struct kvm_vcpu *vcpu, u64 data) static void kvmclock_reset(struct kvm_vcpu *vcpu) { - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.pv_time); + kvm_gpc_deactivate(&vcpu->arch.pv_time); vcpu->arch.time = 0; } @@ -11551,7 +11548,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.regs_avail = ~0; vcpu->arch.regs_dirty = ~0; - kvm_gpc_init(&vcpu->arch.pv_time); + kvm_gpc_init(&vcpu->arch.pv_time, vcpu->kvm, vcpu, KVM_HOST_USES_PFN, + sizeof(struct pvclock_vcpu_time_info)); if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu)) vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 361f77dc7a3d..9b4b0e6e66e5 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -42,13 +42,12 @@ static int kvm_xen_shared_info_init(struct kvm *kvm, gfn_t gfn) int idx = srcu_read_lock(&kvm->srcu); if (gfn == GPA_INVALID) { - kvm_gpc_deactivate(kvm, gpc); + kvm_gpc_deactivate(gpc); goto out; } do { - ret = kvm_gpc_activate(kvm, gpc, NULL, KVM_HOST_USES_PFN, gpa, - PAGE_SIZE); + ret = kvm_gpc_activate(gpc, gpa); if (ret) goto out; @@ -553,15 +552,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) offsetof(struct compat_vcpu_info, time)); if (data->u.gpa == GPA_INVALID) { - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache); r = 0; break; } - r = kvm_gpc_activate(vcpu->kvm, - &vcpu->arch.xen.vcpu_info_cache, NULL, - KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct vcpu_info)); + r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_info_cache, + data->u.gpa); if (!r) kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); @@ -569,16 +566,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) case KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO: if (data->u.gpa == GPA_INVALID) { - kvm_gpc_deactivate(vcpu->kvm, - &vcpu->arch.xen.vcpu_time_info_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_time_info_cache); r = 0; break; } - r = kvm_gpc_activate(vcpu->kvm, - &vcpu->arch.xen.vcpu_time_info_cache, - NULL, KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct pvclock_vcpu_time_info)); + r = kvm_gpc_activate(&vcpu->arch.xen.vcpu_time_info_cache, + data->u.gpa); if (!r) kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); break; @@ -589,15 +583,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) break; } if (data->u.gpa == GPA_INVALID) { - kvm_gpc_deactivate(vcpu->kvm, - &vcpu->arch.xen.runstate_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.runstate_cache); r = 0; break; } - r = kvm_gpc_activate(vcpu->kvm, &vcpu->arch.xen.runstate_cache, - NULL, KVM_HOST_USES_PFN, data->u.gpa, - sizeof(struct vcpu_runstate_info)); + r = kvm_gpc_activate(&vcpu->arch.xen.runstate_cache, + data->u.gpa); break; case KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT: @@ -1818,9 +1810,12 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu) timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0); - kvm_gpc_init(&vcpu->arch.xen.runstate_cache); - kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache); - kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache); + kvm_gpc_init(&vcpu->arch.xen.runstate_cache, vcpu->kvm, NULL, + KVM_HOST_USES_PFN, sizeof(struct vcpu_runstate_info)); + kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache, vcpu->kvm, NULL, + KVM_HOST_USES_PFN, sizeof(struct vcpu_info)); + kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache, vcpu->kvm, NULL, + KVM_HOST_USES_PFN, sizeof(struct pvclock_vcpu_time_info)); } void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu) @@ -1828,9 +1823,9 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu) if (kvm_xen_timer_enabled(vcpu)) kvm_xen_stop_timer(vcpu); - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.runstate_cache); - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache); - kvm_gpc_deactivate(vcpu->kvm, &vcpu->arch.xen.vcpu_time_info_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.runstate_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_info_cache); + kvm_gpc_deactivate(&vcpu->arch.xen.vcpu_time_info_cache); del_timer_sync(&vcpu->arch.xen.poll_timer); } @@ -1838,7 +1833,8 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu) void kvm_xen_init_vm(struct kvm *kvm) { idr_init(&kvm->arch.xen.evtchn_ports); - kvm_gpc_init(&kvm->arch.xen.shinfo_cache); + kvm_gpc_init(&kvm->arch.xen.shinfo_cache, kvm, NULL, KVM_HOST_USES_PFN, + PAGE_SIZE); } void kvm_xen_destroy_vm(struct kvm *kvm) @@ -1846,7 +1842,7 @@ void kvm_xen_destroy_vm(struct kvm *kvm) struct evtchnfd *evtchnfd; int i; - kvm_gpc_deactivate(kvm, &kvm->arch.xen.shinfo_cache); + kvm_gpc_deactivate(&kvm->arch.xen.shinfo_cache); idr_for_each_entry(&kvm->arch.xen.evtchn_ports, evtchnfd, i) { if (!evtchnfd->deliver.port.port) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f687e56c24bc..024b8df5302c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1244,17 +1244,7 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn); * kvm_gpc_init - initialize gfn_to_pfn_cache. * * @gpc: struct gfn_to_pfn_cache object. - * - * This sets up a gfn_to_pfn_cache by initializing locks. - */ -void kvm_gpc_init(struct gfn_to_pfn_cache *gpc); - -/** - * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest - * physical address. - * * @kvm: pointer to kvm instance. - * @gpc: struct gfn_to_pfn_cache object. * @vcpu: vCPU to be used for marking pages dirty and to be woken on * invalidation. * @usage: indicates if the resulting host physical PFN is used while @@ -1263,20 +1253,31 @@ void kvm_gpc_init(struct gfn_to_pfn_cache *gpc); * changes!---will also force @vcpu to exit the guest and * refresh the cache); and/or if the PFN used directly * by KVM (and thus needs a kernel virtual mapping). - * @gpa: guest physical address to map. * @len: sanity check; the range being access must fit a single page. * + * This sets up a gfn_to_pfn_cache by initializing locks and assigning the + * immutable attributes. + */ +void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm, + struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, + unsigned long len); + +/** + * kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest + * physical address. + * + * @gpc: struct gfn_to_pfn_cache object. + * @gpa: guest physical address to map. + * * @return: 0 for success. * -EINVAL for a mapping which would cross a page boundary. - * -EFAULT for an untranslatable guest physical address. + * -EFAULT for an untranslatable guest physical address. * - * This primes a gfn_to_pfn_cache and links it into the @kvm's list for + * This primes a gfn_to_pfn_cache and links it into the @gpc->kvm's list for * invalidations to be processed. Callers are required to use kvm_gpc_check() * to ensure that the cache is valid before accessing the target page. */ -int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, - gpa_t gpa, unsigned long len); +int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa); /** * kvm_gpc_check - check validity of a gfn_to_pfn_cache. @@ -1335,13 +1336,12 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); /** * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache. * - * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. * - * This removes a cache from the @kvm's list to be processed on MMU notifier - * invocation. + * This removes a cache from the @gpc->kvm's list to be processed on MMU + * notifier invocation. */ -void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); +void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc); void kvm_sigset_activate(struct kvm_vcpu *vcpu); void kvm_sigset_deactivate(struct kvm_vcpu *vcpu); diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index 3ca3db020e0e..d66b276d29e0 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -67,12 +67,14 @@ struct gfn_to_pfn_cache { gpa_t gpa; unsigned long uhva; struct kvm_memory_slot *memslot; + struct kvm *kvm; struct kvm_vcpu *vcpu; struct list_head list; rwlock_t lock; struct mutex refresh_lock; void *khva; kvm_pfn_t pfn; + unsigned long len; enum pfn_cache_usage usage; bool active; bool valid; diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 76f1b669cf28..56ca0e9c6ed7 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -346,44 +346,48 @@ void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) } EXPORT_SYMBOL_GPL(kvm_gpc_unmap); -void kvm_gpc_init(struct gfn_to_pfn_cache *gpc) +void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm, + struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, + unsigned long len) { + WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage); + WARN_ON_ONCE((usage & KVM_GUEST_USES_PFN) && !vcpu); + rwlock_init(&gpc->lock); mutex_init(&gpc->refresh_lock); + + gpc->kvm = kvm; + gpc->vcpu = vcpu; + gpc->usage = usage; + gpc->len = len; } EXPORT_SYMBOL_GPL(kvm_gpc_init); -int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, - struct kvm_vcpu *vcpu, enum pfn_cache_usage usage, - gpa_t gpa, unsigned long len) +int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa) { - WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage); - if (!gpc->active) { gpc->khva = NULL; gpc->pfn = KVM_PFN_ERR_FAULT; gpc->uhva = KVM_HVA_ERR_BAD; - gpc->vcpu = vcpu; - gpc->usage = usage; gpc->valid = false; gpc->active = true; - spin_lock(&kvm->gpc_lock); - list_add(&gpc->list, &kvm->gpc_list); - spin_unlock(&kvm->gpc_lock); + spin_lock(&gpc->kvm->gpc_lock); + list_add(&gpc->list, &gpc->kvm->gpc_list); + spin_unlock(&gpc->kvm->gpc_lock); } - return kvm_gpc_refresh(kvm, gpc, gpa, len); + return kvm_gpc_refresh(gpc->kvm, gpc, gpa, gpc->len); } EXPORT_SYMBOL_GPL(kvm_gpc_activate); -void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) +void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc) { if (gpc->active) { - spin_lock(&kvm->gpc_lock); + spin_lock(&gpc->kvm->gpc_lock); list_del(&gpc->list); - spin_unlock(&kvm->gpc_lock); + spin_unlock(&gpc->kvm->gpc_lock); - kvm_gpc_unmap(kvm, gpc); + kvm_gpc_unmap(gpc->kvm, gpc); gpc->active = false; } } -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 5/8] KVM: x86: Clean up kvm_gpc_check() 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj ` (3 preceding siblings ...) 2022-10-05 12:30 ` [PATCH v2 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj @ 2022-10-05 12:30 ` Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj ` (2 subsequent siblings) 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Make kvm_gpc_check() use kvm instance cached in gfn_to_pfn_cache. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/x86.c | 2 +- arch/x86/kvm/xen.c | 14 ++++++-------- include/linux/kvm_host.h | 4 +--- virt/kvm/pfncache.c | 5 ++--- 4 files changed, 10 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ed8e4f8c9cf0..273f1ed3b5ae 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3017,7 +3017,7 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v, unsigned long flags; read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, + while (!kvm_gpc_check(gpc, gpc->gpa, offset + sizeof(*guest_hv_clock))) { read_unlock_irqrestore(&gpc->lock, flags); diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 9b4b0e6e66e5..84b95258773a 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -217,7 +217,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) user_len = sizeof(struct compat_vcpu_runstate_info); read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, user_len)) { + while (!kvm_gpc_check(gpc, gpc->gpa, user_len)) { read_unlock_irqrestore(&gpc->lock, flags); /* When invoked from kvm_sched_out() we cannot sleep */ @@ -350,8 +350,7 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v) * little more honest about it. */ read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) { read_unlock_irqrestore(&gpc->lock, flags); if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, @@ -415,8 +414,7 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v) sizeof_field(struct compat_vcpu_info, evtchn_upcall_pending)); read_lock_irqsave(&gpc->lock, flags); - while (!kvm_gpc_check(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) { read_unlock_irqrestore(&gpc->lock, flags); /* @@ -957,7 +955,7 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, read_lock_irqsave(&gpc->lock, flags); idx = srcu_read_lock(&kvm->srcu); - if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) + if (!kvm_gpc_check(gpc, gpc->gpa, PAGE_SIZE)) goto out_rcu; ret = false; @@ -1349,7 +1347,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) idx = srcu_read_lock(&kvm->srcu); read_lock_irqsave(&gpc->lock, flags); - if (!kvm_gpc_check(kvm, gpc, gpc->gpa, PAGE_SIZE)) + if (!kvm_gpc_check(gpc, gpc->gpa, PAGE_SIZE)) goto out_rcu; if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) { @@ -1383,7 +1381,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) gpc = &vcpu->arch.xen.vcpu_info_cache; read_lock_irqsave(&gpc->lock, flags); - if (!kvm_gpc_check(kvm, gpc, gpc->gpa, sizeof(struct vcpu_info))) { + if (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) { /* * Could not access the vcpu_info. Set the bit in-kernel * and prod the vCPU to deliver it for itself. diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 024b8df5302c..0b66d4889ec3 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1282,7 +1282,6 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa); /** * kvm_gpc_check - check validity of a gfn_to_pfn_cache. * - * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. * @gpa: current guest physical address to map. * @len: sanity check; the range being access must fit a single page. @@ -1297,8 +1296,7 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa); * Callers in IN_GUEST_MODE may do so without locking, although they should * still hold a read lock on kvm->scru for the memslot checks. */ -bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, - unsigned long len); +bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len); /** * kvm_gpc_refresh - update a previously initialized cache. diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 56ca0e9c6ed7..eb91025d7242 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -76,10 +76,9 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start, } } -bool kvm_gpc_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, - unsigned long len) +bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len) { - struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memslots *slots = kvm_memslots(gpc->kvm); if ((gpa & ~PAGE_MASK) + len > PAGE_SIZE) return false; -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 6/8] KVM: x86: Clean up hva_to_pfn_retry() 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj ` (4 preceding siblings ...) 2022-10-05 12:30 ` [PATCH v2 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj @ 2022-10-05 12:30 ` Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Make hva_to_pfn_retry() use kvm instance cached in gfn_to_pfn_cache. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- virt/kvm/pfncache.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index eb91025d7242..a2c95e393e34 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -135,7 +135,7 @@ static inline bool mmu_notifier_retry_cache(struct kvm *kvm, unsigned long mmu_s return kvm->mmu_invalidate_seq != mmu_seq; } -static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) +static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc) { /* Note, the new page offset may be different than the old! */ void *old_khva = gpc->khva - offset_in_page(gpc->khva); @@ -155,7 +155,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) gpc->valid = false; do { - mmu_seq = kvm->mmu_invalidate_seq; + mmu_seq = gpc->kvm->mmu_invalidate_seq; smp_rmb(); write_unlock_irq(&gpc->lock); @@ -213,7 +213,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) * attempting to refresh. */ WARN_ON_ONCE(gpc->valid); - } while (mmu_notifier_retry_cache(kvm, mmu_seq)); + } while (mmu_notifier_retry_cache(gpc->kvm, mmu_seq)); gpc->valid = true; gpc->pfn = new_pfn; @@ -285,7 +285,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, * drop the lock and do the HVA to PFN lookup again. */ if (!gpc->valid || old_uhva != gpc->uhva) { - ret = hva_to_pfn_retry(kvm, gpc); + ret = hva_to_pfn_retry(gpc); } else { /* If the HVA→PFN mapping was already valid, don't unmap it. */ old_pfn = KVM_PFN_ERR_FAULT; -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj ` (5 preceding siblings ...) 2022-10-05 12:30 ` [PATCH v2 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj @ 2022-10-05 12:30 ` Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj Make kvm_gpc_refresh() use kvm instance cached in gfn_to_pfn_cache. First argument of kvm_gpc_unmap() becomes unneeded; remove it from function definition. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> --- arch/x86/kvm/x86.c | 2 +- arch/x86/kvm/xen.c | 10 ++++------ include/linux/kvm_host.h | 10 ++++------ virt/kvm/pfncache.c | 11 +++++------ 4 files changed, 14 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 273f1ed3b5ae..d24d1731d2bc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3021,7 +3021,7 @@ static void kvm_setup_guest_pvclock(struct kvm_vcpu *v, offset + sizeof(*guest_hv_clock))) { read_unlock_irqrestore(&gpc->lock, flags); - if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, + if (kvm_gpc_refresh(gpc, gpc->gpa, offset + sizeof(*guest_hv_clock))) return; diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 84b95258773a..bcaaa83290fb 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -224,7 +224,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) if (state == RUNSTATE_runnable) return; - if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, user_len)) + if (kvm_gpc_refresh(gpc, gpc->gpa, user_len)) return; read_lock_irqsave(&gpc->lock, flags); @@ -353,8 +353,7 @@ void kvm_xen_inject_pending_events(struct kvm_vcpu *v) while (!kvm_gpc_check(gpc, gpc->gpa, sizeof(struct vcpu_info))) { read_unlock_irqrestore(&gpc->lock, flags); - if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) + if (kvm_gpc_refresh(gpc, gpc->gpa, sizeof(struct vcpu_info))) return; read_lock_irqsave(&gpc->lock, flags); @@ -428,8 +427,7 @@ int __kvm_xen_has_interrupt(struct kvm_vcpu *v) if (in_atomic() || !task_is_running(current)) return 1; - if (kvm_gpc_refresh(v->kvm, gpc, gpc->gpa, - sizeof(struct vcpu_info))) { + if (kvm_gpc_refresh(gpc, gpc->gpa, sizeof(struct vcpu_info))) { /* * If this failed, userspace has screwed up the * vcpu_info mapping. No interrupts for you. @@ -1479,7 +1477,7 @@ static int kvm_xen_set_evtchn(struct kvm_xen_evtchn *xe, struct kvm *kvm) break; idx = srcu_read_lock(&kvm->srcu); - rc = kvm_gpc_refresh(kvm, gpc, gpc->gpa, PAGE_SIZE); + rc = kvm_gpc_refresh(gpc, gpc->gpa, PAGE_SIZE); srcu_read_unlock(&kvm->srcu, idx); } while(!rc); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 0b66d4889ec3..efead11bec84 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1301,35 +1301,33 @@ bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len); /** * kvm_gpc_refresh - update a previously initialized cache. * - * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. * @gpa: updated guest physical address to map. * @len: sanity check; the range being access must fit a single page. * * @return: 0 for success. * -EINVAL for a mapping which would cross a page boundary. - * -EFAULT for an untranslatable guest physical address. + * -EFAULT for an untranslatable guest physical address. * * This will attempt to refresh a gfn_to_pfn_cache. Note that a successful - * returm from this function does not mean the page can be immediately + * return from this function does not mean the page can be immediately * accessed because it may have raced with an invalidation. Callers must * still lock and check the cache status, as this function does not return * with the lock still held to permit access. */ -int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, +int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len); /** * kvm_gpc_unmap - temporarily unmap a gfn_to_pfn_cache. * - * @kvm: pointer to kvm instance. * @gpc: struct gfn_to_pfn_cache object. * * This unmaps the referenced page. The cache is left in the invalid state * but at least the mapping from GPA to userspace HVA will remain cached * and can be reused on a subsequent refresh. */ -void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc); +void kvm_gpc_unmap(struct gfn_to_pfn_cache *gpc); /** * kvm_gpc_deactivate - deactivate and unlink a gfn_to_pfn_cache. diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index a2c95e393e34..45b9b96c0ea3 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -234,10 +234,9 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc) return -EFAULT; } -int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, - unsigned long len) +int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned long len) { - struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memslots *slots = kvm_memslots(gpc->kvm); unsigned long page_offset = gpa & ~PAGE_MASK; kvm_pfn_t old_pfn, new_pfn; unsigned long old_uhva; @@ -318,7 +317,7 @@ int kvm_gpc_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpa_t gpa, } EXPORT_SYMBOL_GPL(kvm_gpc_refresh); -void kvm_gpc_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) +void kvm_gpc_unmap(struct gfn_to_pfn_cache *gpc) { void *old_khva; kvm_pfn_t old_pfn; @@ -375,7 +374,7 @@ int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa) list_add(&gpc->list, &gpc->kvm->gpc_list); spin_unlock(&gpc->kvm->gpc_lock); } - return kvm_gpc_refresh(gpc->kvm, gpc, gpa, gpc->len); + return kvm_gpc_refresh(gpc, gpa, gpc->len); } EXPORT_SYMBOL_GPL(kvm_gpc_activate); @@ -386,7 +385,7 @@ void kvm_gpc_deactivate(struct gfn_to_pfn_cache *gpc) list_del(&gpc->list); spin_unlock(&gpc->kvm->gpc_lock); - kvm_gpc_unmap(gpc->kvm, gpc); + kvm_gpc_unmap(gpc); gpc->active = false; } } -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj ` (6 preceding siblings ...) 2022-10-05 12:30 ` [PATCH v2 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj @ 2022-10-05 12:30 ` Michal Luczaj 7 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-10-05 12:30 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, Michal Luczaj There's a race between kvm_xen_set_evtchn_fast() and kvm_gpc_activate() resulting in a near-NULL pointer write. 1. Deactivate shinfo cache: kvm_xen_hvm_set_attr case KVM_XEN_ATTR_TYPE_SHARED_INFO kvm_gpc_deactivate kvm_gpc_unmap gpc->valid = false gpc->khva = NULL gpc->active = false Result: active = false, valid = false 2. Cause cache refresh: kvm_arch_vm_ioctl case KVM_XEN_HVM_EVTCHN_SEND kvm_xen_hvm_evtchn_send kvm_xen_set_evtchn kvm_xen_set_evtchn_fast kvm_gpc_check return -EWOULDBLOCK because !gpc->valid kvm_xen_set_evtchn_fast return -EWOULDBLOCK kvm_gpc_refresh hva_to_pfn_retry gpc->valid = true gpc->khva = ~0 Result: active = false, valid = true 3. Race ioctl KVM_XEN_HVM_EVTCHN_SEND against ioctl KVM_XEN_ATTR_TYPE_SHARED_INFO: kvm_arch_vm_ioctl case KVM_XEN_HVM_EVTCHN_SEND kvm_xen_hvm_evtchn_send kvm_xen_set_evtchn kvm_xen_set_evtchn_fast read_lock gpc->lock kvm_xen_hvm_set_attr case KVM_XEN_ATTR_TYPE_SHARED_INFO mutex_lock kvm->lock kvm_xen_shared_info_init kvm_gpc_activate gpc->khva = NULL kvm_gpc_check [ Check passes because gpc->valid is still true, even though gpc->khva is already NULL. ] shinfo = gpc->khva pending_bits = shinfo->evtchn_pending CRASH: test_and_set_bit(..., pending_bits) Protect kvm_gpc_activate() cache properties writes by write lock gpc->lock. Signed-off-by: Michal Luczaj <mhal@rbox.co> --- Attaching more details: [ 86.127703] BUG: kernel NULL pointer dereference, address: 0000000000000800 [ 86.127751] #PF: supervisor write access in kernel mode [ 86.127778] #PF: error_code(0x0002) - not-present page [ 86.127801] PGD 105792067 P4D 105792067 PUD 105793067 PMD 0 [ 86.127826] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 86.127850] CPU: 0 PID: 945 Comm: xen_shinfo_test Not tainted 6.0.0-rc5-test+ #31 [ 86.127874] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014 [ 86.127898] RIP: 0010:kvm_xen_set_evtchn_fast (./arch/x86/include/asm/bitops.h:138 ./include/asm-generic/bitops/instrumented-atomic.h:72 arch/x86/kvm/xen.c:1370) kvm [ 86.127960] Code: 5a 84 c0 0f 84 01 01 00 00 80 bb b4 9f 00 00 00 8b 45 00 48 8b 93 c8 a0 00 00 75 4d 41 89 c0 48 8d b2 80 08 00 00 41 c1 e8 05 <f0> 48 0f ab 82 00 08 00 00 0f 82 19 01 00 00 8b 45 00 48 0f a3 06 All code ======== 0: 5a pop %rdx 1: 84 c0 test %al,%al 3: 0f 84 01 01 00 00 je 0x10a 9: 80 bb b4 9f 00 00 00 cmpb $0x0,0x9fb4(%rbx) 10: 8b 45 00 mov 0x0(%rbp),%eax 13: 48 8b 93 c8 a0 00 00 mov 0xa0c8(%rbx),%rdx 1a: 75 4d jne 0x69 1c: 41 89 c0 mov %eax,%r8d 1f: 48 8d b2 80 08 00 00 lea 0x880(%rdx),%rsi 26: 41 c1 e8 05 shr $0x5,%r8d 2a:* f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) <-- trapping instruction 31: 00 00 33: 0f 82 19 01 00 00 jb 0x152 39: 8b 45 00 mov 0x0(%rbp),%eax 3c: 48 0f a3 06 bt %rax,(%rsi) Code starting with the faulting instruction =========================================== 0: f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) 7: 00 00 9: 0f 82 19 01 00 00 jb 0x128 f: 8b 45 00 mov 0x0(%rbp),%eax 12: 48 0f a3 06 bt %rax,(%rsi) [ 86.127982] RSP: 0018:ffffc90001367c50 EFLAGS: 00010046 [ 86.128001] RAX: 0000000000000001 RBX: ffffc90001369000 RCX: 0000000000000001 [ 86.128021] RDX: 0000000000000000 RSI: 0000000000000a00 RDI: ffffffff82886a66 [ 86.128040] RBP: ffffc90001367ca8 R08: 0000000000000000 R09: 000000006cc00c97 [ 86.128060] R10: ffff88810c150000 R11: 0000000076cc00c9 R12: 0000000000000001 [ 86.128079] R13: ffffc90001372ff8 R14: ffff8881045c0000 R15: ffffc90001373830 [ 86.128098] FS: 00007f71d6111740(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 [ 86.128118] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 86.128138] CR2: 0000000000000800 CR3: 0000000104774006 CR4: 0000000000772ef0 [ 86.128158] PKRU: 55555554 [ 86.128177] Call Trace: [ 86.128196] <TASK> [ 86.128215] kvm_xen_hvm_evtchn_send (arch/x86/kvm/xen.c:1432 arch/x86/kvm/xen.c:1562) kvm [ 86.128256] kvm_arch_vm_ioctl (arch/x86/kvm/x86.c:6883) kvm [ 86.128294] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007) [ 86.128315] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007) [ 86.128335] ? kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm [ 86.128368] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm [ 86.128401] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710) [ 86.128422] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688) [ 86.128442] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856) [ 86.128462] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 86.128482] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128501] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128520] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 86.128539] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128558] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128577] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128596] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128615] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128634] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128653] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 86.128673] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 86.128692] RIP: 0033:0x7f71d6152c6b [ 86.128712] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48 All code ======== 0: 73 01 jae 0x3 2: c3 ret 3: 48 8b 0d b5 b1 1b 00 mov 0x1bb1b5(%rip),%rcx # 0x1bb1bf a: f7 d8 neg %eax c: 64 89 01 mov %eax,%fs:(%rcx) f: 48 83 c8 ff or $0xffffffffffffffff,%rax 13: c3 ret 14: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 1b: 00 00 00 1e: 90 nop 1f: f3 0f 1e fa endbr64 23: b8 10 00 00 00 mov $0x10,%eax 28: 0f 05 syscall 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction 30: 73 01 jae 0x33 32: c3 ret 33: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb1bf 3a: f7 d8 neg %eax 3c: 64 89 01 mov %eax,%fs:(%rcx) 3f: 48 rex.W Code starting with the faulting instruction =========================================== 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax 6: 73 01 jae 0x9 8: c3 ret 9: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb195 10: f7 d8 neg %eax 12: 64 89 01 mov %eax,%fs:(%rcx) 15: 48 rex.W [ 86.128735] RSP: 002b:00007fff7c716c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 86.128755] RAX: ffffffffffffffda RBX: 00007f71d61116c0 RCX: 00007f71d6152c6b [ 86.128775] RDX: 00007fff7c716f00 RSI: 00000000400caed0 RDI: 0000000000000004 [ 86.128794] RBP: 0000000001f2b2a0 R08: 0000000000417343 R09: 00007f71d62cc341 [ 86.128814] R10: 00007fff7c74c258 R11: 0000000000000246 R12: 00007f71d6329000 [ 86.128833] R13: 00000000632870b9 R14: 0000000000000001 R15: 00007f71d632a020 [ 86.128854] </TASK> [ 86.128873] Modules linked in: kvm_intel 9p fscache netfs nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr sunrpc intel_rapl_msr intel_rapl_common kvm irqbypass rapl pcspkr 9pnet_virtio i2c_piix4 9pnet drm zram ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_console serio_raw ata_generic virtio_blk pata_acpi qemu_fw_cfg ipmi_devintf ipmi_msghandler fuse [ 86.128893] Unloaded tainted modules: kvm_intel():1 [last unloaded: kvm_intel] [ 86.128944] CR2: 0000000000000800 [ 86.128962] ---[ end trace 0000000000000000 ]--- [ 86.128982] RIP: 0010:kvm_xen_set_evtchn_fast (./arch/x86/include/asm/bitops.h:138 ./include/asm-generic/bitops/instrumented-atomic.h:72 arch/x86/kvm/xen.c:1370) kvm [ 86.129022] Code: 5a 84 c0 0f 84 01 01 00 00 80 bb b4 9f 00 00 00 8b 45 00 48 8b 93 c8 a0 00 00 75 4d 41 89 c0 48 8d b2 80 08 00 00 41 c1 e8 05 <f0> 48 0f ab 82 00 08 00 00 0f 82 19 01 00 00 8b 45 00 48 0f a3 06 All code ======== 0: 5a pop %rdx 1: 84 c0 test %al,%al 3: 0f 84 01 01 00 00 je 0x10a 9: 80 bb b4 9f 00 00 00 cmpb $0x0,0x9fb4(%rbx) 10: 8b 45 00 mov 0x0(%rbp),%eax 13: 48 8b 93 c8 a0 00 00 mov 0xa0c8(%rbx),%rdx 1a: 75 4d jne 0x69 1c: 41 89 c0 mov %eax,%r8d 1f: 48 8d b2 80 08 00 00 lea 0x880(%rdx),%rsi 26: 41 c1 e8 05 shr $0x5,%r8d 2a:* f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) <-- trapping instruction 31: 00 00 33: 0f 82 19 01 00 00 jb 0x152 39: 8b 45 00 mov 0x0(%rbp),%eax 3c: 48 0f a3 06 bt %rax,(%rsi) Code starting with the faulting instruction =========================================== 0: f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) 7: 00 00 9: 0f 82 19 01 00 00 jb 0x128 f: 8b 45 00 mov 0x0(%rbp),%eax 12: 48 0f a3 06 bt %rax,(%rsi) [ 86.129044] RSP: 0018:ffffc90001367c50 EFLAGS: 00010046 [ 86.129064] RAX: 0000000000000001 RBX: ffffc90001369000 RCX: 0000000000000001 [ 86.129083] RDX: 0000000000000000 RSI: 0000000000000a00 RDI: ffffffff82886a66 [ 86.129103] RBP: ffffc90001367ca8 R08: 0000000000000000 R09: 000000006cc00c97 [ 86.129122] R10: ffff88810c150000 R11: 0000000076cc00c9 R12: 0000000000000001 [ 86.129142] R13: ffffc90001372ff8 R14: ffff8881045c0000 R15: ffffc90001373830 [ 86.129161] FS: 00007f71d6111740(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 [ 86.129181] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 86.129201] CR2: 0000000000000800 CR3: 0000000104774006 CR4: 0000000000772ef0 [ 86.129221] PKRU: 55555554 [ 86.129240] note: xen_shinfo_test[945] exited with preempt_count 1 [ 151.131754] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 151.131793] rcu: 1-...0: (13 ticks this GP) idle=785c/1/0x4000000000000000 softirq=4758/4760 fqs=16038 [ 151.131816] (detected by 2, t=65002 jiffies, g=6449, q=1299 ncpus=4) [ 151.131837] Sending NMI from CPU 2 to CPUs 1: [ 151.131862] NMI backtrace for cpu 1 [ 151.131864] CPU: 1 PID: 949 Comm: xen_shinfo_test Tainted: G D 6.0.0-rc5-test+ #31 [ 151.131866] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014 [ 151.131866] RIP: 0010:queued_write_lock_slowpath (kernel/locking/qrwlock.c:85) [ 151.131870] Code: ff 90 48 89 df 5b 5d e9 86 fd ff ff f0 81 0b 00 01 00 00 ba ff 00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 8b 03 <3d> 00 01 00 00 75 f5 89 c8 f0 0f b1 13 74 c0 eb e2 89 c6 48 89 ef All code ======== 0: ff 90 48 89 df 5b call *0x5bdf8948(%rax) 6: 5d pop %rbp 7: e9 86 fd ff ff jmp 0xfffffffffffffd92 c: f0 81 0b 00 01 00 00 lock orl $0x100,(%rbx) 13: ba ff 00 00 00 mov $0xff,%edx 18: b9 00 01 00 00 mov $0x100,%ecx 1d: 8b 03 mov (%rbx),%eax 1f: 3d 00 01 00 00 cmp $0x100,%eax 24: 74 0b je 0x31 26: f3 90 pause 28: 8b 03 mov (%rbx),%eax 2a:* 3d 00 01 00 00 cmp $0x100,%eax <-- trapping instruction 2f: 75 f5 jne 0x26 31: 89 c8 mov %ecx,%eax 33: f0 0f b1 13 lock cmpxchg %edx,(%rbx) 37: 74 c0 je 0xfffffffffffffff9 39: eb e2 jmp 0x1d 3b: 89 c6 mov %eax,%esi 3d: 48 89 ef mov %rbp,%rdi Code starting with the faulting instruction =========================================== 0: 3d 00 01 00 00 cmp $0x100,%eax 5: 75 f5 jne 0xfffffffffffffffc 7: 89 c8 mov %ecx,%eax 9: f0 0f b1 13 lock cmpxchg %edx,(%rbx) d: 74 c0 je 0xffffffffffffffcf f: eb e2 jmp 0xfffffffffffffff3 11: 89 c6 mov %eax,%esi 13: 48 89 ef mov %rbp,%rdi [ 151.131871] RSP: 0018:ffffc900011f7b98 EFLAGS: 00000006 [ 151.131872] RAX: 0000000000000300 RBX: ffffc90001372ff8 RCX: 0000000000000100 [ 151.131872] RDX: 00000000000000ff RSI: ffffffff827ea0a3 RDI: ffffffff82886a66 [ 151.131873] RBP: ffffc90001372ffc R08: ffff888107864160 R09: 00000000777cbcf6 [ 151.131873] R10: ffff888107863300 R11: 000000006777cbcf R12: ffffc90001372ff8 [ 151.131874] R13: ffffc90001369000 R14: ffffc90001372fb8 R15: ffffc90001369170 [ 151.131874] FS: 00007f71d5f09640(0000) GS:ffff888237c80000(0000) knlGS:0000000000000000 [ 151.131875] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 151.131876] CR2: 00007f71d5f08ef8 CR3: 0000000104774002 CR4: 0000000000772ee0 [ 151.131877] PKRU: 55555554 [ 151.131878] Call Trace: [ 151.131879] <TASK> [ 151.131880] do_raw_write_lock (./include/asm-generic/qrwlock.h:101 kernel/locking/spinlock_debug.c:210) [ 151.131883] kvm_gpc_refresh (arch/x86/kvm/../../../virt/kvm/pfncache.c:262) kvm [ 151.131907] ? kvm_gpc_activate (./include/linux/spinlock.h:390 arch/x86/kvm/../../../virt/kvm/pfncache.c:375) kvm [ 151.131925] kvm_xen_hvm_set_attr (arch/x86/kvm/xen.c:51 arch/x86/kvm/xen.c:464) kvm [ 151.131950] kvm_arch_vm_ioctl (arch/x86/kvm/x86.c:6883) kvm [ 151.131971] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007) [ 151.131973] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm [ 151.131991] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710) [ 151.131993] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688) [ 151.131995] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856) [ 151.131997] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 151.131998] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710) [ 151.132000] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132000] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 151.132002] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132003] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132003] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 151.132004] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132005] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132006] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132007] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 151.132008] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 151.132009] RIP: 0033:0x7f71d6152c6b [ 151.132011] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48 All code ======== 0: 73 01 jae 0x3 2: c3 ret 3: 48 8b 0d b5 b1 1b 00 mov 0x1bb1b5(%rip),%rcx # 0x1bb1bf a: f7 d8 neg %eax c: 64 89 01 mov %eax,%fs:(%rcx) f: 48 83 c8 ff or $0xffffffffffffffff,%rax 13: c3 ret 14: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 1b: 00 00 00 1e: 90 nop 1f: f3 0f 1e fa endbr64 23: b8 10 00 00 00 mov $0x10,%eax 28: 0f 05 syscall 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction 30: 73 01 jae 0x33 32: c3 ret 33: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb1bf 3a: f7 d8 neg %eax 3c: 64 89 01 mov %eax,%fs:(%rcx) 3f: 48 rex.W Code starting with the faulting instruction =========================================== 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax 6: 73 01 jae 0x9 8: c3 ret 9: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb195 10: f7 d8 neg %eax 12: 64 89 01 mov %eax,%fs:(%rcx) 15: 48 rex.W [ 151.132011] RSP: 002b:00007f71d5f08da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 151.132012] RAX: ffffffffffffffda RBX: 0000000001f2b2a0 RCX: 00007f71d6152c6b [ 151.132013] RDX: 00007f71d5f08db0 RSI: 000000004048aec9 RDI: 0000000000000004 [ 151.132013] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fff7c716b3f [ 151.132013] R10: 00007f71d611c948 R11: 0000000000000246 R12: 00007f71d5f09640 [ 151.132014] R13: 0000000000000004 R14: 00007f71d61b3550 R15: 0000000000000000 [ 151.132016] </TASK> virt/kvm/pfncache.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 45b9b96c0ea3..e987669c3506 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -364,11 +364,13 @@ EXPORT_SYMBOL_GPL(kvm_gpc_init); int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa) { if (!gpc->active) { + write_lock_irq(&gpc->lock); gpc->khva = NULL; gpc->pfn = KVM_PFN_ERR_FAULT; gpc->uhva = KVM_HVA_ERR_BAD; gpc->valid = false; gpc->active = true; + write_unlock_irq(&gpc->lock); spin_lock(&gpc->kvm->gpc_lock); list_add(&gpc->list, &gpc->kvm->gpc_list); -- 2.37.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [RFC PATCH 4/4] KVM: x86/xen: Test shinfo_cache lock races 2022-09-16 0:54 [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption Michal Luczaj ` (2 preceding siblings ...) 2022-09-16 0:54 ` [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization Michal Luczaj @ 2022-09-16 0:54 ` Michal Luczaj 3 siblings, 0 replies; 34+ messages in thread From: Michal Luczaj @ 2022-09-16 0:54 UTC (permalink / raw) To: kvm; +Cc: seanjc, pbonzini, shuah, Michal Luczaj Tests for races between shinfo_cache initialization/destruction (causing lock reinitialization) and hypercall/ioctl processing (acquiring uninitialized lock, holding soon-to-be-corrupted lock). Signed-off-by: Michal Luczaj <mhal@rbox.co> --- .../selftests/kvm/x86_64/xen_shinfo_test.c | 100 ++++++++++++++++++ 1 file changed, 100 insertions(+) diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c index 8a5cb800f50e..8e251b2bfa62 100644 --- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c +++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c @@ -15,9 +15,13 @@ #include <time.h> #include <sched.h> #include <signal.h> +#include <pthread.h> #include <sys/eventfd.h> +/* Defined in include/linux/kvm_types.h */ +#define GPA_INVALID (~(ulong)0) + #define SHINFO_REGION_GVA 0xc0000000ULL #define SHINFO_REGION_GPA 0xc0000000ULL #define SHINFO_REGION_SLOT 10 @@ -44,6 +48,8 @@ #define MIN_STEAL_TIME 50000 +#define SHINFO_RACE_TIMEOUT 2 /* seconds */ + #define __HYPERVISOR_set_timer_op 15 #define __HYPERVISOR_sched_op 29 #define __HYPERVISOR_event_channel_op 32 @@ -325,6 +331,32 @@ static void guest_code(void) guest_wait_for_irq(); GUEST_SYNC(21); + /* Racing host ioctls */ + + guest_wait_for_irq(); + + GUEST_SYNC(22); + /* Racing vmcall against host ioctl */ + + ports[0] = 0; + + p = (struct sched_poll) { + .ports = ports, + .nr_ports = 1, + .timeout = 0 + }; + + do { + asm volatile("vmcall" + : "=a" (rax) + : "a" (__HYPERVISOR_sched_op), + "D" (SCHEDOP_poll), + "S" (&p) + : "memory"); + } while (!guest_saw_irq); + guest_saw_irq = false; + + GUEST_SYNC(23); } static int cmp_timespec(struct timespec *a, struct timespec *b) @@ -352,11 +384,36 @@ static void handle_alrm(int sig) TEST_FAIL("IRQ delivery timed out"); } +static void *juggle_shinfo_state(void *arg) +{ + struct kvm_vm *vm = (struct kvm_vm *)arg; + + struct kvm_xen_hvm_attr cache_init = { + .type = KVM_XEN_ATTR_TYPE_SHARED_INFO, + .u.shared_info.gfn = SHINFO_REGION_GPA / PAGE_SIZE + }; + + struct kvm_xen_hvm_attr cache_destroy = { + .type = KVM_XEN_ATTR_TYPE_SHARED_INFO, + .u.shared_info.gfn = GPA_INVALID + }; + + for (;;) { + __vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &cache_init); + __vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &cache_destroy); + pthread_testcancel(); + }; + + return NULL; +} + int main(int argc, char *argv[]) { struct timespec min_ts, max_ts, vm_ts; struct kvm_vm *vm; + pthread_t thread; bool verbose; + int ret; verbose = argc > 1 && (!strncmp(argv[1], "-v", 3) || !strncmp(argv[1], "--verbose", 10)); @@ -785,6 +842,49 @@ int main(int argc, char *argv[]) case 21: TEST_ASSERT(!evtchn_irq_expected, "Expected event channel IRQ but it didn't happen"); + alarm(0); + + if (verbose) + printf("Testing shinfo lock corruption (KVM_XEN_HVM_EVTCHN_SEND)\n"); + + ret = pthread_create(&thread, NULL, &juggle_shinfo_state, (void *)vm); + TEST_ASSERT(ret == 0, "pthread_create() failed: %s", strerror(ret)); + + struct kvm_irq_routing_xen_evtchn uxe = { + .port = 1, + .vcpu = vcpu->id, + .priority = KVM_IRQ_ROUTING_XEN_EVTCHN_PRIO_2LEVEL + }; + + evtchn_irq_expected = true; + for (time_t t = time(NULL) + SHINFO_RACE_TIMEOUT; time(NULL) < t;) + __vm_ioctl(vm, KVM_XEN_HVM_EVTCHN_SEND, &uxe); + break; + + case 22: + TEST_ASSERT(!evtchn_irq_expected, + "Expected event channel IRQ but it didn't happen"); + + if (verbose) + printf("Testing shinfo lock corruption (SCHEDOP_poll)\n"); + + shinfo->evtchn_pending[0] = 1; + + evtchn_irq_expected = true; + tmr.u.timer.expires_ns = rs->state_entry_time + + SHINFO_RACE_TIMEOUT * 1000000000ULL; + vcpu_ioctl(vcpu, KVM_XEN_VCPU_SET_ATTR, &tmr); + break; + + case 23: + TEST_ASSERT(!evtchn_irq_expected, + "Expected event channel IRQ but it didn't happen"); + + ret = pthread_cancel(thread); + TEST_ASSERT(ret == 0, "pthread_cancel() failed: %s", strerror(ret)); + + ret = pthread_join(thread, 0); + TEST_ASSERT(ret == 0, "pthread_join() failed: %s", strerror(ret)); goto done; case 0x20: -- 2.37.2 ^ permalink raw reply related [flat|nested] 34+ messages in thread
end of thread, other threads:[~2022-10-21 2:40 UTC | newest] Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-09-16 0:54 [RFC PATCH 0/4] KVM: x86/xen: shinfo cache lock corruption Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 1/4] KVM: x86/xen: Ensure kvm_xen_set_evtchn_fast() can use shinfo_cache Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 2/4] KVM: x86/xen: Ensure kvm_xen_schedop_poll() " Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 3/4] KVM: x86/xen: Disallow gpc locks reinitialization Michal Luczaj 2022-09-16 17:12 ` Sean Christopherson 2022-09-18 23:13 ` Michal Luczaj 2022-09-21 2:01 ` [PATCH 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj 2022-09-21 2:01 ` [PATCH 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj 2022-10-10 23:38 ` Sean Christopherson 2022-09-21 2:01 ` [PATCH 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj 2022-09-21 2:01 ` [PATCH 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj 2022-09-21 2:01 ` [PATCH 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj 2022-10-10 23:42 ` Sean Christopherson 2022-10-11 0:37 ` Sean Christopherson 2022-09-21 2:01 ` [PATCH 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj 2022-09-21 2:01 ` [PATCH 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj 2022-09-21 2:01 ` [PATCH 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj 2022-09-21 2:01 ` [PATCH 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj 2022-10-10 23:28 ` Sean Christopherson 2022-10-13 0:22 ` Sean Christopherson 2022-10-13 20:28 ` Sean Christopherson 2022-10-20 15:59 ` Michal Luczaj 2022-10-20 16:58 ` Sean Christopherson 2022-10-21 2:39 ` Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 0/8] KVM: x86: gfn_to_pfn_cache cleanups and a fix Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 1/8] KVM: x86: Add initializer for gfn_to_pfn_cache Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 2/8] KVM: x86: Shorten gfn_to_pfn_cache function names Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 3/8] KVM: x86: Remove unused argument in gpc_unmap_khva() Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 4/8] KVM: x86: Store immutable gfn_to_pfn_cache properties Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 5/8] KVM: x86: Clean up kvm_gpc_check() Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 6/8] KVM: x86: Clean up hva_to_pfn_retry() Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 7/8] KVM: x86: Clean up kvm_gpc_refresh(), kvm_gpc_unmap() Michal Luczaj 2022-10-05 12:30 ` [PATCH v2 8/8] KVM: x86: Fix NULL pointer dereference in kvm_xen_set_evtchn_fast() Michal Luczaj 2022-09-16 0:54 ` [RFC PATCH 4/4] KVM: x86/xen: Test shinfo_cache lock races Michal Luczaj
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.