On Fri, 2023-01-13 at 10:33 +0000, David Woodhouse wrote: > > So everything seems to be working as it should... *except* for the fact > that I don't quite understand why xen_shinfo_test didn't trigger the > warning. Michal, I guess you already worked that out when you came up > with your deadlock-test instead... is there something we should add to > xen_shinfo_test that would mean it *would* have triggered? Got it. It only happens when kvm_xen_set_evtchn() takes the slow path when kvm_xen_set_evtchn_fast() fails. Not utterly sure why that works in your deadlock_test but I can make it happen in xen_shinfo_test just by invalidating the GPC by changing the memslots: --- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c +++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c @@ -29,6 +29,9 @@ #define DUMMY_REGION_GPA (SHINFO_REGION_GPA + (3 * PAGE_SIZE)) #define DUMMY_REGION_SLOT 11 +#define DUMMY_REGION_GPA_2 (SHINFO_REGION_GPA + (4 * PAGE_SIZE)) +#define DUMMY_REGION_SLOT_2 12 + #define SHINFO_ADDR (SHINFO_REGION_GPA) #define VCPU_INFO_ADDR (SHINFO_REGION_GPA + 0x40) #define PVTIME_ADDR (SHINFO_REGION_GPA + PAGE_SIZE) @@ -765,6 +768,8 @@ int main(int argc, char *argv[]) if (verbose) printf("Testing guest EVTCHNOP_send direct to evtchn\n"); + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, + DUMMY_REGION_GPA_2, DUMMY_REGION_SLOT_2, 1, 0); evtchn_irq_expected = true; alarm(1); break; I did also need the trick below. I'll send patches properly, keeping the fast path test and *adding* the slow one, instead of just changing it as above. I also validated that if I put back the evtchn_reset deadlock fix, and the separate xen_lock which is currently the tip of kvm/master, it all works without complaints (or deadlocks). > I even tried this: > > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -1173,6 +1173,16 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) >         if (init_srcu_struct(&kvm->irq_srcu)) >                 goto out_err_no_irq_srcu; >   > +#ifdef CONFIG_LOCKDEP > +       /* > +        * Ensure lockdep knows that it's not permitted to lock kvm->lock > +        * from a SRCU read section on kvm->srcu. > +        */ > +       mutex_lock(&kvm->lock); > +       synchronize_srcu(&kvm->srcu); > +       mutex_unlock(&kvm->lock); > +#endif > + >         refcount_set(&kvm->users_count, 1); >         for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { >                 for (j = 0; j < 2; j++) { > > [ 91.866348] ====================================================== [ 91.866349] WARNING: possible circular locking dependency detected [ 91.866351] 6.2.0-rc3+ #1025 Tainted: G OE [ 91.866353] ------------------------------------------------------ [ 91.866354] xen_shinfo_test/2938 is trying to acquire lock: [ 91.866356] ffffc9000179e3c0 (&kvm->lock){+.+.}-{3:3}, at: kvm_xen_set_evtchn.part.0+0x6d/0x170 [kvm] [ 91.866453] but task is already holding lock: [ 91.866454] ffffc900017a7868 (&kvm->srcu){.+.+}-{0:0}, at: vcpu_enter_guest.constprop.0+0xa89/0x1270 [kvm] [ 91.866527] which lock already depends on the new lock. [ 91.866528] the existing dependency chain (in reverse order) is: [ 91.866529] -> #1 (&kvm->srcu){.+.+}-{0:0}: [ 91.866532] __lock_acquire+0x4b4/0x940 [ 91.866537] lock_sync+0x99/0x110 [ 91.866540] __synchronize_srcu+0x4d/0x170 [ 91.866543] kvm_create_vm+0x271/0x6e0 [kvm] [ 91.866621] kvm_dev_ioctl+0x102/0x1c0 [kvm] [ 91.866694] __x64_sys_ioctl+0x8a/0xc0 [ 91.866697] do_syscall_64+0x3b/0x90 [ 91.866701] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 91.866707] -> #0 (&kvm->lock){+.+.}-{3:3}: [ 91.866710] check_prev_add+0x8f/0xc20 [ 91.866712] validate_chain+0x3ba/0x450 [ 91.866714] __lock_acquire+0x4b4/0x940 [ 91.866716] lock_acquire.part.0+0xa8/0x210 [ 91.866717] __mutex_lock+0x94/0x920 [ 91.866721] kvm_xen_set_evtchn.part.0+0x6d/0x170 [kvm] [ 91.866790] kvm_xen_hcall_evtchn_send.constprop.0+0x138/0x1c0 [kvm] [ 91.866869] kvm_xen_hypercall+0x475/0x5a0 [kvm] [ 91.866938] vmx_handle_exit+0xe/0x50 [kvm_intel] [ 91.866955] vcpu_enter_guest.constprop.0+0xb08/0x1270 [kvm] [ 91.867034] vcpu_run+0x1bd/0x450 [kvm] [ 91.867100] kvm_arch_vcpu_ioctl_run+0x1df/0x670 [kvm] [ 91.867167] kvm_vcpu_ioctl+0x279/0x700 [kvm] [ 91.867229] __x64_sys_ioctl+0x8a/0xc0 [ 91.867231] do_syscall_64+0x3b/0x90 [ 91.867235] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 91.867238] other info that might help us debug this: [ 91.867239] Possible unsafe locking scenario: [ 91.867240] CPU0 CPU1 [ 91.867241] ---- ---- [ 91.867242] lock(&kvm->srcu); [ 91.867244] lock(&kvm->lock); [ 91.867246] lock(&kvm->srcu); [ 91.867248] lock(&kvm->lock); [ 91.867249] *** DEADLOCK *** [ 91.867250] 2 locks held by xen_shinfo_test/2938: [ 91.867252] #0: ffff88815a8800b0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x77/0x700 [kvm] [ 91.867318] #1: ffffc900017a7868 (&kvm->srcu){.+.+}-{0:0}, at: vcpu_enter_guest.constprop.0+0xa89/0x1270 [kvm] [ 91.867387] stack backtrace: [ 91.867389] CPU: 26 PID: 2938 Comm: xen_shinfo_test Tainted: G OE 6.2.0-rc3+ #1025 [ 91.867392] Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015 [ 91.867394] Call Trace: [ 91.867395] [ 91.867398] dump_stack_lvl+0x56/0x73 [ 91.867403] check_noncircular+0x102/0x120 [ 91.867409] check_prev_add+0x8f/0xc20 [ 91.867411] ? add_chain_cache+0x10b/0x2d0 [ 91.867415] validate_chain+0x3ba/0x450 [ 91.867418] __lock_acquire+0x4b4/0x940 [ 91.867421] lock_acquire.part.0+0xa8/0x210 [ 91.867424] ? kvm_xen_set_evtchn.part.0+0x6d/0x170 [kvm] [ 91.867494] ? rcu_read_lock_sched_held+0x43/0x70 [ 91.867498] ? lock_acquire+0x102/0x140 [ 91.867501] __mutex_lock+0x94/0x920 [ 91.867505] ? kvm_xen_set_evtchn.part.0+0x6d/0x170 [kvm] [ 91.867574] ? find_held_lock+0x2b/0x80 [ 91.867578] ? kvm_xen_set_evtchn.part.0+0x6d/0x170 [kvm] [ 91.867647] ? __lock_release+0x5f/0x170 [ 91.867652] ? kvm_xen_set_evtchn.part.0+0x6d/0x170 [kvm] [ 91.867721] kvm_xen_set_evtchn.part.0+0x6d/0x170 [kvm] [ 91.867791] kvm_xen_hcall_evtchn_send.constprop.0+0x138/0x1c0 [kvm] [ 91.867875] kvm_xen_hypercall+0x475/0x5a0 [kvm] [ 91.867947] ? rcu_read_lock_sched_held+0x43/0x70 [ 91.867952] vmx_handle_exit+0xe/0x50 [kvm_intel] [ 91.867966] vcpu_enter_guest.constprop.0+0xb08/0x1270 [kvm] [ 91.868046] ? lock_acquire.part.0+0xa8/0x210 [ 91.868050] ? vcpu_run+0x1bd/0x450 [kvm] [ 91.868117] ? lock_acquire+0x102/0x140 [ 91.868121] vcpu_run+0x1bd/0x450 [kvm] [ 91.868189] kvm_arch_vcpu_ioctl_run+0x1df/0x670 [kvm] [ 91.868257] kvm_vcpu_ioctl+0x279/0x700 [kvm] [ 91.868322] ? get_cpu_entry_area+0xb/0x30 [ 91.868327] ? _raw_spin_unlock_irq+0x34/0x50 [ 91.868330] ? do_setitimer+0x190/0x1e0 [ 91.868335] __x64_sys_ioctl+0x8a/0xc0 [ 91.868338] do_syscall_64+0x3b/0x90 [ 91.868341] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 91.868345] RIP: 0033:0x7f313103fd1b [ 91.868348] Code: 73 01 c3 48 8b 0d 05 a1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 a0 1b 00 f7 d8 64 89 01 48 [ 91.868350] RSP: 002b:00007ffcdc02dba8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 91.868353] RAX: ffffffffffffffda RBX: 00007f31313d2000 RCX: 00007f313103fd1b [ 91.868355] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000007 [ 91.868356] RBP: 00007f313139a6c0 R08: 000000000065a2f0 R09: 0000000000000000 [ 91.868357] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000065c800 [ 91.868359] R13: 000000000000000a R14: 00007f31313d0ff1 R15: 000000000065a2a0 [ 91.868363]