All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2)
@ 2022-04-23 10:56 syzbot
  2022-04-26 16:20 ` Maxim Levitsky
  0 siblings, 1 reply; 9+ messages in thread
From: syzbot @ 2022-04-23 10:56 UTC (permalink / raw)
  To: bp, dave.hansen, hpa, jmattson, joro, kvm, linux-kernel, mingo,
	pbonzini, seanjc, syzkaller-bugs, tglx, vkuznets, wanpengli, x86

Hello,

syzbot found the following issue on:

HEAD commit:    59f0c2447e25 Merge tag 'net-5.18-rc4' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15a61430f00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=6bc13fa21dd76a9b
dashboard link: https://syzkaller.appspot.com/bug?extid=a8ad3ee1525a0c4b40ec
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=134363d0f00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11ed3e34f00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a8ad3ee1525a0c4b40ec@syzkaller.appspotmail.com

------------[ cut here ]------------
WARNING: CPU: 1 PID: 3597 at arch/x86/kvm/mmu/tdp_mmu.c:57 kvm_mmu_uninit_tdp_mmu+0xf8/0x130 arch/x86/kvm/mmu/tdp_mmu.c:57
Modules linked in:
CPU: 1 PID: 3597 Comm: syz-executor294 Not tainted 5.18.0-rc3-syzkaller-00060-g59f0c2447e25 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
RIP: 0010:kvm_mmu_uninit_tdp_mmu+0xf8/0x130 arch/x86/kvm/mmu/tdp_mmu.c:57
Code: 83 d8 a0 00 00 48 39 c5 75 24 e8 e3 4d 5a 00 e8 9e e0 45 00 5b 5d e9 d7 4d 5a 00 e8 b2 42 a5 00 e9 3d ff ff ff e8 c8 4d 5a 00 <0f> 0b eb ad e8 bf 4d 5a 00 0f 0b eb d3 e8 c6 42 a5 00 e9 64 ff ff
RSP: 0018:ffffc90002e37c08 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffc90002cda000 RCX: 0000000000000000
RDX: ffff888023f1e180 RSI: ffffffff811e1688 RDI: 0000000000000001
RBP: ffffc90002ce40e8 R08: 0000000000000001 R09: 0000000000000001
R10: ffffffff817ead48 R11: 0000000000000000 R12: ffffc90002cda000
R13: ffffc90002e37c50 R14: 0000000000000003 R15: ffffc90002cdb240
FS:  0000000000000000(0000) GS:ffff88802cb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560ac4d0cd68 CR3: 000000000ba8e000 CR4: 0000000000152ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 kvm_arch_destroy_vm+0x350/0x470 arch/x86/kvm/x86.c:11860
 kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1230 [inline]
 kvm_put_kvm+0x4fa/0xb70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1264
 kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1287
 __fput+0x277/0x9d0 fs/file_table.c:317
 task_work_run+0xdd/0x1a0 kernel/task_work.c:164
 exit_task_work include/linux/task_work.h:37 [inline]
 do_exit+0xaff/0x2a00 kernel/exit.c:795
 do_group_exit+0xd2/0x2f0 kernel/exit.c:925
 __do_sys_exit_group kernel/exit.c:936 [inline]
 __se_sys_exit_group kernel/exit.c:934 [inline]
 __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:934
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f0327505409
Code: Unable to access opcode bytes at RIP 0x7f03275053df.
RSP: 002b:00007ffc4a0be998 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f0327578350 RCX: 00007f0327505409
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffffffffffc0 R09: 68742f636f72702f
R10: 00000000ffffffff R11: 0000000000000246 R12: 00007f0327578350
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2)
  2022-04-23 10:56 [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2) syzbot
@ 2022-04-26 16:20 ` Maxim Levitsky
  2022-04-28  7:25   ` Maxim Levitsky
  2022-04-28 15:32   ` Sean Christopherson
  0 siblings, 2 replies; 9+ messages in thread
From: Maxim Levitsky @ 2022-04-26 16:20 UTC (permalink / raw)
  To: syzbot, bp, dave.hansen, hpa, jmattson, joro, kvm, linux-kernel,
	mingo, pbonzini, seanjc, syzkaller-bugs, tglx, vkuznets,
	wanpengli, x86

On Sat, 2022-04-23 at 03:56 -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    59f0c2447e25 Merge tag 'net-5.18-rc4' of git://git.kernel...
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15a61430f00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=6bc13fa21dd76a9b
> dashboard link: https://syzkaller.appspot.com/bug?extid=a8ad3ee1525a0c4b40ec
> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=134363d0f00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11ed3e34f00000
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+a8ad3ee1525a0c4b40ec@syzkaller.appspotmail.com
> 
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 3597 at arch/x86/kvm/mmu/tdp_mmu.c:57 kvm_mmu_uninit_tdp_mmu+0xf8/0x130 arch/x86/kvm/mmu/tdp_mmu.c:57
> Modules linked in:
> CPU: 1 PID: 3597 Comm: syz-executor294 Not tainted 5.18.0-rc3-syzkaller-00060-g59f0c2447e25 #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
> RIP: 0010:kvm_mmu_uninit_tdp_mmu+0xf8/0x130 arch/x86/kvm/mmu/tdp_mmu.c:57
> Code: 83 d8 a0 00 00 48 39 c5 75 24 e8 e3 4d 5a 00 e8 9e e0 45 00 5b 5d e9 d7 4d 5a 00 e8 b2 42 a5 00 e9 3d ff ff ff e8 c8 4d 5a 00 <0f> 0b eb ad e8 bf 4d 5a 00 0f 0b eb d3 e8 c6 42 a5 00 e9 64 ff ff
> RSP: 0018:ffffc90002e37c08 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffffc90002cda000 RCX: 0000000000000000
> RDX: ffff888023f1e180 RSI: ffffffff811e1688 RDI: 0000000000000001
> RBP: ffffc90002ce40e8 R08: 0000000000000001 R09: 0000000000000001
> R10: ffffffff817ead48 R11: 0000000000000000 R12: ffffc90002cda000
> R13: ffffc90002e37c50 R14: 0000000000000003 R15: ffffc90002cdb240
> FS:  0000000000000000(0000) GS:ffff88802cb00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000560ac4d0cd68 CR3: 000000000ba8e000 CR4: 0000000000152ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  kvm_arch_destroy_vm+0x350/0x470 arch/x86/kvm/x86.c:11860
>  kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1230 [inline]
>  kvm_put_kvm+0x4fa/0xb70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1264
>  kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1287
>  __fput+0x277/0x9d0 fs/file_table.c:317
>  task_work_run+0xdd/0x1a0 kernel/task_work.c:164
>  exit_task_work include/linux/task_work.h:37 [inline]
>  do_exit+0xaff/0x2a00 kernel/exit.c:795
>  do_group_exit+0xd2/0x2f0 kernel/exit.c:925
>  __do_sys_exit_group kernel/exit.c:936 [inline]
>  __se_sys_exit_group kernel/exit.c:934 [inline]
>  __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:934
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f0327505409
> Code: Unable to access opcode bytes at RIP 0x7f03275053df.
> RSP: 002b:00007ffc4a0be998 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> RAX: ffffffffffffffda RBX: 00007f0327578350 RCX: 00007f0327505409
> RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffffffffffffffc0 R09: 68742f636f72702f
> R10: 00000000ffffffff R11: 0000000000000246 R12: 00007f0327578350
> R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
>  </TASK>
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> syzbot can test patches for this issue, for details see:
> https://goo.gl/tpsmEJ#testing-patches
> 

I can reproduce this in a VM, by running and CTRL+C'in my ipi_stress test,
I am using to test AVIC, but without any relevant patches in the guest kernel.


[  304.367317] WARNING: CPU: 1 PID: 5460 at arch/x86/kvm/mmu/tdp_mmu.c:57 kvm_mmu_uninit_tdp_mmu+0x55/0x60 [kvm]
[  304.368751] Modules linked in: kvm_amd(O) ccp kvm(O) irqbypass xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT uinput snd_seq_dummy snd_hrtimer ip6table_mangle ip6table_nat ip6table_filter
ip6_tables iptable_mangle iptable_nat nf_nat bridge rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace rfkill sunrpc vfat fat snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg
snd_hda_codec snd_hwdep snd_hda_core rng_core snd_seq snd_seq_device input_leds snd_pcm joydev snd_timer snd lpc_ich mfd_core virtio_input efi_pstore pcspkr rtc_cmos button sch_fq_codel ext4 mbcache
jbd2 hid_generic usbhid hid virtio_gpu virtio_dma_buf drm_shmem_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops virtio_blk virtio_net drm net_failover virtio_console failover
i2c_core crc32_pclmul crc32c_intel xhci_pci xhci_hcd virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring dm_mirror dm_region_hash dm_log fuse ipv6 autofs4 [last unloaded: ccp]
[  304.383703] CPU: 1 PID: 5460 Comm: CPU 6/KVM Tainted: G        W  O      5.18.0-rc4.unstable #5
[  304.384804] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[  304.385659] RIP: 0010:kvm_mmu_uninit_tdp_mmu+0x55/0x60 [kvm]
[  304.386302] Code: 8d 83 28 a4 00 00 48 39 c2 75 1e 48 8b 83 18 a4 00 00 48 81 c3 18 a4 00 00 48 39 d8 75 11 e8 52 d5 8d e0 48 8b 5d f8 c9 c3 90 <0f> 0b 90 eb dc 90 0f 0b 90 eb e9 0f 1f 44 00 00 55
8b 05 a0 c3 d6
[  304.388246] RSP: 0018:ffffc90003503bf0 EFLAGS: 00010293
[  304.388807] RAX: ffffc900034cb428 RBX: ffffc900034c1000 RCX: 0000000000000000
[  304.389561] RDX: ffff888120446c80 RSI: ffffffff8115e949 RDI: ffffffff81a6ee08
[  304.390315] RBP: ffffc90003503bf8 R08: 0000000000000000 R09: 0000000000000000
[  304.391078] R10: 0000000000000394 R11: 0000000000000386 R12: ffffc900034c1000
[  304.391818] R13: ffffc900034c1000 R14: ffff888113d8b728 R15: dead000000000100
[  304.392573] FS:  0000000000000000(0000) GS:ffff88846ce40000(0000) knlGS:0000000000000000
[  304.393430] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  304.394050] CR2: 0000000000000000 CR3: 0000000002c21000 CR4: 0000000000350ee0
[  304.394805] Call Trace:
[  304.395086]  <TASK>
[  304.395315]  kvm_mmu_uninit_vm+0x22/0x30 [kvm]
[  304.395821]  kvm_arch_destroy_vm+0x135/0x1c0 [kvm]
[  304.396365]  kvm_destroy_vm+0x19d/0x310 [kvm]
[  304.396858]  kvm_put_kvm+0x26/0x40 [kvm]
[  304.397390]  kvm_vm_release+0x22/0x30 [kvm]
[  304.397865]  __fput+0xa5/0x270
[  304.398207]  ____fput+0xe/0x10
[  304.398532]  task_work_run+0x61/0xb0
[  304.398913]  do_exit+0x3a9/0xb60
[  304.399286]  do_group_exit+0x3b/0xc0
[  304.399676]  get_signal+0xd9c/0xde0
[  304.400062]  arch_do_signal_or_restart+0x37/0x790
[  304.400576]  ? do_futex+0x8a/0x150
[  304.400950]  exit_to_user_mode_prepare+0x152/0x240
[  304.401603]  syscall_exit_to_user_mode+0x1f/0x60
[  304.402230]  do_syscall_64+0x44/0x80
[  304.402690]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  304.408493] RIP: 0033:0x7f593c746a8a
[  304.408887] Code: Unable to access opcode bytes at RIP 0x7f593c746a60.
[  304.409589] RSP: 002b:00007f59348ed5a0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
[  304.410407] RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f593c746a8a
[  304.411365] RDX: 0000000000000000 RSI: 0000000000000189 RDI: 000055f11c7f4fe8
[  304.411366] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000ffffffff
[  304.411367] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  304.411368] R13: 000055f11c7f4fe8 R14: 0000000000000000 R15: 0000000000000000
[  304.411371]  </TASK>
[  304.414745] irq event stamp: 0
[  304.415086] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[  304.415762] hardirqs last disabled at (0): [<ffffffff811339cb>] copy_process+0x94b/0x1ec0
[  304.416628] softirqs last  enabled at (0): [<ffffffff811339cb>] copy_process+0x94b/0x1ec0
[  304.417510] softirqs last disabled at (0): [<0000000000000000>] 0x0
[  304.418186] ---[ end trace 0000000000000000 ]---


Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2)
  2022-04-26 16:20 ` Maxim Levitsky
@ 2022-04-28  7:25   ` Maxim Levitsky
  2022-04-28 15:32   ` Sean Christopherson
  1 sibling, 0 replies; 9+ messages in thread
From: Maxim Levitsky @ 2022-04-28  7:25 UTC (permalink / raw)
  To: syzbot, bp, dave.hansen, hpa, jmattson, joro, kvm, linux-kernel,
	mingo, pbonzini, seanjc, syzkaller-bugs, tglx, vkuznets,
	wanpengli, x86

On Tue, 2022-04-26 at 19:20 +0300, Maxim Levitsky wrote:
> On Sat, 2022-04-23 at 03:56 -0700, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    59f0c2447e25 Merge tag 'net-5.18-rc4' of git://git.kernel...
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15a61430f00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=6bc13fa21dd76a9b
> > dashboard link: https://syzkaller.appspot.com/bug?extid=a8ad3ee1525a0c4b40ec
> > compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=134363d0f00000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11ed3e34f00000
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+a8ad3ee1525a0c4b40ec@syzkaller.appspotmail.com
> > 
> > ------------[ cut here ]------------
> > WARNING: CPU: 1 PID: 3597 at arch/x86/kvm/mmu/tdp_mmu.c:57 kvm_mmu_uninit_tdp_mmu+0xf8/0x130 arch/x86/kvm/mmu/tdp_mmu.c:57
> > Modules linked in:
> > CPU: 1 PID: 3597 Comm: syz-executor294 Not tainted 5.18.0-rc3-syzkaller-00060-g59f0c2447e25 #0
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
> > RIP: 0010:kvm_mmu_uninit_tdp_mmu+0xf8/0x130 arch/x86/kvm/mmu/tdp_mmu.c:57
> > Code: 83 d8 a0 00 00 48 39 c5 75 24 e8 e3 4d 5a 00 e8 9e e0 45 00 5b 5d e9 d7 4d 5a 00 e8 b2 42 a5 00 e9 3d ff ff ff e8 c8 4d 5a 00 <0f> 0b eb ad e8 bf 4d 5a 00 0f 0b eb d3 e8 c6 42 a5 00 e9 64 ff ff
> > RSP: 0018:ffffc90002e37c08 EFLAGS: 00010293
> > RAX: 0000000000000000 RBX: ffffc90002cda000 RCX: 0000000000000000
> > RDX: ffff888023f1e180 RSI: ffffffff811e1688 RDI: 0000000000000001
> > RBP: ffffc90002ce40e8 R08: 0000000000000001 R09: 0000000000000001
> > R10: ffffffff817ead48 R11: 0000000000000000 R12: ffffc90002cda000
> > R13: ffffc90002e37c50 R14: 0000000000000003 R15: ffffc90002cdb240
> > FS:  0000000000000000(0000) GS:ffff88802cb00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000560ac4d0cd68 CR3: 000000000ba8e000 CR4: 0000000000152ee0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  <TASK>
> >  kvm_arch_destroy_vm+0x350/0x470 arch/x86/kvm/x86.c:11860
> >  kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1230 [inline]
> >  kvm_put_kvm+0x4fa/0xb70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1264
> >  kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1287
> >  __fput+0x277/0x9d0 fs/file_table.c:317
> >  task_work_run+0xdd/0x1a0 kernel/task_work.c:164
> >  exit_task_work include/linux/task_work.h:37 [inline]
> >  do_exit+0xaff/0x2a00 kernel/exit.c:795
> >  do_group_exit+0xd2/0x2f0 kernel/exit.c:925
> >  __do_sys_exit_group kernel/exit.c:936 [inline]
> >  __se_sys_exit_group kernel/exit.c:934 [inline]
> >  __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:934
> >  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > RIP: 0033:0x7f0327505409
> > Code: Unable to access opcode bytes at RIP 0x7f03275053df.
> > RSP: 002b:00007ffc4a0be998 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> > RAX: ffffffffffffffda RBX: 00007f0327578350 RCX: 00007f0327505409
> > RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
> > RBP: 0000000000000000 R08: ffffffffffffffc0 R09: 68742f636f72702f
> > R10: 00000000ffffffff R11: 0000000000000246 R12: 00007f0327578350
> > R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
> >  </TASK>
> > 
> > 
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > 
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > syzbot can test patches for this issue, for details see:
> > https://goo.gl/tpsmEJ#testing-patches
> > 
> 
> I can reproduce this in a VM, by running and CTRL+C'in my ipi_stress test,
> I am using to test AVIC, but without any relevant patches in the guest kernel.

Update: I can reproduce it in a guest running mainline 5.18-rc1, and 5.17.0 kernels,
even without AVIC enabled in the guest.

[mlevitsk@fedora34 ~]$[  187.567377] ------------[ cut here ]------------
[  187.568245] WARNING: CPU: 1 PID: 3835 at arch/x86/kvm/mmu/tdp_mmu.c:46 kvm_mmu_uninit_tdp_mmu+0x47/0x50 [kvm]
[  187.570145] Modules linked in: kvm_amd ccp kvm irqbypass xt_CHECKSUM xt_MASQUERADE xt_conntrack uinput ipt_REJECT snd_seq_dummy snd_hrtimer ip6table_mangle ip6table_nat ip6table_filter ip6_tables
iptable_mangle iptable_nat nf_nat bridge rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace rfkill sunrpc vfat fat snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_hda_codec
snd_hwdep snd_hda_core rng_core snd_seq input_leds snd_seq_device joydev snd_pcm snd_timer lpc_ich mfd_core rtc_cmos snd pcspkr virtio_input efi_pstore button sch_fq_codel ext4 mbcache jbd2
hid_generic usbhid hid virtio_gpu virtio_dma_buf drm_shmem_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm i2c_core virtio_net net_failover failover virtio_console
virtio_blk crc32_pclmul xhci_pci crc32c_intel xhci_hcd virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring dm_mirror dm_region_hash dm_log fuse ipv6 autofs4 [last unloaded: ccp]
[  187.584353] CPU: 1 PID: 3835 Comm: CPU 0/KVM Tainted: G    B             5.17.0-BISECT #5
[  187.585784] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[  187.587110] RIP: 0010:kvm_mmu_uninit_tdp_mmu+0x47/0x50 [kvm]
[  187.588108] Code: 48 89 e5 48 39 c2 75 21 48 8b 87 40 a3 00 00 48 81 c7 40 a3 00 00 48 39 f8 75 08 e8 f3 c9 c6 e0 5d c3 c3 90 0f 0b 90 eb f2 90 <0f> 0b 90 eb d9 0f 1f 40 00 0f 1f 44 00 00 55 8b 05
50 d8 0f e2 48
[  187.591236] RSP: 0018:ffffc900025cfd58 EFLAGS: 00010283
[  187.592138] RAX: ffffc90002643350 RBX: ffffc9000263a240 RCX: 0000000000000000
[  187.593409] RDX: ffff88810c5e8f18 RSI: ffffffff81a5a96c RDI: ffffc90002639000
[  187.594616] RBP: ffffc900025cfd58 R08: 0000000000000000 R09: 0000000000000000
[  187.595801] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90002639000
[  187.596996] R13: ffffc90002639000 R14: ffff888104b7a3b0 R15: dead000000000100
[  187.598213] FS:  0000000000000000(0000) GS:ffff888468840000(0000) knlGS:0000000000000000
[  187.599587] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  187.600571] CR2: 0000000000000000 CR3: 0000000108e67000 CR4: 0000000000350ee0
[  187.601789] Call Trace:
[  187.602224]  <TASK>
[  187.602610]  kvm_mmu_uninit_vm+0x22/0x30 [kvm]
[  187.603424]  kvm_arch_destroy_vm+0x135/0x1c0 [kvm]
[  187.604267]  kvm_destroy_vm+0x1a4/0x330 [kvm]
[  187.605058]  kvm_put_kvm+0x26/0x40 [kvm]
[  187.605771]  kvm_vm_release+0x22/0x30 [kvm]
[  187.606521]  __fput+0xa9/0x270
[  187.607051]  ____fput+0xe/0x10
[  187.607591]  task_work_run+0x61/0xa0
[  187.608205]  do_exit+0x3ae/0xc00
[  187.608848]  do_group_exit+0x3b/0xc0
[  187.609472]  __x64_sys_exit_group+0x18/0x20
[  187.610179]  do_syscall_64+0x36/0x80
[  187.610808]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  187.611668] RIP: 0033:0x7fda52300021
[  187.612281] Code: Unable to access opcode bytes at RIP 0x7fda522ffff7.
[  187.613394] RSP: 002b:00007fda4d5c4348 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[  187.614680] RAX: ffffffffffffffda RBX: 00007fda523f8470 RCX: 00007fda52300021
[  187.615892] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
[  187.617098] RBP: 0000000000000001 R08: fffffffffffffb68 R09: 0000000000000000
[  187.618308] R10: 0000000000000020 R11: 0000000000000246 R12: 00007fda523f8470
[  187.619524] R13: 000000000000000f R14: 00007fda523f8948 R15: 0000000000000000
[  187.620744]  </TASK>
[  187.621128] irq event stamp: 0
[  187.621662] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[  187.622732] hardirqs last disabled at (0): [<ffffffff81131499>] copy_process+0xaa9/0x2060
[  187.624116] softirqs last  enabled at (0): [<ffffffff81131499>] copy_process+0xaa9/0x2060
[  187.625575] softirqs last disabled at (0): [<0000000000000000>] 0x0
[  187.626635] ---[ end trace 0000000000000000 ]---


Reloading kvm module also shows these warnings:



[mlevitsk@fedora34 ~]$[  187.567377] ------------[ cut here ]------------
[  187.568245] WARNING: CPU: 1 PID: 3835 at arch/x86/kvm/mmu/tdp_mmu.c:46 kvm_mmu_uninit_tdp_mmu+0x47/0x50 [kvm]
[  187.570145] Modules linked in: kvm_amd ccp kvm irqbypass xt_CHECKSUM xt_MASQUERADE xt_conntrack uinput ipt_REJECT snd_seq_dummy snd_hrtimer ip6table_mangle ip6table_nat ip6table_filter ip6_tables
iptable_mangle iptable_nat nf_nat bridge rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace rfkill sunrpc vfat fat snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_hda_codec
snd_hwdep snd_hda_core rng_core snd_seq input_leds snd_seq_device joydev snd_pcm snd_timer lpc_ich mfd_core rtc_cmos snd pcspkr virtio_input efi_pstore button sch_fq_codel ext4 mbcache jbd2
hid_generic usbhid hid virtio_gpu virtio_dma_buf drm_shmem_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm i2c_core virtio_net net_failover failover virtio_console
virtio_blk crc32_pclmul xhci_pci crc32c_intel xhci_hcd virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring dm_mirror dm_region_hash dm_log fuse ipv6 autofs4 [last unloaded: ccp]
[  187.584353] CPU: 1 PID: 3835 Comm: CPU 0/KVM Tainted: G    B             5.17.0-BISECT #5
[  187.585784] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[  187.587110] RIP: 0010:kvm_mmu_uninit_tdp_mmu+0x47/0x50 [kvm]
[  187.588108] Code: 48 89 e5 48 39 c2 75 21 48 8b 87 40 a3 00 00 48 81 c7 40 a3 00 00 48 39 f8 75 08 e8 f3 c9 c6 e0 5d c3 c3 90 0f 0b 90 eb f2 90 <0f> 0b 90 eb d9 0f 1f 40 00 0f 1f 44 00 00 55 8b 05
50 d8 0f e2 48
[  187.591236] RSP: 0018:ffffc900025cfd58 EFLAGS: 00010283
[  187.592138] RAX: ffffc90002643350 RBX: ffffc9000263a240 RCX: 0000000000000000
[  187.593409] RDX: ffff88810c5e8f18 RSI: ffffffff81a5a96c RDI: ffffc90002639000
[  187.594616] RBP: ffffc900025cfd58 R08: 0000000000000000 R09: 0000000000000000
[  187.595801] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90002639000
[  187.596996] R13: ffffc90002639000 R14: ffff888104b7a3b0 R15: dead000000000100
[  187.598213] FS:  0000000000000000(0000) GS:ffff888468840000(0000) knlGS:0000000000000000
[  187.599587] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  187.600571] CR2: 0000000000000000 CR3: 0000000108e67000 CR4: 0000000000350ee0
[  187.601789] Call Trace:
[  187.602224]  <TASK>
[  187.602610]  kvm_mmu_uninit_vm+0x22/0x30 [kvm]
[  187.603424]  kvm_arch_destroy_vm+0x135/0x1c0 [kvm]
[  187.604267]  kvm_destroy_vm+0x1a4/0x330 [kvm]
[  187.605058]  kvm_put_kvm+0x26/0x40 [kvm]
[  187.605771]  kvm_vm_release+0x22/0x30 [kvm]
[  187.606521]  __fput+0xa9/0x270
[  187.607051]  ____fput+0xe/0x10
[  187.607591]  task_work_run+0x61/0xa0
[  187.608205]  do_exit+0x3ae/0xc00
[  187.608848]  do_group_exit+0x3b/0xc0
[  187.609472]  __x64_sys_exit_group+0x18/0x20
[  187.610179]  do_syscall_64+0x36/0x80
[  187.610808]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  187.611668] RIP: 0033:0x7fda52300021
[  187.612281] Code: Unable to access opcode bytes at RIP 0x7fda522ffff7.
[  187.613394] RSP: 002b:00007fda4d5c4348 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[  187.614680] RAX: ffffffffffffffda RBX: 00007fda523f8470 RCX: 00007fda52300021
[  187.615892] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
[  187.617098] RBP: 0000000000000001 R08: fffffffffffffb68 R09: 0000000000000000
[  187.618308] R10: 0000000000000020 R11: 0000000000000246 R12: 00007fda523f8470
[  187.619524] R13: 000000000000000f R14: 00007fda523f8948 R15: 0000000000000000
[  187.620744]  </TASK>
[  187.621128] irq event stamp: 0
[  187.621662] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[  187.622732] hardirqs last disabled at (0): [<ffffffff81131499>] copy_process+0xaa9/0x2060
[  187.624116] softirqs last  enabled at (0): [<ffffffff81131499>] copy_process+0xaa9/0x2060
[  187.625575] softirqs last disabled at (0): [<0000000000000000>] 0x0
[  187.626635] ---[ end trace 0000000000000000 ]---
[  344.161810] =============================================================================
[  344.164789] BUG kvm_mmu_page_header (Tainted: G    B   W        ): Objects remaining in kvm_mmu_page_header on __kmem_cache_shutdown()
[  344.169042] -----------------------------------------------------------------------------
[  344.169042] 
[  344.172392] Slab 0x0000000054960c26 objects=22 used=1 fp=0x00000000b117b6cb flags=0x7fff00000000200(slab|node=0|zone=1|lastcpupid=0x3fff)
[  344.176757] CPU: 3 PID: 4037 Comm: modprobe Tainted: G    B   W         5.17.0-BISECT #5
[  344.179857] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[  344.184111] Call Trace:
[  344.185101]  <TASK>
[  344.185884]  dump_stack_lvl+0x49/0x5e
[  344.187203]  dump_stack+0x10/0x12
[  344.188424]  slab_err+0x95/0xc9
[  344.189585]  __kmem_cache_shutdown.cold+0x38/0x168
[  344.191324]  kmem_cache_destroy+0x55/0x140
[  344.192799]  kvm_mmu_module_exit+0x21/0x40 [kvm]
[  344.194781]  kvm_arch_exit+0x5c/0xb0 [kvm]
[  344.196996]  kvm_exit+0x9a/0xb0 [kvm]
[  344.198421]  svm_exit+0x9/0xe77 [kvm_amd]
[  344.199862]  __do_sys_delete_module.constprop.0+0x188/0x270
[  344.201919]  ? syscall_enter_from_user_mode+0x8d/0x140
[  344.203853]  ? trace_hardirqs_on+0x2a/0xf0
[  344.205333]  __x64_sys_delete_module+0x12/0x20
[  344.206983]  do_syscall_64+0x36/0x80
[  344.208398]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  344.210178] RIP: 0033:0x7f8377ead00b
[  344.211559] Code: 73 01 c3 48 8b 0d 6d 1e 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 1e 0c 00 f7
d8 64 89 01 48
[  344.218181] RSP: 002b:00007ffe007de3a8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  344.220769] RAX: ffffffffffffffda RBX: 000056414301cdb0 RCX: 00007f8377ead00b
[  344.223314] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000056414301ce18
[  344.225842] RBP: 000056414301cdb0 R08: 0000000000000000 R09: 0000000000000000
[  344.228519] R10: 00007f8377f20ac0 R11: 0000000000000206 R12: 000056414301ce18
[  344.230988] R13: 0000000000000000 R14: 000056414301ce18 R15: 00007ffe007e06b8
[  344.233529]  </TASK>
[  344.234291] Object 0x00000000a24c1b47 @offset=3864
[  344.236078] ------------[ cut here ]------------
[  344.237842] kmem_cache_destroy kvm_mmu_page_header: Slab cache still has objects when called from kvm_mmu_module_exit+0x21/0x40 [kvm]
[  344.237923] WARNING: CPU: 3 PID: 4037 at mm/slab_common.c:502 kmem_cache_destroy+0x133/0x140
[  344.246110] Modules linked in: kvm_amd(-) ccp kvm irqbypass xt_CHECKSUM xt_MASQUERADE xt_conntrack uinput ipt_REJECT snd_seq_dummy snd_hrtimer ip6table_mangle ip6table_nat ip6table_filter
ip6_tables iptable_mangle iptable_nat nf_nat bridge rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace rfkill sunrpc vfat fat snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg
snd_hda_codec snd_hwdep snd_hda_core rng_core snd_seq input_leds snd_seq_device joydev snd_pcm snd_timer lpc_ich mfd_core rtc_cmos snd pcspkr virtio_input efi_pstore button sch_fq_codel ext4 mbcache
jbd2 hid_generic usbhid hid virtio_gpu virtio_dma_buf drm_shmem_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm i2c_core virtio_net net_failover failover virtio_console
virtio_blk crc32_pclmul xhci_pci crc32c_intel xhci_hcd virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring dm_mirror dm_region_hash dm_log fuse ipv6 autofs4 [last unloaded: ccp]
[  344.263589] CPU: 3 PID: 4037 Comm: modprobe Tainted: G    B   W         5.17.0-BISECT #5
[  344.265177] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[  344.266563] RIP: 0010:kmem_cache_destroy+0x133/0x140
[  344.267461] Code: e7 e8 a1 ba 06 00 e9 21 ff ff ff c3 90 49 8b 54 24 58 48 8b 4d 08 48 c7 c6 00 c8 03 82 48 c7 c7 20 b4 2e 82 e8 33 d1 69 00 90 <0f> 0b 90 90 e9 f9 fe ff ff 0f 1f 40 00 55 48 89 e5
41 57 49 89 ff
[  344.270738] RSP: 0018:ffffc90002973e68 EFLAGS: 00010286
[  344.271672] RAX: 0000000000000000 RBX: 0000000000000040 RCX: 0000000000000000
[  344.272940] RDX: 0000000000000001 RSI: ffffffff822d34b8 RDI: 00000000ffffffff
[  344.274180] RBP: ffffc90002973e80 R08: ffff8884687fffe8 R09: 0000000000013ffb
[  344.275479] R10: ffff888468640000 R11: 3fffffffffffffff R12: ffff88810d2e6600
[  344.276877] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  344.278102] FS:  00007f8377d80b80(0000) GS:ffff8884688c0000(0000) knlGS:0000000000000000
[  344.279637] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  344.280684] CR2: 0000565399780ff8 CR3: 0000000111f88000 CR4: 0000000000350ee0
[  344.282021] Call Trace:
[  344.282455]  <TASK>
[  344.282948]  kvm_mmu_module_exit+0x21/0x40 [kvm]
[  344.283794]  kvm_arch_exit+0x5c/0xb0 [kvm]
[  344.284577]  kvm_exit+0x9a/0xb0 [kvm]
[  344.285403]  svm_exit+0x9/0xe77 [kvm_amd]
[  344.286126]  __do_sys_delete_module.constprop.0+0x188/0x270
[  344.287090]  ? syscall_enter_from_user_mode+0x8d/0x140
[  344.288022]  ? trace_hardirqs_on+0x2a/0xf0
[  344.288746]  __x64_sys_delete_module+0x12/0x20
[  344.289612]  do_syscall_64+0x36/0x80
[  344.290218]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  344.291153] RIP: 0033:0x7f8377ead00b
[  344.291876] Code: 73 01 c3 48 8b 0d 6d 1e 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 1e 0c 00 f7
d8 64 89 01 48
[  344.295115] RSP: 002b:00007ffe007de3a8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  344.296601] RAX: ffffffffffffffda RBX: 000056414301cdb0 RCX: 00007f8377ead00b
[  344.297899] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000056414301ce18
[  344.299145] RBP: 000056414301cdb0 R08: 0000000000000000 R09: 0000000000000000
[  344.300490] R10: 00007f8377f20ac0 R11: 0000000000000206 R12: 000056414301ce18
[  344.301805] R13: 0000000000000000 R14: 000056414301ce18 R15: 00007ffe007e06b8
[  344.303042]  </TASK>
[  344.303459] irq event stamp: 0
[  344.304104] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[  344.305179] hardirqs last disabled at (0): [<ffffffff81131499>] copy_process+0xaa9/0x2060
[  344.306708] softirqs last  enabled at (0): [<ffffffff81131499>] copy_process+0xaa9/0x2060
[  344.308234] softirqs last disabled at (0): [<0000000000000000>] 0x0
[  344.309358] ---[ end trace 0000000000000000 ]---


This happens while running a nested guest, and the print is from the guest.
Host is running all my nested AVIC patches, but that should not matter,
since I disabled AVIC in the guest.

I will test it on vanilla host as well.


Best regards,
	Maxim Levitsky

> 
> 
> [  304.367317] WARNING: CPU: 1 PID: 5460 at arch/x86/kvm/mmu/tdp_mmu.c:57 kvm_mmu_uninit_tdp_mmu+0x55/0x60 [kvm]
> [  304.368751] Modules linked in: kvm_amd(O) ccp kvm(O) irqbypass xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT uinput snd_seq_dummy snd_hrtimer ip6table_mangle ip6table_nat ip6table_filter
> ip6_tables iptable_mangle iptable_nat nf_nat bridge rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace rfkill sunrpc vfat fat snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg
> snd_hda_codec snd_hwdep snd_hda_core rng_core snd_seq snd_seq_device input_leds snd_pcm joydev snd_timer snd lpc_ich mfd_core virtio_input efi_pstore pcspkr rtc_cmos button sch_fq_codel ext4 mbcache
> jbd2 hid_generic usbhid hid virtio_gpu virtio_dma_buf drm_shmem_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops virtio_blk virtio_net drm net_failover virtio_console failover
> i2c_core crc32_pclmul crc32c_intel xhci_pci xhci_hcd virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring dm_mirror dm_region_hash dm_log fuse ipv6 autofs4 [last unloaded: ccp]
> [  304.383703] CPU: 1 PID: 5460 Comm: CPU 6/KVM Tainted: G        W  O      5.18.0-rc4.unstable #5
> [  304.384804] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> [  304.385659] RIP: 0010:kvm_mmu_uninit_tdp_mmu+0x55/0x60 [kvm]
> [  304.386302] Code: 8d 83 28 a4 00 00 48 39 c2 75 1e 48 8b 83 18 a4 00 00 48 81 c3 18 a4 00 00 48 39 d8 75 11 e8 52 d5 8d e0 48 8b 5d f8 c9 c3 90 <0f> 0b 90 eb dc 90 0f 0b 90 eb e9 0f 1f 44 00 00 55
> 8b 05 a0 c3 d6
> [  304.388246] RSP: 0018:ffffc90003503bf0 EFLAGS: 00010293
> [  304.388807] RAX: ffffc900034cb428 RBX: ffffc900034c1000 RCX: 0000000000000000
> [  304.389561] RDX: ffff888120446c80 RSI: ffffffff8115e949 RDI: ffffffff81a6ee08
> [  304.390315] RBP: ffffc90003503bf8 R08: 0000000000000000 R09: 0000000000000000
> [  304.391078] R10: 0000000000000394 R11: 0000000000000386 R12: ffffc900034c1000
> [  304.391818] R13: ffffc900034c1000 R14: ffff888113d8b728 R15: dead000000000100
> [  304.392573] FS:  0000000000000000(0000) GS:ffff88846ce40000(0000) knlGS:0000000000000000
> [  304.393430] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  304.394050] CR2: 0000000000000000 CR3: 0000000002c21000 CR4: 0000000000350ee0
> [  304.394805] Call Trace:
> [  304.395086]  <TASK>
> [  304.395315]  kvm_mmu_uninit_vm+0x22/0x30 [kvm]
> [  304.395821]  kvm_arch_destroy_vm+0x135/0x1c0 [kvm]
> [  304.396365]  kvm_destroy_vm+0x19d/0x310 [kvm]
> [  304.396858]  kvm_put_kvm+0x26/0x40 [kvm]
> [  304.397390]  kvm_vm_release+0x22/0x30 [kvm]
> [  304.397865]  __fput+0xa5/0x270
> [  304.398207]  ____fput+0xe/0x10
> [  304.398532]  task_work_run+0x61/0xb0
> [  304.398913]  do_exit+0x3a9/0xb60
> [  304.399286]  do_group_exit+0x3b/0xc0
> [  304.399676]  get_signal+0xd9c/0xde0
> [  304.400062]  arch_do_signal_or_restart+0x37/0x790
> [  304.400576]  ? do_futex+0x8a/0x150
> [  304.400950]  exit_to_user_mode_prepare+0x152/0x240
> [  304.401603]  syscall_exit_to_user_mode+0x1f/0x60
> [  304.402230]  do_syscall_64+0x44/0x80
> [  304.402690]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  304.408493] RIP: 0033:0x7f593c746a8a
> [  304.408887] Code: Unable to access opcode bytes at RIP 0x7f593c746a60.
> [  304.409589] RSP: 002b:00007f59348ed5a0 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> [  304.410407] RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f593c746a8a
> [  304.411365] RDX: 0000000000000000 RSI: 0000000000000189 RDI: 000055f11c7f4fe8
> [  304.411366] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000ffffffff
> [  304.411367] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> [  304.411368] R13: 000055f11c7f4fe8 R14: 0000000000000000 R15: 0000000000000000
> [  304.411371]  </TASK>
> [  304.414745] irq event stamp: 0
> [  304.415086] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> [  304.415762] hardirqs last disabled at (0): [<ffffffff811339cb>] copy_process+0x94b/0x1ec0
> [  304.416628] softirqs last  enabled at (0): [<ffffffff811339cb>] copy_process+0x94b/0x1ec0
> [  304.417510] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [  304.418186] ---[ end trace 0000000000000000 ]---
> 
> 
> Best regards,
> 	Maxim Levitsky



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2)
  2022-04-26 16:20 ` Maxim Levitsky
  2022-04-28  7:25   ` Maxim Levitsky
@ 2022-04-28 15:32   ` Sean Christopherson
  2022-04-28 17:16     ` Maxim Levitsky
  2022-04-28 17:22     ` Paolo Bonzini
  1 sibling, 2 replies; 9+ messages in thread
From: Sean Christopherson @ 2022-04-28 15:32 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: syzbot, bp, dave.hansen, hpa, jmattson, joro, kvm, linux-kernel,
	mingo, pbonzini, syzkaller-bugs, tglx, vkuznets, wanpengli, x86

On Tue, Apr 26, 2022, Maxim Levitsky wrote:
> I can reproduce this in a VM, by running and CTRL+C'in my ipi_stress test,

Can you post your ipi_stress test?  I'm curious to see if I can repro, and also
very curious as to what might be unique about your test.  I haven't been able to
repro the syzbot test, nor have I been able to repro by killing VMs/tests.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2)
  2022-04-28 15:32   ` Sean Christopherson
@ 2022-04-28 17:16     ` Maxim Levitsky
  2022-04-28 17:21       ` Maxim Levitsky
  2022-04-28 17:22     ` Paolo Bonzini
  1 sibling, 1 reply; 9+ messages in thread
From: Maxim Levitsky @ 2022-04-28 17:16 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: syzbot, bp, dave.hansen, hpa, jmattson, joro, kvm, linux-kernel,
	mingo, pbonzini, syzkaller-bugs, tglx, vkuznets, wanpengli, x86

[-- Attachment #1: Type: text/plain, Size: 1453 bytes --]

On Thu, 2022-04-28 at 15:32 +0000, Sean Christopherson wrote:
> On Tue, Apr 26, 2022, Maxim Levitsky wrote:
> > I can reproduce this in a VM, by running and CTRL+C'in my ipi_stress test,
> 
> Can you post your ipi_stress test?  I'm curious to see if I can repro, and also
> very curious as to what might be unique about your test.  I haven't been able to
> repro the syzbot test, nor have I been able to repro by killing VMs/tests.
> 

This is the patch series (mostly attempt to turn svm to mini library,
but I don't know if this is worth it.
It was done so that ipi_stress could use  nesting itself to wait for IPI
from within a nested guest. I usually don't use it.

This is more or less how I was running it lately (I have a wrapper script)


./x86/run x86/ipi_stress.flat \
        -global kvm-pit.lost_tick_policy=discard \
	        -machine kernel-irqchip=on -name debug-threads=on  \
	        \
	        -smp 8 \
	        -cpu host,x2apic=off,svm=off,-hypervisor \
	        -overcommit cpu-pm=on \
	        -m 4g -append "0 10000"


Its not fully finised for upstream, I will get to it soon.

'cpu-pm=on' won't work for you as this fails due to non atomic memslot
update bug for which I have a small hack in qemu, and it is on my
backlog to fix it correctly.

Mostly likely cpu_pm=off will also reproduce it.


Test was run in a guest, natively this doesn't seem to reproduce.
tdp mmu was used for both L0 and L1.

Best regards,
	Maxim levitsky

[-- Attachment #2: 0001-svm-move-svm-spec-definitions-to-lib-x86-svm.h.patch --]
[-- Type: text/x-patch, Size: 21762 bytes --]

From 325a2eff01184e82f1f80ac5783eb5bc5058e1a8 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk@redhat.com>
Date: Mon, 28 Mar 2022 14:23:53 +0300
Subject: [PATCH 1/7] svm: move svm spec definitions to lib/x86/svm.h

---
 lib/x86/svm.h | 364 ++++++++++++++++++++++++++++++++++++++++++++++++++
 x86/svm.h     | 359 +------------------------------------------------
 2 files changed, 365 insertions(+), 358 deletions(-)
 create mode 100644 lib/x86/svm.h

diff --git a/lib/x86/svm.h b/lib/x86/svm.h
new file mode 100644
index 00000000..38bb9224
--- /dev/null
+++ b/lib/x86/svm.h
@@ -0,0 +1,364 @@
+
+#ifndef SRC_LIB_X86_SVM_H_
+#define SRC_LIB_X86_SVM_H_
+
+enum {
+    INTERCEPT_INTR,
+    INTERCEPT_NMI,
+    INTERCEPT_SMI,
+    INTERCEPT_INIT,
+    INTERCEPT_VINTR,
+    INTERCEPT_SELECTIVE_CR0,
+    INTERCEPT_STORE_IDTR,
+    INTERCEPT_STORE_GDTR,
+    INTERCEPT_STORE_LDTR,
+    INTERCEPT_STORE_TR,
+    INTERCEPT_LOAD_IDTR,
+    INTERCEPT_LOAD_GDTR,
+    INTERCEPT_LOAD_LDTR,
+    INTERCEPT_LOAD_TR,
+    INTERCEPT_RDTSC,
+    INTERCEPT_RDPMC,
+    INTERCEPT_PUSHF,
+    INTERCEPT_POPF,
+    INTERCEPT_CPUID,
+    INTERCEPT_RSM,
+    INTERCEPT_IRET,
+    INTERCEPT_INTn,
+    INTERCEPT_INVD,
+    INTERCEPT_PAUSE,
+    INTERCEPT_HLT,
+    INTERCEPT_INVLPG,
+    INTERCEPT_INVLPGA,
+    INTERCEPT_IOIO_PROT,
+    INTERCEPT_MSR_PROT,
+    INTERCEPT_TASK_SWITCH,
+    INTERCEPT_FERR_FREEZE,
+    INTERCEPT_SHUTDOWN,
+    INTERCEPT_VMRUN,
+    INTERCEPT_VMMCALL,
+    INTERCEPT_VMLOAD,
+    INTERCEPT_VMSAVE,
+    INTERCEPT_STGI,
+    INTERCEPT_CLGI,
+    INTERCEPT_SKINIT,
+    INTERCEPT_RDTSCP,
+    INTERCEPT_ICEBP,
+    INTERCEPT_WBINVD,
+    INTERCEPT_MONITOR,
+    INTERCEPT_MWAIT,
+    INTERCEPT_MWAIT_COND,
+};
+
+enum {
+        VMCB_CLEAN_INTERCEPTS = 1, /* Intercept vectors, TSC offset, pause filter count */
+        VMCB_CLEAN_PERM_MAP = 2,   /* IOPM Base and MSRPM Base */
+        VMCB_CLEAN_ASID = 4,       /* ASID */
+        VMCB_CLEAN_INTR = 8,       /* int_ctl, int_vector */
+        VMCB_CLEAN_NPT = 16,       /* npt_en, nCR3, gPAT */
+        VMCB_CLEAN_CR = 32,        /* CR0, CR3, CR4, EFER */
+        VMCB_CLEAN_DR = 64,        /* DR6, DR7 */
+        VMCB_CLEAN_DT = 128,       /* GDT, IDT */
+        VMCB_CLEAN_SEG = 256,      /* CS, DS, SS, ES, CPL */
+        VMCB_CLEAN_CR2 = 512,      /* CR2 only */
+        VMCB_CLEAN_LBR = 1024,     /* DBGCTL, BR_FROM, BR_TO, LAST_EX_FROM, LAST_EX_TO */
+        VMCB_CLEAN_AVIC = 2048,    /* APIC_BAR, APIC_BACKING_PAGE,
+                      PHYSICAL_TABLE pointer, LOGICAL_TABLE pointer */
+        VMCB_CLEAN_ALL = 4095,
+};
+
+struct __attribute__ ((__packed__)) vmcb_control_area {
+    u16 intercept_cr_read;
+    u16 intercept_cr_write;
+    u16 intercept_dr_read;
+    u16 intercept_dr_write;
+    u32 intercept_exceptions;
+    u64 intercept;
+    u8 reserved_1[40];
+    u16 pause_filter_thresh;
+    u16 pause_filter_count;
+    u64 iopm_base_pa;
+    u64 msrpm_base_pa;
+    u64 tsc_offset;
+    u32 asid;
+    u8 tlb_ctl;
+    u8 reserved_2[3];
+    u32 int_ctl;
+    u32 int_vector;
+    u32 int_state;
+    u8 reserved_3[4];
+    u32 exit_code;
+    u32 exit_code_hi;
+    u64 exit_info_1;
+    u64 exit_info_2;
+    u32 exit_int_info;
+    u32 exit_int_info_err;
+    u64 nested_ctl;
+    u8 reserved_4[16];
+    u32 event_inj;
+    u32 event_inj_err;
+    u64 nested_cr3;
+    u64 virt_ext;
+    u32 clean;
+    u32 reserved_5;
+    u64 next_rip;
+    u8 insn_len;
+    u8 insn_bytes[15];
+    u8 reserved_6[800];
+};
+
+#define TLB_CONTROL_DO_NOTHING 0
+#define TLB_CONTROL_FLUSH_ALL_ASID 1
+
+#define V_TPR_MASK 0x0f
+
+#define V_IRQ_SHIFT 8
+#define V_IRQ_MASK (1 << V_IRQ_SHIFT)
+
+#define V_GIF_ENABLED_SHIFT 25
+#define V_GIF_ENABLED_MASK (1 << V_GIF_ENABLED_SHIFT)
+
+#define V_GIF_SHIFT 9
+#define V_GIF_MASK (1 << V_GIF_SHIFT)
+
+#define V_INTR_PRIO_SHIFT 16
+#define V_INTR_PRIO_MASK (0x0f << V_INTR_PRIO_SHIFT)
+
+#define V_IGN_TPR_SHIFT 20
+#define V_IGN_TPR_MASK (1 << V_IGN_TPR_SHIFT)
+
+#define V_INTR_MASKING_SHIFT 24
+#define V_INTR_MASKING_MASK (1 << V_INTR_MASKING_SHIFT)
+
+#define SVM_INTERRUPT_SHADOW_MASK 1
+
+#define SVM_IOIO_STR_SHIFT 2
+#define SVM_IOIO_REP_SHIFT 3
+#define SVM_IOIO_SIZE_SHIFT 4
+#define SVM_IOIO_ASIZE_SHIFT 7
+
+#define SVM_IOIO_TYPE_MASK 1
+#define SVM_IOIO_STR_MASK (1 << SVM_IOIO_STR_SHIFT)
+#define SVM_IOIO_REP_MASK (1 << SVM_IOIO_REP_SHIFT)
+#define SVM_IOIO_SIZE_MASK (7 << SVM_IOIO_SIZE_SHIFT)
+#define SVM_IOIO_ASIZE_MASK (7 << SVM_IOIO_ASIZE_SHIFT)
+
+#define SVM_VM_CR_VALID_MASK    0x001fULL
+#define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL
+#define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
+
+#define TSC_RATIO_DEFAULT   0x0100000000ULL
+
+struct __attribute__ ((__packed__)) vmcb_seg {
+    u16 selector;
+    u16 attrib;
+    u32 limit;
+    u64 base;
+};
+
+struct __attribute__ ((__packed__)) vmcb_save_area {
+    struct vmcb_seg es;
+    struct vmcb_seg cs;
+    struct vmcb_seg ss;
+    struct vmcb_seg ds;
+    struct vmcb_seg fs;
+    struct vmcb_seg gs;
+    struct vmcb_seg gdtr;
+    struct vmcb_seg ldtr;
+    struct vmcb_seg idtr;
+    struct vmcb_seg tr;
+    u8 reserved_1[43];
+    u8 cpl;
+    u8 reserved_2[4];
+    u64 efer;
+    u8 reserved_3[112];
+    u64 cr4;
+    u64 cr3;
+    u64 cr0;
+    u64 dr7;
+    u64 dr6;
+    u64 rflags;
+    u64 rip;
+    u8 reserved_4[88];
+    u64 rsp;
+    u8 reserved_5[24];
+    u64 rax;
+    u64 star;
+    u64 lstar;
+    u64 cstar;
+    u64 sfmask;
+    u64 kernel_gs_base;
+    u64 sysenter_cs;
+    u64 sysenter_esp;
+    u64 sysenter_eip;
+    u64 cr2;
+    u8 reserved_6[32];
+    u64 g_pat;
+    u64 dbgctl;
+    u64 br_from;
+    u64 br_to;
+    u64 last_excp_from;
+    u64 last_excp_to;
+};
+
+struct __attribute__ ((__packed__)) vmcb {
+    struct vmcb_control_area control;
+    struct vmcb_save_area save;
+};
+
+#define SVM_CPUID_FEATURE_SHIFT 2
+#define SVM_CPUID_FUNC 0x8000000a
+
+#define SVM_VM_CR_SVM_DISABLE 4
+
+#define SVM_SELECTOR_S_SHIFT 4
+#define SVM_SELECTOR_DPL_SHIFT 5
+#define SVM_SELECTOR_P_SHIFT 7
+#define SVM_SELECTOR_AVL_SHIFT 8
+#define SVM_SELECTOR_L_SHIFT 9
+#define SVM_SELECTOR_DB_SHIFT 10
+#define SVM_SELECTOR_G_SHIFT 11
+
+#define SVM_SELECTOR_TYPE_MASK (0xf)
+#define SVM_SELECTOR_S_MASK (1 << SVM_SELECTOR_S_SHIFT)
+#define SVM_SELECTOR_DPL_MASK (3 << SVM_SELECTOR_DPL_SHIFT)
+#define SVM_SELECTOR_P_MASK (1 << SVM_SELECTOR_P_SHIFT)
+#define SVM_SELECTOR_AVL_MASK (1 << SVM_SELECTOR_AVL_SHIFT)
+#define SVM_SELECTOR_L_MASK (1 << SVM_SELECTOR_L_SHIFT)
+#define SVM_SELECTOR_DB_MASK (1 << SVM_SELECTOR_DB_SHIFT)
+#define SVM_SELECTOR_G_MASK (1 << SVM_SELECTOR_G_SHIFT)
+
+#define SVM_SELECTOR_WRITE_MASK (1 << 1)
+#define SVM_SELECTOR_READ_MASK SVM_SELECTOR_WRITE_MASK
+#define SVM_SELECTOR_CODE_MASK (1 << 3)
+
+#define INTERCEPT_CR0_MASK 1
+#define INTERCEPT_CR3_MASK (1 << 3)
+#define INTERCEPT_CR4_MASK (1 << 4)
+#define INTERCEPT_CR8_MASK (1 << 8)
+
+#define INTERCEPT_DR0_MASK 1
+#define INTERCEPT_DR1_MASK (1 << 1)
+#define INTERCEPT_DR2_MASK (1 << 2)
+#define INTERCEPT_DR3_MASK (1 << 3)
+#define INTERCEPT_DR4_MASK (1 << 4)
+#define INTERCEPT_DR5_MASK (1 << 5)
+#define INTERCEPT_DR6_MASK (1 << 6)
+#define INTERCEPT_DR7_MASK (1 << 7)
+
+#define SVM_EVTINJ_VEC_MASK 0xff
+
+#define SVM_EVTINJ_TYPE_SHIFT 8
+#define SVM_EVTINJ_TYPE_MASK (7 << SVM_EVTINJ_TYPE_SHIFT)
+
+#define SVM_EVTINJ_TYPE_INTR (0 << SVM_EVTINJ_TYPE_SHIFT)
+#define SVM_EVTINJ_TYPE_NMI (2 << SVM_EVTINJ_TYPE_SHIFT)
+#define SVM_EVTINJ_TYPE_EXEPT (3 << SVM_EVTINJ_TYPE_SHIFT)
+#define SVM_EVTINJ_TYPE_SOFT (4 << SVM_EVTINJ_TYPE_SHIFT)
+
+#define SVM_EVTINJ_VALID (1 << 31)
+#define SVM_EVTINJ_VALID_ERR (1 << 11)
+
+#define SVM_EXITINTINFO_VEC_MASK SVM_EVTINJ_VEC_MASK
+#define SVM_EXITINTINFO_TYPE_MASK SVM_EVTINJ_TYPE_MASK
+
+#define SVM_EXITINTINFO_TYPE_INTR SVM_EVTINJ_TYPE_INTR
+#define SVM_EXITINTINFO_TYPE_NMI SVM_EVTINJ_TYPE_NMI
+#define SVM_EXITINTINFO_TYPE_EXEPT SVM_EVTINJ_TYPE_EXEPT
+#define SVM_EXITINTINFO_TYPE_SOFT SVM_EVTINJ_TYPE_SOFT
+
+#define SVM_EXITINTINFO_VALID SVM_EVTINJ_VALID
+#define SVM_EXITINTINFO_VALID_ERR SVM_EVTINJ_VALID_ERR
+
+#define SVM_EXITINFOSHIFT_TS_REASON_IRET 36
+#define SVM_EXITINFOSHIFT_TS_REASON_JMP 38
+#define SVM_EXITINFOSHIFT_TS_HAS_ERROR_CODE 44
+
+#define SVM_EXIT_READ_CR0   0x000
+#define SVM_EXIT_READ_CR3   0x003
+#define SVM_EXIT_READ_CR4   0x004
+#define SVM_EXIT_READ_CR8   0x008
+#define SVM_EXIT_WRITE_CR0  0x010
+#define SVM_EXIT_WRITE_CR3  0x013
+#define SVM_EXIT_WRITE_CR4  0x014
+#define SVM_EXIT_WRITE_CR8  0x018
+#define SVM_EXIT_READ_DR0   0x020
+#define SVM_EXIT_READ_DR1   0x021
+#define SVM_EXIT_READ_DR2   0x022
+#define SVM_EXIT_READ_DR3   0x023
+#define SVM_EXIT_READ_DR4   0x024
+#define SVM_EXIT_READ_DR5   0x025
+#define SVM_EXIT_READ_DR6   0x026
+#define SVM_EXIT_READ_DR7   0x027
+#define SVM_EXIT_WRITE_DR0  0x030
+#define SVM_EXIT_WRITE_DR1  0x031
+#define SVM_EXIT_WRITE_DR2  0x032
+#define SVM_EXIT_WRITE_DR3  0x033
+#define SVM_EXIT_WRITE_DR4  0x034
+#define SVM_EXIT_WRITE_DR5  0x035
+#define SVM_EXIT_WRITE_DR6  0x036
+#define SVM_EXIT_WRITE_DR7  0x037
+#define SVM_EXIT_EXCP_BASE      0x040
+#define SVM_EXIT_INTR       0x060
+#define SVM_EXIT_NMI        0x061
+#define SVM_EXIT_SMI        0x062
+#define SVM_EXIT_INIT       0x063
+#define SVM_EXIT_VINTR      0x064
+#define SVM_EXIT_CR0_SEL_WRITE  0x065
+#define SVM_EXIT_IDTR_READ  0x066
+#define SVM_EXIT_GDTR_READ  0x067
+#define SVM_EXIT_LDTR_READ  0x068
+#define SVM_EXIT_TR_READ    0x069
+#define SVM_EXIT_IDTR_WRITE 0x06a
+#define SVM_EXIT_GDTR_WRITE 0x06b
+#define SVM_EXIT_LDTR_WRITE 0x06c
+#define SVM_EXIT_TR_WRITE   0x06d
+#define SVM_EXIT_RDTSC      0x06e
+#define SVM_EXIT_RDPMC      0x06f
+#define SVM_EXIT_PUSHF      0x070
+#define SVM_EXIT_POPF       0x071
+#define SVM_EXIT_CPUID      0x072
+#define SVM_EXIT_RSM        0x073
+#define SVM_EXIT_IRET       0x074
+#define SVM_EXIT_SWINT      0x075
+#define SVM_EXIT_INVD       0x076
+#define SVM_EXIT_PAUSE      0x077
+#define SVM_EXIT_HLT        0x078
+#define SVM_EXIT_INVLPG     0x079
+#define SVM_EXIT_INVLPGA    0x07a
+#define SVM_EXIT_IOIO       0x07b
+#define SVM_EXIT_MSR        0x07c
+#define SVM_EXIT_TASK_SWITCH    0x07d
+#define SVM_EXIT_FERR_FREEZE    0x07e
+#define SVM_EXIT_SHUTDOWN   0x07f
+#define SVM_EXIT_VMRUN      0x080
+#define SVM_EXIT_VMMCALL    0x081
+#define SVM_EXIT_VMLOAD     0x082
+#define SVM_EXIT_VMSAVE     0x083
+#define SVM_EXIT_STGI       0x084
+#define SVM_EXIT_CLGI       0x085
+#define SVM_EXIT_SKINIT     0x086
+#define SVM_EXIT_RDTSCP     0x087
+#define SVM_EXIT_ICEBP      0x088
+#define SVM_EXIT_WBINVD     0x089
+#define SVM_EXIT_MONITOR    0x08a
+#define SVM_EXIT_MWAIT      0x08b
+#define SVM_EXIT_MWAIT_COND 0x08c
+#define SVM_EXIT_NPF        0x400
+
+#define SVM_EXIT_ERR        -1
+
+#define SVM_CR0_SELECTIVE_MASK (X86_CR0_TS | X86_CR0_MP)
+
+#define SVM_CR0_RESERVED_MASK           0xffffffff00000000U
+#define SVM_CR3_LONG_MBZ_MASK           0xfff0000000000000U
+#define SVM_CR3_LONG_RESERVED_MASK      0x0000000000000fe7U
+#define SVM_CR3_PAE_LEGACY_RESERVED_MASK    0x0000000000000007U
+#define SVM_CR4_LEGACY_RESERVED_MASK        0xff08e000U
+#define SVM_CR4_RESERVED_MASK           0xffffffffff08e000U
+#define SVM_DR6_RESERVED_MASK           0xffffffffffff1ff0U
+#define SVM_DR7_RESERVED_MASK           0xffffffff0000cc00U
+#define SVM_EFER_RESERVED_MASK          0xffffffffffff0200U
+
+
+#endif /* SRC_LIB_X86_SVM_H_ */
diff --git a/x86/svm.h b/x86/svm.h
index e93822b6..ff5fa91e 100644
--- a/x86/svm.h
+++ b/x86/svm.h
@@ -2,367 +2,10 @@
 #define X86_SVM_H
 
 #include "libcflat.h"
+#include <x86/svm.h>
 
-enum {
-	INTERCEPT_INTR,
-	INTERCEPT_NMI,
-	INTERCEPT_SMI,
-	INTERCEPT_INIT,
-	INTERCEPT_VINTR,
-	INTERCEPT_SELECTIVE_CR0,
-	INTERCEPT_STORE_IDTR,
-	INTERCEPT_STORE_GDTR,
-	INTERCEPT_STORE_LDTR,
-	INTERCEPT_STORE_TR,
-	INTERCEPT_LOAD_IDTR,
-	INTERCEPT_LOAD_GDTR,
-	INTERCEPT_LOAD_LDTR,
-	INTERCEPT_LOAD_TR,
-	INTERCEPT_RDTSC,
-	INTERCEPT_RDPMC,
-	INTERCEPT_PUSHF,
-	INTERCEPT_POPF,
-	INTERCEPT_CPUID,
-	INTERCEPT_RSM,
-	INTERCEPT_IRET,
-	INTERCEPT_INTn,
-	INTERCEPT_INVD,
-	INTERCEPT_PAUSE,
-	INTERCEPT_HLT,
-	INTERCEPT_INVLPG,
-	INTERCEPT_INVLPGA,
-	INTERCEPT_IOIO_PROT,
-	INTERCEPT_MSR_PROT,
-	INTERCEPT_TASK_SWITCH,
-	INTERCEPT_FERR_FREEZE,
-	INTERCEPT_SHUTDOWN,
-	INTERCEPT_VMRUN,
-	INTERCEPT_VMMCALL,
-	INTERCEPT_VMLOAD,
-	INTERCEPT_VMSAVE,
-	INTERCEPT_STGI,
-	INTERCEPT_CLGI,
-	INTERCEPT_SKINIT,
-	INTERCEPT_RDTSCP,
-	INTERCEPT_ICEBP,
-	INTERCEPT_WBINVD,
-	INTERCEPT_MONITOR,
-	INTERCEPT_MWAIT,
-	INTERCEPT_MWAIT_COND,
-};
-
-enum {
-        VMCB_CLEAN_INTERCEPTS = 1, /* Intercept vectors, TSC offset, pause filter count */
-        VMCB_CLEAN_PERM_MAP = 2,   /* IOPM Base and MSRPM Base */
-        VMCB_CLEAN_ASID = 4,       /* ASID */
-        VMCB_CLEAN_INTR = 8,       /* int_ctl, int_vector */
-        VMCB_CLEAN_NPT = 16,       /* npt_en, nCR3, gPAT */
-        VMCB_CLEAN_CR = 32,        /* CR0, CR3, CR4, EFER */
-        VMCB_CLEAN_DR = 64,        /* DR6, DR7 */
-        VMCB_CLEAN_DT = 128,       /* GDT, IDT */
-        VMCB_CLEAN_SEG = 256,      /* CS, DS, SS, ES, CPL */
-        VMCB_CLEAN_CR2 = 512,      /* CR2 only */
-        VMCB_CLEAN_LBR = 1024,     /* DBGCTL, BR_FROM, BR_TO, LAST_EX_FROM, LAST_EX_TO */
-        VMCB_CLEAN_AVIC = 2048,    /* APIC_BAR, APIC_BACKING_PAGE,
-				      PHYSICAL_TABLE pointer, LOGICAL_TABLE pointer */
-        VMCB_CLEAN_ALL = 4095,
-};
-
-struct __attribute__ ((__packed__)) vmcb_control_area {
-	u16 intercept_cr_read;
-	u16 intercept_cr_write;
-	u16 intercept_dr_read;
-	u16 intercept_dr_write;
-	u32 intercept_exceptions;
-	u64 intercept;
-	u8 reserved_1[40];
-	u16 pause_filter_thresh;
-	u16 pause_filter_count;
-	u64 iopm_base_pa;
-	u64 msrpm_base_pa;
-	u64 tsc_offset;
-	u32 asid;
-	u8 tlb_ctl;
-	u8 reserved_2[3];
-	u32 int_ctl;
-	u32 int_vector;
-	u32 int_state;
-	u8 reserved_3[4];
-	u32 exit_code;
-	u32 exit_code_hi;
-	u64 exit_info_1;
-	u64 exit_info_2;
-	u32 exit_int_info;
-	u32 exit_int_info_err;
-	u64 nested_ctl;
-	u8 reserved_4[16];
-	u32 event_inj;
-	u32 event_inj_err;
-	u64 nested_cr3;
-	u64 virt_ext;
-	u32 clean;
-	u32 reserved_5;
-	u64 next_rip;
-	u8 insn_len;
-	u8 insn_bytes[15];
-	u8 reserved_6[800];
-};
-
-#define TLB_CONTROL_DO_NOTHING 0
-#define TLB_CONTROL_FLUSH_ALL_ASID 1
-
-#define V_TPR_MASK 0x0f
-
-#define V_IRQ_SHIFT 8
-#define V_IRQ_MASK (1 << V_IRQ_SHIFT)
-
-#define V_GIF_ENABLED_SHIFT 25
-#define V_GIF_ENABLED_MASK (1 << V_GIF_ENABLED_SHIFT)
-
-#define V_GIF_SHIFT 9
-#define V_GIF_MASK (1 << V_GIF_SHIFT)
-
-#define V_INTR_PRIO_SHIFT 16
-#define V_INTR_PRIO_MASK (0x0f << V_INTR_PRIO_SHIFT)
-
-#define V_IGN_TPR_SHIFT 20
-#define V_IGN_TPR_MASK (1 << V_IGN_TPR_SHIFT)
-
-#define V_INTR_MASKING_SHIFT 24
-#define V_INTR_MASKING_MASK (1 << V_INTR_MASKING_SHIFT)
-
-#define SVM_INTERRUPT_SHADOW_MASK 1
-
-#define SVM_IOIO_STR_SHIFT 2
-#define SVM_IOIO_REP_SHIFT 3
-#define SVM_IOIO_SIZE_SHIFT 4
-#define SVM_IOIO_ASIZE_SHIFT 7
-
-#define SVM_IOIO_TYPE_MASK 1
-#define SVM_IOIO_STR_MASK (1 << SVM_IOIO_STR_SHIFT)
-#define SVM_IOIO_REP_MASK (1 << SVM_IOIO_REP_SHIFT)
-#define SVM_IOIO_SIZE_MASK (7 << SVM_IOIO_SIZE_SHIFT)
-#define SVM_IOIO_ASIZE_MASK (7 << SVM_IOIO_ASIZE_SHIFT)
-
-#define SVM_VM_CR_VALID_MASK	0x001fULL
-#define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL
-#define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
-
-#define TSC_RATIO_DEFAULT   0x0100000000ULL
-
-struct __attribute__ ((__packed__)) vmcb_seg {
-	u16 selector;
-	u16 attrib;
-	u32 limit;
-	u64 base;
-};
-
-struct __attribute__ ((__packed__)) vmcb_save_area {
-	struct vmcb_seg es;
-	struct vmcb_seg cs;
-	struct vmcb_seg ss;
-	struct vmcb_seg ds;
-	struct vmcb_seg fs;
-	struct vmcb_seg gs;
-	struct vmcb_seg gdtr;
-	struct vmcb_seg ldtr;
-	struct vmcb_seg idtr;
-	struct vmcb_seg tr;
-	u8 reserved_1[43];
-	u8 cpl;
-	u8 reserved_2[4];
-	u64 efer;
-	u8 reserved_3[112];
-	u64 cr4;
-	u64 cr3;
-	u64 cr0;
-	u64 dr7;
-	u64 dr6;
-	u64 rflags;
-	u64 rip;
-	u8 reserved_4[88];
-	u64 rsp;
-	u8 reserved_5[24];
-	u64 rax;
-	u64 star;
-	u64 lstar;
-	u64 cstar;
-	u64 sfmask;
-	u64 kernel_gs_base;
-	u64 sysenter_cs;
-	u64 sysenter_esp;
-	u64 sysenter_eip;
-	u64 cr2;
-	u8 reserved_6[32];
-	u64 g_pat;
-	u64 dbgctl;
-	u64 br_from;
-	u64 br_to;
-	u64 last_excp_from;
-	u64 last_excp_to;
-};
-
-struct __attribute__ ((__packed__)) vmcb {
-	struct vmcb_control_area control;
-	struct vmcb_save_area save;
-};
-
-#define SVM_CPUID_FEATURE_SHIFT 2
-#define SVM_CPUID_FUNC 0x8000000a
-
-#define SVM_VM_CR_SVM_DISABLE 4
-
-#define SVM_SELECTOR_S_SHIFT 4
-#define SVM_SELECTOR_DPL_SHIFT 5
-#define SVM_SELECTOR_P_SHIFT 7
-#define SVM_SELECTOR_AVL_SHIFT 8
-#define SVM_SELECTOR_L_SHIFT 9
-#define SVM_SELECTOR_DB_SHIFT 10
-#define SVM_SELECTOR_G_SHIFT 11
-
-#define SVM_SELECTOR_TYPE_MASK (0xf)
-#define SVM_SELECTOR_S_MASK (1 << SVM_SELECTOR_S_SHIFT)
-#define SVM_SELECTOR_DPL_MASK (3 << SVM_SELECTOR_DPL_SHIFT)
-#define SVM_SELECTOR_P_MASK (1 << SVM_SELECTOR_P_SHIFT)
-#define SVM_SELECTOR_AVL_MASK (1 << SVM_SELECTOR_AVL_SHIFT)
-#define SVM_SELECTOR_L_MASK (1 << SVM_SELECTOR_L_SHIFT)
-#define SVM_SELECTOR_DB_MASK (1 << SVM_SELECTOR_DB_SHIFT)
-#define SVM_SELECTOR_G_MASK (1 << SVM_SELECTOR_G_SHIFT)
-
-#define SVM_SELECTOR_WRITE_MASK (1 << 1)
-#define SVM_SELECTOR_READ_MASK SVM_SELECTOR_WRITE_MASK
-#define SVM_SELECTOR_CODE_MASK (1 << 3)
-
-#define INTERCEPT_CR0_MASK 1
-#define INTERCEPT_CR3_MASK (1 << 3)
-#define INTERCEPT_CR4_MASK (1 << 4)
-#define INTERCEPT_CR8_MASK (1 << 8)
-
-#define INTERCEPT_DR0_MASK 1
-#define INTERCEPT_DR1_MASK (1 << 1)
-#define INTERCEPT_DR2_MASK (1 << 2)
-#define INTERCEPT_DR3_MASK (1 << 3)
-#define INTERCEPT_DR4_MASK (1 << 4)
-#define INTERCEPT_DR5_MASK (1 << 5)
-#define INTERCEPT_DR6_MASK (1 << 6)
-#define INTERCEPT_DR7_MASK (1 << 7)
-
-#define SVM_EVTINJ_VEC_MASK 0xff
-
-#define SVM_EVTINJ_TYPE_SHIFT 8
-#define SVM_EVTINJ_TYPE_MASK (7 << SVM_EVTINJ_TYPE_SHIFT)
-
-#define SVM_EVTINJ_TYPE_INTR (0 << SVM_EVTINJ_TYPE_SHIFT)
-#define SVM_EVTINJ_TYPE_NMI (2 << SVM_EVTINJ_TYPE_SHIFT)
-#define SVM_EVTINJ_TYPE_EXEPT (3 << SVM_EVTINJ_TYPE_SHIFT)
-#define SVM_EVTINJ_TYPE_SOFT (4 << SVM_EVTINJ_TYPE_SHIFT)
-
-#define SVM_EVTINJ_VALID (1 << 31)
-#define SVM_EVTINJ_VALID_ERR (1 << 11)
-
-#define SVM_EXITINTINFO_VEC_MASK SVM_EVTINJ_VEC_MASK
-#define SVM_EXITINTINFO_TYPE_MASK SVM_EVTINJ_TYPE_MASK
-
-#define	SVM_EXITINTINFO_TYPE_INTR SVM_EVTINJ_TYPE_INTR
-#define	SVM_EXITINTINFO_TYPE_NMI SVM_EVTINJ_TYPE_NMI
-#define	SVM_EXITINTINFO_TYPE_EXEPT SVM_EVTINJ_TYPE_EXEPT
-#define	SVM_EXITINTINFO_TYPE_SOFT SVM_EVTINJ_TYPE_SOFT
-
-#define SVM_EXITINTINFO_VALID SVM_EVTINJ_VALID
-#define SVM_EXITINTINFO_VALID_ERR SVM_EVTINJ_VALID_ERR
-
-#define SVM_EXITINFOSHIFT_TS_REASON_IRET 36
-#define SVM_EXITINFOSHIFT_TS_REASON_JMP 38
-#define SVM_EXITINFOSHIFT_TS_HAS_ERROR_CODE 44
-
-#define	SVM_EXIT_READ_CR0 	0x000
-#define	SVM_EXIT_READ_CR3 	0x003
-#define	SVM_EXIT_READ_CR4 	0x004
-#define	SVM_EXIT_READ_CR8 	0x008
-#define	SVM_EXIT_WRITE_CR0 	0x010
-#define	SVM_EXIT_WRITE_CR3 	0x013
-#define	SVM_EXIT_WRITE_CR4 	0x014
-#define	SVM_EXIT_WRITE_CR8 	0x018
-#define	SVM_EXIT_READ_DR0 	0x020
-#define	SVM_EXIT_READ_DR1 	0x021
-#define	SVM_EXIT_READ_DR2 	0x022
-#define	SVM_EXIT_READ_DR3 	0x023
-#define	SVM_EXIT_READ_DR4 	0x024
-#define	SVM_EXIT_READ_DR5 	0x025
-#define	SVM_EXIT_READ_DR6 	0x026
-#define	SVM_EXIT_READ_DR7 	0x027
-#define	SVM_EXIT_WRITE_DR0 	0x030
-#define	SVM_EXIT_WRITE_DR1 	0x031
-#define	SVM_EXIT_WRITE_DR2 	0x032
-#define	SVM_EXIT_WRITE_DR3 	0x033
-#define	SVM_EXIT_WRITE_DR4 	0x034
-#define	SVM_EXIT_WRITE_DR5 	0x035
-#define	SVM_EXIT_WRITE_DR6 	0x036
-#define	SVM_EXIT_WRITE_DR7 	0x037
-#define SVM_EXIT_EXCP_BASE      0x040
-#define SVM_EXIT_INTR		0x060
-#define SVM_EXIT_NMI		0x061
-#define SVM_EXIT_SMI		0x062
-#define SVM_EXIT_INIT		0x063
-#define SVM_EXIT_VINTR		0x064
-#define SVM_EXIT_CR0_SEL_WRITE	0x065
-#define SVM_EXIT_IDTR_READ	0x066
-#define SVM_EXIT_GDTR_READ	0x067
-#define SVM_EXIT_LDTR_READ	0x068
-#define SVM_EXIT_TR_READ	0x069
-#define SVM_EXIT_IDTR_WRITE	0x06a
-#define SVM_EXIT_GDTR_WRITE	0x06b
-#define SVM_EXIT_LDTR_WRITE	0x06c
-#define SVM_EXIT_TR_WRITE	0x06d
-#define SVM_EXIT_RDTSC		0x06e
-#define SVM_EXIT_RDPMC		0x06f
-#define SVM_EXIT_PUSHF		0x070
-#define SVM_EXIT_POPF		0x071
-#define SVM_EXIT_CPUID		0x072
-#define SVM_EXIT_RSM		0x073
-#define SVM_EXIT_IRET		0x074
-#define SVM_EXIT_SWINT		0x075
-#define SVM_EXIT_INVD		0x076
-#define SVM_EXIT_PAUSE		0x077
-#define SVM_EXIT_HLT		0x078
-#define SVM_EXIT_INVLPG		0x079
-#define SVM_EXIT_INVLPGA	0x07a
-#define SVM_EXIT_IOIO		0x07b
-#define SVM_EXIT_MSR		0x07c
-#define SVM_EXIT_TASK_SWITCH	0x07d
-#define SVM_EXIT_FERR_FREEZE	0x07e
-#define SVM_EXIT_SHUTDOWN	0x07f
-#define SVM_EXIT_VMRUN		0x080
-#define SVM_EXIT_VMMCALL	0x081
-#define SVM_EXIT_VMLOAD		0x082
-#define SVM_EXIT_VMSAVE		0x083
-#define SVM_EXIT_STGI		0x084
-#define SVM_EXIT_CLGI		0x085
-#define SVM_EXIT_SKINIT		0x086
-#define SVM_EXIT_RDTSCP		0x087
-#define SVM_EXIT_ICEBP		0x088
-#define SVM_EXIT_WBINVD		0x089
-#define SVM_EXIT_MONITOR	0x08a
-#define SVM_EXIT_MWAIT		0x08b
-#define SVM_EXIT_MWAIT_COND	0x08c
-#define SVM_EXIT_NPF  		0x400
-
-#define SVM_EXIT_ERR		-1
-
-#define SVM_CR0_SELECTIVE_MASK (X86_CR0_TS | X86_CR0_MP)
-
-#define	SVM_CR0_RESERVED_MASK			0xffffffff00000000U
-#define	SVM_CR3_LONG_MBZ_MASK			0xfff0000000000000U
-#define	SVM_CR3_LONG_RESERVED_MASK		0x0000000000000fe7U
-#define SVM_CR3_PAE_LEGACY_RESERVED_MASK	0x0000000000000007U
-#define	SVM_CR4_LEGACY_RESERVED_MASK		0xff08e000U
-#define	SVM_CR4_RESERVED_MASK			0xffffffffff08e000U
-#define	SVM_DR6_RESERVED_MASK			0xffffffffffff1ff0U
-#define	SVM_DR7_RESERVED_MASK			0xffffffff0000cc00U
-#define	SVM_EFER_RESERVED_MASK			0xffffffffffff0200U
 
 #define MSR_BITMAP_SIZE 8192
-
 #define LBR_CTL_ENABLE_MASK BIT_ULL(0)
 
 struct svm_test {
-- 
2.26.3


[-- Attachment #3: 0002-move-some-svm-support-functions-into-lib-x86-svm_lib.patch --]
[-- Type: text/x-patch, Size: 4460 bytes --]

From 410f0020fe7330af4fc46dbc728eec0bd94c1c82 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk@redhat.com>
Date: Mon, 28 Mar 2022 15:32:21 +0300
Subject: [PATCH 2/7] move some svm support functions into lib/x86/svm_lib.h

---
 lib/x86/svm_lib.h | 53 +++++++++++++++++++++++++++++++++++++++++++++++
 x86/svm.c         | 35 +------------------------------
 x86/svm.h         | 18 ----------------
 x86/svm_tests.c   |  1 +
 4 files changed, 55 insertions(+), 52 deletions(-)
 create mode 100644 lib/x86/svm_lib.h

diff --git a/lib/x86/svm_lib.h b/lib/x86/svm_lib.h
new file mode 100644
index 00000000..cdc93408
--- /dev/null
+++ b/lib/x86/svm_lib.h
@@ -0,0 +1,53 @@
+#ifndef SRC_LIB_X86_SVM_LIB_H_
+#define SRC_LIB_X86_SVM_LIB_H_
+
+#include <x86/svm.h>
+#include "processor.h"
+
+static inline bool npt_supported(void)
+{
+    return this_cpu_has(X86_FEATURE_NPT);
+}
+
+static inline bool vgif_supported(void)
+{
+    return this_cpu_has(X86_FEATURE_VGIF);
+}
+
+static inline bool lbrv_supported(void)
+{
+    return this_cpu_has(X86_FEATURE_LBRV);
+}
+
+static inline bool tsc_scale_supported(void)
+{
+    return this_cpu_has(X86_FEATURE_TSCRATEMSR);
+}
+
+static inline bool pause_filter_supported(void)
+{
+    return this_cpu_has(X86_FEATURE_PAUSEFILTER);
+}
+
+static inline bool pause_threshold_supported(void)
+{
+    return this_cpu_has(X86_FEATURE_PFTHRESHOLD);
+}
+
+static inline void vmmcall(void)
+{
+    asm volatile ("vmmcall" : : : "memory");
+}
+
+static inline void stgi(void)
+{
+    asm volatile ("stgi");
+}
+
+static inline void clgi(void)
+{
+    asm volatile ("clgi");
+}
+
+
+#endif /* SRC_LIB_X86_SVM_LIB_H_ */
diff --git a/x86/svm.c b/x86/svm.c
index f6896f02..009d2d8c 100644
--- a/x86/svm.c
+++ b/x86/svm.c
@@ -14,6 +14,7 @@
 #include "isr.h"
 #include "apic.h"
 #include "vmalloc.h"
+#include "svm_lib.h"
 
 /* for the nested page table*/
 u64 *pte[2048];
@@ -65,31 +66,6 @@ bool default_supported(void)
     return true;
 }
 
-bool vgif_supported(void)
-{
-	return this_cpu_has(X86_FEATURE_VGIF);
-}
-
-bool lbrv_supported(void)
-{
-    return this_cpu_has(X86_FEATURE_LBRV);
-}
-
-bool tsc_scale_supported(void)
-{
-    return this_cpu_has(X86_FEATURE_TSCRATEMSR);
-}
-
-bool pause_filter_supported(void)
-{
-    return this_cpu_has(X86_FEATURE_PAUSEFILTER);
-}
-
-bool pause_threshold_supported(void)
-{
-    return this_cpu_has(X86_FEATURE_PFTHRESHOLD);
-}
-
 
 void default_prepare(struct svm_test *test)
 {
@@ -105,10 +81,6 @@ bool default_finished(struct svm_test *test)
 	return true; /* one vmexit */
 }
 
-bool npt_supported(void)
-{
-	return this_cpu_has(X86_FEATURE_NPT);
-}
 
 int get_test_stage(struct svm_test *test)
 {
@@ -139,11 +111,6 @@ static void vmcb_set_seg(struct vmcb_seg *seg, u16 selector,
 	seg->base = base;
 }
 
-inline void vmmcall(void)
-{
-	asm volatile ("vmmcall" : : : "memory");
-}
-
 static test_guest_func guest_main;
 
 void test_set_guest(test_guest_func func)
diff --git a/x86/svm.h b/x86/svm.h
index ff5fa91e..1eb98de3 100644
--- a/x86/svm.h
+++ b/x86/svm.h
@@ -52,21 +52,14 @@ u64 *npt_get_pdpe(void);
 u64 *npt_get_pml4e(void);
 bool smp_supported(void);
 bool default_supported(void);
-bool vgif_supported(void);
-bool lbrv_supported(void);
-bool tsc_scale_supported(void);
-bool pause_filter_supported(void);
-bool pause_threshold_supported(void);
 void default_prepare(struct svm_test *test);
 void default_prepare_gif_clear(struct svm_test *test);
 bool default_finished(struct svm_test *test);
-bool npt_supported(void);
 int get_test_stage(struct svm_test *test);
 void set_test_stage(struct svm_test *test, int s);
 void inc_test_stage(struct svm_test *test);
 void vmcb_ident(struct vmcb *vmcb);
 struct regs get_regs(void);
-void vmmcall(void);
 int __svm_vmrun(u64 rip);
 void __svm_bare_vmrun(void);
 int svm_vmrun(void);
@@ -75,17 +68,6 @@ void test_set_guest(test_guest_func func);
 extern struct vmcb *vmcb;
 extern struct svm_test svm_tests[];
 
-static inline void stgi(void)
-{
-    asm volatile ("stgi");
-}
-
-static inline void clgi(void)
-{
-    asm volatile ("clgi");
-}
-
-
 
 #define SAVE_GPR_C                              \
         "xchg %%rbx, regs+0x8\n\t"              \
diff --git a/x86/svm_tests.c b/x86/svm_tests.c
index 6a9b03bd..b6a0d5e6 100644
--- a/x86/svm_tests.c
+++ b/x86/svm_tests.c
@@ -10,6 +10,7 @@
 #include "isr.h"
 #include "apic.h"
 #include "delay.h"
+#include "svm_lib.h"
 
 #define SVM_EXIT_MAX_DR_INTERCEPT 0x3f
 
-- 
2.26.3


[-- Attachment #4: 0003-svm-add-svm_suported.patch --]
[-- Type: text/x-patch, Size: 997 bytes --]

From 29c65cc4bd1f4beaca8d92acb0e1a3c39120e556 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk@redhat.com>
Date: Thu, 31 Mar 2022 09:58:54 +0300
Subject: [PATCH 3/7] svm: add svm_suported

---
 lib/x86/svm_lib.h | 5 +++++
 x86/svm.c         | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/x86/svm_lib.h b/lib/x86/svm_lib.h
index cdc93408..1c35d4a9 100644
--- a/lib/x86/svm_lib.h
+++ b/lib/x86/svm_lib.h
@@ -4,6 +4,11 @@
 #include <x86/svm.h>
 #include "processor.h"
 
+static inline bool svm_supported(void)
+{
+    return this_cpu_has(X86_FEATURE_SVM);
+}
+
 static inline bool npt_supported(void)
 {
     return this_cpu_has(X86_FEATURE_NPT);
diff --git a/x86/svm.c b/x86/svm.c
index 009d2d8c..7a654425 100644
--- a/x86/svm.c
+++ b/x86/svm.c
@@ -375,7 +375,7 @@ int main(int ac, char **av)
 
 	__setup_vm(&opt_mask);
 
-	if (!this_cpu_has(X86_FEATURE_SVM)) {
+	if (!svm_supported()) {
 		printf("SVM not availble\n");
 		return report_summary();
 	}
-- 
2.26.3


[-- Attachment #5: 0004-svm-move-setup_svm-to-svm_lib.c.patch --]
[-- Type: text/x-patch, Size: 10826 bytes --]

From 56dfe907b80b9c4ecaa0042acfa5feca13da98ce Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk@redhat.com>
Date: Mon, 28 Mar 2022 16:13:32 +0300
Subject: [PATCH 4/7] svm: move setup_svm to svm_lib.c

---
 lib/x86/svm.h       |   2 +
 lib/x86/svm_lib.c   | 131 ++++++++++++++++++++++++++++++++++++++++++++
 lib/x86/svm_lib.h   |  12 ++++
 x86/Makefile.x86_64 |   2 +
 x86/svm.c           | 115 +-------------------------------------
 x86/svm.h           |   5 --
 x86/svm_tests.c     |  17 ++++--
 7 files changed, 161 insertions(+), 123 deletions(-)
 create mode 100644 lib/x86/svm_lib.c

diff --git a/lib/x86/svm.h b/lib/x86/svm.h
index 38bb9224..21eff090 100644
--- a/lib/x86/svm.h
+++ b/lib/x86/svm.h
@@ -2,6 +2,8 @@
 #ifndef SRC_LIB_X86_SVM_H_
 #define SRC_LIB_X86_SVM_H_
 
+#include "libcflat.h"
+
 enum {
     INTERCEPT_INTR,
     INTERCEPT_NMI,
diff --git a/lib/x86/svm_lib.c b/lib/x86/svm_lib.c
new file mode 100644
index 00000000..8e59d81c
--- /dev/null
+++ b/lib/x86/svm_lib.c
@@ -0,0 +1,131 @@
+
+#include "svm_lib.h"
+#include "libcflat.h"
+#include "processor.h"
+#include "desc.h"
+#include "msr.h"
+#include "vm.h"
+#include "smp.h"
+#include "alloc_page.h"
+
+/* for the nested page table*/
+static u64 *pte[2048];
+static u64 *pde[4];
+static u64 *pdpe;
+static u64 *pml4e;
+
+static u8 *io_bitmap;
+static u8 io_bitmap_area[16384];
+
+static u8 *msr_bitmap;
+static u8 msr_bitmap_area[MSR_BITMAP_SIZE + PAGE_SIZE];
+
+
+u64 *npt_get_pte(u64 address)
+{
+    int i1, i2;
+
+    address >>= 12;
+    i1 = (address >> 9) & 0x7ff;
+    i2 = address & 0x1ff;
+
+    return &pte[i1][i2];
+}
+
+u64 *npt_get_pde(u64 address)
+{
+    int i1, i2;
+
+    address >>= 21;
+    i1 = (address >> 9) & 0x3;
+    i2 = address & 0x1ff;
+
+    return &pde[i1][i2];
+}
+
+u64 *npt_get_pdpe(void)
+{
+    return pdpe;
+}
+
+u64 *npt_get_pml4e(void)
+{
+    return pml4e;
+}
+
+u8* svm_get_msr_bitmap(void)
+{
+    return msr_bitmap;
+}
+
+u8* svm_get_io_bitmap(void)
+{
+    return io_bitmap;
+}
+
+static void set_additional_vcpu_msr(void *msr_efer)
+{
+    void *hsave = alloc_page();
+
+    wrmsr(MSR_VM_HSAVE_PA, virt_to_phys(hsave));
+    wrmsr(MSR_EFER, (ulong)msr_efer | EFER_SVME);
+}
+
+void setup_svm(void)
+{
+    void *hsave = alloc_page();
+    u64 *page, address;
+    int i,j;
+
+    wrmsr(MSR_VM_HSAVE_PA, virt_to_phys(hsave));
+    wrmsr(MSR_EFER, rdmsr(MSR_EFER) | EFER_SVME);
+
+    io_bitmap = (void *) ALIGN((ulong)io_bitmap_area, PAGE_SIZE);
+
+    msr_bitmap = (void *) ALIGN((ulong)msr_bitmap_area, PAGE_SIZE);
+
+    if (!npt_supported())
+        return;
+
+    for (i = 1; i < cpu_count(); i++)
+        on_cpu(i, (void *)set_additional_vcpu_msr, (void *)rdmsr(MSR_EFER));
+
+    printf("NPT detected - running all tests with NPT enabled\n");
+
+    /*
+    * Nested paging supported - Build a nested page table
+    * Build the page-table bottom-up and map everything with 4k
+    * pages to get enough granularity for the NPT unit-tests.
+    */
+
+    address = 0;
+
+    /* PTE level */
+    for (i = 0; i < 2048; ++i) {
+        page = alloc_page();
+
+        for (j = 0; j < 512; ++j, address += 4096)
+                page[j] = address | 0x067ULL;
+
+        pte[i] = page;
+    }
+
+    /* PDE level */
+    for (i = 0; i < 4; ++i) {
+        page = alloc_page();
+
+        for (j = 0; j < 512; ++j)
+            page[j] = (u64)pte[(i * 512) + j] | 0x027ULL;
+
+        pde[i] = page;
+    }
+
+    /* PDPe level */
+    pdpe   = alloc_page();
+    for (i = 0; i < 4; ++i)
+        pdpe[i] = ((u64)(pde[i])) | 0x27;
+
+    /* PML4e level */
+    pml4e    = alloc_page();
+    pml4e[0] = ((u64)pdpe) | 0x27;
+}
diff --git a/lib/x86/svm_lib.h b/lib/x86/svm_lib.h
index 1c35d4a9..f5e83b85 100644
--- a/lib/x86/svm_lib.h
+++ b/lib/x86/svm_lib.h
@@ -54,5 +54,17 @@ static inline void clgi(void)
     asm volatile ("clgi");
 }
 
+void setup_svm(void);
+
+u64 *npt_get_pte(u64 address);
+u64 *npt_get_pde(u64 address);
+u64 *npt_get_pdpe(void);
+u64 *npt_get_pml4e(void);
+
+u8* svm_get_msr_bitmap(void);
+u8* svm_get_io_bitmap(void);
+
+#define MSR_BITMAP_SIZE 8192
+
 
 #endif /* SRC_LIB_X86_SVM_LIB_H_ */
diff --git a/x86/Makefile.x86_64 b/x86/Makefile.x86_64
index f18c1e20..302acf58 100644
--- a/x86/Makefile.x86_64
+++ b/x86/Makefile.x86_64
@@ -17,6 +17,8 @@ COMMON_CFLAGS += -mno-red-zone -mno-sse -mno-sse2 $(fcf_protection_full)
 cflatobjs += lib/x86/setjmp64.o
 cflatobjs += lib/x86/intel-iommu.o
 cflatobjs += lib/x86/usermode.o
+cflatobjs += lib/x86/svm_lib.o
+
 
 tests = $(TEST_DIR)/apic.$(exe) \
 	  $(TEST_DIR)/emulator.$(exe) $(TEST_DIR)/idt_test.$(exe) \
diff --git a/x86/svm.c b/x86/svm.c
index 7a654425..23e65261 100644
--- a/x86/svm.c
+++ b/x86/svm.c
@@ -16,46 +16,8 @@
 #include "vmalloc.h"
 #include "svm_lib.h"
 
-/* for the nested page table*/
-u64 *pte[2048];
-u64 *pde[4];
-u64 *pdpe;
-u64 *pml4e;
-
 struct vmcb *vmcb;
 
-u64 *npt_get_pte(u64 address)
-{
-	int i1, i2;
-
-	address >>= 12;
-	i1 = (address >> 9) & 0x7ff;
-	i2 = address & 0x1ff;
-
-	return &pte[i1][i2];
-}
-
-u64 *npt_get_pde(u64 address)
-{
-	int i1, i2;
-
-	address >>= 21;
-	i1 = (address >> 9) & 0x3;
-	i2 = address & 0x1ff;
-
-	return &pde[i1][i2];
-}
-
-u64 *npt_get_pdpe(void)
-{
-	return pdpe;
-}
-
-u64 *npt_get_pml4e(void)
-{
-	return pml4e;
-}
-
 bool smp_supported(void)
 {
 	return cpu_count() > 1;
@@ -124,12 +86,6 @@ static void test_thunk(struct svm_test *test)
 	vmmcall();
 }
 
-u8 *io_bitmap;
-u8 io_bitmap_area[16384];
-
-u8 *msr_bitmap;
-u8 msr_bitmap_area[MSR_BITMAP_SIZE + PAGE_SIZE];
-
 void vmcb_ident(struct vmcb *vmcb)
 {
 	u64 vmcb_phys = virt_to_phys(vmcb);
@@ -165,12 +121,12 @@ void vmcb_ident(struct vmcb *vmcb)
 	ctrl->intercept = (1ULL << INTERCEPT_VMRUN) |
 			  (1ULL << INTERCEPT_VMMCALL) |
 			  (1ULL << INTERCEPT_SHUTDOWN);
-	ctrl->iopm_base_pa = virt_to_phys(io_bitmap);
-	ctrl->msrpm_base_pa = virt_to_phys(msr_bitmap);
+	ctrl->iopm_base_pa = virt_to_phys(svm_get_io_bitmap());
+	ctrl->msrpm_base_pa = virt_to_phys(svm_get_msr_bitmap());
 
 	if (npt_supported()) {
 		ctrl->nested_ctl = 1;
-		ctrl->nested_cr3 = (u64)pml4e;
+		ctrl->nested_cr3 = (u64)npt_get_pml4e();
 		ctrl->tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
 	}
 }
@@ -259,72 +215,7 @@ static noinline void test_run(struct svm_test *test)
 	    test->on_vcpu_done = true;
 }
 
-static void set_additional_vcpu_msr(void *msr_efer)
-{
-	void *hsave = alloc_page();
-
-	wrmsr(MSR_VM_HSAVE_PA, virt_to_phys(hsave));
-	wrmsr(MSR_EFER, (ulong)msr_efer | EFER_SVME);
-}
-
-static void setup_svm(void)
-{
-	void *hsave = alloc_page();
-	u64 *page, address;
-	int i,j;
-
-	wrmsr(MSR_VM_HSAVE_PA, virt_to_phys(hsave));
-	wrmsr(MSR_EFER, rdmsr(MSR_EFER) | EFER_SVME);
-
-	io_bitmap = (void *) ALIGN((ulong)io_bitmap_area, PAGE_SIZE);
-
-	msr_bitmap = (void *) ALIGN((ulong)msr_bitmap_area, PAGE_SIZE);
-
-	if (!npt_supported())
-		return;
-
-	for (i = 1; i < cpu_count(); i++)
-		on_cpu(i, (void *)set_additional_vcpu_msr, (void *)rdmsr(MSR_EFER));
-
-	printf("NPT detected - running all tests with NPT enabled\n");
-
-	/*
-	* Nested paging supported - Build a nested page table
-	* Build the page-table bottom-up and map everything with 4k
-	* pages to get enough granularity for the NPT unit-tests.
-	*/
-
-	address = 0;
 
-	/* PTE level */
-	for (i = 0; i < 2048; ++i) {
-		page = alloc_page();
-
-		for (j = 0; j < 512; ++j, address += 4096)
-	    		page[j] = address | 0x067ULL;
-
-		pte[i] = page;
-	}
-
-	/* PDE level */
-	for (i = 0; i < 4; ++i) {
-		page = alloc_page();
-
-	for (j = 0; j < 512; ++j)
-	    page[j] = (u64)pte[(i * 512) + j] | 0x027ULL;
-
-		pde[i] = page;
-	}
-
-	/* PDPe level */
-	pdpe   = alloc_page();
-	for (i = 0; i < 4; ++i)
-		pdpe[i] = ((u64)(pde[i])) | 0x27;
-
-	/* PML4e level */
-	pml4e    = alloc_page();
-	pml4e[0] = ((u64)pdpe) | 0x27;
-}
 
 int matched;
 
diff --git a/x86/svm.h b/x86/svm.h
index 1eb98de3..7fecb429 100644
--- a/x86/svm.h
+++ b/x86/svm.h
@@ -5,7 +5,6 @@
 #include <x86/svm.h>
 
 
-#define MSR_BITMAP_SIZE 8192
 #define LBR_CTL_ENABLE_MASK BIT_ULL(0)
 
 struct svm_test {
@@ -46,10 +45,6 @@ struct regs {
 
 typedef void (*test_guest_func)(struct svm_test *);
 
-u64 *npt_get_pte(u64 address);
-u64 *npt_get_pde(u64 address);
-u64 *npt_get_pdpe(void);
-u64 *npt_get_pml4e(void);
 bool smp_supported(void);
 bool default_supported(void);
 void default_prepare(struct svm_test *test);
diff --git a/x86/svm_tests.c b/x86/svm_tests.c
index b6a0d5e6..07ac01ff 100644
--- a/x86/svm_tests.c
+++ b/x86/svm_tests.c
@@ -309,14 +309,13 @@ static bool check_next_rip(struct svm_test *test)
     return address == vmcb->control.next_rip;
 }
 
-extern u8 *msr_bitmap;
 
 static void prepare_msr_intercept(struct svm_test *test)
 {
     default_prepare(test);
     vmcb->control.intercept |= (1ULL << INTERCEPT_MSR_PROT);
     vmcb->control.intercept_exceptions |= (1ULL << GP_VECTOR);
-    memset(msr_bitmap, 0xff, MSR_BITMAP_SIZE);
+    memset(svm_get_msr_bitmap(), 0xff, MSR_BITMAP_SIZE);
 }
 
 static void test_msr_intercept(struct svm_test *test)
@@ -427,7 +426,7 @@ static bool msr_intercept_finished(struct svm_test *test)
 
 static bool check_msr_intercept(struct svm_test *test)
 {
-    memset(msr_bitmap, 0, MSR_BITMAP_SIZE);
+    memset(svm_get_msr_bitmap(), 0, MSR_BITMAP_SIZE);
     return (test->scratch == -2);
 }
 
@@ -539,10 +538,10 @@ static bool check_mode_switch(struct svm_test *test)
 	return test->scratch == 2;
 }
 
-extern u8 *io_bitmap;
-
 static void prepare_ioio(struct svm_test *test)
 {
+    u8 *io_bitmap = svm_get_io_bitmap();
+
     vmcb->control.intercept |= (1ULL << INTERCEPT_IOIO_PROT);
     test->scratch = 0;
     memset(io_bitmap, 0, 8192);
@@ -551,6 +550,8 @@ static void prepare_ioio(struct svm_test *test)
 
 static void test_ioio(struct svm_test *test)
 {
+    u8 *io_bitmap = svm_get_io_bitmap();
+
     // stage 0, test IO pass
     inb(0x5000);
     outb(0x0, 0x5000);
@@ -623,6 +624,7 @@ fail:
 static bool ioio_finished(struct svm_test *test)
 {
     unsigned port, size;
+    u8 *io_bitmap = svm_get_io_bitmap();
 
     /* Only expect IOIO intercepts */
     if (vmcb->control.exit_code == SVM_EXIT_VMMCALL)
@@ -647,6 +649,8 @@ static bool ioio_finished(struct svm_test *test)
 
 static bool check_ioio(struct svm_test *test)
 {
+    u8 *io_bitmap = svm_get_io_bitmap();
+
     memset(io_bitmap, 0, 8193);
     return test->scratch != -1;
 }
@@ -2514,7 +2518,8 @@ static void test_msrpm_iopm_bitmap_addrs(void)
 {
 	u64 saved_intercept = vmcb->control.intercept;
 	u64 addr_beyond_limit = 1ull << cpuid_maxphyaddr();
-	u64 addr = virt_to_phys(msr_bitmap) & (~((1ull << 12) - 1));
+	u64 addr = virt_to_phys(svm_get_msr_bitmap()) & (~((1ull << 12) - 1));
+	u8 *io_bitmap = svm_get_io_bitmap();
 
 	TEST_BITMAP_ADDR(saved_intercept, INTERCEPT_MSR_PROT,
 			addr_beyond_limit - 2 * PAGE_SIZE, SVM_EXIT_ERR,
-- 
2.26.3


[-- Attachment #6: 0005-svm-move-vmcb_ident-to-svm_lib.c.patch --]
[-- Type: text/x-patch, Size: 6163 bytes --]

From 7315483ca9c06017a4642ef8d5dfd4b19d47d712 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk@redhat.com>
Date: Mon, 28 Mar 2022 16:16:24 +0300
Subject: [PATCH 5/7] svm: move vmcb_ident to svm_lib.c

---
 lib/x86/svm_lib.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++
 lib/x86/svm_lib.h |  4 ++++
 x86/svm.c         | 54 -----------------------------------------------
 x86/svm.h         |  1 -
 4 files changed, 58 insertions(+), 55 deletions(-)

diff --git a/lib/x86/svm_lib.c b/lib/x86/svm_lib.c
index 8e59d81c..48246810 100644
--- a/lib/x86/svm_lib.c
+++ b/lib/x86/svm_lib.c
@@ -71,6 +71,15 @@ static void set_additional_vcpu_msr(void *msr_efer)
     wrmsr(MSR_EFER, (ulong)msr_efer | EFER_SVME);
 }
 
+void vmcb_set_seg(struct vmcb_seg *seg, u16 selector,
+                         u64 base, u32 limit, u32 attr)
+{
+    seg->selector = selector;
+    seg->attrib = attr;
+    seg->limit = limit;
+    seg->base = base;
+}
+
 void setup_svm(void)
 {
     void *hsave = alloc_page();
@@ -129,3 +138,48 @@ void setup_svm(void)
     pml4e    = alloc_page();
     pml4e[0] = ((u64)pdpe) | 0x27;
 }
+
+void vmcb_ident(struct vmcb *vmcb)
+{
+    u64 vmcb_phys = virt_to_phys(vmcb);
+    struct vmcb_save_area *save = &vmcb->save;
+    struct vmcb_control_area *ctrl = &vmcb->control;
+    u32 data_seg_attr = 3 | SVM_SELECTOR_S_MASK | SVM_SELECTOR_P_MASK
+        | SVM_SELECTOR_DB_MASK | SVM_SELECTOR_G_MASK;
+    u32 code_seg_attr = 9 | SVM_SELECTOR_S_MASK | SVM_SELECTOR_P_MASK
+        | SVM_SELECTOR_L_MASK | SVM_SELECTOR_G_MASK;
+    struct descriptor_table_ptr desc_table_ptr;
+
+    memset(vmcb, 0, sizeof(*vmcb));
+    asm volatile ("vmsave %0" : : "a"(vmcb_phys) : "memory");
+    vmcb_set_seg(&save->es, read_es(), 0, -1U, data_seg_attr);
+    vmcb_set_seg(&save->cs, read_cs(), 0, -1U, code_seg_attr);
+    vmcb_set_seg(&save->ss, read_ss(), 0, -1U, data_seg_attr);
+    vmcb_set_seg(&save->ds, read_ds(), 0, -1U, data_seg_attr);
+    sgdt(&desc_table_ptr);
+    vmcb_set_seg(&save->gdtr, 0, desc_table_ptr.base, desc_table_ptr.limit, 0);
+    sidt(&desc_table_ptr);
+    vmcb_set_seg(&save->idtr, 0, desc_table_ptr.base, desc_table_ptr.limit, 0);
+    ctrl->asid = 1;
+    save->cpl = 0;
+    save->efer = rdmsr(MSR_EFER);
+    save->cr4 = read_cr4();
+    save->cr3 = read_cr3();
+    save->cr0 = read_cr0();
+    save->dr7 = read_dr7();
+    save->dr6 = read_dr6();
+    save->cr2 = read_cr2();
+    save->g_pat = rdmsr(MSR_IA32_CR_PAT);
+    save->dbgctl = rdmsr(MSR_IA32_DEBUGCTLMSR);
+    ctrl->intercept = (1ULL << INTERCEPT_VMRUN) |
+              (1ULL << INTERCEPT_VMMCALL) |
+              (1ULL << INTERCEPT_SHUTDOWN);
+    ctrl->iopm_base_pa = virt_to_phys(svm_get_io_bitmap());
+    ctrl->msrpm_base_pa = virt_to_phys(svm_get_msr_bitmap());
+
+    if (npt_supported()) {
+        ctrl->nested_ctl = 1;
+        ctrl->nested_cr3 = (u64)npt_get_pml4e();
+        ctrl->tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
+    }
+}
diff --git a/lib/x86/svm_lib.h b/lib/x86/svm_lib.h
index f5e83b85..6d9a86aa 100644
--- a/lib/x86/svm_lib.h
+++ b/lib/x86/svm_lib.h
@@ -54,7 +54,11 @@ static inline void clgi(void)
     asm volatile ("clgi");
 }
 
+void vmcb_set_seg(struct vmcb_seg *seg, u16 selector,
+                         u64 base, u32 limit, u32 attr);
+
 void setup_svm(void);
+void vmcb_ident(struct vmcb *vmcb);
 
 u64 *npt_get_pte(u64 address);
 u64 *npt_get_pde(u64 address);
diff --git a/x86/svm.c b/x86/svm.c
index 23e65261..74c3931b 100644
--- a/x86/svm.c
+++ b/x86/svm.c
@@ -64,15 +64,6 @@ void inc_test_stage(struct svm_test *test)
 	barrier();
 }
 
-static void vmcb_set_seg(struct vmcb_seg *seg, u16 selector,
-                         u64 base, u32 limit, u32 attr)
-{
-	seg->selector = selector;
-	seg->attrib = attr;
-	seg->limit = limit;
-	seg->base = base;
-}
-
 static test_guest_func guest_main;
 
 void test_set_guest(test_guest_func func)
@@ -86,51 +77,6 @@ static void test_thunk(struct svm_test *test)
 	vmmcall();
 }
 
-void vmcb_ident(struct vmcb *vmcb)
-{
-	u64 vmcb_phys = virt_to_phys(vmcb);
-	struct vmcb_save_area *save = &vmcb->save;
-	struct vmcb_control_area *ctrl = &vmcb->control;
-	u32 data_seg_attr = 3 | SVM_SELECTOR_S_MASK | SVM_SELECTOR_P_MASK
-	    | SVM_SELECTOR_DB_MASK | SVM_SELECTOR_G_MASK;
-	u32 code_seg_attr = 9 | SVM_SELECTOR_S_MASK | SVM_SELECTOR_P_MASK
-	    | SVM_SELECTOR_L_MASK | SVM_SELECTOR_G_MASK;
-	struct descriptor_table_ptr desc_table_ptr;
-
-	memset(vmcb, 0, sizeof(*vmcb));
-	asm volatile ("vmsave %0" : : "a"(vmcb_phys) : "memory");
-	vmcb_set_seg(&save->es, read_es(), 0, -1U, data_seg_attr);
-	vmcb_set_seg(&save->cs, read_cs(), 0, -1U, code_seg_attr);
-	vmcb_set_seg(&save->ss, read_ss(), 0, -1U, data_seg_attr);
-	vmcb_set_seg(&save->ds, read_ds(), 0, -1U, data_seg_attr);
-	sgdt(&desc_table_ptr);
-	vmcb_set_seg(&save->gdtr, 0, desc_table_ptr.base, desc_table_ptr.limit, 0);
-	sidt(&desc_table_ptr);
-	vmcb_set_seg(&save->idtr, 0, desc_table_ptr.base, desc_table_ptr.limit, 0);
-	ctrl->asid = 1;
-	save->cpl = 0;
-	save->efer = rdmsr(MSR_EFER);
-	save->cr4 = read_cr4();
-	save->cr3 = read_cr3();
-	save->cr0 = read_cr0();
-	save->dr7 = read_dr7();
-	save->dr6 = read_dr6();
-	save->cr2 = read_cr2();
-	save->g_pat = rdmsr(MSR_IA32_CR_PAT);
-	save->dbgctl = rdmsr(MSR_IA32_DEBUGCTLMSR);
-	ctrl->intercept = (1ULL << INTERCEPT_VMRUN) |
-			  (1ULL << INTERCEPT_VMMCALL) |
-			  (1ULL << INTERCEPT_SHUTDOWN);
-	ctrl->iopm_base_pa = virt_to_phys(svm_get_io_bitmap());
-	ctrl->msrpm_base_pa = virt_to_phys(svm_get_msr_bitmap());
-
-	if (npt_supported()) {
-		ctrl->nested_ctl = 1;
-		ctrl->nested_cr3 = (u64)npt_get_pml4e();
-		ctrl->tlb_ctl = TLB_CONTROL_FLUSH_ALL_ASID;
-	}
-}
-
 struct regs regs;
 
 struct regs get_regs(void)
diff --git a/x86/svm.h b/x86/svm.h
index 7fecb429..4c609795 100644
--- a/x86/svm.h
+++ b/x86/svm.h
@@ -53,7 +53,6 @@ bool default_finished(struct svm_test *test);
 int get_test_stage(struct svm_test *test);
 void set_test_stage(struct svm_test *test, int s);
 void inc_test_stage(struct svm_test *test);
-void vmcb_ident(struct vmcb *vmcb);
 struct regs get_regs(void);
 int __svm_vmrun(u64 rip);
 void __svm_bare_vmrun(void);
-- 
2.26.3


[-- Attachment #7: 0006-svm-move-svm-entry-macros-to-svm_lib.h.patch --]
[-- Type: text/x-patch, Size: 8860 bytes --]

From f06ebf20cd0115be33c38ce887ef6d28ad562183 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk@redhat.com>
Date: Sun, 3 Apr 2022 10:46:43 +0300
Subject: [PATCH 6/7] svm: move svm entry macros to svm_lib.h

---
 lib/x86/svm_lib.h | 68 +++++++++++++++++++++++++++++++++++++++++++++
 x86/svm.c         | 22 ++++++---------
 x86/svm.h         | 71 ++---------------------------------------------
 x86/svm_tests.c   |  9 +++---
 4 files changed, 85 insertions(+), 85 deletions(-)

diff --git a/lib/x86/svm_lib.h b/lib/x86/svm_lib.h
index 6d9a86aa..f682c679 100644
--- a/lib/x86/svm_lib.h
+++ b/lib/x86/svm_lib.h
@@ -71,4 +71,72 @@ u8* svm_get_io_bitmap(void);
 #define MSR_BITMAP_SIZE 8192
 
 
+struct x86_gpr_regs
+{
+    u64 rax;
+    u64 rbx;
+    u64 rcx;
+    u64 rdx;
+    u64 cr2;
+    u64 rbp;
+    u64 rsi;
+    u64 rdi;
+
+    u64 r8;
+    u64 r9;
+    u64 r10;
+    u64 r11;
+    u64 r12;
+    u64 r13;
+    u64 r14;
+    u64 r15;
+    u64 rflags;
+};
+
+#define SAVE_GPR_C(regs) \
+		"xchg %%rbx, %p[" #regs "]+0x8\n\t"              \
+		"xchg %%rcx, %p[" #regs "]+0x10\n\t"             \
+		"xchg %%rdx, %p[" #regs "]+0x18\n\t"             \
+		"xchg %%rbp, %p[" #regs "]+0x28\n\t"             \
+		"xchg %%rsi, %p[" #regs "]+0x30\n\t"             \
+		"xchg %%rdi, %p[" #regs "]+0x38\n\t"             \
+		"xchg %%r8,  %p[" #regs "]+0x40\n\t"             \
+		"xchg %%r9,  %p[" #regs "]+0x48\n\t"             \
+		"xchg %%r10, %p[" #regs "]+0x50\n\t"             \
+		"xchg %%r11, %p[" #regs "]+0x58\n\t"             \
+		"xchg %%r12, %p[" #regs "]+0x60\n\t"             \
+		"xchg %%r13, %p[" #regs "]+0x68\n\t"             \
+		"xchg %%r14, %p[" #regs "]+0x70\n\t"             \
+		"xchg %%r15, %p[" #regs "]+0x78\n\t"             \
+
+#define LOAD_GPR_C(regs)      SAVE_GPR_C(regs)
+
+#define ASM_PRE_VMRUN_CMD(regs)             \
+        "vmload %%rax\n\t"                  \
+        "mov %p[" #regs "]+0x80, %%r15\n\t" \
+        "mov %%r15, 0x170(%%rax)\n\t"       \
+        "mov %p[" #regs "], %%r15\n\t"      \
+        "mov %%r15, 0x1f8(%%rax)\n\t"       \
+        LOAD_GPR_C(regs)                    \
+
+#define ASM_POST_VMRUN_CMD(regs)            \
+        SAVE_GPR_C(regs)                    \
+        "mov 0x170(%%rax), %%r15\n\t"       \
+        "mov %%r15, %p[regs]+0x80\n\t"      \
+        "mov 0x1f8(%%rax), %%r15\n\t"       \
+        "mov %%r15, %p[regs]\n\t"           \
+        "vmsave %%rax\n\t"                  \
+
+
+#define SVM_BARE_VMRUN(vmcb, regs) \
+	asm volatile (                    \
+		ASM_PRE_VMRUN_CMD(regs)       \
+                "vmrun %%rax\n\t"     \
+		ASM_POST_VMRUN_CMD(regs)      \
+		:                             \
+		: "a" (virt_to_phys(vmcb)),    \
+		  [regs] "i" (&regs) \
+		: "memory", "r15")
+
+
 #endif /* SRC_LIB_X86_SVM_LIB_H_ */
diff --git a/x86/svm.c b/x86/svm.c
index 74c3931b..b2dbef75 100644
--- a/x86/svm.c
+++ b/x86/svm.c
@@ -77,9 +77,9 @@ static void test_thunk(struct svm_test *test)
 	vmmcall();
 }
 
-struct regs regs;
+struct x86_gpr_regs regs;
 
-struct regs get_regs(void)
+struct x86_gpr_regs get_regs(void)
 {
 	return regs;
 }
@@ -98,13 +98,7 @@ int __svm_vmrun(u64 rip)
 	vmcb->save.rsp = (ulong)(guest_stack + ARRAY_SIZE(guest_stack));
 	regs.rdi = (ulong)v2_test;
 
-	asm volatile (
-		ASM_PRE_VMRUN_CMD
-                "vmrun %%rax\n\t"               \
-		ASM_POST_VMRUN_CMD
-		:
-		: "a" (virt_to_phys(vmcb))
-		: "memory", "r15");
+	SVM_BARE_VMRUN(vmcb, regs);
 
 	return (vmcb->control.exit_code);
 }
@@ -118,6 +112,7 @@ extern u8 vmrun_rip;
 
 static noinline void test_run(struct svm_test *test)
 {
+
 	u64 vmcb_phys = virt_to_phys(vmcb);
 
 	irq_disable();
@@ -136,18 +131,19 @@ static noinline void test_run(struct svm_test *test)
 			"sti \n\t"
 			"call *%c[PREPARE_GIF_CLEAR](%[test]) \n \t"
 			"mov %[vmcb_phys], %%rax \n\t"
-			ASM_PRE_VMRUN_CMD
+			ASM_PRE_VMRUN_CMD(regs)
 			".global vmrun_rip\n\t"		\
 			"vmrun_rip: vmrun %%rax\n\t"    \
-			ASM_POST_VMRUN_CMD
+			ASM_POST_VMRUN_CMD(regs)
 			"cli \n\t"
 			"stgi"
 			: // inputs clobbered by the guest:
 			"=D" (the_test),            // first argument register
 			"=b" (the_vmcb)             // callee save register!
 			: [test] "0" (the_test),
-			[vmcb_phys] "1"(the_vmcb),
-			[PREPARE_GIF_CLEAR] "i" (offsetof(struct svm_test, prepare_gif_clear))
+			  [vmcb_phys] "1"(the_vmcb),
+			  [PREPARE_GIF_CLEAR] "i" (offsetof(struct svm_test, prepare_gif_clear)),
+			  [regs] "i"(&regs)
 			: "rax", "rcx", "rdx", "rsi",
 			"r8", "r9", "r10", "r11" , "r12", "r13", "r14", "r15",
 			"memory");
diff --git a/x86/svm.h b/x86/svm.h
index 4c609795..7cc3b690 100644
--- a/x86/svm.h
+++ b/x86/svm.h
@@ -23,28 +23,10 @@ struct svm_test {
 	bool on_vcpu_done;
 };
 
-struct regs {
-	u64 rax;
-	u64 rbx;
-	u64 rcx;
-	u64 rdx;
-	u64 cr2;
-	u64 rbp;
-	u64 rsi;
-	u64 rdi;
-	u64 r8;
-	u64 r9;
-	u64 r10;
-	u64 r11;
-	u64 r12;
-	u64 r13;
-	u64 r14;
-	u64 r15;
-	u64 rflags;
-};
-
 typedef void (*test_guest_func)(struct svm_test *);
 
+extern struct x86_gpr_regs regs;
+
 bool smp_supported(void);
 bool default_supported(void);
 void default_prepare(struct svm_test *test);
@@ -53,7 +35,7 @@ bool default_finished(struct svm_test *test);
 int get_test_stage(struct svm_test *test);
 void set_test_stage(struct svm_test *test, int s);
 void inc_test_stage(struct svm_test *test);
-struct regs get_regs(void);
+struct x86_gpr_regs get_regs(void);
 int __svm_vmrun(u64 rip);
 void __svm_bare_vmrun(void);
 int svm_vmrun(void);
@@ -61,51 +43,4 @@ void test_set_guest(test_guest_func func);
 
 extern struct vmcb *vmcb;
 extern struct svm_test svm_tests[];
-
-
-#define SAVE_GPR_C                              \
-        "xchg %%rbx, regs+0x8\n\t"              \
-        "xchg %%rcx, regs+0x10\n\t"             \
-        "xchg %%rdx, regs+0x18\n\t"             \
-        "xchg %%rbp, regs+0x28\n\t"             \
-        "xchg %%rsi, regs+0x30\n\t"             \
-        "xchg %%rdi, regs+0x38\n\t"             \
-        "xchg %%r8, regs+0x40\n\t"              \
-        "xchg %%r9, regs+0x48\n\t"              \
-        "xchg %%r10, regs+0x50\n\t"             \
-        "xchg %%r11, regs+0x58\n\t"             \
-        "xchg %%r12, regs+0x60\n\t"             \
-        "xchg %%r13, regs+0x68\n\t"             \
-        "xchg %%r14, regs+0x70\n\t"             \
-        "xchg %%r15, regs+0x78\n\t"
-
-#define LOAD_GPR_C      SAVE_GPR_C
-
-#define ASM_PRE_VMRUN_CMD                       \
-                "vmload %%rax\n\t"              \
-                "mov regs+0x80, %%r15\n\t"      \
-                "mov %%r15, 0x170(%%rax)\n\t"   \
-                "mov regs, %%r15\n\t"           \
-                "mov %%r15, 0x1f8(%%rax)\n\t"   \
-                LOAD_GPR_C                      \
-
-#define ASM_POST_VMRUN_CMD                      \
-                SAVE_GPR_C                      \
-                "mov 0x170(%%rax), %%r15\n\t"   \
-                "mov %%r15, regs+0x80\n\t"      \
-                "mov 0x1f8(%%rax), %%r15\n\t"   \
-                "mov %%r15, regs\n\t"           \
-                "vmsave %%rax\n\t"              \
-
-
-
-#define SVM_BARE_VMRUN \
-	asm volatile ( \
-		ASM_PRE_VMRUN_CMD \
-                "vmrun %%rax\n\t"               \
-		ASM_POST_VMRUN_CMD \
-		: \
-		: "a" (virt_to_phys(vmcb)) \
-		: "memory", "r15") \
-
 #endif
diff --git a/x86/svm_tests.c b/x86/svm_tests.c
index 07ac01ff..cb47fb02 100644
--- a/x86/svm_tests.c
+++ b/x86/svm_tests.c
@@ -3147,6 +3147,7 @@ into:
 static void svm_into_test(void)
 {
     handle_exception(OF_VECTOR, guest_test_of_handler);
+
     test_set_guest(svm_of_test_guest);
     report(svm_vmrun() == SVM_EXIT_VMMCALL && of_test_counter == 1,
         "#OF is generated in L2 exception handler0");
@@ -3351,7 +3352,7 @@ static void svm_lbrv_test1(void)
 
 	wrmsr(MSR_IA32_DEBUGCTLMSR, DEBUGCTLMSR_LBR);
 	DO_BRANCH(host_branch1);
-	SVM_BARE_VMRUN;
+	SVM_BARE_VMRUN(vmcb,regs);
 	dbgctl = rdmsr(MSR_IA32_DEBUGCTLMSR);
 
 	if (vmcb->control.exit_code != SVM_EXIT_VMMCALL) {
@@ -3374,7 +3375,7 @@ static void svm_lbrv_test2(void)
 	wrmsr(MSR_IA32_DEBUGCTLMSR, DEBUGCTLMSR_LBR);
 	DO_BRANCH(host_branch2);
 	wrmsr(MSR_IA32_DEBUGCTLMSR, 0);
-	SVM_BARE_VMRUN;
+	SVM_BARE_VMRUN(vmcb,regs);
 	dbgctl = rdmsr(MSR_IA32_DEBUGCTLMSR);
 	wrmsr(MSR_IA32_DEBUGCTLMSR, 0);
 
@@ -3402,7 +3403,7 @@ static void svm_lbrv_nested_test1(void)
 
 	wrmsr(MSR_IA32_DEBUGCTLMSR, DEBUGCTLMSR_LBR);
 	DO_BRANCH(host_branch3);
-	SVM_BARE_VMRUN;
+	SVM_BARE_VMRUN(vmcb,regs);
 	dbgctl = rdmsr(MSR_IA32_DEBUGCTLMSR);
 	wrmsr(MSR_IA32_DEBUGCTLMSR, 0);
 
@@ -3437,7 +3438,7 @@ static void svm_lbrv_nested_test2(void)
 
 	wrmsr(MSR_IA32_DEBUGCTLMSR, DEBUGCTLMSR_LBR);
 	DO_BRANCH(host_branch4);
-	SVM_BARE_VMRUN;
+	SVM_BARE_VMRUN(vmcb,regs);
 	dbgctl = rdmsr(MSR_IA32_DEBUGCTLMSR);
 	wrmsr(MSR_IA32_DEBUGCTLMSR, 0);
 
-- 
2.26.3


[-- Attachment #8: 0007-add-unit-test-for-avic-ipi.patch --]
[-- Type: text/x-patch, Size: 7221 bytes --]

From d5acfbc39399d4727eaafcbe0d9eabedb54d76a9 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk@redhat.com>
Date: Tue, 30 Nov 2021 13:56:57 +0200
Subject: [PATCH 7/7] add unit test for avic ipi

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 x86/Makefile.common |   4 +-
 x86/ipi_stress.c    | 252 ++++++++++++++++++++++++++++++++++++++++++++
 x86/unittests.cfg   |   5 +
 3 files changed, 260 insertions(+), 1 deletion(-)
 create mode 100644 x86/ipi_stress.c

diff --git a/x86/Makefile.common b/x86/Makefile.common
index b9039882..21c6af15 100644
--- a/x86/Makefile.common
+++ b/x86/Makefile.common
@@ -84,7 +84,9 @@ tests-common = $(TEST_DIR)/vmexit.$(exe) $(TEST_DIR)/tsc.$(exe) \
                $(TEST_DIR)/tsx-ctrl.$(exe) \
                $(TEST_DIR)/eventinj.$(exe) \
                $(TEST_DIR)/smap.$(exe) \
-               $(TEST_DIR)/umip.$(exe)
+               $(TEST_DIR)/umip.$(exe) \
+               $(TEST_DIR)/ipi_stress.$(exe)
+               
 
 # The following test cases are disabled when building EFI tests because they
 # use absolute addresses in their inline assembly code, which cannot compile
diff --git a/x86/ipi_stress.c b/x86/ipi_stress.c
new file mode 100644
index 00000000..950c2439
--- /dev/null
+++ b/x86/ipi_stress.c
@@ -0,0 +1,252 @@
+#include "libcflat.h"
+#include "smp.h"
+#include "alloc.h"
+#include "apic.h"
+#include "processor.h"
+#include "isr.h"
+#include "asm/barrier.h"
+#include "delay.h"
+#include "svm.h"
+#include "desc.h"
+#include "msr.h"
+#include "vm.h"
+#include "types.h"
+#include "alloc_page.h"
+#include "vmalloc.h"
+#include "svm_lib.h"
+
+u64 num_iterations = -1;
+struct x86_gpr_regs regs;
+u64 guest_stack[10000];
+struct vmcb *vmcb;
+
+volatile u64 *isr_counts;
+bool use_svm;
+int hlt_allowed = -1;
+
+
+static int get_random(int min, int max)
+{
+	/* TODO : use rdrand to seed an PRNG instead */
+	u64 random_value = rdtsc() >> 4;
+
+	return min + random_value % (max - min + 1);
+}
+
+static void ipi_interrupt_handler(isr_regs_t *r)
+{
+	isr_counts[smp_id()]++;
+	eoi();
+}
+
+static void wait_for_ipi(volatile u64 *count)
+{
+	u64 old_count = *count;
+	bool use_halt;
+
+	switch (hlt_allowed) {
+	case -1:
+		use_halt = get_random(0,10000) == 0;
+		break;
+	case 0:
+		use_halt = false;
+		break;
+	case 1:
+		use_halt = true;
+		break;
+	default:
+		use_halt = false;
+		break;
+	}
+
+	do {
+		if (use_halt)
+			asm volatile ("sti;hlt;cli\n");
+		else
+			asm volatile ("sti;nop;cli");
+
+	} while (old_count == *count);
+}
+
+/******************************************************************************************************/
+
+#ifdef __x86_64__
+static void l2_guest_wait_for_ipi(volatile u64 *count)
+{
+	wait_for_ipi(count);
+	asm volatile("vmmcall");
+}
+
+static void l2_guest_dummy(void)
+{
+	asm volatile("vmmcall");
+}
+
+static void wait_for_ipi_in_l2(volatile u64 *count, struct vmcb *vmcb)
+{
+	u64 old_count = *count;
+	bool irq_on_vmentry = get_random(0,1) == 0;
+
+	vmcb->save.rip = (ulong)l2_guest_wait_for_ipi;
+	vmcb->save.rsp = (ulong)(guest_stack + ARRAY_SIZE(guest_stack));
+	regs.rdi = (u64)count;
+
+	vmcb->save.rip = irq_on_vmentry ? (ulong)l2_guest_dummy : (ulong)l2_guest_wait_for_ipi;
+
+	do {
+		if (irq_on_vmentry)
+			vmcb->save.rflags |= X86_EFLAGS_IF;
+		else
+			vmcb->save.rflags &= ~X86_EFLAGS_IF;
+
+		asm volatile("clgi;nop;sti");
+		// GIF is set by VMRUN
+		SVM_BARE_VMRUN(vmcb, regs);
+		// GIF is cleared by VMEXIT
+		asm volatile("cli;nop;stgi");
+
+		assert(vmcb->control.exit_code == SVM_EXIT_VMMCALL);
+
+	} while (old_count == *count);
+}
+#endif
+
+/******************************************************************************************************/
+
+#define FIRST_TEST_VCPU 1
+
+static void vcpu_init(void *data)
+{
+	/* To make it easier to see iteration number in the trace */
+	handle_irq(0x40, ipi_interrupt_handler);
+	handle_irq(0x50, ipi_interrupt_handler);
+}
+
+static void vcpu_code(void *data)
+{
+	int ncpus = cpu_count();
+	int cpu = (long)data;
+
+	u64 i;
+
+#ifdef __x86_64__
+	if (cpu == 2 && use_svm)
+	{
+		vmcb = alloc_page();
+		vmcb_ident(vmcb);
+
+		// when set, intercept physical interrupts
+		//vmcb->control.intercept |= (1 << INTERCEPT_INTR);
+
+		// when set, host IF controls the masking of interrupts while the guest runs
+		// guest IF only might allow a virtual interrupt to be injected (if set in int_ctl)
+		//vmcb->control.int_ctl |= V_INTR_MASKING_MASK;
+	}
+#endif
+
+	assert(cpu != 0);
+
+	if (cpu != FIRST_TEST_VCPU)
+		wait_for_ipi(&isr_counts[cpu]);
+
+	for (i = 0; i < num_iterations; i++)
+	{
+		u8 physical_dst = cpu == ncpus -1 ? 1 : cpu + 1;
+
+		// send IPI to a next vCPU in a circular fashion
+		apic_icr_write(APIC_INT_ASSERT |
+				APIC_DEST_PHYSICAL |
+				APIC_DM_FIXED |
+				(i % 2 ? 0x40 : 0x50),
+				physical_dst);
+
+		if (i == (num_iterations - 1) && cpu != FIRST_TEST_VCPU)
+			break;
+
+#ifdef __x86_64__
+		// wait for the IPI interrupt chain to come back to us
+		if (cpu == 2 && use_svm) {
+				wait_for_ipi_in_l2(&isr_counts[cpu], vmcb);
+				continue;
+		}
+#endif
+
+		wait_for_ipi(&isr_counts[cpu]);
+	}
+}
+
+int main(int argc, void** argv)
+{
+	int cpu, ncpus = cpu_count();
+
+	assert(ncpus > 2);
+
+	if (argc > 1)
+		hlt_allowed = atol(argv[1]);
+
+	if (argc > 2)
+		num_iterations = atol(argv[2]);
+
+	setup_vm();
+
+#ifdef __x86_64__
+	if (svm_supported()) {
+		use_svm = true;
+		setup_svm();
+	}
+#endif
+
+	isr_counts = (volatile u64 *)calloc(ncpus, sizeof(u64));
+
+	printf("found %d cpus\n", ncpus);
+	printf("running for %lld iterations - test\n",
+		(long long unsigned int)num_iterations);
+
+	/*
+	 * Ensure that we don't have interrupt window pending
+	 * from PIT timer which inhibits the AVIC.
+	 */
+
+	asm volatile("sti;nop;cli\n");
+
+	for (cpu = 0; cpu < ncpus; ++cpu)
+		on_cpu_async(cpu, vcpu_init, (void *)(long)cpu);
+
+	/* now let all the vCPUs end the IPI function*/
+	while (cpus_active() > 1)
+		  pause();
+
+	printf("starting test on all cpus but 0...\n");
+
+	for (cpu = ncpus-1; cpu >= FIRST_TEST_VCPU; cpu--)
+		on_cpu_async(cpu, vcpu_code, (void *)(long)cpu);
+
+	printf("test started, waiting to end...\n");
+
+	while (cpus_active() > 1) {
+
+		unsigned long isr_count1, isr_count2;
+
+		isr_count1 = isr_counts[1];
+		delay(5ULL*1000*1000*1000);
+		isr_count2 = isr_counts[1];
+
+		if (isr_count1 == isr_count2) {
+			printf("\n");
+			printf("hang detected!!\n");
+			//break;
+		} else {
+			printf("made %ld IPIs \n", (isr_count2 - isr_count1)*(ncpus-1));
+		}
+	}
+
+	printf("\n");
+
+	for (cpu = 1; cpu < ncpus; ++cpu)
+		report(isr_counts[cpu] == num_iterations,
+				"Number of IPIs match (%lld)",
+				(long long unsigned int)isr_counts[cpu]);
+
+	free((void*)isr_counts);
+	return report_summary();
+}
diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index 37017971..c001f42b 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -61,6 +61,11 @@ smp = 2
 file = smptest.flat
 smp = 3
 
+[ipi_stress]
+file = ipi_stress.flat
+extra_params = -cpu host,-x2apic,-svm,-hypervisor -global kvm-pit.lost_tick_policy=discard -machine kernel-irqchip=on -append '50000'
+smp = 4
+
 [vmexit_cpuid]
 file = vmexit.flat
 extra_params = -append 'cpuid'
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2)
  2022-04-28 17:16     ` Maxim Levitsky
@ 2022-04-28 17:21       ` Maxim Levitsky
  0 siblings, 0 replies; 9+ messages in thread
From: Maxim Levitsky @ 2022-04-28 17:21 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: syzbot, bp, dave.hansen, hpa, jmattson, joro, kvm, linux-kernel,
	mingo, pbonzini, syzkaller-bugs, tglx, vkuznets, wanpengli, x86

On Thu, 2022-04-28 at 20:16 +0300, Maxim Levitsky wrote:
> On Thu, 2022-04-28 at 15:32 +0000, Sean Christopherson wrote:
> > On Tue, Apr 26, 2022, Maxim Levitsky wrote:
> > > I can reproduce this in a VM, by running and CTRL+C'in my ipi_stress test,
> > 
> > Can you post your ipi_stress test?  I'm curious to see if I can repro, and also
> > very curious as to what might be unique about your test.  I haven't been able to
> > repro the syzbot test, nor have I been able to repro by killing VMs/tests.
> > 
> 
> This is the patch series (mostly attempt to turn svm to mini library,
> but I don't know if this is worth it.
> It was done so that ipi_stress could use  nesting itself to wait for IPI
> from within a nested guest. I usually don't use it.
> 
> This is more or less how I was running it lately (I have a wrapper script)
> 
> 
> ./x86/run x86/ipi_stress.flat \
>         -global kvm-pit.lost_tick_policy=discard \
> 	        -machine kernel-irqchip=on -name debug-threads=on  \
> 	        \
> 	        -smp 8 \
> 	        -cpu host,x2apic=off,svm=off,-hypervisor \
> 	        -overcommit cpu-pm=on \
> 	        -m 4g -append "0 10000"

I forgot to mention: this should be run in a loop.

Best regards,
	Maxim Levitsky

> 
> 
> Its not fully finised for upstream, I will get to it soon.
> 
> 'cpu-pm=on' won't work for you as this fails due to non atomic memslot
> update bug for which I have a small hack in qemu, and it is on my
> backlog to fix it correctly.
> 
> Mostly likely cpu_pm=off will also reproduce it.
> 
> 
> Test was run in a guest, natively this doesn't seem to reproduce.
> tdp mmu was used for both L0 and L1.
> 
> Best regards,
> 	Maxim levitsky



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2)
  2022-04-28 15:32   ` Sean Christopherson
  2022-04-28 17:16     ` Maxim Levitsky
@ 2022-04-28 17:22     ` Paolo Bonzini
  2022-04-28 17:26       ` Maxim Levitsky
  1 sibling, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2022-04-28 17:22 UTC (permalink / raw)
  To: Sean Christopherson, Maxim Levitsky
  Cc: syzbot, bp, dave.hansen, hpa, jmattson, joro, kvm, linux-kernel,
	mingo, syzkaller-bugs, tglx, vkuznets, wanpengli, x86

On 4/28/22 17:32, Sean Christopherson wrote:
> On Tue, Apr 26, 2022, Maxim Levitsky wrote:
>> I can reproduce this in a VM, by running and CTRL+C'in my ipi_stress test,
> 
> Can you post your ipi_stress test?  I'm curious to see if I can repro, and also
> very curious as to what might be unique about your test.  I haven't been able to
> repro the syzbot test, nor have I been able to repro by killing VMs/tests.

Did you test with CONFIG_PREEMPT=y?

(BTW, the fact that it reproduces under 5.17 is a mixed blessing, 
because it means that we can analyze/stare at a simpler codebase).

Paolo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2)
  2022-04-28 17:22     ` Paolo Bonzini
@ 2022-04-28 17:26       ` Maxim Levitsky
  2022-04-28 17:43         ` Sean Christopherson
  0 siblings, 1 reply; 9+ messages in thread
From: Maxim Levitsky @ 2022-04-28 17:26 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: syzbot, bp, dave.hansen, hpa, jmattson, joro, kvm, linux-kernel,
	mingo, syzkaller-bugs, tglx, vkuznets, wanpengli, x86

On Thu, 2022-04-28 at 19:22 +0200, Paolo Bonzini wrote:
> On 4/28/22 17:32, Sean Christopherson wrote:
> > On Tue, Apr 26, 2022, Maxim Levitsky wrote:
> > > I can reproduce this in a VM, by running and CTRL+C'in my ipi_stress test,
> > 
> > Can you post your ipi_stress test?  I'm curious to see if I can repro, and also
> > very curious as to what might be unique about your test.  I haven't been able to
> > repro the syzbot test, nor have I been able to repro by killing VMs/tests.
> 
> Did you test with CONFIG_PREEMPT=y?

yes, I test with CONFIG_PREEMPT but I only enabled it a day ago,
I think I had seen this warning before, but could bit, I'll check
if that fails without CONFIG_PREEMPT as well.


What I recently changed, is that I enabled lockdep and related settings
on all my machines and VMs, I enabled CONFIG_PREEMPT, and I also
switched to tdp_mmu on all systems.

Bugs are biting but it is better this way, especially to weed out
the last bugs of my nested avic code :)

Best regards,
	Maxim Levitsky


> 
> (BTW, the fact that it reproduces under 5.17 is a mixed blessing, 
> because it means that we can analyze/stare at a simpler codebase).
> 
> Paolo
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2)
  2022-04-28 17:26       ` Maxim Levitsky
@ 2022-04-28 17:43         ` Sean Christopherson
  0 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2022-04-28 17:43 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, syzbot, bp, dave.hansen, hpa, jmattson, joro, kvm,
	linux-kernel, mingo, syzkaller-bugs, tglx, vkuznets, wanpengli,
	x86

On Thu, Apr 28, 2022, Maxim Levitsky wrote:
> On Thu, 2022-04-28 at 19:22 +0200, Paolo Bonzini wrote:
> > On 4/28/22 17:32, Sean Christopherson wrote:
> > > On Tue, Apr 26, 2022, Maxim Levitsky wrote:
> > > > I can reproduce this in a VM, by running and CTRL+C'in my ipi_stress test,
> > > 
> > > Can you post your ipi_stress test?  I'm curious to see if I can repro, and also
> > > very curious as to what might be unique about your test.  I haven't been able to
> > > repro the syzbot test, nor have I been able to repro by killing VMs/tests.
> > 
> > Did you test with CONFIG_PREEMPT=y?
> 
> yes, I test with CONFIG_PREEMPT but I only enabled it a day ago,
> I think I had seen this warning before, but could bit, I'll check
> if that fails without CONFIG_PREEMPT as well.

I have not tested with CONFIG_PREEMPT.  For some unknown reason the syzbot configs
don't play nice with my VM setup and so I never use them verbatim.  I didn't think
to pull over CONFIG_PREEMPTY.  I'll give that a shot.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-04-28 17:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-23 10:56 [syzbot] WARNING in kvm_mmu_uninit_tdp_mmu (2) syzbot
2022-04-26 16:20 ` Maxim Levitsky
2022-04-28  7:25   ` Maxim Levitsky
2022-04-28 15:32   ` Sean Christopherson
2022-04-28 17:16     ` Maxim Levitsky
2022-04-28 17:21       ` Maxim Levitsky
2022-04-28 17:22     ` Paolo Bonzini
2022-04-28 17:26       ` Maxim Levitsky
2022-04-28 17:43         ` Sean Christopherson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.