All of lore.kernel.org
 help / color / mirror / Atom feed
* WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
@ 2023-05-26  7:43 Fabio Coatti
  2023-05-26  8:34 ` Bagas Sanjaya
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Fabio Coatti @ 2023-05-26  7:43 UTC (permalink / raw)
  To: stable, regressions, kvm

Hi all,
I'm using vanilla kernels on a gentoo-based laptop and since 6.3.2 I'm
getting the kernel log  below when using kvm VM on my box.
I know, kernel is tainted but avoiding to load nvidia driver could
make things complicated on my side; if needed for debug I can try to
avoid it.

Not sure which other infos can be relevant in this context; if you
need more details just let me know, happy to provide them.

[Fri May 26 09:16:35 2023] ------------[ cut here ]------------
[Fri May 26 09:16:35 2023] WARNING: CPU: 5 PID: 4684 at
kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
[Fri May 26 09:16:35 2023] Modules linked in: vhost_net vhost
vhost_iotlb tap tun tls rfcomm snd_hrtimer snd_seq xt_CHECKSUM
algif_skcipher xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4
ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle
iptable_nat nf_nat iptable_filter ip_tables bpfilter bridge stp llc
rmi_smbus rmi_core bnep squashfs sch_fq_codel nvidia_drm(POE)
intel_rapl_msr vboxnetadp(OE) vboxnetflt(OE) nvidia_modeset(POE)
mei_pxp mei_hdcp rtsx_pci_sdmmc vboxdrv(OE) mmc_core intel_rapl_common
intel_pmc_core_pltdrv intel_pmc_core snd_ctl_led intel_tcc_cooling
snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal
intel_powerclamp btusb btrtl snd_usb_audio btbcm btmtk kvm_intel
btintel snd_hda_intel snd_intel_dspcfg snd_usbmidi_lib snd_hda_codec
snd_rawmidi snd_hwdep bluetooth snd_hda_core snd_seq_device kvm
snd_pcm thinkpad_acpi iwlmvm mousedev ledtrig_audio uvcvideo snd_timer
ecdh_generic irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni
snd think_lmi joydev mei_me ecc uvc
[Fri May 26 09:16:35 2023]  polyval_generic rtsx_pci iwlwifi
firmware_attributes_class psmouse wmi_bmof soundcore intel_pch_thermal
mei platform_profile input_leds evdev nvidia(POE) coretemp hwmon
akvcam(OE) videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev
videobuf2_common mc loop nfsd auth_rpcgss nfs_acl efivarfs dmi_sysfs
dm_zero dm_thin_pool dm_persistent_data dm_bio_prison dm_service_time
dm_round_robin dm_queue_length dm_multipath dm_delay virtio_pci
virtio_pci_legacy_dev virtio_pci_modern_dev virtio_blk virtio_console
virtio_balloon vxlan ip6_udp_tunnel udp_tunnel macvlan virtio_net
net_failover failover virtio_ring virtio fuse overlay nfs lockd grace
sunrpc linear raid10 raid1 raid0 dm_raid raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx md_mod dm_snapshot dm_bufio
dm_crypt trusted asn1_encoder tpm rng_core dm_mirror dm_region_hash
dm_log firewire_core crc_itu_t hid_apple usb_storage ehci_pci ehci_hcd
sr_mod cdrom ahci libahci libata
[Fri May 26 09:16:35 2023] CPU: 5 PID: 4684 Comm: kvm-nx-lpage-re
Tainted: P     U     OE      6.3.4-cova #1
[Fri May 26 09:16:35 2023] Hardware name: LENOVO
20EQS58500/20EQS58500, BIOS N1EET98W (1.71 ) 12/06/2022
[Fri May 26 09:16:35 2023] RIP:
0010:kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
[Fri May 26 09:16:35 2023] Code: 48 8b 44 24 30 4c 39 e0 0f 85 1b fe
ff ff 48 89 df e8 2e ab fb ff e9 23 fe ff ff 49 bc ff ff ff ff ff ff
ff 7f e9 fb fc ff ff <0f> 0b e9 1b ff ff ff 48 8b 44 24 40 65 48 2b 04
25 28 00 00 00 75
[Fri May 26 09:16:35 2023] RSP: 0018:ffff8e1a4403fe68 EFLAGS: 00010246
[Fri May 26 09:16:35 2023] RAX: 0000000000000000 RBX: ffff8e1a42bbd000
RCX: 0000000000000000
[Fri May 26 09:16:35 2023] RDX: 0000000000000000 RSI: 0000000000000000
RDI: 0000000000000000
[Fri May 26 09:16:35 2023] RBP: ffff8b4e9a56d930 R08: 0000000000000000
R09: ffff8b4e9a56d8a0
[Fri May 26 09:16:35 2023] R10: 0000000000000000 R11: 0000000000000001
R12: ffff8e1a4403fe98
[Fri May 26 09:16:35 2023] R13: 0000000000000001 R14: ffff8b4d9c432e80
R15: 0000000000000010
[Fri May 26 09:16:35 2023] FS:  0000000000000000(0000)
GS:ffff8b5cdf740000(0000) knlGS:0000000000000000
[Fri May 26 09:16:35 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri May 26 09:16:35 2023] CR2: 00007efeac53d000 CR3: 0000000978c2c003
CR4: 00000000003726e0
[Fri May 26 09:16:35 2023] Call Trace:
[Fri May 26 09:16:35 2023]  <TASK>
[Fri May 26 09:16:35 2023]  ?
__pfx_kvm_nx_huge_page_recovery_worker+0x10/0x10 [kvm]
[Fri May 26 09:16:35 2023]  kvm_vm_worker_thread+0x106/0x1c0 [kvm]
[Fri May 26 09:16:35 2023]  ? __pfx_kvm_vm_worker_thread+0x10/0x10 [kvm]
[Fri May 26 09:16:35 2023]  kthread+0xd9/0x100
[Fri May 26 09:16:35 2023]  ? __pfx_kthread+0x10/0x10
[Fri May 26 09:16:35 2023]  ret_from_fork+0x2c/0x50
[Fri May 26 09:16:35 2023]  </TASK>
[Fri May 26 09:16:35 2023] ---[ end trace 0000000000000000 ]---
-- 
Fabio

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
  2023-05-26  7:43 WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4 Fabio Coatti
@ 2023-05-26  8:34 ` Bagas Sanjaya
  2023-05-26 17:01 ` Sean Christopherson
  2023-05-28 12:44 ` Bagas Sanjaya
  2 siblings, 0 replies; 11+ messages in thread
From: Bagas Sanjaya @ 2023-05-26  8:34 UTC (permalink / raw)
  To: Fabio Coatti, stable, regressions, kvm

On Fri, May 26, 2023 at 09:43:17AM +0200, Fabio Coatti wrote:
> Hi all,
> I'm using vanilla kernels on a gentoo-based laptop and since 6.3.2 I'm
> getting the kernel log  below when using kvm VM on my box.
> I know, kernel is tainted but avoiding to load nvidia driver could
> make things complicated on my side; if needed for debug I can try to
> avoid it.

Can you try uninstalling nvidia driver (should make kernel not tainted
anymore) and reproduce this regression?

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
  2023-05-26  7:43 WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4 Fabio Coatti
  2023-05-26  8:34 ` Bagas Sanjaya
@ 2023-05-26 17:01 ` Sean Christopherson
  2023-05-28  9:22   ` Fabio Coatti
  2023-05-28 10:54   ` Fabio Coatti
  2023-05-28 12:44 ` Bagas Sanjaya
  2 siblings, 2 replies; 11+ messages in thread
From: Sean Christopherson @ 2023-05-26 17:01 UTC (permalink / raw)
  To: Fabio Coatti; +Cc: stable, regressions, kvm

On Fri, May 26, 2023, Fabio Coatti wrote:
> Hi all,
> I'm using vanilla kernels on a gentoo-based laptop and since 6.3.2

What was the last kernel you used that didn't trigger this WARN?

> I'm getting the kernel log  below when using kvm VM on my box.

Are you doing anything "interesting" when the WARN fires, or are you just running
the VM and it random fires?  Either way, can you provide your QEMU command line?

> I know, kernel is tainted but avoiding to load nvidia driver could make
> things complicated on my side; if needed for debug I can try to avoid it.

Nah, don't worry about that at this point.

> Not sure which other infos can be relevant in this context; if you
> need more details just let me know, happy to provide them.
> 
> [Fri May 26 09:16:35 2023] ------------[ cut here ]------------
> [Fri May 26 09:16:35 2023] WARNING: CPU: 5 PID: 4684 at
> kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]

Do you have the actual line number for the WARN?  There are a handful of sanity
checks in kvm_recover_nx_huge_pages(), it would be helpful to pinpoint which one
is firing.  My builds generate quite different code, and the code stream doesn't
appear to be useful for reverse engineering the location.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
  2023-05-26 17:01 ` Sean Christopherson
@ 2023-05-28  9:22   ` Fabio Coatti
  2023-05-28 10:54   ` Fabio Coatti
  1 sibling, 0 replies; 11+ messages in thread
From: Fabio Coatti @ 2023-05-28  9:22 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: stable, regressions, kvm

Il giorno ven 26 mag 2023 alle ore 19:01 Sean Christopherson
<seanjc@google.com> ha scritto:

> > I'm using vanilla kernels on a gentoo-based laptop and since 6.3.2
>
> What was the last kernel you used that didn't trigger this WARN?

6.3.1

>
> > I'm getting the kernel log  below when using kvm VM on my box.
>
> Are you doing anything "interesting" when the WARN fires, or are you just running
> the VM and it random fires?  Either way, can you provide your QEMU command line?

I'm not able to spot a specific action that triggers the dump. Now it
happened when I was "simply" opening a new chrome page in the guest
VM. I guess this can cause some work on mm side, but not really an
"interesting" action, I'd say. Basically, I fired up the guest machine
(ubuntu 22.04 very basic) on a newly rebooted host, connected a USB
device (yubikey) and started chrome. No message just after starting
chrome, only when I opened a new page.

Anyway, this is the command line (libvirt managed VM)

/usr/sbin/qemu-system-x86_64 -name guest=ubuntu-u2204-kvm,debug-threads=on -S
-object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-ubuntu-u2204-kvm/master-key.aes"}
-blockdev {"driver":"file","filename":"/usr/share/edk2-ovmf/OVMF_CODE.secboot.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}
-blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}
-blockdev {"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/ubuntu-u2204-kvm_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}
-blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}
-machine pc-q35-7.1,usb=off,vmport=off,smm=on,dump-guest-core=off,memory-backend=pc.ram,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format,hpet=off,acpi=on
-accel kvm
-cpu host,migratable=on
-global driver=cfi.pflash01,property=secure,value=on
-m 16384
-object {"qom-type":"memory-backend-ram","id":"pc.ram","size":17179869184}
-overcommit mem-lock=off
-smp 4,sockets=4,cores=1,threads=1
-uuid 160141fc-ec2e-4d91-bc1c-3e597643bcfd
-no-user-config
-nodefaults
-chardev socket,id=charmonitor,fd=30,server=on,wait=off
-mon chardev=charmonitor,id=monitor,mode=control
-rtc base=utc,driftfix=slew
-global kvm-pit.lost_tick_policy=delay
-no-shutdown
-global ICH9-LPC.disable_s3=1
-global ICH9-LPC.disable_s4=1
-boot strict=on
-device {"driver":"pcie-root-port","port":16,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x2"}
-device {"driver":"pcie-root-port","port":17,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x2.0x1"}
-device {"driver":"pcie-root-port","port":18,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x2.0x2"}
-device {"driver":"pcie-root-port","port":19,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x2.0x3"}
-device {"driver":"pcie-root-port","port":20,"chassis":5,"id":"pci.5","bus":"pcie.0","addr":"0x2.0x4"}
-device {"driver":"pcie-root-port","port":21,"chassis":6,"id":"pci.6","bus":"pcie.0","addr":"0x2.0x5"}
-device {"driver":"pcie-root-port","port":22,"chassis":7,"id":"pci.7","bus":"pcie.0","addr":"0x2.0x6"}
-device {"driver":"pcie-root-port","port":23,"chassis":8,"id":"pci.8","bus":"pcie.0","addr":"0x2.0x7"}
-device {"driver":"pcie-root-port","port":24,"chassis":9,"id":"pci.9","bus":"pcie.0","multifunction":true,"addr":"0x3"}
-device {"driver":"pcie-root-port","port":25,"chassis":10,"id":"pci.10","bus":"pcie.0","addr":"0x3.0x1"}
-device {"driver":"pcie-root-port","port":26,"chassis":11,"id":"pci.11","bus":"pcie.0","addr":"0x3.0x2"}
-device {"driver":"pcie-root-port","port":27,"chassis":12,"id":"pci.12","bus":"pcie.0","addr":"0x3.0x3"}
-device {"driver":"pcie-root-port","port":28,"chassis":13,"id":"pci.13","bus":"pcie.0","addr":"0x3.0x4"}
-device {"driver":"pcie-root-port","port":29,"chassis":14,"id":"pci.14","bus":"pcie.0","addr":"0x3.0x5"}
-device {"driver":"qemu-xhci","p2":15,"p3":15,"id":"usb","bus":"pci.2","addr":"0x0"}
-device {"driver":"virtio-serial-pci","id":"virtio-serial0","bus":"pci.3","addr":"0x0"}
-blockdev {"driver":"file","filename":"/var/lib/libvirt/images/ubuntu22.04.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}
-blockdev {"node-name":"libvirt-2-format","read-only":false,"discard":"unmap","driver":"qcow2","file":"libvirt-2-storage","backing":null}
-device {"driver":"virtio-blk-pci","bus":"pci.4","addr":"0x0","drive":"libvirt-2-format","id":"virtio-disk0","bootindex":1}
-device {"driver":"ide-cd","bus":"ide.0","id":"sata0-0-0"} -netdev
{"type":"tap","fd":"32","vhost":true,"vhostfd":"34","id":"hostnet0"}
-device {"driver":"virtio-net-pci","netdev":"hostnet0","id":"net0","mac":"52:54:00:17:0a:44","bus":"pci.1","addr":"0x0"}
-chardev pty,id=charserial0 -device
{"driver":"isa-serial","chardev":"charserial0","id":"serial0","index":0}
-chardev socket,id=charchannel0,fd=28,server=on,wait=off
-device {"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"}
-chardev spicevmc,id=charchannel1,name=vdagent -device
{"driver":"virtserialport","bus":"virtio-serial0.0","nr":2,"chardev":"charchannel1","id":"channel1","name":"com.redhat.spice.0"}
-device {"driver":"usb-tablet","id":"input0","bus":"usb.0","port":"1"}
-audiodev {"id":"audio1","driver":"spice"}
-spice port=0,disable-ticketing=on,image-compression=off,seamless-migration=on
-device {"driver":"virtio-vga","id":"video0","max_outputs":1,"bus":"pcie.0","addr":"0x1"}
-device {"driver":"ich9-intel-hda","id":"sound0","bus":"pcie.0","addr":"0x1b"}
-device {"driver":"hda-duplex","id":"sound0-codec0","bus":"sound0.0","cad":0,"audiodev":"audio1"}
-global ICH9-LPC.noreboot=off
-watchdog-action reset
-chardev spicevmc,id=charredir0,name=usbredir
-device {"driver":"usb-redir","chardev":"charredir0","id":"redir0","bus":"usb.0","port":"2"}
-chardev spicevmc,id=charredir1,name=usbredir
-device {"driver":"usb-redir","chardev":"charredir1","id":"redir1","bus":"usb.0","port":"3"}
-device {"driver":"virtio-balloon-pci","id":"balloon0","bus":"pci.5","addr":"0x0"}
-object {"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}
-device {"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.6","addr":"0x0"}
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny
-msg timestamp=on

ps output (of course a different run from the first message report):
     57176 ?        I<     0:00 [kvm]
     57178 ?        S      0:00 [kvm-nx-lpage-recovery-57159]
     57189 ?        S      0:00 [kvm-pit/57159]


> > Not sure which other infos can be relevant in this context; if you
> > need more details just let me know, happy to provide them.
> >
> > [Fri May 26 09:16:35 2023] ------------[ cut here ]------------
> > [Fri May 26 09:16:35 2023] WARNING: CPU: 5 PID: 4684 at
> > kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
>
> Do you have the actual line number for the WARN?  There are a handful of sanity
> checks in kvm_recover_nx_huge_pages(), it would be helpful to pinpoint which one
> is firing.  My builds generate quite different code, and the code stream doesn't
> appear to be useful for reverse engineering the location.

That's the full message I get. Maybe I should recompile the host
kernel with some debug active, any specific suggestion?



-- 
Fabio

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
  2023-05-26 17:01 ` Sean Christopherson
  2023-05-28  9:22   ` Fabio Coatti
@ 2023-05-28 10:54   ` Fabio Coatti
  1 sibling, 0 replies; 11+ messages in thread
From: Fabio Coatti @ 2023-05-28 10:54 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: stable, regressions, kvm

Il giorno ven 26 mag 2023 alle ore 19:01 Sean Christopherson
<seanjc@google.com> ha scritto:

>
> Do you have the actual line number for the WARN?  There are a handful of sanity
> checks in kvm_recover_nx_huge_pages(), it would be helpful to pinpoint which one
> is firing.  My builds generate quite different code, and the code stream doesn't
> appear to be useful for reverse engineering the location.

Just got the following: arch/x86/kvm/mmu/mmu.c:7015 so seemingly around here:

if (atomic_read(&kvm->nr_memslots_dirty_logging)) {
slot = gfn_to_memslot(kvm, sp->gfn);
WARN_ON_ONCE(!slot);
}


[Sun May 28 12:48:12 2023] ------------[ cut here ]------------
[Sun May 28 12:48:12 2023] WARNING: CPU: 1 PID: 3911 at
arch/x86/kvm/mmu/mmu.c:7015
kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
[Sun May 28 12:48:12 2023] Modules linked in: vhost_net vhost
vhost_iotlb tap tun rfcomm snd_hrtimer snd_seq xt_CHECKSUM
xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_mangle
ip6table_nat ip6table_filter ip6_tables iptable_mangle iptable_nat
nf_nat iptable_filter ip_tables bpfilter bridge stp llc algif_skcipher
bnep rmi_smbus rmi_core squashfs sch_fq_codel vboxnetadp(OE)
nvidia_drm(POE) vboxnetflt(OE) rtsx_pci_sdmmc intel_rapl_msr
nvidia_modeset(POE) mmc_core mei_pxp mei_hdcp vboxdrv(OE) snd_ctl_led
intel_rapl_common snd_hda_codec_realtek intel_pmc_core_pltdrv
snd_hda_codec_generic intel_pmc_core intel_tcc_cooling
x86_pkg_temp_thermal intel_powerclamp btusb snd_hda_intel btrtl btbcm
snd_intel_dspcfg btmtk snd_usb_audio kvm_intel btintel snd_usbmidi_lib
snd_hda_codec snd_hwdep kvm snd_rawmidi iwlmvm snd_hda_core
snd_seq_device bluetooth snd_pcm thinkpad_acpi irqbypass
crct10dif_pclmul crc32_pclmul snd_timer mei_me ledtrig_audio
ecdh_generic psmouse joydev think_lmi uvcvideo polyval_clmulni snd
polyval_generic wmi_bmof
[Sun May 28 12:48:12 2023]  firmware_attributes_class iwlwifi rtsx_pci
uvc ecc mousedev soundcore mei intel_pch_thermal platform_profile
evdev input_leds nvidia(POE) coretemp hwmon akvcam(OE)
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev
videobuf2_common mc loop nfsd auth_rpcgss nfs_acl efivarfs dmi_sysfs
dm_zero dm_thin_pool dm_persistent_data dm_bio_prison dm_service_time
dm_round_robin dm_queue_length dm_multipath dm_delay virtio_pci
virtio_pci_legacy_dev virtio_pci_modern_dev virtio_blk virtio_console
virtio_balloon vxlan ip6_udp_tunnel udp_tunnel macvlan virtio_net
net_failover failover virtio_ring virtio fuse overlay nfs lockd grace
sunrpc linear raid10 raid1 raid0 dm_raid raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx md_mod dm_snapshot dm_bufio
dm_crypt trusted asn1_encoder tpm rng_core dm_mirror dm_region_hash
dm_log firewire_core crc_itu_t hid_apple usb_storage ehci_pci ehci_hcd
sr_mod cdrom ahci libahci libata
[Sun May 28 12:48:12 2023] CPU: 1 PID: 3911 Comm: kvm-nx-lpage-re
Tainted: P     U     OE      6.3.4-cova #2
[Sun May 28 12:48:12 2023] Hardware name: LENOVO
20EQS58500/20EQS58500, BIOS N1EET98W (1.71 ) 12/06/2022
[Sun May 28 12:48:12 2023] RIP:
0010:kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
[Sun May 28 12:48:12 2023] Code: 48 8b 44 24 30 4c 39 e0 0f 85 1b fe
ff ff 48 89 df e8 2e ab fb ff e9 23 fe ff ff 49 bc ff ff ff ff ff ff
ff 7f e9 fb fc ff ff <0f> 0b e9 1b ff ff ff 48 8b 44 24 40 65 48 2b 04
25 28 00 00 00 75
[Sun May 28 12:48:12 2023] RSP: 0018:ffff99b284f0be68 EFLAGS: 00010246
[Sun May 28 12:48:12 2023] RAX: 0000000000000000 RBX: ffff99b284edd000
RCX: 0000000000000000
[Sun May 28 12:48:12 2023] RDX: 0000000000000000 RSI: 0000000000000000
RDI: 0000000000000000
[Sun May 28 12:48:12 2023] RBP: ffff9271397024e0 R08: 0000000000000000
R09: ffff927139702450
[Sun May 28 12:48:12 2023] R10: 0000000000000000 R11: 0000000000000001
R12: ffff99b284f0be98
[Sun May 28 12:48:12 2023] R13: 0000000000000000 R14: ffff9270991fcd80
R15: 0000000000000003
[Sun May 28 12:48:12 2023] FS:  0000000000000000(0000)
GS:ffff927f9f640000(0000) knlGS:0000000000000000
[Sun May 28 12:48:12 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Sun May 28 12:48:12 2023] CR2: 00007f0aacad3ae0 CR3: 000000088fc2c005
CR4: 00000000003726e0
[Sun May 28 12:48:12 2023] Call Trace:
[Sun May 28 12:48:12 2023]  <TASK>
[Sun May 28 12:48:12 2023]  ?
__pfx_kvm_nx_huge_page_recovery_worker+0x10/0x10 [kvm]
[Sun May 28 12:48:12 2023]  kvm_vm_worker_thread+0x106/0x1c0 [kvm]
[Sun May 28 12:48:12 2023]  ? __pfx_kvm_vm_worker_thread+0x10/0x10 [kvm]
[Sun May 28 12:48:12 2023]  kthread+0xd9/0x100
[Sun May 28 12:48:12 2023]  ? __pfx_kthread+0x10/0x10
[Sun May 28 12:48:12 2023]  ret_from_fork+0x2c/0x50
[Sun May 28 12:48:12 2023]  </TASK>
[Sun May 28 12:48:12 2023] ---[ end trace 0000000000000000 ]---


-- 
Fabio

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
  2023-05-26  7:43 WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4 Fabio Coatti
  2023-05-26  8:34 ` Bagas Sanjaya
  2023-05-26 17:01 ` Sean Christopherson
@ 2023-05-28 12:44 ` Bagas Sanjaya
  2023-05-30 10:42   ` Fabio Coatti
  2 siblings, 1 reply; 11+ messages in thread
From: Bagas Sanjaya @ 2023-05-28 12:44 UTC (permalink / raw)
  To: Fabio Coatti, stable, regressions, kvm; +Cc: Junaid Shahid, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 5119 bytes --]

On Fri, May 26, 2023 at 09:43:17AM +0200, Fabio Coatti wrote:
> Hi all,
> I'm using vanilla kernels on a gentoo-based laptop and since 6.3.2 I'm
> getting the kernel log  below when using kvm VM on my box.
> I know, kernel is tainted but avoiding to load nvidia driver could
> make things complicated on my side; if needed for debug I can try to
> avoid it.
> 
> Not sure which other infos can be relevant in this context; if you
> need more details just let me know, happy to provide them.
> 
> [Fri May 26 09:16:35 2023] ------------[ cut here ]------------
> [Fri May 26 09:16:35 2023] WARNING: CPU: 5 PID: 4684 at
> kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
> [Fri May 26 09:16:35 2023] Modules linked in: vhost_net vhost
> vhost_iotlb tap tun tls rfcomm snd_hrtimer snd_seq xt_CHECKSUM
> algif_skcipher xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4
> ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle
> iptable_nat nf_nat iptable_filter ip_tables bpfilter bridge stp llc
> rmi_smbus rmi_core bnep squashfs sch_fq_codel nvidia_drm(POE)
> intel_rapl_msr vboxnetadp(OE) vboxnetflt(OE) nvidia_modeset(POE)
> mei_pxp mei_hdcp rtsx_pci_sdmmc vboxdrv(OE) mmc_core intel_rapl_common
> intel_pmc_core_pltdrv intel_pmc_core snd_ctl_led intel_tcc_cooling
> snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal
> intel_powerclamp btusb btrtl snd_usb_audio btbcm btmtk kvm_intel
> btintel snd_hda_intel snd_intel_dspcfg snd_usbmidi_lib snd_hda_codec
> snd_rawmidi snd_hwdep bluetooth snd_hda_core snd_seq_device kvm
> snd_pcm thinkpad_acpi iwlmvm mousedev ledtrig_audio uvcvideo snd_timer
> ecdh_generic irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni
> snd think_lmi joydev mei_me ecc uvc
> [Fri May 26 09:16:35 2023]  polyval_generic rtsx_pci iwlwifi
> firmware_attributes_class psmouse wmi_bmof soundcore intel_pch_thermal
> mei platform_profile input_leds evdev nvidia(POE) coretemp hwmon
> akvcam(OE) videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev
> videobuf2_common mc loop nfsd auth_rpcgss nfs_acl efivarfs dmi_sysfs
> dm_zero dm_thin_pool dm_persistent_data dm_bio_prison dm_service_time
> dm_round_robin dm_queue_length dm_multipath dm_delay virtio_pci
> virtio_pci_legacy_dev virtio_pci_modern_dev virtio_blk virtio_console
> virtio_balloon vxlan ip6_udp_tunnel udp_tunnel macvlan virtio_net
> net_failover failover virtio_ring virtio fuse overlay nfs lockd grace
> sunrpc linear raid10 raid1 raid0 dm_raid raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx md_mod dm_snapshot dm_bufio
> dm_crypt trusted asn1_encoder tpm rng_core dm_mirror dm_region_hash
> dm_log firewire_core crc_itu_t hid_apple usb_storage ehci_pci ehci_hcd
> sr_mod cdrom ahci libahci libata
> [Fri May 26 09:16:35 2023] CPU: 5 PID: 4684 Comm: kvm-nx-lpage-re
> Tainted: P     U     OE      6.3.4-cova #1
> [Fri May 26 09:16:35 2023] Hardware name: LENOVO
> 20EQS58500/20EQS58500, BIOS N1EET98W (1.71 ) 12/06/2022
> [Fri May 26 09:16:35 2023] RIP:
> 0010:kvm_nx_huge_page_recovery_worker+0x38c/0x3d0 [kvm]
> [Fri May 26 09:16:35 2023] Code: 48 8b 44 24 30 4c 39 e0 0f 85 1b fe
> ff ff 48 89 df e8 2e ab fb ff e9 23 fe ff ff 49 bc ff ff ff ff ff ff
> ff 7f e9 fb fc ff ff <0f> 0b e9 1b ff ff ff 48 8b 44 24 40 65 48 2b 04
> 25 28 00 00 00 75
> [Fri May 26 09:16:35 2023] RSP: 0018:ffff8e1a4403fe68 EFLAGS: 00010246
> [Fri May 26 09:16:35 2023] RAX: 0000000000000000 RBX: ffff8e1a42bbd000
> RCX: 0000000000000000
> [Fri May 26 09:16:35 2023] RDX: 0000000000000000 RSI: 0000000000000000
> RDI: 0000000000000000
> [Fri May 26 09:16:35 2023] RBP: ffff8b4e9a56d930 R08: 0000000000000000
> R09: ffff8b4e9a56d8a0
> [Fri May 26 09:16:35 2023] R10: 0000000000000000 R11: 0000000000000001
> R12: ffff8e1a4403fe98
> [Fri May 26 09:16:35 2023] R13: 0000000000000001 R14: ffff8b4d9c432e80
> R15: 0000000000000010
> [Fri May 26 09:16:35 2023] FS:  0000000000000000(0000)
> GS:ffff8b5cdf740000(0000) knlGS:0000000000000000
> [Fri May 26 09:16:35 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [Fri May 26 09:16:35 2023] CR2: 00007efeac53d000 CR3: 0000000978c2c003
> CR4: 00000000003726e0
> [Fri May 26 09:16:35 2023] Call Trace:
> [Fri May 26 09:16:35 2023]  <TASK>
> [Fri May 26 09:16:35 2023]  ?
> __pfx_kvm_nx_huge_page_recovery_worker+0x10/0x10 [kvm]
> [Fri May 26 09:16:35 2023]  kvm_vm_worker_thread+0x106/0x1c0 [kvm]
> [Fri May 26 09:16:35 2023]  ? __pfx_kvm_vm_worker_thread+0x10/0x10 [kvm]
> [Fri May 26 09:16:35 2023]  kthread+0xd9/0x100
> [Fri May 26 09:16:35 2023]  ? __pfx_kthread+0x10/0x10
> [Fri May 26 09:16:35 2023]  ret_from_fork+0x2c/0x50
> [Fri May 26 09:16:35 2023]  </TASK>
> [Fri May 26 09:16:35 2023] ---[ end trace 0000000000000000 ]---

Thanks for the regression report. I'm adding it to regzbot:

#regzbot ^introduced: v6.3.1..v6.3.2
#regzbot title: WARNING trace at kvm_nx_huge_page_recovery_worker when opening a new tab in Chrome

Fabio, can you also check the mainline (on guest)?

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
  2023-05-28 12:44 ` Bagas Sanjaya
@ 2023-05-30 10:42   ` Fabio Coatti
  2023-05-30 17:37     ` Sean Christopherson
  0 siblings, 1 reply; 11+ messages in thread
From: Fabio Coatti @ 2023-05-30 10:42 UTC (permalink / raw)
  To: Bagas Sanjaya; +Cc: stable, regressions, kvm, Junaid Shahid, Paolo Bonzini

Il giorno dom 28 mag 2023 alle ore 14:44 Bagas Sanjaya
<bagasdotme@gmail.com> ha scritto:

>
> Thanks for the regression report. I'm adding it to regzbot:
>
> #regzbot ^introduced: v6.3.1..v6.3.2
> #regzbot title: WARNING trace at kvm_nx_huge_page_recovery_worker when opening a new tab in Chrome

Out of curiosity, I recompiled 6.3.4 after reverting the following
commit mentioned in 6.3.2 changelog:

commit 2ec1fe292d6edb3bd112f900692d9ef292b1fa8b
Author: Sean Christopherson <seanjc@google.com>
Date:   Wed Apr 26 15:03:23 2023 -0700
KVM: x86: Preserve TDP MMU roots until they are explicitly invalidated
commit edbdb43fc96b11b3bfa531be306a1993d9fe89ec upstream.

And the WARN message no longer appears on my host kernel logs, at
least so far :)

>
> Fabio, can you also check the mainline (on guest)?

Not sure to understand, you mean 6.4-rcX? I can do that, sure, but why
on guest? The WARN appears on host logs, the one with 6.3.4 kernel.
Guest is a standard ubuntu 22.04, currently with 5.19.0-42-generic
(ubuntu) kernel


-- 
Fabio

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
  2023-05-30 10:42   ` Fabio Coatti
@ 2023-05-30 17:37     ` Sean Christopherson
  2023-05-31  1:27       ` Sean Christopherson
  0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2023-05-30 17:37 UTC (permalink / raw)
  To: Fabio Coatti
  Cc: Bagas Sanjaya, stable, regressions, kvm, Junaid Shahid, Paolo Bonzini

On Tue, May 30, 2023, Fabio Coatti wrote:
> Il giorno dom 28 mag 2023 alle ore 14:44 Bagas Sanjaya
> <bagasdotme@gmail.com> ha scritto:
> > #regzbot ^introduced: v6.3.1..v6.3.2
> > #regzbot title: WARNING trace at kvm_nx_huge_page_recovery_worker when opening a new tab in Chrome
> 
> Out of curiosity, I recompiled 6.3.4 after reverting the following
> commit mentioned in 6.3.2 changelog:
> 
> commit 2ec1fe292d6edb3bd112f900692d9ef292b1fa8b
> Author: Sean Christopherson <seanjc@google.com>
> Date:   Wed Apr 26 15:03:23 2023 -0700
> KVM: x86: Preserve TDP MMU roots until they are explicitly invalidated
> commit edbdb43fc96b11b3bfa531be306a1993d9fe89ec upstream.
> 
> And the WARN message no longer appears on my host kernel logs, at
> least so far :)

Hmm, more than likely an NX shadow page is outliving a memslot update.  I'll take
another look at those flows to see if I can spot a race or leak.

> > Fabio, can you also check the mainline (on guest)?
> 
> Not sure to understand, you mean 6.4-rcX? I can do that, sure, but why
> on guest?

Misunderstanding probably?  Please do test with 6.4-rcX on the host.  I expect
the WARN to reproduce there as well, but if it doesn't then we'll have a very
useful datapoint.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
  2023-05-30 17:37     ` Sean Christopherson
@ 2023-05-31  1:27       ` Sean Christopherson
  2023-05-31  2:04         ` Sean Christopherson
  0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2023-05-31  1:27 UTC (permalink / raw)
  To: Fabio Coatti
  Cc: Bagas Sanjaya, stable, regressions, kvm, Junaid Shahid, Paolo Bonzini

On Tue, May 30, 2023, Sean Christopherson wrote:
> On Tue, May 30, 2023, Fabio Coatti wrote:
> > Il giorno dom 28 mag 2023 alle ore 14:44 Bagas Sanjaya
> > <bagasdotme@gmail.com> ha scritto:
> > > #regzbot ^introduced: v6.3.1..v6.3.2
> > > #regzbot title: WARNING trace at kvm_nx_huge_page_recovery_worker when opening a new tab in Chrome
> > 
> > Out of curiosity, I recompiled 6.3.4 after reverting the following
> > commit mentioned in 6.3.2 changelog:
> > 
> > commit 2ec1fe292d6edb3bd112f900692d9ef292b1fa8b
> > Author: Sean Christopherson <seanjc@google.com>
> > Date:   Wed Apr 26 15:03:23 2023 -0700
> > KVM: x86: Preserve TDP MMU roots until they are explicitly invalidated
> > commit edbdb43fc96b11b3bfa531be306a1993d9fe89ec upstream.
> > 
> > And the WARN message no longer appears on my host kernel logs, at
> > least so far :)
> 
> Hmm, more than likely an NX shadow page is outliving a memslot update.  I'll take
> another look at those flows to see if I can spot a race or leak.

I didn't spot anything, and I couldn't reproduce the WARN even when dropping the
dirty logging requirement and hacking KVM to periodically delete memslots.

printk debugging it is...  Can you run with this and report back?

---
 arch/x86/kvm/mmu/mmu.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d3812de54b02..89c2e5ee7d36 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -855,6 +855,8 @@ void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 	if (!list_empty(&sp->possible_nx_huge_page_link))
 		return;
 
+	sp->mmu_valid_gen = kvm->arch.mmu_valid_gen;
+
 	++kvm->stat.nx_lpage_splits;
 	list_add_tail(&sp->possible_nx_huge_page_link,
 		      &kvm->arch.possible_nx_huge_pages);
@@ -7012,7 +7014,9 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm)
 		slot = NULL;
 		if (atomic_read(&kvm->nr_memslots_dirty_logging)) {
 			slot = gfn_to_memslot(kvm, sp->gfn);
-			WARN_ON_ONCE(!slot);
+			if (!WARN_ON_ONCE(!slot))
+				pr_warn_ratelimited("No slot for gfn = %llx, role = %x, TDP MMU = %u, root count = %u, gen = %u vs %u\n",
+						    sp->gfn, sp->role.word, sp->tdp_mmu_page, sp->root_count, sp->mmu_valid_gen, kvm->arch.mmu_valid_gen);
 		}
 
 		if (slot && kvm_slot_dirty_track_enabled(slot))

base-commit: 17f2d782f18c9a49943ea723d7628da1837c9204
-- 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
  2023-05-31  1:27       ` Sean Christopherson
@ 2023-05-31  2:04         ` Sean Christopherson
  2023-06-01  8:38           ` Fabio Coatti
  0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2023-05-31  2:04 UTC (permalink / raw)
  To: Fabio Coatti
  Cc: Bagas Sanjaya, stable, regressions, kvm, Junaid Shahid, Paolo Bonzini

On Tue, May 30, 2023, Sean Christopherson wrote:
> On Tue, May 30, 2023, Sean Christopherson wrote:
> > On Tue, May 30, 2023, Fabio Coatti wrote:
> > > Il giorno dom 28 mag 2023 alle ore 14:44 Bagas Sanjaya
> > > <bagasdotme@gmail.com> ha scritto:
> > > > #regzbot ^introduced: v6.3.1..v6.3.2
> > > > #regzbot title: WARNING trace at kvm_nx_huge_page_recovery_worker when opening a new tab in Chrome
> > > 
> > > Out of curiosity, I recompiled 6.3.4 after reverting the following
> > > commit mentioned in 6.3.2 changelog:
> > > 
> > > commit 2ec1fe292d6edb3bd112f900692d9ef292b1fa8b
> > > Author: Sean Christopherson <seanjc@google.com>
> > > Date:   Wed Apr 26 15:03:23 2023 -0700
> > > KVM: x86: Preserve TDP MMU roots until they are explicitly invalidated
> > > commit edbdb43fc96b11b3bfa531be306a1993d9fe89ec upstream.
> > > 
> > > And the WARN message no longer appears on my host kernel logs, at
> > > least so far :)
> > 
> > Hmm, more than likely an NX shadow page is outliving a memslot update.  I'll take
> > another look at those flows to see if I can spot a race or leak.
> 
> I didn't spot anything, and I couldn't reproduce the WARN even when dropping the
> dirty logging requirement and hacking KVM to periodically delete memslots.

Aha!  Apparently my brain was just waiting until I sat down for dinner to have
its lightbulb moment.

The memslot lookup isn't factoring in whether the shadow page is for non-SMM versus
SMM.  QEMU configures SMM to have memslots that do not exist in the non-SMM world,
so if kvm_recover_nx_huge_pages() encounters an SMM shadow page, the memslot lookup
can fail to find a memslot because it looks only in the set of non-SMM memslots.

Before commit 2ec1fe292d6e ("KVM: x86: Preserve TDP MMU roots until they are
explicitly invalidated"), KVM would zap all SMM TDP MMU roots and thus all SMM TDP
MMU shadow pages once all vCPUs exited SMM.  That made the window where this bug
could be encountered quite tiny, as the NX recovery thread would have to kick in
while at least one vCPU was in SMM.  QEMU VMs typically only use SMM during boot,
and so the "bad" shadow pages were gone by the time the NX recovery thread ran.

Now that KVM preserves TDP MMU roots until they are explicity invalidated (by a
memslot deletion), the window to encounter the bug is effectively never closed
because QEMU doesn't delete memslots after boot (except for a handful of special
scenarios.

Assuming I'm correct, this should fix the issue:

---
 arch/x86/kvm/mmu/mmu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d3812de54b02..d5c03f14cdc7 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -7011,7 +7011,10 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm)
 		 */
 		slot = NULL;
 		if (atomic_read(&kvm->nr_memslots_dirty_logging)) {
-			slot = gfn_to_memslot(kvm, sp->gfn);
+			struct kvm_memslots *slots;
+
+			slots = kvm_memslots_for_spte_role(kvm, sp->role);
+			slot = __gfn_to_memslot(slots, sp->gfn);
 			WARN_ON_ONCE(!slot);
 		}
 

base-commit: 17f2d782f18c9a49943ea723d7628da1837c9204
-- 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4
  2023-05-31  2:04         ` Sean Christopherson
@ 2023-06-01  8:38           ` Fabio Coatti
  0 siblings, 0 replies; 11+ messages in thread
From: Fabio Coatti @ 2023-06-01  8:38 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Bagas Sanjaya, stable, regressions, kvm, Junaid Shahid, Paolo Bonzini

Il giorno mer 31 mag 2023 alle ore 04:04 Sean Christopherson
<seanjc@google.com> ha scritto:
>
> On Tue, May 30, 2023, Sean Christopherson wrote:
> > On Tue, May 30, 2023, Sean Christopherson wrote:
> > > On Tue, May 30, 2023, Fabio Coatti wrote:
> > > > Il giorno dom 28 mag 2023 alle ore 14:44 Bagas Sanjaya
> > > > <bagasdotme@gmail.com> ha scritto:
> > > > > #regzbot ^introduced: v6.3.1..v6.3.2
> > > > > #regzbot title: WARNING trace at kvm_nx_huge_page_recovery_worker when opening a new tab in Chrome
> > > >
> > > > Out of curiosity, I recompiled 6.3.4 after reverting the following
> > > > commit mentioned in 6.3.2 changelog:
> > > >
> > > > commit 2ec1fe292d6edb3bd112f900692d9ef292b1fa8b
> > > > Author: Sean Christopherson <seanjc@google.com>
> > > > Date:   Wed Apr 26 15:03:23 2023 -0700
> > > > KVM: x86: Preserve TDP MMU roots until they are explicitly invalidated
> > > > commit edbdb43fc96b11b3bfa531be306a1993d9fe89ec upstream.
> > > >
> > > > And the WARN message no longer appears on my host kernel logs, at
> > > > least so far :)
> > >
> > > Hmm, more than likely an NX shadow page is outliving a memslot update.  I'll take
> > > another look at those flows to see if I can spot a race or leak.
> >
> > I didn't spot anything, and I couldn't reproduce the WARN even when dropping the
> > dirty logging requirement and hacking KVM to periodically delete memslots.
>
> Aha!  Apparently my brain was just waiting until I sat down for dinner to have
> its lightbulb moment.
>
> The memslot lookup isn't factoring in whether the shadow page is for non-SMM versus
> SMM.  QEMU configures SMM to have memslots that do not exist in the non-SMM world,
> so if kvm_recover_nx_huge_pages() encounters an SMM shadow page, the memslot lookup
> can fail to find a memslot because it looks only in the set of non-SMM memslots.
>
> Before commit 2ec1fe292d6e ("KVM: x86: Preserve TDP MMU roots until they are
> explicitly invalidated"), KVM would zap all SMM TDP MMU roots and thus all SMM TDP
> MMU shadow pages once all vCPUs exited SMM.  That made the window where this bug
> could be encountered quite tiny, as the NX recovery thread would have to kick in
> while at least one vCPU was in SMM.  QEMU VMs typically only use SMM during boot,
> and so the "bad" shadow pages were gone by the time the NX recovery thread ran.
>
> Now that KVM preserves TDP MMU roots until they are explicity invalidated (by a
> memslot deletion), the window to encounter the bug is effectively never closed
> because QEMU doesn't delete memslots after boot (except for a handful of special
> scenarios.
>
> Assuming I'm correct, this should fix the issue:
>
> ---
>  arch/x86/kvm/mmu/mmu.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index d3812de54b02..d5c03f14cdc7 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -7011,7 +7011,10 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm)
>                  */
>                 slot = NULL;
>                 if (atomic_read(&kvm->nr_memslots_dirty_logging)) {
> -                       slot = gfn_to_memslot(kvm, sp->gfn);
> +                       struct kvm_memslots *slots;
> +
> +                       slots = kvm_memslots_for_spte_role(kvm, sp->role);
> +                       slot = __gfn_to_memslot(slots, sp->gfn);
>                         WARN_ON_ONCE(!slot);
>                 }
>
>
> base-commit: 17f2d782f18c9a49943ea723d7628da1837c9204

I applied this patch on the same kernel I was using for testing
(6.3.4) and indeed I'm no longer able to see the WARN message, so I
assume that you are indeed correct :) . Many thanks, it seems to be
fixed at least on my machine!



-- 
Fabio

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-06-01  8:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-26  7:43 WARNING trace at kvm_nx_huge_page_recovery_worker on 6.3.4 Fabio Coatti
2023-05-26  8:34 ` Bagas Sanjaya
2023-05-26 17:01 ` Sean Christopherson
2023-05-28  9:22   ` Fabio Coatti
2023-05-28 10:54   ` Fabio Coatti
2023-05-28 12:44 ` Bagas Sanjaya
2023-05-30 10:42   ` Fabio Coatti
2023-05-30 17:37     ` Sean Christopherson
2023-05-31  1:27       ` Sean Christopherson
2023-05-31  2:04         ` Sean Christopherson
2023-06-01  8:38           ` Fabio Coatti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.