LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
* QEMU/KVM snapshot restore bug
@ 2020-02-11  3:57 dftxbs3e
  2020-02-12 11:05 ` Greg Kurz
  2020-02-16 18:16 ` Cédric Le Goater
  0 siblings, 2 replies; 6+ messages in thread
From: dftxbs3e @ 2020-02-11  3:57 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1.1.1: Type: text/plain, Size: 4205 bytes --]

Hello,

I took a snapshot of a ppc64 (big endian) VM from a ppc64 (little endian) host using `virsh snapshot-create-as --domain <name> --name <name>`

Then I restarted my system and tried restoring the snapshot:

# virsh snapshot-revert --domain <name> --snapshotname <name>
error: internal error: process exited while connecting to monitor: 2020-02-11T03:18:08.110582Z qemu-system-ppc64: KVM_SET_DEVICE_ATTR failed: Group 3 attr 0x0000000000001309: Device or resource busy
2020-02-11T03:18:08.110605Z qemu-system-ppc64: error while loading state for instance 0x0 of device 'spapr'
2020-02-11T03:18:08.112843Z qemu-system-ppc64: Error -1 while loading VM state

And dmesg shows each time the restore command is executed:

[  180.176606] WARNING: CPU: 16 PID: 5528 at arch/powerpc/kvm/book3s_xive.c:345 xive_try_pick_queue+0x40/0xb8 [kvm]
[  180.176608] Modules linked in: vhost_net vhost tap kvm_hv kvm xt_CHECKSUM xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp tun bridge 8021q garp mrp stp llc rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc raid1 at24 regmap_i2c snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg joydev snd_hda_codec snd_hda_core ofpart snd_hwdep crct10dif_vpmsum snd_seq ipmi_powernv powernv_flash ipmi_devintf snd_seq_device mtd ipmi_msghandler rtc_opal snd_pcm opal_prd i2c_opal snd_timer snd soundcore lz4 lz4_compress zram ip_tables xfs libcrc32c dm_crypt amdgpu ast drm_vram_helper mfd_core i2c_algo_bit gpu_sched drm_kms_helper mpt3sas
[  180.176652]  syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm vmx_crypto tg3 crc32c_vpmsum nvme raid_class scsi_transport_sas nvme_core drm_panel_orientation_quirks i2c_core fuse
[  180.176663] CPU: 16 PID: 5528 Comm: qemu-system-ppc Not tainted 5.4.17-200.fc31.ppc64le #1
[  180.176665] NIP:  c00800000a883c80 LR: c00800000a886db8 CTR: c00800000a88a9e0
[  180.176667] REGS: c000000767a17890 TRAP: 0700   Not tainted  (5.4.17-200.fc31.ppc64le)
[  180.176668] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 48224248  XER: 20040000
[  180.176673] CFAR: c00800000a886db4 IRQMASK: 0 
               GPR00: c00800000a886db8 c000000767a17b20 c00800000a8aed00 c0002005468a4480 
               GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 
               GPR08: c0002007142b2400 c0002007142b2400 0000000000000000 c00800000a8910f0 
               GPR12: c00800000a88a488 c0000007fffed000 0000000000000000 0000000000000000 
               GPR16: 0000000149524180 00007ffff39bda78 00007ffff39bda30 000000000000025c 
               GPR20: 0000000000000000 0000000000000003 c0002006f13a0000 0000000000000000 
               GPR24: 0000000000001359 0000000000000000 c0000002f8c96c38 c0000002f8c80000 
               GPR28: 0000000000000000 c0002006f13a0000 c0002006f13a4038 c000000767a17be4 
[  180.176688] NIP [c00800000a883c80] xive_try_pick_queue+0x40/0xb8 [kvm]
[  180.176693] LR [c00800000a886db8] kvmppc_xive_select_target+0x100/0x210 [kvm]
[  180.176694] Call Trace:
[  180.176696] [c000000767a17b20] [c000000767a17b70] 0xc000000767a17b70 (unreliable)
[  180.176701] [c000000767a17b70] [c00800000a88b420] kvmppc_xive_native_set_attr+0xf98/0x1760 [kvm]
[  180.176705] [c000000767a17cc0] [c00800000a86392c] kvm_device_ioctl+0xf4/0x180 [kvm]
[  180.176710] [c000000767a17d10] [c0000000005380b0] do_vfs_ioctl+0xaa0/0xd90
[  180.176712] [c000000767a17dd0] [c000000000538464] sys_ioctl+0xc4/0x110
[  180.176716] [c000000767a17e20] [c00000000000b9d0] system_call+0x5c/0x68
[  180.176717] Instruction dump:
[  180.176719] 794ad182 0b0a0000 2c290000 41820080 89490010 2c0a0000 41820074 78883664 
[  180.176723] 7d094214 e9480070 7d470074 78e7d182 <0b070000> 2c2a0000 41820054 81480078 
[  180.176727] ---[ end trace 056a6dd275e20684 ]---

Let me know if I can provide more information

Thanks


[-- Attachment #1.1.2: Type: text/html, Size: 4457 bytes --]

<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <pre>Hello,</pre>
    <pre>I took a snapshot of a ppc64 (big endian) VM from a ppc64 (little endian) host using `virsh snapshot-create-as --domain &lt;name&gt; --name &lt;name&gt;`</pre>
    <pre>Then I restarted my system and tried restoring the snapshot:
</pre>
    <pre># virsh snapshot-revert --domain &lt;name&gt; --snapshotname &lt;name&gt;
error: internal error: process exited while connecting to monitor: 2020-02-11T03:18:08.110582Z qemu-system-ppc64: KVM_SET_DEVICE_ATTR failed: Group 3 attr 0x0000000000001309: Device or resource busy
2020-02-11T03:18:08.110605Z qemu-system-ppc64: error while loading state for instance 0x0 of device 'spapr'
2020-02-11T03:18:08.112843Z qemu-system-ppc64: Error -1 while loading VM state</pre>
    <pre>And dmesg shows each time the restore command is executed:</pre>
    <pre>[  180.176606] WARNING: CPU: 16 PID: 5528 at arch/powerpc/kvm/book3s_xive.c:345 xive_try_pick_queue+0x40/0xb8 [kvm]
[  180.176608] Modules linked in: vhost_net vhost tap kvm_hv kvm xt_CHECKSUM xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp tun bridge 8021q garp mrp stp llc rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc raid1 at24 regmap_i2c snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg joydev snd_hda_codec snd_hda_core ofpart snd_hwdep crct10dif_vpmsum snd_seq ipmi_powernv powernv_flash ipmi_devintf snd_seq_device mtd ipmi_msghandler rtc_opal snd_pcm opal_prd i2c_opal snd_timer snd soundcore lz4 lz4_compress zram ip_tables xfs libcrc32c dm_crypt amdgpu ast drm_vram_helper mfd_core i2c_algo_bit gpu_sched drm_kms_helper mpt3sas
[  180.176652]  syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm vmx_crypto tg3 crc32c_vpmsum nvme raid_class scsi_transport_sas nvme_core drm_panel_orientation_quirks i2c_core fuse
[  180.176663] CPU: 16 PID: 5528 Comm: qemu-system-ppc Not tainted 5.4.17-200.fc31.ppc64le #1
[  180.176665] NIP:  c00800000a883c80 LR: c00800000a886db8 CTR: c00800000a88a9e0
[  180.176667] REGS: c000000767a17890 TRAP: 0700   Not tainted  (5.4.17-200.fc31.ppc64le)
[  180.176668] MSR:  9000000000029033 &lt;SF,HV,EE,ME,IR,DR,RI,LE&gt;  CR: 48224248  XER: 20040000
[  180.176673] CFAR: c00800000a886db4 IRQMASK: 0 
               GPR00: c00800000a886db8 c000000767a17b20 c00800000a8aed00 c0002005468a4480 
               GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 
               GPR08: c0002007142b2400 c0002007142b2400 0000000000000000 c00800000a8910f0 
               GPR12: c00800000a88a488 c0000007fffed000 0000000000000000 0000000000000000 
               GPR16: 0000000149524180 00007ffff39bda78 00007ffff39bda30 000000000000025c 
               GPR20: 0000000000000000 0000000000000003 c0002006f13a0000 0000000000000000 
               GPR24: 0000000000001359 0000000000000000 c0000002f8c96c38 c0000002f8c80000 
               GPR28: 0000000000000000 c0002006f13a0000 c0002006f13a4038 c000000767a17be4 
[  180.176688] NIP [c00800000a883c80] xive_try_pick_queue+0x40/0xb8 [kvm]
[  180.176693] LR [c00800000a886db8] kvmppc_xive_select_target+0x100/0x210 [kvm]
[  180.176694] Call Trace:
[  180.176696] [c000000767a17b20] [c000000767a17b70] 0xc000000767a17b70 (unreliable)
[  180.176701] [c000000767a17b70] [c00800000a88b420] kvmppc_xive_native_set_attr+0xf98/0x1760 [kvm]
[  180.176705] [c000000767a17cc0] [c00800000a86392c] kvm_device_ioctl+0xf4/0x180 [kvm]
[  180.176710] [c000000767a17d10] [c0000000005380b0] do_vfs_ioctl+0xaa0/0xd90
[  180.176712] [c000000767a17dd0] [c000000000538464] sys_ioctl+0xc4/0x110
[  180.176716] [c000000767a17e20] [c00000000000b9d0] system_call+0x5c/0x68
[  180.176717] Instruction dump:
[  180.176719] 794ad182 0b0a0000 2c290000 41820080 89490010 2c0a0000 41820074 78883664 
[  180.176723] 7d094214 e9480070 7d470074 78e7d182 &lt;0b070000&gt; 2c2a0000 41820054 81480078 
[  180.176727] ---[ end trace 056a6dd275e20684 ]---

Let me know if I can provide more information

Thanks
</pre>
  </body>
</html>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: QEMU/KVM snapshot restore bug
  2020-02-11  3:57 QEMU/KVM snapshot restore bug dftxbs3e
@ 2020-02-12 11:05 ` Greg Kurz
  2020-02-12 17:59   ` dftxbs3e
  2020-02-16 18:16 ` Cédric Le Goater
  1 sibling, 1 reply; 6+ messages in thread
From: Greg Kurz @ 2020-02-12 11:05 UTC (permalink / raw)
  To: dftxbs3e; +Cc: linuxppc-dev, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 5340 bytes --]

On Tue, 11 Feb 2020 04:57:52 +0100
dftxbs3e <dftxbs3e@free.fr> wrote:

> Hello,
> 
> I took a snapshot of a ppc64 (big endian) VM from a ppc64 (little endian) host using `virsh snapshot-create-as --domain <name> --name <name>`
> 

A big endian guest doing XIVE ?!? I'm pretty sure we didn't do much testing, if
any, on such a setup... What distro is used in the VM ?

> Then I restarted my system and tried restoring the snapshot:
> 
> # virsh snapshot-revert --domain <name> --snapshotname <name>
> error: internal error: process exited while connecting to monitor: 2020-02-11T03:18:08.110582Z qemu-system-ppc64: KVM_SET_DEVICE_ATTR failed: Group 3 attr 0x0000000000001309: Device or resource busy
> 2020-02-11T03:18:08.110605Z qemu-system-ppc64: error while loading state for instance 0x0 of device 'spapr'
> 2020-02-11T03:18:08.112843Z qemu-system-ppc64: Error -1 while loading VM state
> 

This indicates that QEMU failed to configure the source targeting
for the HW interrupt 0x1309, which is an MSI interrupt used by
a PCI device plugged in the default PHB. Especially, -EBUSY means

    -EBUSY:  No CPU available to serve interrupt

> And dmesg shows each time the restore command is executed:
> 
> [  180.176606] WARNING: CPU: 16 PID: 5528 at arch/powerpc/kvm/book3s_xive.c:345 xive_try_pick_queue+0x40/0xb8 [kvm]

This warning means that we have vCPU without a configured event queue.

Since kvmppc_xive_select_target() is trying all vCPUs before bailing out
with -EBUSY, you might be seeing several WARNINGs (1 per vCPU) in dmesg,
correct ?

Anyway, this looks wrong since QEMU is supposed to have already configured
the event queues at this point... Not sure what's happening here...

> [  180.176608] Modules linked in: vhost_net vhost tap kvm_hv kvm xt_CHECKSUM xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp tun bridge 8021q garp mrp stp llc rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc raid1 at24 regmap_i2c snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg joydev snd_hda_codec snd_hda_core ofpart snd_hwdep crct10dif_vpmsum snd_seq ipmi_powernv powernv_flash ipmi_devintf snd_seq_device mtd ipmi_msghandler rtc_opal snd_pcm opal_prd i2c_opal snd_timer snd soundcore lz4 lz4_compress zram ip_tables xfs libcrc32c dm_crypt amdgpu ast drm_vram_helper mfd_core i2c_algo_bit gpu_sched drm_kms_helper mpt3sas
> [  180.176652]  syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm vmx_crypto tg3 crc32c_vpmsum nvme raid_class scsi_transport_sas nvme_core drm_panel_orientation_quirks i2c_core fuse
> [  180.176663] CPU: 16 PID: 5528 Comm: qemu-system-ppc Not tainted 5.4.17-200.fc31.ppc64le #1
> [  180.176665] NIP:  c00800000a883c80 LR: c00800000a886db8 CTR: c00800000a88a9e0
> [  180.176667] REGS: c000000767a17890 TRAP: 0700   Not tainted  (5.4.17-200.fc31.ppc64le)
> [  180.176668] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 48224248  XER: 20040000
> [  180.176673] CFAR: c00800000a886db4 IRQMASK: 0 
>                GPR00: c00800000a886db8 c000000767a17b20 c00800000a8aed00 c0002005468a4480 
>                GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 
>                GPR08: c0002007142b2400 c0002007142b2400 0000000000000000 c00800000a8910f0 
>                GPR12: c00800000a88a488 c0000007fffed000 0000000000000000 0000000000000000 
>                GPR16: 0000000149524180 00007ffff39bda78 00007ffff39bda30 000000000000025c 
>                GPR20: 0000000000000000 0000000000000003 c0002006f13a0000 0000000000000000 
>                GPR24: 0000000000001359 0000000000000000 c0000002f8c96c38 c0000002f8c80000 
>                GPR28: 0000000000000000 c0002006f13a0000 c0002006f13a4038 c000000767a17be4 
> [  180.176688] NIP [c00800000a883c80] xive_try_pick_queue+0x40/0xb8 [kvm]
> [  180.176693] LR [c00800000a886db8] kvmppc_xive_select_target+0x100/0x210 [kvm]
> [  180.176694] Call Trace:
> [  180.176696] [c000000767a17b20] [c000000767a17b70] 0xc000000767a17b70 (unreliable)
> [  180.176701] [c000000767a17b70] [c00800000a88b420] kvmppc_xive_native_set_attr+0xf98/0x1760 [kvm]
> [  180.176705] [c000000767a17cc0] [c00800000a86392c] kvm_device_ioctl+0xf4/0x180 [kvm]
> [  180.176710] [c000000767a17d10] [c0000000005380b0] do_vfs_ioctl+0xaa0/0xd90
> [  180.176712] [c000000767a17dd0] [c000000000538464] sys_ioctl+0xc4/0x110
> [  180.176716] [c000000767a17e20] [c00000000000b9d0] system_call+0x5c/0x68
> [  180.176717] Instruction dump:
> [  180.176719] 794ad182 0b0a0000 2c290000 41820080 89490010 2c0a0000 41820074 78883664 
> [  180.176723] 7d094214 e9480070 7d470074 78e7d182 <0b070000> 2c2a0000 41820054 81480078 
> [  180.176727] ---[ end trace 056a6dd275e20684 ]---
> 
> Let me know if I can provide more information

Yeah, QEMU command line, QEMU version, guest kernel version can help. Also,
what kind of workload is running inside the guest ? Is this easy to reproduce ?

Cheers,

--
Greg

> 
> Thanks
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: QEMU/KVM snapshot restore bug
  2020-02-12 11:05 ` Greg Kurz
@ 2020-02-12 17:59   ` dftxbs3e
  0 siblings, 0 replies; 6+ messages in thread
From: dftxbs3e @ 2020-02-12 17:59 UTC (permalink / raw)
  To: Greg Kurz; +Cc: linuxppc-dev, Cédric Le Goater

[-- Attachment #1.1: Type: text/plain, Size: 3912 bytes --]

Hello,
> A big endian guest doing XIVE ?!? I'm pretty sure we didn't do much testing, if
> any, on such a setup... What distro is used in the VM ?
A live Void Linux ISO ;
https://repo.voidlinux-ppc.org/live/current/void-live-ppc64-20190901.iso
> This indicates that QEMU failed to configure the source targeting
> for the HW interrupt 0x1309, which is an MSI interrupt used by
> a PCI device plugged in the default PHB. Especially, -EBUSY means
>
>     -EBUSY:  No CPU available to serve interrupt
>
Okay.
> This warning means that we have vCPU without a configured event queue.
>
> Since kvmppc_xive_select_target() is trying all vCPUs before bailing out
> with -EBUSY, you might be seeing several WARNINGs (1 per vCPU) in dmesg,
> correct ?
>
> Anyway, this looks wrong since QEMU is supposed to have already configured
> the event queues at this point... Not sure what's happening here...
>
Indeed. VM core count + 1 such messages in dmesg.
> Yeah, QEMU command line, QEMU version, guest kernel version can help. Also,
> what kind of workload is running inside the guest ? Is this easy to reproduce ?

/usr/bin/qemu-system-ppc64 -name guest=voidlinux-ppc64,debug-threads=on
-S -object
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-voidlinux-ppc64/master-key.aes
-machine pseries-4.1,accel=kvm,usb=off,dump-guest-core=off -m 8192
-overcommit mem-lock=off -smp 8,sockets=8,cores=1,threads=1 -uuid
5dd7af48-f00d-43c1-86ed-df5e0f7b4f1c -no-user-config -nodefaults
-chardev socket,id=charmonitor,fd=41,server,nowait -mon
chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown
-boot strict=on -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.0,addr=0x2
-device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
file=/var/lib/libvirt/images/voidlinux-ppc64.qcow2,format=qcow2,if=none,id=drive-virtio-disk0
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-drive
file=/home/jdoe/Downloads/void-live-ppc64-20190901.iso,format=raw,if=none,id=drive-scsi0-0-0-0,readonly=on
-device
scsi-cd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,device_id=drive-scsi0-0-0-0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2
-netdev tap,fd=43,id=hostnet0,vhost=on,vhostfd=44 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:ae:d7:62,bus=pci.0,addr=0x1
-chardev pty,id=charserial0 -device
spapr-vty,chardev=charserial0,id=serial0,reg=0x30000000 -chardev
socket,id=charchannel0,fd=45,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
-device usb-tablet,id=input0,bus=usb.0,port=1 -device
usb-kbd,id=input1,bus=usb.0,port=2 -vnc 127.0.0.1:2 -device
VGA,id=video0,vgamem_mb=16,bus=pci.0,addr=0x8 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -object
rng-random,id=objrng0,filename=/dev/urandom -device
virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x7 -loadvm
guix-gentoo -sandbox
on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny
-msg timestamp=on

I am using virt-manager, which is why the command line is so long.

And ;

$ qemu-system-ppc64 --version
QEMU emulator version 4.1.1 (qemu-4.1.1-1.fc31)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

Workload at snapshot time, the VM was idle, I was compiling software
using a Gentoo ppc64 big endian chroot inside the Void Linux ppc64 big
endian headless live system.

And yes it is easy to reproduce, download that Void Linux ppc64 big
endian ISO, using virt-manager, create a ppc64 vm with a disk, set the
VM to 8192MB of RAM and 8 cores (less RAM and cores might work,
untested) and it should reproduce the issue. It seems that a 1 core,
512MB of RAM VM suffers from no issue with snapshotting.

Thanks!



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: QEMU/KVM snapshot restore bug
  2020-02-11  3:57 QEMU/KVM snapshot restore bug dftxbs3e
  2020-02-12 11:05 ` Greg Kurz
@ 2020-02-16 18:16 ` Cédric Le Goater
  2020-02-17  2:48   ` dftxbs3e
  1 sibling, 1 reply; 6+ messages in thread
From: Cédric Le Goater @ 2020-02-16 18:16 UTC (permalink / raw)
  To: dftxbs3e, linuxppc-dev, Greg Kurz

On 2/11/20 4:57 AM, dftxbs3e wrote:
> Hello,
> 
> I took a snapshot of a ppc64 (big endian) VM from a ppc64 (little endian) host using `virsh snapshot-create-as --domain <name> --name <name>`
>
> Then I restarted my system and tried restoring the snapshot:
> 
> # virsh snapshot-revert --domain <name> --snapshotname <name>
> error: internal error: process exited while connecting to monitor: 2020-02-11T03:18:08.110582Z qemu-system-ppc64: KVM_SET_DEVICE_ATTR failed: Group 3 attr 0x0000000000001309: Device or resource busy
> 2020-02-11T03:18:08.110605Z qemu-system-ppc64: error while loading state for instance 0x0 of device 'spapr'
> 2020-02-11T03:18:08.112843Z qemu-system-ppc64: Error -1 while loading VM state
> 
> And dmesg shows each time the restore command is executed:
> 
> [  180.176606] WARNING: CPU: 16 PID: 5528 at arch/powerpc/kvm/book3s_xive.c:345 xive_try_pick_queue+0x40/0xb8 [kvm]
> [  180.176608] Modules linked in: vhost_net vhost tap kvm_hv kvm xt_CHECKSUM xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp tun bridge 8021q garp mrp stp llc rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc raid1 at24 regmap_i2c snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg joydev snd_hda_codec snd_hda_core ofpart snd_hwdep crct10dif_vpmsum snd_seq ipmi_powernv powernv_flash ipmi_devintf snd_seq_device mtd ipmi_msghandler rtc_opal snd_pcm opal_prd i2c_opal snd_timer snd soundcore lz4 lz4_compress zram ip_tables xfs libcrc32c dm_crypt amdgpu ast drm_vram_helper mfd_core i2c_algo_bit gpu_sched drm_kms_helper mpt3sas
> [  180.176652]  syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm vmx_crypto tg3 crc32c_vpmsum nvme raid_class scsi_transport_sas nvme_core drm_panel_orientation_quirks i2c_core fuse
> [  180.176663] CPU: 16 PID: 5528 Comm: qemu-system-ppc Not tainted 5.4.17-200.fc31.ppc64le #1
> [  180.176665] NIP:  c00800000a883c80 LR: c00800000a886db8 CTR: c00800000a88a9e0
> [  180.176667] REGS: c000000767a17890 TRAP: 0700   Not tainted  (5.4.17-200.fc31.ppc64le)
> [  180.176668] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 48224248  XER: 20040000
> [  180.176673] CFAR: c00800000a886db4 IRQMASK: 0 
>                GPR00: c00800000a886db8 c000000767a17b20 c00800000a8aed00 c0002005468a4480 
>                GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 
>                GPR08: c0002007142b2400 c0002007142b2400 0000000000000000 c00800000a8910f0 
>                GPR12: c00800000a88a488 c0000007fffed000 0000000000000000 0000000000000000 
>                GPR16: 0000000149524180 00007ffff39bda78 00007ffff39bda30 000000000000025c 
>                GPR20: 0000000000000000 0000000000000003 c0002006f13a0000 0000000000000000 
>                GPR24: 0000000000001359 0000000000000000 c0000002f8c96c38 c0000002f8c80000 
>                GPR28: 0000000000000000 c0002006f13a0000 c0002006f13a4038 c000000767a17be4 
> [  180.176688] NIP [c00800000a883c80] xive_try_pick_queue+0x40/0xb8 [kvm]
> [  180.176693] LR [c00800000a886db8] kvmppc_xive_select_target+0x100/0x210 [kvm]
> [  180.176694] Call Trace:
> [  180.176696] [c000000767a17b20] [c000000767a17b70] 0xc000000767a17b70 (unreliable)
> [  180.176701] [c000000767a17b70] [c00800000a88b420] kvmppc_xive_native_set_attr+0xf98/0x1760 [kvm]
> [  180.176705] [c000000767a17cc0] [c00800000a86392c] kvm_device_ioctl+0xf4/0x180 [kvm]
> [  180.176710] [c000000767a17d10] [c0000000005380b0] do_vfs_ioctl+0xaa0/0xd90
> [  180.176712] [c000000767a17dd0] [c000000000538464] sys_ioctl+0xc4/0x110
> [  180.176716] [c000000767a17e20] [c00000000000b9d0] system_call+0x5c/0x68
> [  180.176717] Instruction dump:
> [  180.176719] 794ad182 0b0a0000 2c290000 41820080 89490010 2c0a0000 41820074 78883664 
> [  180.176723] 7d094214 e9480070 7d470074 78e7d182 <0b070000> 2c2a0000 41820054 81480078 
> [  180.176727] ---[ end trace 056a6dd275e20684 ]---
> 
> Let me know if I can provide more information

I think this is fixed by commit f55750e4e4fb ("spapr/xive: Mask the EAS when 
allocating an IRQ") which is not in QEMU 4.1.1. The same problem should also 
occur with LE guests. 

Could you possibly regenerate the QEMU rpm with this patch ? 

Thanks,

C.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: QEMU/KVM snapshot restore bug
  2020-02-16 18:16 ` Cédric Le Goater
@ 2020-02-17  2:48   ` dftxbs3e
  2020-02-17 15:20     ` Cédric Le Goater
  0 siblings, 1 reply; 6+ messages in thread
From: dftxbs3e @ 2020-02-17  2:48 UTC (permalink / raw)
  To: Cédric Le Goater, linuxppc-dev, Greg Kurz

[-- Attachment #1.1: Type: text/plain, Size: 642 bytes --]

On 2/16/20 7:16 PM, Cédric Le Goater wrote:
>
> I think this is fixed by commit f55750e4e4fb ("spapr/xive: Mask the EAS when 
> allocating an IRQ") which is not in QEMU 4.1.1. The same problem should also 
> occur with LE guests. 
>
> Could you possibly regenerate the QEMU rpm with this patch ? 
>
> Thanks,
>
> C.

Hello!

I applied the patch and reinstalled the RPM then tried to restore the
snapshot I created previously and it threw the same error.

Do I need to re-create the snapshot and/or restart the machine? I have
important workloads running so that'll be possible only in a few days if
needed.

Thanks



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: QEMU/KVM snapshot restore bug
  2020-02-17  2:48   ` dftxbs3e
@ 2020-02-17 15:20     ` Cédric Le Goater
  0 siblings, 0 replies; 6+ messages in thread
From: Cédric Le Goater @ 2020-02-17 15:20 UTC (permalink / raw)
  To: dftxbs3e, linuxppc-dev, Greg Kurz

On 2/17/20 3:48 AM, dftxbs3e wrote:
> On 2/16/20 7:16 PM, Cédric Le Goater wrote:
>>
>> I think this is fixed by commit f55750e4e4fb ("spapr/xive: Mask the EAS when 
>> allocating an IRQ") which is not in QEMU 4.1.1. The same problem should also 
>> occur with LE guests. 
>>
>> Could you possibly regenerate the QEMU rpm with this patch ? 
>>
>> Thanks,
>>
>> C.
> 
> Hello!
> 
> I applied the patch and reinstalled the RPM then tried to restore the
> snapshot I created previously and it threw the same error.
> 
> Do I need to re-create the snapshot and/or restart the machine? 

yes. The problem is at the source.

> I have
> important workloads running so that'll be possible only in a few days if
> needed.
OK. 

Thanks,

C. 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11  3:57 QEMU/KVM snapshot restore bug dftxbs3e
2020-02-12 11:05 ` Greg Kurz
2020-02-12 17:59   ` dftxbs3e
2020-02-16 18:16 ` Cédric Le Goater
2020-02-17  2:48   ` dftxbs3e
2020-02-17 15:20     ` Cédric Le Goater

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git