* [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error @ 2017-06-14 21:56 Fernando Casas Schössow 2017-06-16 6:58 ` Ladi Prosek 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2017-06-14 21:56 UTC (permalink / raw) To: qemu-devel Hi there, I recently migrated a Hyper-V host to qemu/kvm runing on Alpine Linux 3.6.1 (kernel 4.9.30 -with grsec patches- and qemu 2.8.1). Almost on daily basis at least one of the guests is showing the following error in the log and the it needs to be terminated and restarted to recover it: qemu-system-x86_64: Virtqueue size exceeded Is not always the same guest, and the error is appearing for both, Linux (CentOS 7.3) and Windows (2012R2) guests. As soon as this error appears the guest is not really working anymore. It may respond to ping or you can even try to login but then everything is very slow or completely unresponsive. Restarting the guest from within the guest OS is not working either and the only thing I can do is to terminate it (virsh destroy) and start it again until the next failure. In Windows guest the error seems to be related to disk: "Reset to device, \Device\RaidPort2, was issued" and the source is viostor And in Linux guests the error is always (with the process and pid changing): INFO: task <process>:<pid> blocked for more than 120 seconds But unfortunately I was not able to find any other indication of a problem in the guests logs nor in the host logs except for the error regarding the virtqueue size. The problem is happening at different times of day and I couldn't find any patterns yet. All the Windows guests are using virtio drivers version 126 and all Linux guests are CentOS 7.3 using the latest kernel available in the distribution (3.10.0-514.21.1). They all run qemu-guest agent as well. All the guest disks are qcow2 images with cache=none and aimode=threads (tried native mode before but with the same results). Example qemu command for a Linux guest: /usr/bin/qemu-system-x86_64 -name guest=DOCKER01,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-24-DOCKER01/master-key.aes -machine pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu IvyBridge,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+osxsave,+arat,+xsaveopt -drive file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/DOCKER01_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 4705b146-3b14-4c20-923c-42105d47e7fc -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-24-DOCKER01/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/storage/storage-ssd-vms/virtual_machines_ssd/docker01.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=35,id=hostnet0,vhost=on,vhostfd=45 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:1c:af:ce,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-24-DOCKER01/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5905,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x8 -msg timestamp=on For what it worth, the same guests were working fine for years on Hyper-V on the same hardware (Intel Xeon E3, 32GB RAM, Supermicro mainboard, 6x3TB Western Digital Red disks and 6x120MB Kingston V300 SSD all connected to a LSI LSISAS2008 controller). Except for this stability issue that I hope to solve everything else is working great and outperforming Hyper-V. Any ideas, thoughts or suggestions to try to narrow down the problem? Thanks in advance and sorry for the long email but I wanted to be as descriptive as possible. Fer ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-14 21:56 [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error Fernando Casas Schössow @ 2017-06-16 6:58 ` Ladi Prosek 2017-06-16 10:11 ` Fernando Casas Schössow 0 siblings, 1 reply; 55+ messages in thread From: Ladi Prosek @ 2017-06-16 6:58 UTC (permalink / raw) To: Fernando Casas Schössow; +Cc: qemu-devel Hi, On Wed, Jun 14, 2017 at 11:56 PM, Fernando Casas Schössow <casasfernando@hotmail.com> wrote: > Hi there, > > I recently migrated a Hyper-V host to qemu/kvm runing on Alpine Linux 3.6.1 (kernel 4.9.30 -with grsec patches- and qemu 2.8.1). > > Almost on daily basis at least one of the guests is showing the following error in the log and the it needs to be terminated and restarted to recover it: > > qemu-system-x86_64: Virtqueue size exceeded > > Is not always the same guest, and the error is appearing for both, Linux (CentOS 7.3) and Windows (2012R2) guests. > As soon as this error appears the guest is not really working anymore. It may respond to ping or you can even try to login but then everything is very slow or completely unresponsive. Restarting the guest from within the guest OS is not working either and the only thing I can do is to terminate it (virsh destroy) and start it again until the next failure. > > In Windows guest the error seems to be related to disk: > "Reset to device, \Device\RaidPort2, was issued" and the source is viostor > > And in Linux guests the error is always (with the process and pid changing): > > INFO: task <process>:<pid> blocked for more than 120 seconds > > But unfortunately I was not able to find any other indication of a problem in the guests logs nor in the host logs except for the error regarding the virtqueue size. The problem is happening at different times of day and I couldn't find any patterns yet. > > All the Windows guests are using virtio drivers version 126 and all Linux guests are CentOS 7.3 using the latest kernel available in the distribution (3.10.0-514.21.1). They all run qemu-guest agent as well. > All the guest disks are qcow2 images with cache=none and aimode=threads (tried native mode before but with the same results). > > Example qemu command for a Linux guest: > > /usr/bin/qemu-system-x86_64 -name guest=DOCKER01,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-24-DOCKER01/master-key.aes -machine pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu IvyBridge,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+osxsave,+arat,+xsaveopt -drive file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/DOCKER01_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 4705b146-3b14-4c20-923c-42105d47e7fc -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-24-DOCKER01/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/storage/storage-ssd-vms/virtual_machines_ssd/docker01.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=35,id=hostnet0,vhost=on,vhostfd=45 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:1c:af:ce,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-24-DOCKER01/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5905,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x8 -msg timestamp=on > > For what it worth, the same guests were working fine for years on Hyper-V on the same hardware (Intel Xeon E3, 32GB RAM, Supermicro mainboard, 6x3TB Western Digital Red disks and 6x120MB Kingston V300 SSD all connected to a LSI LSISAS2008 controller). > Except for this stability issue that I hope to solve everything else is working great and outperforming Hyper-V. > > Any ideas, thoughts or suggestions to try to narrow down the problem? Would you be able to enhance the error message and rebuild QEMU? --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -856,7 +856,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { - virtio_error(vdev, "Virtqueue size exceeded"); + virtio_error(vdev, "Virtqueue %u device %s size exceeded", vq->queue_index, vdev->name); goto done; } This would at least confirm the theory that it's caused by virtio-blk-pci. If rebuilding is not feasible I would start by removing other virtio devices -- particularly balloon which has had quite a few virtio related bugs fixed recently. Does your environment involve VM migrations or saving/resuming, or does the crashing QEMU process always run the VM from its boot? Thanks! > Thanks in advance and sorry for the long email but I wanted to be as descriptive as possible. > > Fer ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-16 6:58 ` Ladi Prosek @ 2017-06-16 10:11 ` Fernando Casas Schössow 2017-06-16 10:25 ` Ladi Prosek 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2017-06-16 10:11 UTC (permalink / raw) To: Ladi Prosek; +Cc: qemu-devel Hi Ladi, Thanks a lot for looking into this and replying. I will do my best to rebuild and deploy Alpine's qemu packages with this patch included but not sure its feasible yet. In any case, would it be possible to have this patch included in the next qemu release? The current error message is helpful but knowing which device was involved will be much more helpful. Regarding the environment, I'm not doing migrations and only managed save is done in case the host needs to be rebooted or shutdown. The QEMU process is running the VM since the host is started and this failuire is ocurring randomly without any previous manage save done. As part of troubleshooting on one of the guests I switched from virtio_blk to virtio_scsi for the guest disks but I will need more time to see if that helped. If I have this problem again I will follow your advise and remove virtio_balloon. Another question: is there any way to monitor the virtqueue size either from the guest itself or from the host? Any file in sysfs or proc? This may help to understand in which conditions this is happening and to react faster to mitigate the problem. Thanks again for your help with this! Fer On vie, jun 16, 2017 at 8:58 , Ladi Prosek <lprosek@redhat.com> wrote: Hi, Would you be able to enhance the error message and rebuild QEMU? --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -856,7 +856,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { - virtio_error(vdev, "Virtqueue size exceeded"); + virtio_error(vdev, "Virtqueue %u device %s size exceeded", vq->queue_index, vdev->name); goto done; } This would at least confirm the theory that it's caused by virtio-blk-pci. If rebuilding is not feasible I would start by removing other virtio devices -- particularly balloon which has had quite a few virtio related bugs fixed recently. Does your environment involve VM migrations or saving/resuming, or does the crashing QEMU process always run the VM from its boot? Thanks! ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-16 10:11 ` Fernando Casas Schössow @ 2017-06-16 10:25 ` Ladi Prosek 2017-06-19 22:10 ` Fernando Casas Schössow 0 siblings, 1 reply; 55+ messages in thread From: Ladi Prosek @ 2017-06-16 10:25 UTC (permalink / raw) To: Fernando Casas Schössow; +Cc: qemu-devel On Fri, Jun 16, 2017 at 12:11 PM, Fernando Casas Schössow <casasfernando@hotmail.com> wrote: > Hi Ladi, > > Thanks a lot for looking into this and replying. > I will do my best to rebuild and deploy Alpine's qemu packages with this > patch included but not sure its feasible yet. > In any case, would it be possible to have this patch included in the next > qemu release? Yes, I have already added this to my todo list. > The current error message is helpful but knowing which device was involved > will be much more helpful. > > Regarding the environment, I'm not doing migrations and only managed save is > done in case the host needs to be rebooted or shutdown. The QEMU process is > running the VM since the host is started and this failuire is ocurring > randomly without any previous manage save done. > > As part of troubleshooting on one of the guests I switched from virtio_blk > to virtio_scsi for the guest disks but I will need more time to see if that > helped. > If I have this problem again I will follow your advise and remove > virtio_balloon. Thanks, please keep us posted. > Another question: is there any way to monitor the virtqueue size either from > the guest itself or from the host? Any file in sysfs or proc? > This may help to understand in which conditions this is happening and to > react faster to mitigate the problem. The problem is not in the virtqueue size but in one piece of internal state ("inuse") which is meant to track the number of buffers "checked out" by QEMU. It's being compared to virtqueue size merely as a sanity check. I'm afraid that there's no way to expose this variable without rebuilding QEMU. The best you could do is attach gdb to the QEMU process and use some clever data access breakpoints to catch suspicious writes to the variable. Although it's likely that it just creeps up slowly and you won't see anything interesting. It's probably beyond reasonable at this point anyway. I would continue with the elimination process (virtio_scsi instead of virtio_blk, no balloon, etc.) and then maybe once we know which device it is, we can add some instrumentation to the code. > Thanks again for your help with this! > > Fer > > On vie, jun 16, 2017 at 8:58 , Ladi Prosek <lprosek@redhat.com> wrote: > > Hi, > > Would you be able to enhance the error message and rebuild QEMU? --- > a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -856,7 +856,7 @@ void > *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >>= vq->vring.num) { - virtio_error(vdev, "Virtqueue size exceeded"); + > virtio_error(vdev, "Virtqueue %u device %s size exceeded", vq->queue_index, > vdev->name); goto done; } This would at least confirm the theory that it's > caused by virtio-blk-pci. If rebuilding is not feasible I would start by > removing other virtio devices -- particularly balloon which has had quite a > few virtio related bugs fixed recently. Does your environment involve VM > migrations or saving/resuming, or does the crashing QEMU process always run > the VM from its boot? Thanks! > > > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-16 10:25 ` Ladi Prosek @ 2017-06-19 22:10 ` Fernando Casas Schössow 2017-06-20 5:59 ` Ladi Prosek 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2017-06-19 22:10 UTC (permalink / raw) To: Ladi Prosek; +Cc: qemu-devel Hi Ladi, Today two guests failed again at different times of day. One of them was the one I switched from virtio_blk to virtio_scsi so this change didn't solved the problem. Now in this guest I also disabled virtio_balloon, continuing with the elimination process. Also this time I found a different error message in the guest console. In the guest already switched to virtio_scsi: virtio_scsi virtio2: request:id 44 is not a head! Followed by the usual "task blocked for more than 120 seconds." error. On the guest still running on virtio_blk the error was similar: virtio_blk virtio2: req.0:id 42 is not a head! blk_update_request: I/O error, dev vda, sector 645657736 Buffer I/O error on dev dm-1, logical block 7413821, lost async page write Followed by the usual "task blocked for more than 120 seconds." error. Do you think that the blk_update_request and the buffer I/O error may be a consequence of the previous "is not a head!" error or should I be worried for a storage level issue here? Now I will wait to see if disabling virtio_balloon helps or not and report back. Thanks. Fer On vie, jun 16, 2017 at 12:25 , Ladi Prosek <lprosek@redhat.com> wrote: On Fri, Jun 16, 2017 at 12:11 PM, Fernando Casas Schössow <casasfernando@hotmail.com<mailto:casasfernando@hotmail.com>> wrote: Hi Ladi, Thanks a lot for looking into this and replying. I will do my best to rebuild and deploy Alpine's qemu packages with this patch included but not sure its feasible yet. In any case, would it be possible to have this patch included in the next qemu release? Yes, I have already added this to my todo list. The current error message is helpful but knowing which device was involved will be much more helpful. Regarding the environment, I'm not doing migrations and only managed save is done in case the host needs to be rebooted or shutdown. The QEMU process is running the VM since the host is started and this failuire is ocurring randomly without any previous manage save done. As part of troubleshooting on one of the guests I switched from virtio_blk to virtio_scsi for the guest disks but I will need more time to see if that helped. If I have this problem again I will follow your advise and remove virtio_balloon. Thanks, please keep us posted. Another question: is there any way to monitor the virtqueue size either from the guest itself or from the host? Any file in sysfs or proc? This may help to understand in which conditions this is happening and to react faster to mitigate the problem. The problem is not in the virtqueue size but in one piece of internal state ("inuse") which is meant to track the number of buffers "checked out" by QEMU. It's being compared to virtqueue size merely as a sanity check. I'm afraid that there's no way to expose this variable without rebuilding QEMU. The best you could do is attach gdb to the QEMU process and use some clever data access breakpoints to catch suspicious writes to the variable. Although it's likely that it just creeps up slowly and you won't see anything interesting. It's probably beyond reasonable at this point anyway. I would continue with the elimination process (virtio_scsi instead of virtio_blk, no balloon, etc.) and then maybe once we know which device it is, we can add some instrumentation to the code. Thanks again for your help with this! Fer On vie, jun 16, 2017 at 8:58 , Ladi Prosek <lprosek@redhat.com<mailto:lprosek@redhat.com>> wrote: Hi, Would you be able to enhance the error message and rebuild QEMU? --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -856,7 +856,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse = vq->vring.num) { - virtio_error(vdev, "Virtqueue size exceeded"); + virtio_error(vdev, "Virtqueue %u device %s size exceeded", vq->queue_index, vdev->name); goto done; } This would at least confirm the theory that it's caused by virtio-blk-pci. If rebuilding is not feasible I would start by removing other virtio devices -- particularly balloon which has had quite a few virtio related bugs fixed recently. Does your environment involve VM migrations or saving/resuming, or does the crashing QEMU process always run the VM from its boot? Thanks! ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-19 22:10 ` Fernando Casas Schössow @ 2017-06-20 5:59 ` Ladi Prosek 2017-06-20 6:30 ` Fernando Casas Schössow 0 siblings, 1 reply; 55+ messages in thread From: Ladi Prosek @ 2017-06-20 5:59 UTC (permalink / raw) To: Fernando Casas Schössow; +Cc: qemu-devel Hi Fernando, On Tue, Jun 20, 2017 at 12:10 AM, Fernando Casas Schössow <casasfernando@hotmail.com> wrote: > Hi Ladi, > > Today two guests failed again at different times of day. > One of them was the one I switched from virtio_blk to virtio_scsi so this > change didn't solved the problem. > Now in this guest I also disabled virtio_balloon, continuing with the > elimination process. > > Also this time I found a different error message in the guest console. > In the guest already switched to virtio_scsi: > > virtio_scsi virtio2: request:id 44 is not a head! > > Followed by the usual "task blocked for more than 120 seconds." error. > > On the guest still running on virtio_blk the error was similar: > > virtio_blk virtio2: req.0:id 42 is not a head! > blk_update_request: I/O error, dev vda, sector 645657736 > Buffer I/O error on dev dm-1, logical block 7413821, lost async page write > > Followed by the usual "task blocked for more than 120 seconds." error. Honestly this is starting to look more and more like a memory corruption. Two different virtio devices and two different guest operating systems, that would have to be a bug in the common virtio code and we would have seen it somewhere else already. Would it be possible run a thorough memtest on the host just in case? > Do you think that the blk_update_request and the buffer I/O error may be a > consequence of the previous "is not a head!" error or should I be worried > for a storage level issue here? > > Now I will wait to see if disabling virtio_balloon helps or not and report > back. > > Thanks. > > Fer > > On vie, jun 16, 2017 at 12:25 , Ladi Prosek <lprosek@redhat.com> wrote: > > On Fri, Jun 16, 2017 at 12:11 PM, Fernando Casas Schössow > <casasfernando@hotmail.com> wrote: > > Hi Ladi, Thanks a lot for looking into this and replying. I will do my best > to rebuild and deploy Alpine's qemu packages with this patch included but > not sure its feasible yet. In any case, would it be possible to have this > patch included in the next qemu release? > > Yes, I have already added this to my todo list. > > The current error message is helpful but knowing which device was involved > will be much more helpful. Regarding the environment, I'm not doing > migrations and only managed save is done in case the host needs to be > rebooted or shutdown. The QEMU process is running the VM since the host is > started and this failuire is ocurring randomly without any previous manage > save done. As part of troubleshooting on one of the guests I switched from > virtio_blk to virtio_scsi for the guest disks but I will need more time to > see if that helped. If I have this problem again I will follow your advise > and remove virtio_balloon. > > Thanks, please keep us posted. > > Another question: is there any way to monitor the virtqueue size either from > the guest itself or from the host? Any file in sysfs or proc? This may help > to understand in which conditions this is happening and to react faster to > mitigate the problem. > > The problem is not in the virtqueue size but in one piece of internal state > ("inuse") which is meant to track the number of buffers "checked out" by > QEMU. It's being compared to virtqueue size merely as a sanity check. I'm > afraid that there's no way to expose this variable without rebuilding QEMU. > The best you could do is attach gdb to the QEMU process and use some clever > data access breakpoints to catch suspicious writes to the variable. Although > it's likely that it just creeps up slowly and you won't see anything > interesting. It's probably beyond reasonable at this point anyway. I would > continue with the elimination process (virtio_scsi instead of virtio_blk, no > balloon, etc.) and then maybe once we know which device it is, we can add > some instrumentation to the code. > > Thanks again for your help with this! Fer On vie, jun 16, 2017 at 8:58 , > Ladi Prosek <lprosek@redhat.com> wrote: Hi, Would you be able to enhance the > error message and rebuild QEMU? --- a/hw/virtio/virtio.c +++ > b/hw/virtio/virtio.c @@ -856,7 +856,7 @@ void *virtqueue_pop(VirtQueue *vq, > size_t sz) max = vq->vring.num; if (vq->inuse > > = vq->vring.num) { - virtio_error(vdev, "Virtqueue size exceeded"); + > > virtio_error(vdev, "Virtqueue %u device %s size exceeded", vq->queue_index, > vdev->name); goto done; } This would at least confirm the theory that it's > caused by virtio-blk-pci. If rebuilding is not feasible I would start by > removing other virtio devices -- particularly balloon which has had quite a > few virtio related bugs fixed recently. Does your environment involve VM > migrations or saving/resuming, or does the crashing QEMU process always run > the VM from its boot? Thanks! > > > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-20 5:59 ` Ladi Prosek @ 2017-06-20 6:30 ` Fernando Casas Schössow 2017-06-20 7:52 ` Ladi Prosek 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2017-06-20 6:30 UTC (permalink / raw) To: Ladi Prosek; +Cc: qemu-devel Hi Ladi, In this case both guests are CentOS 7.3 running the same kernel 3.10.0-514.21.1. Also the guest that fails most frequently is running Docker with 4 or 5 containers. Another thing I would like to mention is that the host is running on Alpine's default grsec patched kernel. I have the option to install also a vanilla kernel. Would it make sense to switch to the vanilla kernel on the host and see if that helps? And last but not least KSM is enabled on the host. Should I disable it? Following your advice I will run memtest on the host and report back. Just as a side comment, the host is running on ECC memory. Thanks for all your help. Fer. On mar, jun 20, 2017 at 7:59 , Ladi Prosek <lprosek@redhat.com> wrote: Hi Fernando, On Tue, Jun 20, 2017 at 12:10 AM, Fernando Casas Schössow <casasfernando@hotmail.com<mailto:casasfernando@hotmail.com>> wrote: Hi Ladi, Today two guests failed again at different times of day. One of them was the one I switched from virtio_blk to virtio_scsi so this change didn't solved the problem. Now in this guest I also disabled virtio_balloon, continuing with the elimination process. Also this time I found a different error message in the guest console. In the guest already switched to virtio_scsi: virtio_scsi virtio2: request:id 44 is not a head! Followed by the usual "task blocked for more than 120 seconds." error. On the guest still running on virtio_blk the error was similar: virtio_blk virtio2: req.0:id 42 is not a head! blk_update_request: I/O error, dev vda, sector 645657736 Buffer I/O error on dev dm-1, logical block 7413821, lost async page write Followed by the usual "task blocked for more than 120 seconds." error. Honestly this is starting to look more and more like a memory corruption. Two different virtio devices and two different guest operating systems, that would have to be a bug in the common virtio code and we would have seen it somewhere else already. Would it be possible run a thorough memtest on the host just in case? Do you think that the blk_update_request and the buffer I/O error may be a consequence of the previous "is not a head!" error or should I be worried for a storage level issue here? Now I will wait to see if disabling virtio_balloon helps or not and report back. Thanks. Fer On vie, jun 16, 2017 at 12:25 , Ladi Prosek <lprosek@redhat.com<mailto:lprosek@redhat.com>> wrote: On Fri, Jun 16, 2017 at 12:11 PM, Fernando Casas Schössow <casasfernando@hotmail.com<mailto:casasfernando@hotmail.com>> wrote: Hi Ladi, Thanks a lot for looking into this and replying. I will do my best to rebuild and deploy Alpine's qemu packages with this patch included but not sure its feasible yet. In any case, would it be possible to have this patch included in the next qemu release? Yes, I have already added this to my todo list. The current error message is helpful but knowing which device was involved will be much more helpful. Regarding the environment, I'm not doing migrations and only managed save is done in case the host needs to be rebooted or shutdown. The QEMU process is running the VM since the host is started and this failuire is ocurring randomly without any previous manage save done. As part of troubleshooting on one of the guests I switched from virtio_blk to virtio_scsi for the guest disks but I will need more time to see if that helped. If I have this problem again I will follow your advise and remove virtio_balloon. Thanks, please keep us posted. Another question: is there any way to monitor the virtqueue size either from the guest itself or from the host? Any file in sysfs or proc? This may help to understand in which conditions this is happening and to react faster to mitigate the problem. The problem is not in the virtqueue size but in one piece of internal state ("inuse") which is meant to track the number of buffers "checked out" by QEMU. It's being compared to virtqueue size merely as a sanity check. I'm afraid that there's no way to expose this variable without rebuilding QEMU. The best you could do is attach gdb to the QEMU process and use some clever data access breakpoints to catch suspicious writes to the variable. Although it's likely that it just creeps up slowly and you won't see anything interesting. It's probably beyond reasonable at this point anyway. I would continue with the elimination process (virtio_scsi instead of virtio_blk, no balloon, etc.) and then maybe once we know which device it is, we can add some instrumentation to the code. Thanks again for your help with this! Fer On vie, jun 16, 2017 at 8:58 , Ladi Prosek <lprosek@redhat.com<mailto:lprosek@redhat.com>> wrote: Hi, Would you be able to enhance the error message and rebuild QEMU? --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -856,7 +856,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse = vq->vring.num) { - virtio_error(vdev, "Virtqueue size exceeded"); + virtio_error(vdev, "Virtqueue %u device %s size exceeded", vq->queue_index, vdev->name); goto done; } This would at least confirm the theory that it's caused by virtio-blk-pci. If rebuilding is not feasible I would start by removing other virtio devices -- particularly balloon which has had quite a few virtio related bugs fixed recently. Does your environment involve VM migrations or saving/resuming, or does the crashing QEMU process always run the VM from its boot? Thanks! ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-20 6:30 ` Fernando Casas Schössow @ 2017-06-20 7:52 ` Ladi Prosek 2017-06-21 12:19 ` Fernando Casas Schössow 0 siblings, 1 reply; 55+ messages in thread From: Ladi Prosek @ 2017-06-20 7:52 UTC (permalink / raw) To: Fernando Casas Schössow; +Cc: qemu-devel On Tue, Jun 20, 2017 at 8:30 AM, Fernando Casas Schössow <casasfernando@hotmail.com> wrote: > Hi Ladi, > > In this case both guests are CentOS 7.3 running the same kernel > 3.10.0-514.21.1. > Also the guest that fails most frequently is running Docker with 4 or 5 > containers. > > Another thing I would like to mention is that the host is running on > Alpine's default grsec patched kernel. I have the option to install also a > vanilla kernel. Would it make sense to switch to the vanilla kernel on the > host and see if that helps? The host kernel is less likely to be responsible for this, in my opinion. I'd hold off on that for now. > And last but not least KSM is enabled on the host. Should I disable it? Could be worth the try. > Following your advice I will run memtest on the host and report back. Just > as a side comment, the host is running on ECC memory. I see. Would it be possible for you, once a guest is in the broken state, to make it available for debugging? By attaching gdb to the QEMU process for example and letting me poke around it remotely? Thanks! > Thanks for all your help. > > Fer. > > On mar, jun 20, 2017 at 7:59 , Ladi Prosek <lprosek@redhat.com> wrote: > > Hi Fernando, On Tue, Jun 20, 2017 at 12:10 AM, Fernando Casas Schössow > <casasfernando@hotmail.com> wrote: > > Hi Ladi, Today two guests failed again at different times of day. One of > them was the one I switched from virtio_blk to virtio_scsi so this change > didn't solved the problem. Now in this guest I also disabled virtio_balloon, > continuing with the elimination process. Also this time I found a different > error message in the guest console. In the guest already switched to > virtio_scsi: virtio_scsi virtio2: request:id 44 is not a head! Followed by > the usual "task blocked for more than 120 seconds." error. On the guest > still running on virtio_blk the error was similar: virtio_blk virtio2: > req.0:id 42 is not a head! blk_update_request: I/O error, dev vda, sector > 645657736 Buffer I/O error on dev dm-1, logical block 7413821, lost async > page write Followed by the usual "task blocked for more than 120 seconds." > error. > > Honestly this is starting to look more and more like a memory corruption. > Two different virtio devices and two different guest operating systems, that > would have to be a bug in the common virtio code and we would have seen it > somewhere else already. Would it be possible run a thorough memtest on the > host just in case? > > Do you think that the blk_update_request and the buffer I/O error may be a > consequence of the previous "is not a head!" error or should I be worried > for a storage level issue here? Now I will wait to see if disabling > virtio_balloon helps or not and report back. Thanks. Fer On vie, jun 16, > 2017 at 12:25 , Ladi Prosek <lprosek@redhat.com> wrote: On Fri, Jun 16, 2017 > at 12:11 PM, Fernando Casas Schössow <casasfernando@hotmail.com> wrote: Hi > Ladi, Thanks a lot for looking into this and replying. I will do my best to > rebuild and deploy Alpine's qemu packages with this patch included but not > sure its feasible yet. In any case, would it be possible to have this patch > included in the next qemu release? Yes, I have already added this to my todo > list. The current error message is helpful but knowing which device was > involved will be much more helpful. Regarding the environment, I'm not doing > migrations and only managed save is done in case the host needs to be > rebooted or shutdown. The QEMU process is running the VM since the host is > started and this failuire is ocurring randomly without any previous manage > save done. As part of troubleshooting on one of the guests I switched from > virtio_blk to virtio_scsi for the guest disks but I will need more time to > see if that helped. If I have this problem again I will follow your advise > and remove virtio_balloon. Thanks, please keep us posted. Another question: > is there any way to monitor the virtqueue size either from the guest itself > or from the host? Any file in sysfs or proc? This may help to understand in > which conditions this is happening and to react faster to mitigate the > problem. The problem is not in the virtqueue size but in one piece of > internal state ("inuse") which is meant to track the number of buffers > "checked out" by QEMU. It's being compared to virtqueue size merely as a > sanity check. I'm afraid that there's no way to expose this variable without > rebuilding QEMU. The best you could do is attach gdb to the QEMU process and > use some clever data access breakpoints to catch suspicious writes to the > variable. Although it's likely that it just creeps up slowly and you won't > see anything interesting. It's probably beyond reasonable at this point > anyway. I would continue with the elimination process (virtio_scsi instead > of virtio_blk, no balloon, etc.) and then maybe once we know which device it > is, we can add some instrumentation to the code. Thanks again for your help > with this! Fer On vie, jun 16, 2017 at 8:58 , Ladi Prosek > <lprosek@redhat.com> wrote: Hi, Would you be able to enhance the error > message and rebuild QEMU? --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c > @@ -856,7 +856,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = > vq->vring.num; if (vq->inuse = vq->vring.num) { - virtio_error(vdev, > "Virtqueue size exceeded"); + virtio_error(vdev, "Virtqueue %u device %s > size exceeded", vq->queue_index, vdev->name); goto done; } This would at > least confirm the theory that it's caused by virtio-blk-pci. If rebuilding > is not feasible I would start by removing other virtio devices -- > particularly balloon which has had quite a few virtio related bugs fixed > recently. Does your environment involve VM migrations or saving/resuming, or > does the crashing QEMU process always run the VM from its boot? Thanks! > > > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-20 7:52 ` Ladi Prosek @ 2017-06-21 12:19 ` Fernando Casas Schössow 2017-06-22 7:43 ` Ladi Prosek 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2017-06-21 12:19 UTC (permalink / raw) To: Ladi Prosek; +Cc: qemu-devel Hi Ladi, Sorry for the delay in my reply. I will leave the host kernel alone for now then. For the last 15 hours or so I'm running memtest86+ on the host. So far so good. Two passes no errors so far. I will try to leave it running for at least another 24hr and report back the results. Hopefully we can discard the memory issue at hardware level. Regarding KSM, that will be the next thing I will disable if after removing the balloon device guests still crash. About leaving a guest in a failed state for you to debug it remotely, that's absolutely an option. We just need to coordinate so I can give you remote access to the host and so on. Let me know if any preparation is needed in advance and which tools you need installed on the host. Once I again I would like to thank you for all your help and your great disposition! Cheers, Fer On mar, jun 20, 2017 at 9:52 , Ladi Prosek <lprosek@redhat.com> wrote: The host kernel is less likely to be responsible for this, in my opinion. I'd hold off on that for now. And last but not least KSM is enabled on the host. Should I disable it? Could be worth the try. Following your advice I will run memtest on the host and report back. Just as a side comment, the host is running on ECC memory. I see. Would it be possible for you, once a guest is in the broken state, to make it available for debugging? By attaching gdb to the QEMU process for example and letting me poke around it remotely? Thanks! ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-21 12:19 ` Fernando Casas Schössow @ 2017-06-22 7:43 ` Ladi Prosek 2017-06-23 6:29 ` Fernando Casas Schössow [not found] ` <1498199343.2815.0@smtp-mail.outlook.com> 0 siblings, 2 replies; 55+ messages in thread From: Ladi Prosek @ 2017-06-22 7:43 UTC (permalink / raw) To: Fernando Casas Schössow; +Cc: qemu-devel Hi Fernando, On Wed, Jun 21, 2017 at 2:19 PM, Fernando Casas Schössow <casasfernando@hotmail.com> wrote: > Hi Ladi, > > Sorry for the delay in my reply. > I will leave the host kernel alone for now then. > For the last 15 hours or so I'm running memtest86+ on the host. So far so > good. Two passes no errors so far. I will try to leave it running for at > least another 24hr and report back the results. Hopefully we can discard the > memory issue at hardware level. > > Regarding KSM, that will be the next thing I will disable if after removing > the balloon device guests still crash. > > About leaving a guest in a failed state for you to debug it remotely, that's > absolutely an option. We just need to coordinate so I can give you remote > access to the host and so on. Let me know if any preparation is needed in > advance and which tools you need installed on the host. I think that gdbserver attached to the QEMU process should be enough. When the VM gets into the broken state please do something like: gdbserver --attach host:12345 <QEMU pid> and let me know the host name and port (12345 in the above example). > Once I again I would like to thank you for all your help and your great > disposition! You're absolutely welcome, I don't think I've done anything helpful so far :) > Cheers, > > Fer > > On mar, jun 20, 2017 at 9:52 , Ladi Prosek <lprosek@redhat.com> wrote: > > The host kernel is less likely to be responsible for this, in my opinion. > I'd hold off on that for now. > > And last but not least KSM is enabled on the host. Should I disable it? > > Could be worth the try. > > Following your advice I will run memtest on the host and report back. Just > as a side comment, the host is running on ECC memory. > > I see. Would it be possible for you, once a guest is in the broken state, to > make it available for debugging? By attaching gdb to the QEMU process for > example and letting me poke around it remotely? Thanks! > > > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-22 7:43 ` Ladi Prosek @ 2017-06-23 6:29 ` Fernando Casas Schössow [not found] ` <1498199343.2815.0@smtp-mail.outlook.com> 1 sibling, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2017-06-23 6:29 UTC (permalink / raw) To: Ladi Prosek; +Cc: qemu-devel Hi Ladi, Small update. Memtest86+ was running on the host for more than 54 hours. 8 passes were completed and no memory errors found. For now I think we can assume that the host memory is ok. I just started all the guests one hour ago. I will monitor them and once one fails I will attach the debugger and let you know. Thanks. Fer On jue, jun 22, 2017 at 9:43 , Ladi Prosek <lprosek@redhat.com> wrote: Hi Fernando, On Wed, Jun 21, 2017 at 2:19 PM, Fernando Casas Schössow <casasfernando@hotmail.com<mailto:casasfernando@hotmail.com>> wrote: Hi Ladi, Sorry for the delay in my reply. I will leave the host kernel alone for now then. For the last 15 hours or so I'm running memtest86+ on the host. So far so good. Two passes no errors so far. I will try to leave it running for at least another 24hr and report back the results. Hopefully we can discard the memory issue at hardware level. Regarding KSM, that will be the next thing I will disable if after removing the balloon device guests still crash. About leaving a guest in a failed state for you to debug it remotely, that's absolutely an option. We just need to coordinate so I can give you remote access to the host and so on. Let me know if any preparation is needed in advance and which tools you need installed on the host. I think that gdbserver attached to the QEMU process should be enough. When the VM gets into the broken state please do something like: gdbserver --attach host:12345 <QEMU pid> and let me know the host name and port (12345 in the above example). Once I again I would like to thank you for all your help and your great disposition! You're absolutely welcome, I don't think I've done anything helpful so far :) Cheers, Fer On mar, jun 20, 2017 at 9:52 , Ladi Prosek <lprosek@redhat.com<mailto:lprosek@redhat.com>> wrote: The host kernel is less likely to be responsible for this, in my opinion. I'd hold off on that for now. And last but not least KSM is enabled on the host. Should I disable it? Could be worth the try. Following your advice I will run memtest on the host and report back. Just as a side comment, the host is running on ECC memory. I see. Would it be possible for you, once a guest is in the broken state, to make it available for debugging? By attaching gdb to the QEMU process for example and letting me poke around it remotely? Thanks! ^ permalink raw reply [flat|nested] 55+ messages in thread
[parent not found: <1498199343.2815.0@smtp-mail.outlook.com>]
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error [not found] ` <1498199343.2815.0@smtp-mail.outlook.com> @ 2017-06-24 8:34 ` Fernando Casas Schössow 2019-01-31 11:32 ` Fernando Casas Schössow 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2017-06-24 8:34 UTC (permalink / raw) To: Ladi Prosek; +Cc: qemu-devel Hi Ladi, After running for about 15hrs two different guests (one Windows, one Linux) crashed with around 1 hour difference and the same error in qemu log "Virqueue size exceeded". The Linux guest was already running on virtio_scsi and without virtio_balloon. :( I compiled and attached gdbserver to the qemu process for this guest but when I did this I got the following warning in gdbserver: warning: Cannot call inferior functions, Linux kernel PaX protection forbids return to non-executable pages! The default Alpine kernel is a grsec kernel. Not sure if this will interfere with debugging or not but I suspect yes. If you need me to replace the grsec kernel with a vanilla one (also available as an option in Alpine) let me know and I will do so. Otherwise send me an email directly so I can share with you the host:port details so you can connect to gdbserver. Thanks, Fer On vie, jun 23, 2017 at 8:29 , Fernando Casas Schössow <casasfernando@hotmail.com> wrote: Hi Ladi, Small update. Memtest86+ was running on the host for more than 54 hours. 8 passes were completed and no memory errors found. For now I think we can assume that the host memory is ok. I just started all the guests one hour ago. I will monitor them and once one fails I will attach the debugger and let you know. Thanks. Fer On jue, jun 22, 2017 at 9:43 , Ladi Prosek <lprosek@redhat.com> wrote: Hi Fernando, On Wed, Jun 21, 2017 at 2:19 PM, Fernando Casas Schössow <casasfernando@hotmail.com<mailto:casasfernando@hotmail.com>> wrote: Hi Ladi, Sorry for the delay in my reply. I will leave the host kernel alone for now then. For the last 15 hours or so I'm running memtest86+ on the host. So far so good. Two passes no errors so far. I will try to leave it running for at least another 24hr and report back the results. Hopefully we can discard the memory issue at hardware level. Regarding KSM, that will be the next thing I will disable if after removing the balloon device guests still crash. About leaving a guest in a failed state for you to debug it remotely, that's absolutely an option. We just need to coordinate so I can give you remote access to the host and so on. Let me know if any preparation is needed in advance and which tools you need installed on the host. I think that gdbserver attached to the QEMU process should be enough. When the VM gets into the broken state please do something like: gdbserver --attach host:12345 <QEMU pid> and let me know the host name and port (12345 in the above example). Once I again I would like to thank you for all your help and your great disposition! You're absolutely welcome, I don't think I've done anything helpful so far :) Cheers, Fer On mar, jun 20, 2017 at 9:52 , Ladi Prosek <lprosek@redhat.com<mailto:lprosek@redhat.com>> wrote: The host kernel is less likely to be responsible for this, in my opinion. I'd hold off on that for now. And last but not least KSM is enabled on the host. Should I disable it? Could be worth the try. Following your advice I will run memtest on the host and report back. Just as a side comment, the host is running on ECC memory. I see. Would it be possible for you, once a guest is in the broken state, to make it available for debugging? By attaching gdb to the QEMU process for example and letting me poke around it remotely? Thanks! ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error 2017-06-24 8:34 ` Fernando Casas Schössow @ 2019-01-31 11:32 ` Fernando Casas Schössow 2019-02-01 5:48 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2019-01-31 11:32 UTC (permalink / raw) To: qemu-devel; +Cc: qemu-block Hi, Sorry for resurrecting this thread after so long but I just upgraded the host to Qemu 3.1 and libvirt 4.10 and I'm still facing this problem. At the moment I cannot use virtio disks (virtio-blk nor virtio-scsi) with my guests in order to avoid this issue so as a workaround I'm using SATA emulated storage which is not ideal but is perfectly stable. Do you have any suggestions on how can I progress troubleshooting? Qemu is not crashing so I don't have any dumps that can be analyzed. The guest is just "stuck" and all I can do is destroy it and start it again. It's really frustrating that after all this time I couldn't find the cause for this issue so any ideas are welcome. Thanks. Fernando ________________________________ From: Fernando Casas Schössow <casasfernando@hotmail.com> Sent: Saturday, June 24, 2017 10:34 AM To: Ladi Prosek Cc: qemu-devel@nongnu.org Subject: Re: [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error Hi Ladi, After running for about 15hrs two different guests (one Windows, one Linux) crashed with around 1 hour difference and the same error in qemu log "Virqueue size exceeded". The Linux guest was already running on virtio_scsi and without virtio_balloon. :( I compiled and attached gdbserver to the qemu process for this guest but when I did this I got the following warning in gdbserver: warning: Cannot call inferior functions, Linux kernel PaX protection forbids return to non-executable pages! The default Alpine kernel is a grsec kernel. Not sure if this will interfere with debugging or not but I suspect yes. If you need me to replace the grsec kernel with a vanilla one (also available as an option in Alpine) let me know and I will do so. Otherwise send me an email directly so I can share with you the host:port details so you can connect to gdbserver. Thanks, Fer On vie, jun 23, 2017 at 8:29 , Fernando Casas Schössow <casasfernando@hotmail.com> wrote: Hi Ladi, Small update. Memtest86+ was running on the host for more than 54 hours. 8 passes were completed and no memory errors found. For now I think we can assume that the host memory is ok. I just started all the guests one hour ago. I will monitor them and once one fails I will attach the debugger and let you know. Thanks. Fer On jue, jun 22, 2017 at 9:43 , Ladi Prosek <lprosek@redhat.com> wrote: Hi Fernando, On Wed, Jun 21, 2017 at 2:19 PM, Fernando Casas Schössow <casasfernando@hotmail.com<mailto:casasfernando@hotmail.com>> wrote: Hi Ladi, Sorry for the delay in my reply. I will leave the host kernel alone for now then. For the last 15 hours or so I'm running memtest86+ on the host. So far so good. Two passes no errors so far. I will try to leave it running for at least another 24hr and report back the results. Hopefully we can discard the memory issue at hardware level. Regarding KSM, that will be the next thing I will disable if after removing the balloon device guests still crash. About leaving a guest in a failed state for you to debug it remotely, that's absolutely an option. We just need to coordinate so I can give you remote access to the host and so on. Let me know if any preparation is needed in advance and which tools you need installed on the host. I think that gdbserver attached to the QEMU process should be enough. When the VM gets into the broken state please do something like: gdbserver --attach host:12345 <QEMU pid> and let me know the host name and port (12345 in the above example). Once I again I would like to thank you for all your help and your great disposition! You're absolutely welcome, I don't think I've done anything helpful so far :) Cheers, Fer On mar, jun 20, 2017 at 9:52 , Ladi Prosek <lprosek@redhat.com<mailto:lprosek@redhat.com>> wrote: The host kernel is less likely to be responsible for this, in my opinion. I'd hold off on that for now. And last but not least KSM is enabled on the host. Should I disable it? Could be worth the try. Following your advice I will run memtest on the host and report back. Just as a side comment, the host is running on ECC memory. I see. Would it be possible for you, once a guest is in the broken state, to make it available for debugging? By attaching gdb to the QEMU process for example and letting me poke around it remotely? Thanks! ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-01-31 11:32 ` Fernando Casas Schössow @ 2019-02-01 5:48 ` Stefan Hajnoczi 2019-02-01 8:17 ` Fernando Casas Schössow 0 siblings, 1 reply; 55+ messages in thread From: Stefan Hajnoczi @ 2019-02-01 5:48 UTC (permalink / raw) To: Fernando Casas Schössow; +Cc: qemu-devel, qemu-block [-- Attachment #1: Type: text/plain, Size: 915 bytes --] On Thu, Jan 31, 2019 at 11:32:32AM +0000, Fernando Casas Schössow wrote: > Sorry for resurrecting this thread after so long but I just upgraded the host to Qemu 3.1 and libvirt 4.10 and I'm still facing this problem. > At the moment I cannot use virtio disks (virtio-blk nor virtio-scsi) with my guests in order to avoid this issue so as a workaround I'm using SATA emulated storage which is not ideal but is perfectly stable. > > Do you have any suggestions on how can I progress troubleshooting? > Qemu is not crashing so I don't have any dumps that can be analyzed. The guest is just "stuck" and all I can do is destroy it and start it again. > It's really frustrating that after all this time I couldn't find the cause for this issue so any ideas are welcome. Hi Fernando, Please post your QEMU command-line (ps aux | grep qemu) and the details of the guest operating system and version. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-01 5:48 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi @ 2019-02-01 8:17 ` Fernando Casas Schössow 2019-02-04 6:06 ` Stefan Hajnoczi 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-01 8:17 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Fernando Casas Schössow, qemu-devel, qemu-block Hi Stefan, Thanks for looking into this. Please find the requested details below. If you need any further details (like the host storage details) just let me know. Host details: CPU: Intel(R) Xeon(R) CPU E3-1230 V2 Memory: 32GB (ECC) OS: Alpine 3.9 Kernel version: 4.19.18-0-vanilla Qemu version: 3.1 Libvirt version: 4.10 Networking: openvswitch-2.10.1 Windows guest: OS: Windows Server 2012 R2 64bit (up to date) Virtio drivers version: 141 (stable) Command line: /usr/bin/qemu-system-x86_64 -name guest=DCHOMENET02,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-17-DCHOMENET02/master-key.aes -machine pc-i440fx-3.1,accel=kvm,usb=off,dump-guest-core=off -cpu IvyBridge,ss=on,vmx=on,pcid=on,hypervisor=on,arat=on,tsc_adjust=on,umip=on,xsaveopt=on,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff -drive file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/DCHOMENET02_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 84afccf7-f164-4d0d-b9c9-c92ffab00848 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=42,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x5 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/storage/storage-ssd-vms/virtual_machines_ssd/dchomenet02.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none,aio=threads -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,write-cache=on -drive file=/storage/storage-hdd-vms/virtual_machines_hdd/dchomenet02_storage.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-1,cache=none,aio=threads -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi0-0-0-1,id=scsi0-0-0-1,write-cache=on -netdev tap,fd=44,id=hostnet0,vhost=on,vhostfd=45 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:91:aa:a1,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -chardev socket,id=charchannel1,fd=47,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5905,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on Linux guest: OS: CentOS 7.6 64bit (up to date) Kernel version: 3.10.0-957.1.3.el7.x86_64 Command line: /usr/bin/qemu-system-x86_64 -name guest=DOCKER01,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-21-DOCKER01/master-key.aes -machine pc-i440fx-3.1,accel=kvm,usb=off,dump-guest-core=off -cpu IvyBridge,ss=on,vmx=on,pcid=on,hypervisor=on,arat=on,tsc_adjust=on,umip=on,xsaveopt=on -drive file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/DOCKER01_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 4705b146-3b14-4c20-923c-42105d47e7fc -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=44,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/storage/storage-ssd-vms/virtual_machines_ssd/docker01.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none,aio=threads -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,write-cache=on -netdev tap,fd=46,id=hostnet0,vhost=on,vhostfd=48 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:1c:af:ce,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=49,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -spice port=5904,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x8 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on Thanks. Kind regards, Fernando On vie, feb 1, 2019 at 6:48 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: On Thu, Jan 31, 2019 at 11:32:32AM +0000, Fernando Casas Schössow wrote: Sorry for resurrecting this thread after so long but I just upgraded the host to Qemu 3.1 and libvirt 4.10 and I'm still facing this problem. At the moment I cannot use virtio disks (virtio-blk nor virtio-scsi) with my guests in order to avoid this issue so as a workaround I'm using SATA emulated storage which is not ideal but is perfectly stable. Do you have any suggestions on how can I progress troubleshooting? Qemu is not crashing so I don't have any dumps that can be analyzed. The guest is just "stuck" and all I can do is destroy it and start it again. It's really frustrating that after all this time I couldn't find the cause for this issue so any ideas are welcome. Hi Fernando, Please post your QEMU command-line (ps aux | grep qemu) and the details of the guest operating system and version. Stefan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-01 8:17 ` Fernando Casas Schössow @ 2019-02-04 6:06 ` Stefan Hajnoczi 2019-02-04 7:24 ` Fernando Casas Schössow 0 siblings, 1 reply; 55+ messages in thread From: Stefan Hajnoczi @ 2019-02-04 6:06 UTC (permalink / raw) To: Fernando Casas Schössow Cc: Fernando Casas Schössow, qemu-devel, qemu-block [-- Attachment #1: Type: text/plain, Size: 1129 bytes --] Are you sure this happens with both virtio-blk and virtio-scsi? The following patch adds more debug output. You can build as follows: $ git clone https://git.qemu.org/git/qemu.git $ cd qemu $ patch apply -p1 ...paste the patch here... ^D # For info on build dependencies see https://wiki.qemu.org/Hosts/Linux $ ./configure --target-list=x86_64-softmmu $ make -j4 You can configure a libvirt domain to use your custom QEMU binary by changing the <devices><emulator> tag to the qemu/x86_64-softmmu/qemu-system-x86_64 path. --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 22bd1ac34e..aa44bffa1f 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -879,6 +879,9 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); virtio_error(vdev, "Virtqueue size exceeded"); goto done; } [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply related [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-04 6:06 ` Stefan Hajnoczi @ 2019-02-04 7:24 ` Fernando Casas Schössow [not found] ` <AM5PR0602MB32368CB5ADDEC05F42D8BC8FA46D0@AM5PR0602MB3236.eurprd06.prod.outlo ok.com> [not found] ` <AM5PR0602MB32368CB5ADDEC05F42D8BC8FA46D0@AM5PR0602MB3236.eurprd06.prod.outlo> 0 siblings, 2 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-04 7:24 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block [-- Attachment #1: Type: text/plain, Size: 1858 bytes --] I can test again with qemu 3.1 but with previous versions yes, it was happening the same with both virtio-blk and virtio-scsi. For 3.1 I can confirm it happens for virtio-scsi (already tested it) and I can test with virtio-blk again if that will add value to the investigation. Also I'm attaching a guest console screenshot showing the errors displayed by the guest when it goes unresponsive in case it can help. Thanks for the patch. I will build the custom qemu binary and reproduce the issue. This may take a couple of days since I cannot reproduce it at will. Sometimes it takes 12 hours sometimes 2 days until it happens. Hopefully the code below will add more light on to this problem. Thanks, Fernando On lun, feb 4, 2019 at 7:06 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: Are you sure this happens with both virtio-blk and virtio-scsi? The following patch adds more debug output. You can build as follows: $ git clone https://git.qemu.org/git/qemu.git $ cd qemu $ patch apply -p1 ...paste the patch here... ^D # For info on build dependencies see https://wiki.qemu.org/Hosts/Linux $ ./configure --target-list=x86_64-softmmu $ make -j4 You can configure a libvirt domain to use your custom QEMU binary by changing the <devices><emulator> tag to the qemu/x86_64-softmmu/qemu-system-x86_64 path. --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 22bd1ac34e..aa44bffa1f 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -879,6 +879,9 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); virtio_error(vdev, "Virtqueue size exceeded"); goto done; } [-- Attachment #2: libvirt-php-tmp-wuYYkI.png --] [-- Type: image/png, Size: 4382 bytes --] ^ permalink raw reply [flat|nested] 55+ messages in thread
[parent not found: <AM5PR0602MB32368CB5ADDEC05F42D8BC8FA46D0@AM5PR0602MB3236.eurprd06.prod.outlo ok.com>]
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error [not found] ` <AM5PR0602MB32368CB5ADDEC05F42D8BC8FA46D0@AM5PR0602MB3236.eurprd06.prod.outlo ok.com> @ 2019-02-06 7:15 ` Fernando Casas Schössow 0 siblings, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-06 7:15 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block I can now confirm that the same happens with virtio-blk and virtio-scsi. Please find below the qemu log enhanced with the new information added by the patch provided by Stefan: vdev 0x55d22b8e10f0 ("virtio-blk") vq 0x55d22b8ebe40 (idx 0) inuse 128 vring.num 128 2019-02-06T00:40:41.742552Z qemu-system-x86_64: Virtqueue size exceeded I just changed the disk back to virtio-scsi so I can repro this again with the patched qemu and report back. Thanks. On lun, feb 4, 2019 at 8:24 AM, Fernando Casas Schössow <casasfernando@outlook.com> wrote: I can test again with qemu 3.1 but with previous versions yes, it was happening the same with both virtio-blk and virtio-scsi. For 3.1 I can confirm it happens for virtio-scsi (already tested it) and I can test with virtio-blk again if that will add value to the investigation. Also I'm attaching a guest console screenshot showing the errors displayed by the guest when it goes unresponsive in case it can help. Thanks for the patch. I will build the custom qemu binary and reproduce the issue. This may take a couple of days since I cannot reproduce it at will. Sometimes it takes 12 hours sometimes 2 days until it happens. Hopefully the code below will add more light on to this problem. Thanks, Fernando On lun, feb 4, 2019 at 7:06 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: Are you sure this happens with both virtio-blk and virtio-scsi? The following patch adds more debug output. You can build as follows: $ git clone https://git.qemu.org/git/qemu.git $ cd qemu $ patch apply -p1 ...paste the patch here... ^D # For info on build dependencies see https://wiki.qemu.org/Hosts/Linux $ ./configure --target-list=x86_64-softmmu $ make -j4 You can configure a libvirt domain to use your custom QEMU binary by changing the <devices><emulator> tag to the qemu/x86_64-softmmu/qemu-system-x86_64 path. --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 22bd1ac34e..aa44bffa1f 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -879,6 +879,9 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); virtio_error(vdev, "Virtqueue size exceeded"); goto done; } ^ permalink raw reply [flat|nested] 55+ messages in thread
[parent not found: <AM5PR0602MB32368CB5ADDEC05F42D8BC8FA46D0@AM5PR0602MB3236.eurprd06.prod.outlo>]
[parent not found: <VI1PR0602MB3245032D51A5DF45AF6E1952A46F0@VI1PR0602MB3245.eurprd06.prod.outlo ok.com>]
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error [not found] ` <VI1PR0602MB3245032D51A5DF45AF6E1952A46F0@VI1PR0602MB3245.eurprd06.prod.outlo ok.com> @ 2019-02-06 16:47 ` Fernando Casas Schössow 2019-02-11 3:17 ` Stefan Hajnoczi 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-06 16:47 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block I could also repro the same with virtio-scsi on the same guest a couple of hours later: 2019-02-06 07:10:37.672+0000: starting up libvirt version: 4.10.0, qemu version: 3.1.0, kernel: 4.19.18-0-vanilla, hostname: vmsvr01.homenet.local LC_ALL=C PATH=/bin:/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin HOME=/root USER=root QEMU_AUDIO_DRV=spice /home/fernando/qemu-system-x86_64 -name guest=DOCKER01,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-32-DOCKER01/master-key.aes -machine pc-i440fx-3.1,accel=kvm,usb=off,dump-guest-core=off -cpu IvyBridge,ss=on,vmx=on,pcid=on,hypervisor=on,arat=on,tsc_adjust=on,umip=on,xsaveopt=on -drive file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/DOCKER01_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 4705b146-3b14-4c20-923c-42105d47e7fc -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=46,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/storage/storage-ssd-vms/virtual_machines_ssd/docker01.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none,aio=threads -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,write-cache=on -netdev tap,fd=48,id=hostnet0,vhost=on,vhostfd=50 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:1c:af:ce,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=51,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -spice port=5904,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x8 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on 2019-02-06 07:10:37.672+0000: Domain id=32 is tainted: high-privileges char device redirected to /dev/pts/5 (label charserial0) vdev 0x5585456ef6b0 ("virtio-scsi") vq 0x5585456f90a0 (idx 2) inuse 128 vring.num 128 2019-02-06T13:00:46.942424Z qemu-system-x86_64: Virtqueue size exceeded I'm open to any tests or suggestions that can move the investigation forward and find the cause of this issue. Thanks. On mié, feb 6, 2019 at 8:15 AM, Fernando Casas Schössow <casasfernando@outlook.com> wrote: I can now confirm that the same happens with virtio-blk and virtio-scsi. Please find below the qemu log enhanced with the new information added by the patch provided by Stefan: vdev 0x55d22b8e10f0 ("virtio-blk") vq 0x55d22b8ebe40 (idx 0) inuse 128 vring.num 128 2019-02-06T00:40:41.742552Z qemu-system-x86_64: Virtqueue size exceeded I just changed the disk back to virtio-scsi so I can repro this again with the patched qemu and report back. Thanks. On lun, feb 4, 2019 at 8:24 AM, Fernando Casas Schössow <casasfernando@outlook.com> wrote: I can test again with qemu 3.1 but with previous versions yes, it was happening the same with both virtio-blk and virtio-scsi. For 3.1 I can confirm it happens for virtio-scsi (already tested it) and I can test with virtio-blk again if that will add value to the investigation. Also I'm attaching a guest console screenshot showing the errors displayed by the guest when it goes unresponsive in case it can help. Thanks for the patch. I will build the custom qemu binary and reproduce the issue. This may take a couple of days since I cannot reproduce it at will. Sometimes it takes 12 hours sometimes 2 days until it happens. Hopefully the code below will add more light on to this problem. Thanks, Fernando On lun, feb 4, 2019 at 7:06 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: Are you sure this happens with both virtio-blk and virtio-scsi? The following patch adds more debug output. You can build as follows: $ git clone https://git.qemu.org/git/qemu.git $ cd qemu $ patch apply -p1 ...paste the patch here... ^D # For info on build dependencies see https://wiki.qemu.org/Hosts/Linux $ ./configure --target-list=x86_64-softmmu $ make -j4 You can configure a libvirt domain to use your custom QEMU binary by changing the <devices><emulator> tag to the qemu/x86_64-softmmu/qemu-system-x86_64 path. --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 22bd1ac34e..aa44bffa1f 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -879,6 +879,9 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); virtio_error(vdev, "Virtqueue size exceeded"); goto done; } ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-06 16:47 ` Fernando Casas Schössow @ 2019-02-11 3:17 ` Stefan Hajnoczi 2019-02-11 9:48 ` Fernando Casas Schössow 2019-02-18 7:21 ` Fernando Casas Schössow 0 siblings, 2 replies; 55+ messages in thread From: Stefan Hajnoczi @ 2019-02-11 3:17 UTC (permalink / raw) To: Fernando Casas Schössow; +Cc: qemu-devel, qemu-block [-- Attachment #1: Type: text/plain, Size: 6120 bytes --] On Wed, Feb 06, 2019 at 04:47:19PM +0000, Fernando Casas Schössow wrote: > I could also repro the same with virtio-scsi on the same guest a couple of hours later: > > 2019-02-06 07:10:37.672+0000: starting up libvirt version: 4.10.0, qemu version: 3.1.0, kernel: 4.19.18-0-vanilla, hostname: vmsvr01.homenet.local > LC_ALL=C PATH=/bin:/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin HOME=/root USER=root QEMU_AUDIO_DRV=spice /home/fernando/qemu-system-x86_64 -name guest=DOCKER01,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-32-DOCKER01/master-key.aes -machine pc-i440fx-3.1,accel=kvm,usb=off,dump-guest-core=off -cpu IvyBridge,ss=on,vmx=on,pcid=on,hypervisor=on,arat=on,tsc_adjust=on,umip=on,xsaveopt=on -drive file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/DOCKER01_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 4705b146-3b14-4c20-923c-42105d47e7fc -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=46,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/storage/storage-ssd-vms/virtual_machines_ssd/docker01.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none,aio=threads -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,write-cache=on -netdev tap,fd=48,id=hostnet0,vhost=on,vhostfd=50 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:1c:af:ce,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=51,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -spice port=5904,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x8 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on > 2019-02-06 07:10:37.672+0000: Domain id=32 is tainted: high-privileges > char device redirected to /dev/pts/5 (label charserial0) > vdev 0x5585456ef6b0 ("virtio-scsi") > vq 0x5585456f90a0 (idx 2) > inuse 128 vring.num 128 > 2019-02-06T13:00:46.942424Z qemu-system-x86_64: Virtqueue size exceeded > > > I'm open to any tests or suggestions that can move the investigation forward and find the cause of this issue. Thanks for collecting the data! The fact that both virtio-blk and virtio-scsi failed suggests it's not a virtqueue element leak in the virtio-blk or virtio-scsi device emulation code. The hung task error messages from inside the guest are a consequence of QEMU hitting the "Virtqueue size exceeded" error. QEMU refuses to process further requests after the error, causing tasks inside the guest to get stuck on I/O. I don't have a good theory regarding the root cause. Two ideas: 1. The guest is corrupting the vring or submitting more requests than will fit into the ring. Somewhat unlikely because it happens with both Windows and Linux guests. 2. QEMU's virtqueue code is buggy, maybe the memory region cache which is used for fast guest RAM accesses. Here is an expanded version of the debug patch which might help identify which of these scenarios is likely. Sorry, it requires running the guest again! This time let's make QEMU dump core so both QEMU state and guest RAM are captured for further debugging. That way it will be possible to extract more information using gdb without rerunning. Stefan --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a1ff647a66..28d89fcbcb 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -866,6 +866,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) return NULL; } rcu_read_lock(); + uint16_t old_shadow_avail_idx = vq->shadow_avail_idx; if (virtio_queue_empty_rcu(vq)) { goto done; } @@ -879,6 +880,12 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); + fprintf(stderr, "old_shadow_avail_idx %u last_avail_idx %u avail_idx %u\n", old_shadow_avail_idx, vq->last_avail_idx, vq->shadow_avail_idx); + fprintf(stderr, "avail %#" HWADDR_PRIx " avail_idx (cache bypassed) %u\n", vq->vring.avail, virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); + fprintf(stderr, "used_idx %u\n", vq->used_idx); + abort(); /* <--- core dump! */ virtio_error(vdev, "Virtqueue size exceeded"); goto done; } [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply related [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-11 3:17 ` Stefan Hajnoczi @ 2019-02-11 9:48 ` Fernando Casas Schössow 2019-02-18 7:21 ` Fernando Casas Schössow 1 sibling, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-11 9:48 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block Thanks for looking into this Stefan. I rebuilt Qemu with the new patch and got a couple of guests running with the new build. Two of them using virtio-scsi and another one using virtio-blk. Now I'm waiting for any of them to crash. I also set libvirt to include the guest memory in the qemu dumps as I understood you will want to look at both (qemu dump and guest memory dump). I will reply to this thread once I have any news. Kind regards. Fernando On lun, feb 11, 2019 at 4:17 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: On Wed, Feb 06, 2019 at 04:47:19PM +0000, Fernando Casas Schössow wrote: I could also repro the same with virtio-scsi on the same guest a couple of hours later: 2019-02-06 07:10:37.672+0000: starting up libvirt version: 4.10.0, qemu version: 3.1.0, kernel: 4.19.18-0-vanilla, hostname: vmsvr01.homenet.local LC_ALL=C PATH=/bin:/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin HOME=/root USER=root QEMU_AUDIO_DRV=spice /home/fernando/qemu-system-x86_64 -name guest=DOCKER01,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-32-DOCKER01/master-key.aes -machine pc-i440fx-3.1,accel=kvm,usb=off,dump-guest-core=off -cpu IvyBridge,ss=on,vmx=on,pcid=on,hypervisor=on,arat=on,tsc_adjust=on,umip=on,xsaveopt=on -drive file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/DOCKER01_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 4705b146-3b14-4c20-923c-42105d47e7fc -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=46,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/storage/storage-ssd-vms/virtual_machines_ssd/docker01.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none,aio=threads -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,write-cache=on -netdev tap,fd=48,id=hostnet0,vhost=on,vhostfd=50 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:1c:af:ce,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=51,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -spice port=5904,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x8 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on 2019-02-06 07:10:37.672+0000: Domain id=32 is tainted: high-privileges char device redirected to /dev/pts/5 (label charserial0) vdev 0x5585456ef6b0 ("virtio-scsi") vq 0x5585456f90a0 (idx 2) inuse 128 vring.num 128 2019-02-06T13:00:46.942424Z qemu-system-x86_64: Virtqueue size exceeded I'm open to any tests or suggestions that can move the investigation forward and find the cause of this issue. Thanks for collecting the data! The fact that both virtio-blk and virtio-scsi failed suggests it's not a virtqueue element leak in the virtio-blk or virtio-scsi device emulation code. The hung task error messages from inside the guest are a consequence of QEMU hitting the "Virtqueue size exceeded" error. QEMU refuses to process further requests after the error, causing tasks inside the guest to get stuck on I/O. I don't have a good theory regarding the root cause. Two ideas: 1. The guest is corrupting the vring or submitting more requests than will fit into the ring. Somewhat unlikely because it happens with both Windows and Linux guests. 2. QEMU's virtqueue code is buggy, maybe the memory region cache which is used for fast guest RAM accesses. Here is an expanded version of the debug patch which might help identify which of these scenarios is likely. Sorry, it requires running the guest again! This time let's make QEMU dump core so both QEMU state and guest RAM are captured for further debugging. That way it will be possible to extract more information using gdb without rerunning. Stefan --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a1ff647a66..28d89fcbcb 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -866,6 +866,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) return NULL; } rcu_read_lock(); + uint16_t old_shadow_avail_idx = vq->shadow_avail_idx; if (virtio_queue_empty_rcu(vq)) { goto done; } @@ -879,6 +880,12 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); + fprintf(stderr, "old_shadow_avail_idx %u last_avail_idx %u avail_idx %u\n", old_shadow_avail_idx, vq->last_avail_idx, vq->shadow_avail_idx); + fprintf(stderr, "avail %#" HWADDR_PRIx " avail_idx (cache bypassed) %u\n", vq->vring.avail, virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); + fprintf(stderr, "used_idx %u\n", vq->used_idx); + abort(); /* <--- core dump! */ virtio_error(vdev, "Virtqueue size exceeded"); goto done; } ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-11 3:17 ` Stefan Hajnoczi 2019-02-11 9:48 ` Fernando Casas Schössow @ 2019-02-18 7:21 ` Fernando Casas Schössow [not found] ` <VI1PR0602MB3245424120D151F29884A7E2A4630@VI1PR0602MB3245.eurprd06.prod.outlo ok.com> 2019-02-20 16:58 ` Stefan Hajnoczi 1 sibling, 2 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-18 7:21 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block It took a few days but last night the problem was reproduced. This is the information from the log: vdev 0x55f261d940f0 ("virtio-blk") vq 0x55f261d9ee40 (idx 0) inuse 128 vring.num 128 old_shadow_avail_idx 58874 last_avail_idx 58625 avail_idx 58874 avail 0x3d87a800 avail_idx (cache bypassed) 58625 used_idx 58497 2019-02-18 03:20:08.605+0000: shutting down, reason=crashed The dump file, including guest memory, was generated successfully (after gzip the file is around 492MB). I switched the guest now to virtio-scsi to get the information and dump with this setup as well. How should we proceed? Thanks. On lun, feb 11, 2019 at 4:17 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: Thanks for collecting the data! The fact that both virtio-blk and virtio-scsi failed suggests it's not a virtqueue element leak in the virtio-blk or virtio-scsi device emulation code. The hung task error messages from inside the guest are a consequence of QEMU hitting the "Virtqueue size exceeded" error. QEMU refuses to process further requests after the error, causing tasks inside the guest to get stuck on I/O. I don't have a good theory regarding the root cause. Two ideas: 1. The guest is corrupting the vring or submitting more requests than will fit into the ring. Somewhat unlikely because it happens with both Windows and Linux guests. 2. QEMU's virtqueue code is buggy, maybe the memory region cache which is used for fast guest RAM accesses. Here is an expanded version of the debug patch which might help identify which of these scenarios is likely. Sorry, it requires running the guest again! This time let's make QEMU dump core so both QEMU state and guest RAM are captured for further debugging. That way it will be possible to extract more information using gdb without rerunning. Stefan --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a1ff647a66..28d89fcbcb 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -866,6 +866,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) return NULL; } rcu_read_lock(); + uint16_t old_shadow_avail_idx = vq->shadow_avail_idx; if (virtio_queue_empty_rcu(vq)) { goto done; } @@ -879,6 +880,12 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); + fprintf(stderr, "old_shadow_avail_idx %u last_avail_idx %u avail_idx %u\n", old_shadow_avail_idx, vq->last_avail_idx, vq->shadow_avail_idx); + fprintf(stderr, "avail %#" HWADDR_PRIx " avail_idx (cache bypassed) %u\n", vq->vring.avail, virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); + fprintf(stderr, "used_idx %u\n", vq->used_idx); + abort(); /* <--- core dump! */ virtio_error(vdev, "Virtqueue size exceeded"); goto done; } ^ permalink raw reply [flat|nested] 55+ messages in thread
[parent not found: <VI1PR0602MB3245424120D151F29884A7E2A4630@VI1PR0602MB3245.eurprd06.prod.outlo ok.com>]
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error [not found] ` <VI1PR0602MB3245424120D151F29884A7E2A4630@VI1PR0602MB3245.eurprd06.prod.outlo ok.com> @ 2019-02-19 7:26 ` Fernando Casas Schössow 0 siblings, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-19 7:26 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block Problem reproduced with virtio-scsi as well on the same guest, this time it took less than a day. Information from the log file: vdev 0x55823f119f90 ("virtio-scsi") vq 0x55823f122e80 (idx 2) inuse 128 vring.num 128 old_shadow_avail_idx 58367 last_avail_idx 58113 avail_idx 58367 avail 0x3de8a800 avail_idx (cache bypassed) 58113 used_idx 57985 2019-02-19 04:20:43.291+0000: shutting down, reason=crashed Got the dump file as well, including guest memory. Size is around 486MB after compression. Is there any other information I should collect to progress the investigation? Thanks. On lun, feb 18, 2019 at 8:21 AM, Fernando Casas Schössow <casasfernando@outlook.com> wrote: It took a few days but last night the problem was reproduced. This is the information from the log: vdev 0x55f261d940f0 ("virtio-blk") vq 0x55f261d9ee40 (idx 0) inuse 128 vring.num 128 old_shadow_avail_idx 58874 last_avail_idx 58625 avail_idx 58874 avail 0x3d87a800 avail_idx (cache bypassed) 58625 used_idx 58497 2019-02-18 03:20:08.605+0000: shutting down, reason=crashed The dump file, including guest memory, was generated successfully (after gzip the file is around 492MB). I switched the guest now to virtio-scsi to get the information and dump with this setup as well. How should we proceed? Thanks. On lun, feb 11, 2019 at 4:17 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: Thanks for collecting the data! The fact that both virtio-blk and virtio-scsi failed suggests it's not a virtqueue element leak in the virtio-blk or virtio-scsi device emulation code. The hung task error messages from inside the guest are a consequence of QEMU hitting the "Virtqueue size exceeded" error. QEMU refuses to process further requests after the error, causing tasks inside the guest to get stuck on I/O. I don't have a good theory regarding the root cause. Two ideas: 1. The guest is corrupting the vring or submitting more requests than will fit into the ring. Somewhat unlikely because it happens with both Windows and Linux guests. 2. QEMU's virtqueue code is buggy, maybe the memory region cache which is used for fast guest RAM accesses. Here is an expanded version of the debug patch which might help identify which of these scenarios is likely. Sorry, it requires running the guest again! This time let's make QEMU dump core so both QEMU state and guest RAM are captured for further debugging. That way it will be possible to extract more information using gdb without rerunning. Stefan --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a1ff647a66..28d89fcbcb 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -866,6 +866,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) return NULL; } rcu_read_lock(); + uint16_t old_shadow_avail_idx = vq->shadow_avail_idx; if (virtio_queue_empty_rcu(vq)) { goto done; } @@ -879,6 +880,12 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); + fprintf(stderr, "old_shadow_avail_idx %u last_avail_idx %u avail_idx %u\n", old_shadow_avail_idx, vq->last_avail_idx, vq->shadow_avail_idx); + fprintf(stderr, "avail %#" HWADDR_PRIx " avail_idx (cache bypassed) %u\n", vq->vring.avail, virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); + fprintf(stderr, "used_idx %u\n", vq->used_idx); + abort(); /* <--- core dump! */ virtio_error(vdev, "Virtqueue size exceeded"); goto done; } ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-18 7:21 ` Fernando Casas Schössow [not found] ` <VI1PR0602MB3245424120D151F29884A7E2A4630@VI1PR0602MB3245.eurprd06.prod.outlo ok.com> @ 2019-02-20 16:58 ` Stefan Hajnoczi 2019-02-20 17:53 ` Paolo Bonzini 1 sibling, 1 reply; 55+ messages in thread From: Stefan Hajnoczi @ 2019-02-20 16:58 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, qemu-block, Fernando Casas Schössow [-- Attachment #1: Type: text/plain, Size: 4873 bytes --] On Mon, Feb 18, 2019 at 07:21:25AM +0000, Fernando Casas Schössow wrote: > It took a few days but last night the problem was reproduced. > This is the information from the log: > > vdev 0x55f261d940f0 ("virtio-blk") > vq 0x55f261d9ee40 (idx 0) > inuse 128 vring.num 128 > old_shadow_avail_idx 58874 last_avail_idx 58625 avail_idx 58874 > avail 0x3d87a800 avail_idx (cache bypassed) 58625 Hi Paolo, Are you aware of any recent MemoryRegionCache issues? The avail_idx value 58874 was read via the cache while a non-cached read produces 58625! I suspect that 58625 is correct since the vring is already full and the driver wouldn't bump avail_idx any further until requests complete. Fernando also hits this issue with virtio-scsi so it's not a virtio_blk.ko driver bug or a virtio-blk device emulation issue. A QEMU core dump is available for debugging. Here is the patch that produced this debug output: diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a1ff647a66..28d89fcbcb 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -866,6 +866,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) return NULL; } rcu_read_lock(); + uint16_t old_shadow_avail_idx = vq->shadow_avail_idx; if (virtio_queue_empty_rcu(vq)) { goto done; } @@ -879,6 +880,12 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); + fprintf(stderr, "old_shadow_avail_idx %u last_avail_idx %u avail_idx %u\n", old_shadow_avail_idx, vq->last_avail_idx, vq->shadow_avail_idx); + fprintf(stderr, "avail %#" HWADDR_PRIx " avail_idx (cache bypassed) %u\n", vq->vring.avail, virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); + fprintf(stderr, "used_idx %u\n", vq->used_idx); + abort(); /* <--- core dump! */ virtio_error(vdev, "Virtqueue size exceeded"); goto done; } Stefan > used_idx 58497 > 2019-02-18 03:20:08.605+0000: shutting down, reason=crashed > > The dump file, including guest memory, was generated successfully (after gzip the file is around 492MB). > I switched the guest now to virtio-scsi to get the information and dump with this setup as well. > > How should we proceed? > > Thanks. > > On lun, feb 11, 2019 at 4:17 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > Thanks for collecting the data! The fact that both virtio-blk and virtio-scsi failed suggests it's not a virtqueue element leak in the virtio-blk or virtio-scsi device emulation code. The hung task error messages from inside the guest are a consequence of QEMU hitting the "Virtqueue size exceeded" error. QEMU refuses to process further requests after the error, causing tasks inside the guest to get stuck on I/O. I don't have a good theory regarding the root cause. Two ideas: 1. The guest is corrupting the vring or submitting more requests than will fit into the ring. Somewhat unlikely because it happens with both Windows and Linux guests. 2. QEMU's virtqueue code is buggy, maybe the memory region cache which is used for fast guest RAM accesses. Here is an expanded version of the debug patch which might help identify which of these scenarios is likely. Sorry, it requires running the guest again! This time let's make QEMU dump core so both QEMU state and guest RAM are captured for further debugging. That way it will be possible to extract more information using gdb without rerunning. Stefan --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a1ff647a66..28d89fcbcb 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -866,6 +866,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) return NULL; } rcu_read_lock(); + uint16_t old_shadow_avail_idx = vq->shadow_avail_idx; if (virtio_queue_empty_rcu(vq)) { goto done; } @@ -879,6 +880,12 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); + fprintf(stderr, "old_shadow_avail_idx %u last_avail_idx %u avail_idx %u\n", old_shadow_avail_idx, vq->last_avail_idx, vq->shadow_avail_idx); + fprintf(stderr, "avail %#" HWADDR_PRIx " avail_idx (cache bypassed) %u\n", vq->vring.avail, virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); + fprintf(stderr, "used_idx %u\n", vq->used_idx); + abort(); /* <--- core dump! */ virtio_error(vdev, "Virtqueue size exceeded"); goto done; } > > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply related [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-20 16:58 ` Stefan Hajnoczi @ 2019-02-20 17:53 ` Paolo Bonzini 2019-02-20 18:56 ` Fernando Casas Schössow 0 siblings, 1 reply; 55+ messages in thread From: Paolo Bonzini @ 2019-02-20 17:53 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block, Fernando Casas Schössow [-- Attachment #1: Type: text/plain, Size: 5473 bytes --] On 20/02/19 17:58, Stefan Hajnoczi wrote: > On Mon, Feb 18, 2019 at 07:21:25AM +0000, Fernando Casas Schössow wrote: >> It took a few days but last night the problem was reproduced. >> This is the information from the log: >> >> vdev 0x55f261d940f0 ("virtio-blk") >> vq 0x55f261d9ee40 (idx 0) >> inuse 128 vring.num 128 >> old_shadow_avail_idx 58874 last_avail_idx 58625 avail_idx 58874 >> avail 0x3d87a800 avail_idx (cache bypassed) 58625 > > Hi Paolo, > Are you aware of any recent MemoryRegionCache issues? The avail_idx > value 58874 was read via the cache while a non-cached read produces > 58625! > > I suspect that 58625 is correct since the vring is already full and the > driver wouldn't bump avail_idx any further until requests complete. > > Fernando also hits this issue with virtio-scsi so it's not a > virtio_blk.ko driver bug or a virtio-blk device emulation issue. No, I am not aware of any issues. How can I get the core dump (and the corresponding executable to get the symbols)? Alternatively, it should be enough to print the vq->vring.caches->avail.mrs from the debugger. Also, one possibility is to add in vring_avail_idx an assertion like assert(vq->shadow_availa_idx == virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); and try to catch the error earlier. Paolo > A QEMU core dump is available for debugging. > > Here is the patch that produced this debug output: > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > index a1ff647a66..28d89fcbcb 100644 > --- a/hw/virtio/virtio.c > +++ b/hw/virtio/virtio.c > @@ -866,6 +866,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) > return NULL; > } > rcu_read_lock(); > + uint16_t old_shadow_avail_idx = vq->shadow_avail_idx; > if (virtio_queue_empty_rcu(vq)) { > goto done; > } > @@ -879,6 +880,12 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) > max = vq->vring.num; > > if (vq->inuse >= vq->vring.num) { > + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); > + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); > + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); > + fprintf(stderr, "old_shadow_avail_idx %u last_avail_idx %u avail_idx %u\n", old_shadow_avail_idx, vq->last_avail_idx, vq->shadow_avail_idx); > + fprintf(stderr, "avail %#" HWADDR_PRIx " avail_idx (cache bypassed) %u\n", vq->vring.avail, virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); > + fprintf(stderr, "used_idx %u\n", vq->used_idx); > + abort(); /* <--- core dump! */ > virtio_error(vdev, "Virtqueue size exceeded"); > goto done; > } > > Stefan > >> used_idx 58497 >> 2019-02-18 03:20:08.605+0000: shutting down, reason=crashed >> >> The dump file, including guest memory, was generated successfully (after gzip the file is around 492MB). >> I switched the guest now to virtio-scsi to get the information and dump with this setup as well. >> >> How should we proceed? >> >> Thanks. >> >> On lun, feb 11, 2019 at 4:17 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote: >> Thanks for collecting the data! The fact that both virtio-blk and virtio-scsi failed suggests it's not a virtqueue element leak in the virtio-blk or virtio-scsi device emulation code. The hung task error messages from inside the guest are a consequence of QEMU hitting the "Virtqueue size exceeded" error. QEMU refuses to process further requests after the error, causing tasks inside the guest to get stuck on I/O. I don't have a good theory regarding the root cause. Two ideas: 1. The guest is corrupting the vring or submitting more requests than will fit into the ring. Somewhat unlikely because it happens with both Windows and Linux guests. 2. QEMU's virtqueue code is buggy, maybe the memory region cache which is used for fast guest RAM accesses. Here is an expanded version of the debug patch which might help identify which of these scenarios is likely. Sorry, it requires running the guest again! This time let's make QEMU dump core so both QEMU state and guest RAM are captured for further debugging. That way it will be possible to extract more information using gdb without rerunning. Stefan --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a1ff647a66..28d89fcbcb 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -866,6 +866,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) return NULL; } rcu_read_lock(); + uint16_t old_shadow_avail_idx = vq->shadow_avail_idx; if (virtio_queue_empty_rcu(vq)) { goto done; } @@ -879,6 +880,12 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); + fprintf(stderr, "old_shadow_avail_idx %u last_avail_idx %u avail_idx %u\n", old_shadow_avail_idx, vq->last_avail_idx, vq->shadow_avail_idx); + fprintf(stderr, "avail %#" HWADDR_PRIx " avail_idx (cache bypassed) %u\n", vq->vring.avail, virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); + fprintf(stderr, "used_idx %u\n", vq->used_idx); + abort(); /* <--- core dump! */ virtio_error(vdev, "Virtqueue size exceeded"); goto done; } >> >> [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-20 17:53 ` Paolo Bonzini @ 2019-02-20 18:56 ` Fernando Casas Schössow 2019-02-21 11:11 ` Stefan Hajnoczi 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-20 18:56 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Stefan Hajnoczi, qemu-devel, qemu-block Hi Paolo, This is Fernando, the one that reported the issue. Regarding the dumps I have three of them including guest memory, 2 for virtio-scsi, 1 for virtio-blk, in case a comparison may help to confirm which is the proble.) I can upload them to a server you indicate me or I can also put them on a server so you can download them as you see fit. Each dump, compressed, is around 500MB. If it's more convenient for you I can try to get the requested information from gdb. But I will need some guidance since I'm not skilled enough with the debugger. Another option, if you provide me with the right patch, is for me to patch, rebuild QEMU and repro the problem again. With virtio-scsi I'm able to repro this in a matter of hours most of the times, with virtio-blk it will take a couple of days. Just let me know how do you prefer to move forward. Thanks a lot for helping with this! Kind regards, Fernando On mié, feb 20, 2019 at 6:53 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: On 20/02/19 17:58, Stefan Hajnoczi wrote: On Mon, Feb 18, 2019 at 07:21:25AM +0000, Fernando Casas Schössow wrote: It took a few days but last night the problem was reproduced. This is the information from the log: vdev 0x55f261d940f0 ("virtio-blk") vq 0x55f261d9ee40 (idx 0) inuse 128 vring.num 128 old_shadow_avail_idx 58874 last_avail_idx 58625 avail_idx 58874 avail 0x3d87a800 avail_idx (cache bypassed) 58625 Hi Paolo, Are you aware of any recent MemoryRegionCache issues? The avail_idx value 58874 was read via the cache while a non-cached read produces 58625! I suspect that 58625 is correct since the vring is already full and the driver wouldn't bump avail_idx any further until requests complete. Fernando also hits this issue with virtio-scsi so it's not a virtio_blk.ko driver bug or a virtio-blk device emulation issue. No, I am not aware of any issues. How can I get the core dump (and the corresponding executable to get the symbols)? Alternatively, it should be enough to print the vq->vring.caches->avail.mrs from the debugger. Also, one possibility is to add in vring_avail_idx an assertion like assert(vq->shadow_availa_idx == virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); and try to catch the error earlier. Paolo A QEMU core dump is available for debugging. Here is the patch that produced this debug output: diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a1ff647a66..28d89fcbcb 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -866,6 +866,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) return NULL; } rcu_read_lock(); + uint16_t old_shadow_avail_idx = vq->shadow_avail_idx; if (virtio_queue_empty_rcu(vq)) { goto done; } @@ -879,6 +880,12 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); + fprintf(stderr, "old_shadow_avail_idx %u last_avail_idx %u avail_idx %u\n", old_shadow_avail_idx, vq->last_avail_idx, vq->shadow_avail_idx); + fprintf(stderr, "avail %#" HWADDR_PRIx " avail_idx (cache bypassed) %u\n", vq->vring.avail, virtio_lduw_phys(vdev, vq->vring.avail + offsetof(VRingAvail, idx))); + fprintf(stderr, "used_idx %u\n", vq->used_idx); + abort(); /* <--- core dump! */ virtio_error(vdev, "Virtqueue size exceeded"); goto done; } Stefan used_idx 58497 2019-02-18 03:20:08.605+0000: shutting down, reason=crashed The dump file, including guest memory, was generated successfully (after gzip the file is around 492MB). I switched the guest now to virtio-scsi to get the information and dump with this setup as well. How should we proceed? Thanks. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-20 18:56 ` Fernando Casas Schössow @ 2019-02-21 11:11 ` Stefan Hajnoczi 2019-02-21 11:33 ` Fernando Casas Schössow [not found] ` <VI1PR0602MB3245593855B029B427ED544FA47E0@VI1PR0602MB3245.eurprd06.prod.outlook.com> 0 siblings, 2 replies; 55+ messages in thread From: Stefan Hajnoczi @ 2019-02-21 11:11 UTC (permalink / raw) To: Fernando Casas Schössow; +Cc: Paolo Bonzini, qemu-devel, qemu-block [-- Attachment #1: Type: text/plain, Size: 1021 bytes --] On Wed, Feb 20, 2019 at 06:56:04PM +0000, Fernando Casas Schössow wrote: > Regarding the dumps I have three of them including guest memory, 2 for virtio-scsi, 1 for virtio-blk, in case a comparison may help to confirm which is the proble.) I can upload them to a server you indicate me or I can also put them on a server so you can download them as you see fit. Each dump, compressed, is around 500MB. Hi Fernando, It would be great if you could make a compressed coredump and the corresponding QEMU executable (hopefully with debug symbols) available on a server so Paolo and/or I can download them. If you're wondering about the debug symbols, since you built QEMU from source the binary in x86_64-softmmu/qemu-system-x86_64 should have the debug symbols. "gdb path/to/qemu-system-x86_64" will print "Reading symbols from qemu-system-x86_64...done." if symbols are available. Otherwise it will say "Reading symbols from qemu-system-x86_64...(no debugging symbols found)...done.". Thanks, Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-21 11:11 ` Stefan Hajnoczi @ 2019-02-21 11:33 ` Fernando Casas Schössow [not found] ` <VI1PR0602MB3245593855B029B427ED544FA47E0@VI1PR0602MB3245.eurprd06.prod.outlook.com> 1 sibling, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-21 11:33 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, qemu-block Hi Stefan, I can confirm that the symbols are included in the binary using gdb. I will send you and Paolo an email with the link and credentials (if needed) so you can download everything. Thanks! On jue, feb 21, 2019 at 12:11 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: On Wed, Feb 20, 2019 at 06:56:04PM +0000, Fernando Casas Schössow wrote: Regarding the dumps I have three of them including guest memory, 2 for virtio-scsi, 1 for virtio-blk, in case a comparison may help to confirm which is the proble.) I can upload them to a server you indicate me or I can also put them on a server so you can download them as you see fit. Each dump, compressed, is around 500MB. Hi Fernando, It would be great if you could make a compressed coredump and the corresponding QEMU executable (hopefully with debug symbols) available on a server so Paolo and/or I can download them. If you're wondering about the debug symbols, since you built QEMU from source the binary in x86_64-softmmu/qemu-system-x86_64 should have the debug symbols. "gdb path/to/qemu-system-x86_64" will print "Reading symbols from qemu-system-x86_64...done." if symbols are available. Otherwise it will say "Reading symbols from qemu-system-x86_64...(no debugging symbols found)...done.". Thanks, Stefan ^ permalink raw reply [flat|nested] 55+ messages in thread
[parent not found: <VI1PR0602MB3245593855B029B427ED544FA47E0@VI1PR0602MB3245.eurprd06.prod.outlook.com>]
[parent not found: <CAJSP0QUs9Yz2-k1KyVMwpgx6RwY9cK7qdQRCQ74xmgXJPJR-qw@mail.gmail.com>]
[parent not found: <VI1PR0602MB32453A8B5CBC0308C7D18F1DA47E0@VI1PR0602MB3245.eurprd06.prod.outlook.com>]
[parent not found: <CAJSP0QVxaW3tezjBN9owJHsxzE9h8_qcaeRr5zHHKxKJOeFnkQ@mail.gmail.com>]
[parent not found: <CAJSP0QVXoZJ9MJ0qp4RM_m2fGJ8iFSyJMAU_X7mdiQvpOK59KA@mail.gmail.com>]
[parent not found: <VI1PR0602MB324516419266A934FE7759C6A47E0@VI1PR0602MB3245.eurprd06.prod.outlook.com>]
[parent not found: <VI1PR0602MB324516419266A934FE7759C6A47E0@VI1PR0602MB3245.eurprd06.prod.outlo>]
[parent not found: <VI1PR0602MB32454C17192EFA863E29CC49A47E0@VI1PR0602MB3245.eurprd06.prod.outlo>]
[parent not found: <VI1PR0602MB324547F72DA9EDEB1613C888A47E0@VI1PR0602MB3245.eurprd06.prod.outlook.com>]
[parent not found: <CAJSP0QUg=cq3tCSLidQ9BR2hxAo3K6gA6LKtpx5Rjb=_6XgJ6Q@mail.gmail.com>]
[parent not found: <28e6b4ed-9afd-3a79-6267-86c7385c23ce@redhat.com>]
[parent not found: <VI1PR0602MB324578F91F1AF9390D03022FA47F0@VI1PR0602MB3245.eurprd06.prod.outlook.com>]
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error [not found] ` <VI1PR0602MB324578F91F1AF9390D03022FA47F0@VI1PR0602MB3245.eurprd06.prod.outlook.com> @ 2019-02-22 14:04 ` Stefan Hajnoczi 2019-02-22 14:38 ` Paolo Bonzini ` (5 more replies) 0 siblings, 6 replies; 55+ messages in thread From: Stefan Hajnoczi @ 2019-02-22 14:04 UTC (permalink / raw) To: Fernando Casas Schössow Cc: Paolo Bonzini, Natanael Copa, Richard Henderson, qemu-devel On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow <casasfernando@outlook.com> wrote: I have CCed Natanael Copa, qemu package maintainer in Alpine Linux. Fernando: Can you confirm that the bug occurs with an unmodified Alpine Linux qemu binary? Richard: Commit 7db2145a6826b14efceb8dd64bfe6ad8647072eb ("bswap: Add host endian unaligned access functions") introduced the unaligned memory access functions in question here. Please see below for details on the bug - basically QEMU code assumes they are atomic, but that is not guaranteed :(. Any ideas for how to fix this? Natanael: It seems likely that the qemu package in Alpine Linux suffers from a compilation issue resulting in a broken QEMU. It may be necessary to leave the compiler optimization flag alone in APKBUILD to work around this problem. Here are the details. QEMU relies on the compiler turning memcpy(&dst, &src, 2) turning into a load instruction in include/qemu/bswap.h:lduw_he_p() (explanation below): /* Any compiler worth its salt will turn these memcpy into native unaligned operations. Thus we don't need to play games with packed attributes, or inline byte-by-byte stores. */ static inline int lduw_he_p(const void *ptr) { uint16_t r; memcpy(&r, ptr, sizeof(r)); return r; } Here is the disassembly snippet of virtqueue_pop() from Fedora 29 that shows the load instruction: 398166: 0f b7 42 02 movzwl 0x2(%rdx),%eax 39816a: 66 89 43 32 mov %ax,0x32(%rbx) Here is the instruction sequence in the Alpine Linux binary: 455562: ba 02 00 00 00 mov $0x2,%edx 455567: e8 74 24 f3 ff callq 3879e0 <memcpy@plt> 45556c: 0f b7 44 24 42 movzwl 0x42(%rsp),%eax 455571: 66 41 89 47 32 mov %ax,0x32(%r15) It's calling memcpy instead of using a load instruction. Fernando found that QEMU's virtqueue_pop() function sees bogus values when loading a 16-bit guest RAM location. Paolo figured out that the bogus value can be produced by memcpy() when another thread is updating the 16-bit memory location simultaneously. This is a race condition between one thread loading the 16-bit value and another thread storing it (in this case a guest vcpu thread). Sometimes memcpy() may load one old byte and one new byte, resulting in a bogus value. The symptom that Fernando experienced is a "Virtqueue size exceeded" error message from QEMU and then the virtio-blk or virtio-scsi device stops working. This issue potentially affects other device emulation code in QEMU as well, not just virtio devices. For the time being, I suggest tweaking the APKBUILD so the memcpy() is not generated. Hopefully QEMU can make the code more portable in the future so the compiler always does the expected thing, but this may not be easily possible. Stefan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-22 14:04 ` Stefan Hajnoczi @ 2019-02-22 14:38 ` Paolo Bonzini 2019-02-22 14:43 ` Fernando Casas Schössow ` (4 subsequent siblings) 5 siblings, 0 replies; 55+ messages in thread From: Paolo Bonzini @ 2019-02-22 14:38 UTC (permalink / raw) To: Stefan Hajnoczi, Fernando Casas Schössow Cc: Natanael Copa, Richard Henderson, qemu-devel On 22/02/19 15:04, Stefan Hajnoczi wrote: > On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow > <casasfernando@outlook.com> wrote: > > I have CCed Natanael Copa, qemu package maintainer in Alpine Linux. > > Fernando: Can you confirm that the bug occurs with an unmodified > Alpine Linux qemu binary? I can confirm that the memcpy calls are there, at least. Paolo > Richard: Commit 7db2145a6826b14efceb8dd64bfe6ad8647072eb ("bswap: Add > host endian unaligned access functions") introduced the unaligned > memory access functions in question here. Please see below for > details on the bug - basically QEMU code assumes they are atomic, but > that is not guaranteed :(. Any ideas for how to fix this? > > Natanael: It seems likely that the qemu package in Alpine Linux > suffers from a compilation issue resulting in a broken QEMU. It may > be necessary to leave the compiler optimization flag alone in APKBUILD > to work around this problem. > > Here are the details. QEMU relies on the compiler turning > memcpy(&dst, &src, 2) turning into a load instruction in > include/qemu/bswap.h:lduw_he_p() (explanation below): > > /* Any compiler worth its salt will turn these memcpy into native unaligned > operations. Thus we don't need to play games with packed attributes, or > inline byte-by-byte stores. */ > > static inline int lduw_he_p(const void *ptr) > { > uint16_t r; > memcpy(&r, ptr, sizeof(r)); > return r; > } > > Here is the disassembly snippet of virtqueue_pop() from Fedora 29 that > shows the load instruction: > > 398166: 0f b7 42 02 movzwl 0x2(%rdx),%eax > 39816a: 66 89 43 32 mov %ax,0x32(%rbx) > > Here is the instruction sequence in the Alpine Linux binary: > > 455562: ba 02 00 00 00 mov $0x2,%edx > 455567: e8 74 24 f3 ff callq 3879e0 <memcpy@plt> > 45556c: 0f b7 44 24 42 movzwl 0x42(%rsp),%eax > 455571: 66 41 89 47 32 mov %ax,0x32(%r15) > > It's calling memcpy instead of using a load instruction. > > Fernando found that QEMU's virtqueue_pop() function sees bogus values > when loading a 16-bit guest RAM location. Paolo figured out that the > bogus value can be produced by memcpy() when another thread is > updating the 16-bit memory location simultaneously. This is a race > condition between one thread loading the 16-bit value and another > thread storing it (in this case a guest vcpu thread). Sometimes > memcpy() may load one old byte and one new byte, resulting in a bogus > value. > > The symptom that Fernando experienced is a "Virtqueue size exceeded" > error message from QEMU and then the virtio-blk or virtio-scsi device > stops working. This issue potentially affects other device emulation > code in QEMU as well, not just virtio devices. > > For the time being, I suggest tweaking the APKBUILD so the memcpy() is > not generated. Hopefully QEMU can make the code more portable in the > future so the compiler always does the expected thing, but this may > not be easily possible. > > Stefan > ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-22 14:04 ` Stefan Hajnoczi 2019-02-22 14:38 ` Paolo Bonzini @ 2019-02-22 14:43 ` Fernando Casas Schössow 2019-02-22 14:55 ` Paolo Bonzini 2019-02-22 16:37 ` Dr. David Alan Gilbert ` (3 subsequent siblings) 5 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-22 14:43 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Fernando Casas Schössow, Paolo Bonzini, Natanael Copa, Richard Henderson, qemu-devel Hi all, Indeed I can confirm this issue occurs with the stock, unmodified QEMU binary provided with Alpine since at least Alpine 3.6 up to 3.9. Is there any compiler flag I can tweak, add or remove to rebuild qemu and try to repro to confirm a possible workaround? Thanks. Fernando Sent from my iPhone > On 22 Feb 2019, at 15:04, Stefan Hajnoczi <stefanha@gmail.com> wrote: > > On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow > <casasfernando@outlook.com> wrote: > > I have CCed Natanael Copa, qemu package maintainer in Alpine Linux. > > Fernando: Can you confirm that the bug occurs with an unmodified > Alpine Linux qemu binary? > > Richard: Commit 7db2145a6826b14efceb8dd64bfe6ad8647072eb ("bswap: Add > host endian unaligned access functions") introduced the unaligned > memory access functions in question here. Please see below for > details on the bug - basically QEMU code assumes they are atomic, but > that is not guaranteed :(. Any ideas for how to fix this? > > Natanael: It seems likely that the qemu package in Alpine Linux > suffers from a compilation issue resulting in a broken QEMU. It may > be necessary to leave the compiler optimization flag alone in APKBUILD > to work around this problem. > > Here are the details. QEMU relies on the compiler turning > memcpy(&dst, &src, 2) turning into a load instruction in > include/qemu/bswap.h:lduw_he_p() (explanation below): > > /* Any compiler worth its salt will turn these memcpy into native unaligned > operations. Thus we don't need to play games with packed attributes, or > inline byte-by-byte stores. */ > > static inline int lduw_he_p(const void *ptr) > { > uint16_t r; > memcpy(&r, ptr, sizeof(r)); > return r; > } > > Here is the disassembly snippet of virtqueue_pop() from Fedora 29 that > shows the load instruction: > > 398166: 0f b7 42 02 movzwl 0x2(%rdx),%eax > 39816a: 66 89 43 32 mov %ax,0x32(%rbx) > > Here is the instruction sequence in the Alpine Linux binary: > > 455562: ba 02 00 00 00 mov $0x2,%edx > 455567: e8 74 24 f3 ff callq 3879e0 <memcpy@plt> > 45556c: 0f b7 44 24 42 movzwl 0x42(%rsp),%eax > 455571: 66 41 89 47 32 mov %ax,0x32(%r15) > > It's calling memcpy instead of using a load instruction. > > Fernando found that QEMU's virtqueue_pop() function sees bogus values > when loading a 16-bit guest RAM location. Paolo figured out that the > bogus value can be produced by memcpy() when another thread is > updating the 16-bit memory location simultaneously. This is a race > condition between one thread loading the 16-bit value and another > thread storing it (in this case a guest vcpu thread). Sometimes > memcpy() may load one old byte and one new byte, resulting in a bogus > value. > > The symptom that Fernando experienced is a "Virtqueue size exceeded" > error message from QEMU and then the virtio-blk or virtio-scsi device > stops working. This issue potentially affects other device emulation > code in QEMU as well, not just virtio devices. > > For the time being, I suggest tweaking the APKBUILD so the memcpy() is > not generated. Hopefully QEMU can make the code more portable in the > future so the compiler always does the expected thing, but this may > not be easily possible. > > Stefan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-22 14:43 ` Fernando Casas Schössow @ 2019-02-22 14:55 ` Paolo Bonzini 2019-02-22 15:48 ` Fernando Casas Schössow 0 siblings, 1 reply; 55+ messages in thread From: Paolo Bonzini @ 2019-02-22 14:55 UTC (permalink / raw) To: Fernando Casas Schössow, Stefan Hajnoczi Cc: Natanael Copa, Richard Henderson, qemu-devel On 22/02/19 15:43, Fernando Casas Schössow wrote: > Hi all, > > Indeed I can confirm this issue occurs with the stock, unmodified > QEMU binary provided with Alpine since at least Alpine 3.6 up to > 3.9. > > Is there any compiler flag I can tweak, add or remove to rebuild qemu > and try to repro to confirm a possible workaround? Nope, not yet. However, I can try some experiments if you can provide some information on how to rebuild an apk. Paolo > Thanks. > > Fernando > > Sent from my iPhone > >> On 22 Feb 2019, at 15:04, Stefan Hajnoczi <stefanha@gmail.com> wrote: >> >> On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow >> <casasfernando@outlook.com> wrote: >> >> I have CCed Natanael Copa, qemu package maintainer in Alpine Linux. >> >> Fernando: Can you confirm that the bug occurs with an unmodified >> Alpine Linux qemu binary? >> >> Richard: Commit 7db2145a6826b14efceb8dd64bfe6ad8647072eb ("bswap: Add >> host endian unaligned access functions") introduced the unaligned >> memory access functions in question here. Please see below for >> details on the bug - basically QEMU code assumes they are atomic, but >> that is not guaranteed :(. Any ideas for how to fix this? >> >> Natanael: It seems likely that the qemu package in Alpine Linux >> suffers from a compilation issue resulting in a broken QEMU. It may >> be necessary to leave the compiler optimization flag alone in APKBUILD >> to work around this problem. >> >> Here are the details. QEMU relies on the compiler turning >> memcpy(&dst, &src, 2) turning into a load instruction in >> include/qemu/bswap.h:lduw_he_p() (explanation below): >> >> /* Any compiler worth its salt will turn these memcpy into native unaligned >> operations. Thus we don't need to play games with packed attributes, or >> inline byte-by-byte stores. */ >> >> static inline int lduw_he_p(const void *ptr) >> { >> uint16_t r; >> memcpy(&r, ptr, sizeof(r)); >> return r; >> } >> >> Here is the disassembly snippet of virtqueue_pop() from Fedora 29 that >> shows the load instruction: >> >> 398166: 0f b7 42 02 movzwl 0x2(%rdx),%eax >> 39816a: 66 89 43 32 mov %ax,0x32(%rbx) >> >> Here is the instruction sequence in the Alpine Linux binary: >> >> 455562: ba 02 00 00 00 mov $0x2,%edx >> 455567: e8 74 24 f3 ff callq 3879e0 <memcpy@plt> >> 45556c: 0f b7 44 24 42 movzwl 0x42(%rsp),%eax >> 455571: 66 41 89 47 32 mov %ax,0x32(%r15) >> >> It's calling memcpy instead of using a load instruction. >> >> Fernando found that QEMU's virtqueue_pop() function sees bogus values >> when loading a 16-bit guest RAM location. Paolo figured out that the >> bogus value can be produced by memcpy() when another thread is >> updating the 16-bit memory location simultaneously. This is a race >> condition between one thread loading the 16-bit value and another >> thread storing it (in this case a guest vcpu thread). Sometimes >> memcpy() may load one old byte and one new byte, resulting in a bogus >> value. >> >> The symptom that Fernando experienced is a "Virtqueue size exceeded" >> error message from QEMU and then the virtio-blk or virtio-scsi device >> stops working. This issue potentially affects other device emulation >> code in QEMU as well, not just virtio devices. >> >> For the time being, I suggest tweaking the APKBUILD so the memcpy() is >> not generated. Hopefully QEMU can make the code more portable in the >> future so the compiler always does the expected thing, but this may >> not be easily possible. >> >> Stefan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-22 14:55 ` Paolo Bonzini @ 2019-02-22 15:48 ` Fernando Casas Schössow 0 siblings, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-22 15:48 UTC (permalink / raw) To: Paolo Bonzini Cc: Stefan Hajnoczi, Natanael Copa, Richard Henderson, qemu-devel You can find the the information here: https://wiki.alpinelinux.org/wiki/Creating_an_Alpine_package For your convenience, I can setup an Alpine VM and give you SSH access with the build environment and the aports tree already setup. Then summary steps in case you want to try on your own container/VM would be: 1- Setup the build environment as described in the wiki. 2- Clone aports tree: git clone https://github.com/alpinelinux/aports 3- Go into the qemu package directory directory in aports tree: cd aports/main/qemu 4- Run: abuild clean checksum unpack prepare deps build This would get you the binary built. If you need to add a patch. Create the patch file add it to the APKBUILD file and then run "abuild checksum" to compute it's checksum and add it to the APKBUILD file. abuild quick ref: abuild clean: Remove temp build and install dirs abuild unpack: Unpack sources to \$srcdir abuild checksum: Generate checksum to be included in APKBUILD abuild prepare: Apply patches abuild deps: Install packages listed in makedepends and depends abuild build: Compile and install package into $pkgdir Once finished to clean up everything run: abuild undeps clean Thanks. Fernando On vie, feb 22, 2019 at 3:55 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: On 22/02/19 15:43, Fernando Casas Schössow wrote: Hi all, Indeed I can confirm this issue occurs with the stock, unmodified QEMU binary provided with Alpine since at least Alpine 3.6 up to 3.9. Is there any compiler flag I can tweak, add or remove to rebuild qemu and try to repro to confirm a possible workaround? Nope, not yet. However, I can try some experiments if you can provide some information on how to rebuild an apk. Paolo ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-22 14:04 ` Stefan Hajnoczi 2019-02-22 14:38 ` Paolo Bonzini 2019-02-22 14:43 ` Fernando Casas Schössow @ 2019-02-22 16:37 ` Dr. David Alan Gilbert 2019-02-22 16:39 ` Paolo Bonzini 2019-02-23 11:49 ` Natanael Copa ` (2 subsequent siblings) 5 siblings, 1 reply; 55+ messages in thread From: Dr. David Alan Gilbert @ 2019-02-22 16:37 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Fernando Casas Schössow, Paolo Bonzini, Richard Henderson, qemu-devel, Natanael Copa * Stefan Hajnoczi (stefanha@gmail.com) wrote: > On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow > <casasfernando@outlook.com> wrote: > > I have CCed Natanael Copa, qemu package maintainer in Alpine Linux. > > Fernando: Can you confirm that the bug occurs with an unmodified > Alpine Linux qemu binary? > > Richard: Commit 7db2145a6826b14efceb8dd64bfe6ad8647072eb ("bswap: Add > host endian unaligned access functions") introduced the unaligned > memory access functions in question here. Please see below for > details on the bug - basically QEMU code assumes they are atomic, but > that is not guaranteed :(. Any ideas for how to fix this? Why does vring_avail_idx use virtio_ld*u*w_phys_cached? (similar for vring_avail_ring). There's no generically safe way to do unaligned atomic loads - don't we know in virtio that these are naturally aligned? Dave > Natanael: It seems likely that the qemu package in Alpine Linux > suffers from a compilation issue resulting in a broken QEMU. It may > be necessary to leave the compiler optimization flag alone in APKBUILD > to work around this problem. > > Here are the details. QEMU relies on the compiler turning > memcpy(&dst, &src, 2) turning into a load instruction in > include/qemu/bswap.h:lduw_he_p() (explanation below): > > /* Any compiler worth its salt will turn these memcpy into native unaligned > operations. Thus we don't need to play games with packed attributes, or > inline byte-by-byte stores. */ > > static inline int lduw_he_p(const void *ptr) > { > uint16_t r; > memcpy(&r, ptr, sizeof(r)); > return r; > } > > Here is the disassembly snippet of virtqueue_pop() from Fedora 29 that > shows the load instruction: > > 398166: 0f b7 42 02 movzwl 0x2(%rdx),%eax > 39816a: 66 89 43 32 mov %ax,0x32(%rbx) > > Here is the instruction sequence in the Alpine Linux binary: > > 455562: ba 02 00 00 00 mov $0x2,%edx > 455567: e8 74 24 f3 ff callq 3879e0 <memcpy@plt> > 45556c: 0f b7 44 24 42 movzwl 0x42(%rsp),%eax > 455571: 66 41 89 47 32 mov %ax,0x32(%r15) > > It's calling memcpy instead of using a load instruction. > > Fernando found that QEMU's virtqueue_pop() function sees bogus values > when loading a 16-bit guest RAM location. Paolo figured out that the > bogus value can be produced by memcpy() when another thread is > updating the 16-bit memory location simultaneously. This is a race > condition between one thread loading the 16-bit value and another > thread storing it (in this case a guest vcpu thread). Sometimes > memcpy() may load one old byte and one new byte, resulting in a bogus > value. > > The symptom that Fernando experienced is a "Virtqueue size exceeded" > error message from QEMU and then the virtio-blk or virtio-scsi device > stops working. This issue potentially affects other device emulation > code in QEMU as well, not just virtio devices. > > For the time being, I suggest tweaking the APKBUILD so the memcpy() is > not generated. Hopefully QEMU can make the code more portable in the > future so the compiler always does the expected thing, but this may > not be easily possible. > > Stefan > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-22 16:37 ` Dr. David Alan Gilbert @ 2019-02-22 16:39 ` Paolo Bonzini 2019-02-22 16:47 ` Dr. David Alan Gilbert 0 siblings, 1 reply; 55+ messages in thread From: Paolo Bonzini @ 2019-02-22 16:39 UTC (permalink / raw) To: Dr. David Alan Gilbert, Stefan Hajnoczi Cc: Fernando Casas Schössow, Richard Henderson, qemu-devel, Natanael Copa On 22/02/19 17:37, Dr. David Alan Gilbert wrote: > Why does vring_avail_idx use virtio_ld*u*w_phys_cached? > (similar for vring_avail_ring). > There's no generically safe way to do unaligned atomic loads - don't we know > in virtio that these are naturally aligned? u is for unsigned. :) We know that these are aligned and so they will be atomic. Paolo ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-22 16:39 ` Paolo Bonzini @ 2019-02-22 16:47 ` Dr. David Alan Gilbert 0 siblings, 0 replies; 55+ messages in thread From: Dr. David Alan Gilbert @ 2019-02-22 16:47 UTC (permalink / raw) To: Paolo Bonzini Cc: Stefan Hajnoczi, Fernando Casas Schössow, Richard Henderson, qemu-devel, Natanael Copa * Paolo Bonzini (pbonzini@redhat.com) wrote: > On 22/02/19 17:37, Dr. David Alan Gilbert wrote: > > Why does vring_avail_idx use virtio_ld*u*w_phys_cached? > > (similar for vring_avail_ring). > > There's no generically safe way to do unaligned atomic loads - don't we know > > in virtio that these are naturally aligned? > > u is for unsigned. :) We know that these are aligned and so they will > be atomic. Ah OK! Dave > Paolo -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-22 14:04 ` Stefan Hajnoczi ` (2 preceding siblings ...) 2019-02-22 16:37 ` Dr. David Alan Gilbert @ 2019-02-23 11:49 ` Natanael Copa 2019-02-26 13:30 ` Paolo Bonzini 2019-02-23 15:55 ` Natanael Copa 2019-02-23 16:57 ` Peter Maydell 5 siblings, 1 reply; 55+ messages in thread From: Natanael Copa @ 2019-02-23 11:49 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Fernando Casas Schössow, Paolo Bonzini, Richard Henderson, qemu-devel On Fri, 22 Feb 2019 14:04:20 +0000 Stefan Hajnoczi <stefanha@gmail.com> wrote: > On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow > <casasfernando@outlook.com> wrote: > > I have CCed Natanael Copa, qemu package maintainer in Alpine Linux. Hi! ... > Richard: Commit 7db2145a6826b14efceb8dd64bfe6ad8647072eb ("bswap: Add > host endian unaligned access functions") introduced the unaligned > memory access functions in question here. Please see below for > details on the bug - basically QEMU code assumes they are atomic, but > that is not guaranteed :(. Any ideas for how to fix this? > > Natanael: It seems likely that the qemu package in Alpine Linux > suffers from a compilation issue resulting in a broken QEMU. It may > be necessary to leave the compiler optimization flag alone in APKBUILD > to work around this problem. > > Here are the details. QEMU relies on the compiler turning > memcpy(&dst, &src, 2) turning into a load instruction in > include/qemu/bswap.h:lduw_he_p() (explanation below): > > /* Any compiler worth its salt will turn these memcpy into native unaligned > operations. Thus we don't need to play games with packed attributes, or > inline byte-by-byte stores. */ > > static inline int lduw_he_p(const void *ptr) > { > uint16_t r; > memcpy(&r, ptr, sizeof(r)); > return r; > } > > Here is the disassembly snippet of virtqueue_pop() from Fedora 29 that > shows the load instruction: > > 398166: 0f b7 42 02 movzwl 0x2(%rdx),%eax > 39816a: 66 89 43 32 mov %ax,0x32(%rbx) > > Here is the instruction sequence in the Alpine Linux binary: > > 455562: ba 02 00 00 00 mov $0x2,%edx > 455567: e8 74 24 f3 ff callq 3879e0 <memcpy@plt> > 45556c: 0f b7 44 24 42 movzwl 0x42(%rsp),%eax > 455571: 66 41 89 47 32 mov %ax,0x32(%r15) > > It's calling memcpy instead of using a load instruction. My first reaction to this is: If the intention is to not actually call memcpy function, then maybe memcpy should not be used in the C code in first place? > Fernando found that QEMU's virtqueue_pop() function sees bogus values > when loading a 16-bit guest RAM location. Paolo figured out that the > bogus value can be produced by memcpy() when another thread is > updating the 16-bit memory location simultaneously. This is a race > condition between one thread loading the 16-bit value and another > thread storing it (in this case a guest vcpu thread). Sometimes > memcpy() may load one old byte and one new byte, resulting in a bogus > value. > > The symptom that Fernando experienced is a "Virtqueue size exceeded" > error message from QEMU and then the virtio-blk or virtio-scsi device > stops working. This issue potentially affects other device emulation > code in QEMU as well, not just virtio devices. > > For the time being, I suggest tweaking the APKBUILD so the memcpy() is > not generated. Hopefully QEMU can make the code more portable in the > future so the compiler always does the expected thing, but this may > not be easily possible. I suspect this happens due to the Alpine toolchain will enable _FORTIFY_SOURCE=2 by default and the way this is implemented via fortify-headers: http://git.2f30.org/fortify-headers/file/include/string.h.html#l39 Try build with -U_FORTIFY_SOURCE > > Stefan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-23 11:49 ` Natanael Copa @ 2019-02-26 13:30 ` Paolo Bonzini 2019-02-28 7:35 ` Fernando Casas Schössow 0 siblings, 1 reply; 55+ messages in thread From: Paolo Bonzini @ 2019-02-26 13:30 UTC (permalink / raw) To: Natanael Copa, Stefan Hajnoczi Cc: Fernando Casas Schössow, Richard Henderson, qemu-devel On 23/02/19 12:49, Natanael Copa wrote: > I suspect this happens due to the Alpine toolchain will enable > _FORTIFY_SOURCE=2 by default and the way this is implemented via fortify-headers: > http://git.2f30.org/fortify-headers/file/include/string.h.html#l39 The call to __orig_memcpy is the culprit there, is there any reason not to do something like _FORTIFY_FN(memcpy) void *__memcpy_chk(void *__od, const void *__os, size_t __n) { size_t __bd = __builtin_object_size(__od, 0); size_t __bs = __builtin_object_size(__os, 0); char *__d = (char *)__od; const char *__s = (const char *)__os; /* trap if pointers are overlapping but not if dst == src. * gcc seems to like to generate code that relies on dst == src */ if ((__d < __s && __d + __n > __s) || (__s < __d && __s + __n > __d)) __builtin_trap(); if (__n > __bd || __n > __bs) __builtin_trap(); return memcpy(__od, __os, __n); } #define memcpy __memcpy_chk ? That is, getting rid of _FORTIFY_ORIG altogether. Paolo ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-26 13:30 ` Paolo Bonzini @ 2019-02-28 7:35 ` Fernando Casas Schössow 0 siblings, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-28 7:35 UTC (permalink / raw) To: Paolo Bonzini Cc: Natanael Copa, Stefan Hajnoczi, Richard Henderson, qemu-devel Just wanted to share a small update on the situation after updating QEMU to the new Alpine package patched with Natanael's patch. So far so good, moreover I switched a few other guests from SATA to VirtIO SCSI and after two days no issues. Unless I find any problem I will report back with an update in a week from now. Thanks everyone for all you did to help find a solution to this issue. On mar, feb 26, 2019 at 2:30 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: On 23/02/19 12:49, Natanael Copa wrote: I suspect this happens due to the Alpine toolchain will enable _FORTIFY_SOURCE=2 by default and the way this is implemented via fortify-headers: http://git.2f30.org/fortify-headers/file/include/string.h.html#l39 The call to __orig_memcpy is the culprit there, is there any reason not to do something like _FORTIFY_FN(memcpy) void *__memcpy_chk(void *__od, const void *__os, size_t __n) { size_t __bd = __builtin_object_size(__od, 0); size_t __bs = __builtin_object_size(__os, 0); char *__d = (char *)__od; const char *__s = (const char *)__os; /* trap if pointers are overlapping but not if dst == src. * gcc seems to like to generate code that relies on dst == src */ if ((__d < __s && __d + __n > __s) || (__s < __d && __s + __n > __d)) __builtin_trap(); if (__n > __bd || __n > __bs) __builtin_trap(); return memcpy(__od, __os, __n); } #define memcpy __memcpy_chk ? That is, getting rid of _FORTIFY_ORIG altogether. Paolo ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-22 14:04 ` Stefan Hajnoczi ` (3 preceding siblings ...) 2019-02-23 11:49 ` Natanael Copa @ 2019-02-23 15:55 ` Natanael Copa 2019-02-23 16:18 ` Peter Maydell ` (2 more replies) 2019-02-23 16:57 ` Peter Maydell 5 siblings, 3 replies; 55+ messages in thread From: Natanael Copa @ 2019-02-23 15:55 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Fernando Casas Schössow, Paolo Bonzini, Richard Henderson, qemu-devel On Fri, 22 Feb 2019 14:04:20 +0000 Stefan Hajnoczi <stefanha@gmail.com> wrote: > On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow > <casasfernando@outlook.com> wrote: > > I have CCed Natanael Copa, qemu package maintainer in Alpine Linux. > > Fernando: Can you confirm that the bug occurs with an unmodified > Alpine Linux qemu binary? I wonder exactly which binary it is. What arcitecture (x86 or x86_64) and what binary. Is it ddynamic or statically linked qemu-system-x86_64? > Richard: Commit 7db2145a6826b14efceb8dd64bfe6ad8647072eb ("bswap: Add > host endian unaligned access functions") introduced the unaligned > memory access functions in question here. Please see below for > details on the bug - basically QEMU code assumes they are atomic, but > that is not guaranteed :(. Any ideas for how to fix this? > > Natanael: It seems likely that the qemu package in Alpine Linux > suffers from a compilation issue resulting in a broken QEMU. It may > be necessary to leave the compiler optimization flag alone in APKBUILD > to work around this problem. > > Here are the details. QEMU relies on the compiler turning > memcpy(&dst, &src, 2) turning into a load instruction in > include/qemu/bswap.h:lduw_he_p() (explanation below): > > /* Any compiler worth its salt will turn these memcpy into native unaligned > operations. Thus we don't need to play games with packed attributes, or > inline byte-by-byte stores. */ > > static inline int lduw_he_p(const void *ptr) > { > uint16_t r; > memcpy(&r, ptr, sizeof(r)); > return r; > } > > Here is the disassembly snippet of virtqueue_pop() from Fedora 29 that > shows the load instruction: > > 398166: 0f b7 42 02 movzwl 0x2(%rdx),%eax > 39816a: 66 89 43 32 mov %ax,0x32(%rbx) > > Here is the instruction sequence in the Alpine Linux binary: > > 455562: ba 02 00 00 00 mov $0x2,%edx > 455567: e8 74 24 f3 ff callq 3879e0 <memcpy@plt> > 45556c: 0f b7 44 24 42 movzwl 0x42(%rsp),%eax > 455571: 66 41 89 47 32 mov %ax,0x32(%r15) > > It's calling memcpy instead of using a load instruction. I tried to find this section. How do you get the assembly listing of relevant secion? I tried to do "disas virtio_pop" from `gdb /usr/bin/qemu-system-x86_64` from the binary in alpine edge. I could find 2 memcpy but none of them look like a 16 bit operation after: 0x00000000004551f1 <+353>: mov 0x10(%rsp),%rdi 0x00000000004551f6 <+358>: mov $0x10,%edx 0x00000000004551fb <+363>: callq 0x3879e0 <memcpy@plt> 0x0000000000455200 <+368>: movzwl 0x5c(%rsp),%eax 0x0000000000455205 <+373>: test $0x4,%al 0x0000000000455207 <+375>: je 0x4552aa <virtqueue_pop+538> .... 0x0000000000455291 <+513>: mov 0x10(%rsp),%rdi 0x0000000000455296 <+518>: mov $0x10,%edx 0x000000000045529b <+523>: callq 0x3879e0 <memcpy@plt> 0x00000000004552a0 <+528>: mov %rbp,0x20(%rsp) 0x00000000004552a5 <+533>: movzwl 0x5c(%rsp),%eax 0x00000000004552aa <+538>: lea 0x20e0(%rsp),%rdi 0x00000000004552b2 <+546>: xor %r11d,%r11d 0x00000000004552b5 <+549>: mov %r15,0x38(%rsp) > > Fernando found that QEMU's virtqueue_pop() function sees bogus values > when loading a 16-bit guest RAM location. Paolo figured out that the > bogus value can be produced by memcpy() when another thread is > updating the 16-bit memory location simultaneously. This is a race > condition between one thread loading the 16-bit value and another > thread storing it (in this case a guest vcpu thread). Sometimes > memcpy() may load one old byte and one new byte, resulting in a bogus > value. > > The symptom that Fernando experienced is a "Virtqueue size exceeded" > error message from QEMU and then the virtio-blk or virtio-scsi device > stops working. This issue potentially affects other device emulation > code in QEMU as well, not just virtio devices. > > For the time being, I suggest tweaking the APKBUILD so the memcpy() is > not generated. Hopefully QEMU can make the code more portable in the > future so the compiler always does the expected thing, but this may > not be easily possible. I was thinking of something in the lines of: typedef volatile uint16_t __attribute__((__may_alias__)) volatile_uint16_t; static inline int lduw_he_p(const void *ptr) { volatile_uint16_t r = *(volatile_uint16_t*)ptr; return r; } I can test different CFLAGS with and without the _FORTIFY_SOURCE and with different variants of memcpy (like __builtint_memcpy etc) but i need find a way to get the correct assembly output so I know if/when I have found something that works. Thanks! -nc > > Stefan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-23 15:55 ` Natanael Copa @ 2019-02-23 16:18 ` Peter Maydell 2019-02-25 10:24 ` Natanael Copa 2019-02-23 16:21 ` Fernando Casas Schössow 2019-02-25 10:30 ` Stefan Hajnoczi 2 siblings, 1 reply; 55+ messages in thread From: Peter Maydell @ 2019-02-23 16:18 UTC (permalink / raw) To: Natanael Copa Cc: Stefan Hajnoczi, Paolo Bonzini, Fernando Casas Schössow, qemu-devel, Richard Henderson On Sat, 23 Feb 2019 at 16:05, Natanael Copa <ncopa@alpinelinux.org> wrote: > I was thinking of something in the lines of: > > typedef volatile uint16_t __attribute__((__may_alias__)) volatile_uint16_t; > static inline int lduw_he_p(const void *ptr) > { > volatile_uint16_t r = *(volatile_uint16_t*)ptr; > return r; > } This won't correctly handle accesses with unaligned pointers, I'm afraid. We rely on these functions correctly working with pointers that are potentially unaligned. thanks -- PMM ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-23 16:18 ` Peter Maydell @ 2019-02-25 10:24 ` Natanael Copa 2019-02-25 10:34 ` Peter Maydell 0 siblings, 1 reply; 55+ messages in thread From: Natanael Copa @ 2019-02-25 10:24 UTC (permalink / raw) To: Peter Maydell Cc: Stefan Hajnoczi, Paolo Bonzini, Fernando Casas Schössow, qemu-devel, Richard Henderson On Sat, 23 Feb 2019 16:18:15 +0000 Peter Maydell <peter.maydell@linaro.org> wrote: > On Sat, 23 Feb 2019 at 16:05, Natanael Copa <ncopa@alpinelinux.org> wrote: > > I was thinking of something in the lines of: > > > > typedef volatile uint16_t __attribute__((__may_alias__)) volatile_uint16_t; > > static inline int lduw_he_p(const void *ptr) > > { > > volatile_uint16_t r = *(volatile_uint16_t*)ptr; > > return r; > > } > > This won't correctly handle accesses with unaligned pointers, > I'm afraid. We rely on these functions correctly working > with pointers that are potentially unaligned. Well, current code does not even handle access with aligned pointers, depending on FORTIFY_SOURCE implementation. My thinking here is that we depend on assumption that compiler will remove the memcpy call, so maybe find other way to generate same assembly, while still depend on compiler optimization. I did some tests and compared the assembly output. The compiler will generate same assembly if volatile is not used. The attribute __may_alias__ does not seem to make any difference. So the following function generates exactly same assembly: static inline int lduw_he_p(const void *ptr) { uint16_t r = *(uint16_t*)ptr; return r; } I created a script to compare on different Linux distros and architectures. My conclusion is: - Alpine fortify implementation is non-optimal, but technically correct. - `uint16_t r = *(uint16_t*)ptr;` generates the same assembly code, regardless of fortify implementation, and is clearer on the intention, than use of memcpy. test.sh: ---[CUT HERE]---8<-------------------------------------------------- #!/bin/sh -x cat > testcase.c <<EOF #include <stdint.h> #ifdef WITH_MEMCPY #include <string.h> static inline int lduw_he_p(const void *ptr) { uint16_t r; memcpy(&r, ptr, sizeof(r)); return r; } #else static inline int lduw_he_p(const void *ptr) { uint16_t r = *(uint16_t*)ptr; return r; } #endif int main(int argc, char *argv[]) { void *p = argv; int i; p++; // make sure we are unaligned i=lduw_he_p(p); return i; } EOF : ${CC:=gcc} $CC --version uname -m for fortify in "-D_FORTIFY_SOURCE=2" "-U_FORTIFY_SOURCE"; do for cflag in "-DWITH_MEMCPY" "-DWITHOUT_MEMCPY"; do $CC $cflag $fortify $@ -o testcase$cflag testcase.c gdb --batch -ex "disas ${func:-main}" ./testcase$cflag done done ---[CUT HERE]---8<-------------------------------------------------- Output on Alpine: $ sh -x test.sh -O2 + cat + : gcc + gcc --version gcc (Alpine 8.2.0) 8.2.0 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + uname -m x86_64 + gcc -DWITH_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x0000000000001070 <+0>: sub $0x18,%rsp 0x0000000000001074 <+4>: mov %fs:0x28,%rax 0x000000000000107d <+13>: mov %rax,0x8(%rsp) 0x0000000000001082 <+18>: xor %eax,%eax 0x0000000000001084 <+20>: lea 0x6(%rsp),%rdi 0x0000000000001089 <+25>: lea 0x1(%rsi),%rax 0x000000000000108d <+29>: lea 0x2(%rdi),%rdx 0x0000000000001091 <+33>: cmp %rdx,%rax 0x0000000000001094 <+36>: jae 0x109b <main+43> 0x0000000000001096 <+38>: cmp %rdi,%rax 0x0000000000001099 <+41>: ja 0x10d0 <main+96> 0x000000000000109b <+43>: cmp %rdi,%rax 0x000000000000109e <+46>: jb 0x10c7 <main+87> 0x00000000000010a0 <+48>: mov $0x2,%edx 0x00000000000010a5 <+53>: mov %rax,%rsi 0x00000000000010a8 <+56>: callq 0x1020 <memcpy@plt> 0x00000000000010ad <+61>: movzwl 0x6(%rsp),%eax 0x00000000000010b2 <+66>: mov 0x8(%rsp),%rcx 0x00000000000010b7 <+71>: xor %fs:0x28,%rcx 0x00000000000010c0 <+80>: jne 0x10d2 <main+98> 0x00000000000010c2 <+82>: add $0x18,%rsp 0x00000000000010c6 <+86>: retq 0x00000000000010c7 <+87>: add $0x3,%rsi 0x00000000000010cb <+91>: cmp %rsi,%rdi 0x00000000000010ce <+94>: jae 0x10a0 <main+48> 0x00000000000010d0 <+96>: ud2 0x00000000000010d2 <+98>: callq 0x1030 <__stack_chk_fail@plt> End of assembler dump. + gcc -DWITHOUT_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x0000000000001050 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000001054 <+4>: retq End of assembler dump. + gcc -DWITH_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x0000000000001050 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000001054 <+4>: retq End of assembler dump. + gcc -DWITHOUT_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x0000000000001050 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000001054 <+4>: retq End of assembler dump. Output on Debian: root@19c3f13c6023:~# sh -x test.sh -O2 + cat + : gcc + gcc --version gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 Copyright (C) 2016 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + uname -m x86_64 + gcc -DWITH_MEMCPY -D_FORTIFY_SOURCE=2 -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex disas main ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x0000000000000530 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000000534 <+4>: retq End of assembler dump. + gcc -DWITHOUT_MEMCPY -D_FORTIFY_SOURCE=2 -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex disas main ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x0000000000000530 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000000534 <+4>: retq End of assembler dump. + gcc -DWITH_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex disas main ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x0000000000000530 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000000534 <+4>: retq End of assembler dump. + gcc -DWITHOUT_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex disas main ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x0000000000000530 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000000534 <+4>: retq End of assembler dump. root@19c3f13c6023:~# Output on Fedora: [root@ac68847f5bb1 ~]# sh -x test.sh -O2 + cat + : gcc + gcc --version gcc (GCC) 8.2.1 20181215 (Red Hat 8.2.1-6) Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + uname -m x86_64 + for fortify in "-D_FORTIFY_SOURCE=2" "-U_FORTIFY_SOURCE" + for cflag in "-DWITH_MEMCPY" "-DWITHOUT_MEMCPY" + gcc -DWITH_MEMCPY -D_FORTIFY_SOURCE=2 -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x0000000000401020 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000401024 <+4>: retq End of assembler dump. + for cflag in "-DWITH_MEMCPY" "-DWITHOUT_MEMCPY" + gcc -DWITHOUT_MEMCPY -D_FORTIFY_SOURCE=2 -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x0000000000401020 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000401024 <+4>: retq End of assembler dump. + for fortify in "-D_FORTIFY_SOURCE=2" "-U_FORTIFY_SOURCE" + for cflag in "-DWITH_MEMCPY" "-DWITHOUT_MEMCPY" + gcc -DWITH_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x0000000000401020 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000401024 <+4>: retq End of assembler dump. + for cflag in "-DWITH_MEMCPY" "-DWITHOUT_MEMCPY" + gcc -DWITHOUT_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x0000000000401020 <+0>: movzwl 0x1(%rsi),%eax 0x0000000000401024 <+4>: retq End of assembler dump. Output on Alpine with clang: $ CC=clang sh -x test.sh -O2 + cat + : clang + clang --version Alpine clang version 5.0.2 (tags/RELEASE_502/final) (based on LLVM 5.0.2) Target: x86_64-alpine-linux-musl Thread model: posix InstalledDir: /usr/bin + uname -m x86_64 + clang -DWITH_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x00000000000011a0 <+0>: movzwl 0x1(%rsi),%eax 0x00000000000011a4 <+4>: retq End of assembler dump. + clang -DWITHOUT_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x00000000000011a0 <+0>: movzwl 0x1(%rsi),%eax 0x00000000000011a4 <+4>: retq End of assembler dump. + clang -DWITH_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x00000000000011a0 <+0>: movzwl 0x1(%rsi),%eax 0x00000000000011a4 <+4>: retq End of assembler dump. + clang -DWITHOUT_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x00000000000011a0 <+0>: movzwl 0x1(%rsi),%eax 0x00000000000011a4 <+4>: retq End of assembler dump. Output on alpine armv7: $ sh -x test.sh -O2 + cat + : gcc + gcc --version gcc (Alpine 8.2.0) 8.2.0 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + uname -m armv8l + gcc -DWITH_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x000003f8 <+0>: push {r4, r5, lr} 0x000003fa <+2>: sub sp, #12 0x000003fc <+4>: adds r3, r1, #1 0x000003fe <+6>: add r4, sp, #4 0x00000400 <+8>: cmp r3, r4 0x00000402 <+10>: add.w r0, sp, #2 0x00000406 <+14>: ite cs 0x00000408 <+16>: movcs r2, #0 0x0000040a <+18>: movcc r2, #1 0x0000040c <+20>: cmp r3, r0 0x0000040e <+22>: it ls 0x00000410 <+24>: movls r2, #0 0x00000412 <+26>: ldr r4, [pc, #52] ; (0x448 <main+80>) 0x00000414 <+28>: ldr r5, [pc, #52] ; (0x44c <main+84>) 0x00000416 <+30>: add r4, pc 0x00000418 <+32>: ldr r4, [r4, r5] 0x0000041a <+34>: ldr r5, [r4, #0] 0x0000041c <+36>: str r5, [sp, #4] 0x0000041e <+38>: cbnz r2, 0x442 <main+74> 0x00000420 <+40>: cmp r3, r0 0x00000422 <+42>: bcc.n 0x43c <main+68> 0x00000424 <+44>: mov r1, r3 0x00000426 <+46>: movs r2, #2 0x00000428 <+48>: blx 0x3b0 <memcpy@plt> 0x0000042c <+52>: ldr r2, [sp, #4] 0x0000042e <+54>: ldr r3, [r4, #0] 0x00000430 <+56>: ldrh.w r0, [sp, #2] 0x00000434 <+60>: cmp r2, r3 0x00000436 <+62>: bne.n 0x444 <main+76> 0x00000438 <+64>: add sp, #12 0x0000043a <+66>: pop {r4, r5, pc} 0x0000043c <+68>: adds r1, #3 0x0000043e <+70>: cmp r0, r1 0x00000440 <+72>: bcs.n 0x424 <main+44> 0x00000442 <+74>: udf #255 ; 0xff 0x00000444 <+76>: blx 0x3c8 <__stack_chk_fail@plt> 0x00000448 <+80>: muleq r1, lr, r11 0x0000044c <+84>: andeq r0, r0, r4, lsr #32 End of assembler dump. + gcc -DWITHOUT_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x0000036c <+0>: ldrh.w r0, [r1, #1] 0x00000370 <+4>: bx lr End of assembler dump. + gcc -DWITH_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x0000036c <+0>: ldrh.w r0, [r1, #1] 0x00000370 <+4>: bx lr End of assembler dump. + gcc -DWITHOUT_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x0000036c <+0>: ldrh.w r0, [r1, #1] 0x00000370 <+4>: bx lr End of assembler dump. Output on Alpine aarch64: $ sh -x test.sh -O2 + cat + : gcc + gcc --version gcc (Alpine 8.2.0) 8.2.0 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + uname -m aarch64 + gcc -DWITH_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x00000000000006b0 <+0>: stp x29, x30, [sp, #-48]! 0x00000000000006b4 <+4>: add x2, x1, #0x1 0x00000000000006b8 <+8>: mov x29, sp 0x00000000000006bc <+12>: str x19, [sp, #16] 0x00000000000006c0 <+16>: adrp x19, 0x10000 0x00000000000006c4 <+20>: add x0, sp, #0x26 0x00000000000006c8 <+24>: ldr x3, [x19, #4024] 0x00000000000006cc <+28>: add x4, x0, #0x2 0x00000000000006d0 <+32>: cmp x4, x2 0x00000000000006d4 <+36>: ldr x4, [x3] 0x00000000000006d8 <+40>: str x4, [sp, #40] 0x00000000000006dc <+44>: mov x4, #0x0 // #0 0x00000000000006e0 <+48>: ccmp x0, x2, #0x2, hi // hi = pmore 0x00000000000006e4 <+52>: b.cc 0x72c <main+124> // b.lo, b.ul, b.last 0x00000000000006e8 <+56>: cmp x2, x0 0x00000000000006ec <+60>: b.cc 0x720 <main+112> // b.lo, b.ul, b.last 0x00000000000006f0 <+64>: mov x1, x2 0x00000000000006f4 <+68>: mov x2, #0x2 // #2 0x00000000000006f8 <+72>: bl 0x650 <memcpy@plt> 0x00000000000006fc <+76>: ldr x19, [x19, #4024] 0x0000000000000700 <+80>: ldrh w0, [sp, #38] 0x0000000000000704 <+84>: ldr x2, [sp, #40] 0x0000000000000708 <+88>: ldr x1, [x19] 0x000000000000070c <+92>: eor x1, x2, x1 0x0000000000000710 <+96>: cbnz x1, 0x730 <main+128> 0x0000000000000714 <+100>: ldr x19, [sp, #16] 0x0000000000000718 <+104>: ldp x29, x30, [sp], #48 0x000000000000071c <+108>: ret 0x0000000000000720 <+112>: add x1, x1, #0x3 0x0000000000000724 <+116>: cmp x0, x1 0x0000000000000728 <+120>: b.cs 0x6f0 <main+64> // b.hs, b.nlast 0x000000000000072c <+124>: brk #0x3e8 0x0000000000000730 <+128>: bl 0x670 <__stack_chk_fail@plt> End of assembler dump. + gcc -DWITHOUT_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x00000000000005e0 <+0>: ldurh w0, [x1, #1] 0x00000000000005e4 <+4>: ret End of assembler dump. + gcc -DWITH_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x00000000000005e0 <+0>: ldurh w0, [x1, #1] 0x00000000000005e4 <+4>: ret End of assembler dump. + gcc -DWITHOUT_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x00000000000005e0 <+0>: ldurh w0, [x1, #1] 0x00000000000005e4 <+4>: ret End of assembler dump. Output on Alpine ppc64le: $ sh -x test.sh -O2 + cat + : gcc + gcc --version gcc (Alpine 8.2.0) 8.2.0 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + uname -m ppc64le + gcc -DWITH_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x00000000000005c0 <+0>: lhz r3,1(r4) 0x00000000000005c4 <+4>: blr 0x00000000000005c8 <+8>: .long 0x0 0x00000000000005cc <+12>: .long 0x0 0x00000000000005d0 <+16>: .long 0x0 End of assembler dump. + gcc -DWITHOUT_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x00000000000005c0 <+0>: lhz r3,1(r4) 0x00000000000005c4 <+4>: blr 0x00000000000005c8 <+8>: .long 0x0 0x00000000000005cc <+12>: .long 0x0 0x00000000000005d0 <+16>: .long 0x0 End of assembler dump. + gcc -DWITH_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x00000000000005c0 <+0>: lhz r3,1(r4) 0x00000000000005c4 <+4>: blr 0x00000000000005c8 <+8>: .long 0x0 0x00000000000005cc <+12>: .long 0x0 0x00000000000005d0 <+16>: .long 0x0 End of assembler dump. + gcc -DWITHOUT_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x00000000000005c0 <+0>: lhz r3,1(r4) 0x00000000000005c4 <+4>: blr 0x00000000000005c8 <+8>: .long 0x0 0x00000000000005cc <+12>: .long 0x0 0x00000000000005d0 <+16>: .long 0x0 End of assembler dump. Output on Alpine s390x: $ sh -x test.sh -O2 + cat + : gcc + gcc --version gcc (Alpine 8.2.0) 8.2.0 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + uname -m s390x + gcc -DWITH_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x00000000000006b0 <+0>: stmg %r14,%r15,112(%r15) 0x00000000000006b6 <+6>: ear %r1,%a0 0x00000000000006ba <+10>: lay %r15,-176(%r15) 0x00000000000006c0 <+16>: sllg %r1,%r1,32 0x00000000000006c6 <+22>: ear %r1,%a1 0x00000000000006ca <+26>: la %r2,166(%r15) 0x00000000000006ce <+30>: mvc 168(8,%r15),40(%r1) 0x00000000000006d4 <+36>: la %r1,1(%r3) 0x00000000000006d8 <+40>: clgr %r1,%r2 0x00000000000006dc <+44>: jle 0x6e8 <main+56> 0x00000000000006e0 <+48>: la %r4,168(%r15) 0x00000000000006e4 <+52>: clgrtl %r1,%r4 0x00000000000006e8 <+56>: clgr %r1,%r2 0x00000000000006ec <+60>: jl 0x724 <main+116> 0x00000000000006f0 <+64>: lgr %r3,%r1 0x00000000000006f4 <+68>: lghi %r4,2 0x00000000000006f8 <+72>: brasl %r14,0x5f0 <memcpy@plt> 0x00000000000006fe <+78>: ear %r1,%a0 0x0000000000000702 <+82>: llgh %r2,166(%r15) 0x0000000000000708 <+88>: sllg %r1,%r1,32 0x000000000000070e <+94>: ear %r1,%a1 0x0000000000000712 <+98>: clc 168(8,%r15),40(%r1) 0x0000000000000718 <+104>: jne 0x730 <main+128> 0x000000000000071c <+108>: lmg %r14,%r15,288(%r15) 0x0000000000000722 <+114>: br %r14 0x0000000000000724 <+116>: la %r3,3(%r3) 0x0000000000000728 <+120>: clgrtl %r2,%r3 0x000000000000072c <+124>: j 0x6f0 <main+64> 0x0000000000000730 <+128>: brasl %r14,0x630 <__stack_chk_fail@plt> End of assembler dump. + gcc -DWITHOUT_MEMCPY '-D_FORTIFY_SOURCE=2' -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x00000000000005f8 <+0>: llgh %r2,1(%r3) 0x00000000000005fe <+6>: br %r14 End of assembler dump. + gcc -DWITH_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITH_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITH_MEMCPY Dump of assembler code for function main: 0x00000000000005f8 <+0>: llgh %r2,1(%r3) 0x00000000000005fe <+6>: br %r14 End of assembler dump. + gcc -DWITHOUT_MEMCPY -U_FORTIFY_SOURCE -O2 -o testcase-DWITHOUT_MEMCPY testcase.c + gdb --batch -ex 'disas main' ./testcase-DWITHOUT_MEMCPY Dump of assembler code for function main: 0x00000000000005f8 <+0>: llgh %r2,1(%r3) 0x00000000000005fe <+6>: br %r14 End of assembler dump. ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-25 10:24 ` Natanael Copa @ 2019-02-25 10:34 ` Peter Maydell 2019-02-25 12:15 ` Fernando Casas Schössow 2019-02-25 12:21 ` Natanael Copa 0 siblings, 2 replies; 55+ messages in thread From: Peter Maydell @ 2019-02-25 10:34 UTC (permalink / raw) To: Natanael Copa Cc: Stefan Hajnoczi, Paolo Bonzini, Fernando Casas Schössow, qemu-devel, Richard Henderson On Mon, 25 Feb 2019 at 10:24, Natanael Copa <ncopa@alpinelinux.org> wrote: > > On Sat, 23 Feb 2019 16:18:15 +0000 > Peter Maydell <peter.maydell@linaro.org> wrote: > > > On Sat, 23 Feb 2019 at 16:05, Natanael Copa <ncopa@alpinelinux.org> wrote: > > > I was thinking of something in the lines of: > > > > > > typedef volatile uint16_t __attribute__((__may_alias__)) volatile_uint16_t; > > > static inline int lduw_he_p(const void *ptr) > > > { > > > volatile_uint16_t r = *(volatile_uint16_t*)ptr; > > > return r; > > > } > > > > This won't correctly handle accesses with unaligned pointers, > > I'm afraid. We rely on these functions correctly working > > with pointers that are potentially unaligned. > > Well, current code does not even handle access with aligned pointers, > depending on FORTIFY_SOURCE implementation. It correctly handles aligned and unaligned pointers for the API guarantees that the function in QEMU provides, which is to say "definitely works on any kind of aligned or unaligned pointer, not guaranteed to be atomic". Unfortunately some code in QEMU is implicitly assuming it is atomic, which this QEMU function does not guarantee -- it just happens to provide that most of the time. > My thinking here is that we depend on assumption that compiler will > remove the memcpy call, so maybe find other way to generate same > assembly, while still depend on compiler optimization. > I did some tests and compared the assembly output. The compiler will > generate same assembly if volatile is not used. The attribute > __may_alias__ does not seem to make any difference. So the following > function generates exactly same assembly: > > static inline int lduw_he_p(const void *ptr) > { > uint16_t r = *(uint16_t*)ptr; > return r; > } This still is not guaranteed to work on unaligned pointers. (For instance it probably won't work on SPARC hosts.) More generally there is no way to have a single function that is guaranteed to handle unaligned pointers and also guaranteed to produce an atomic access on all host architectures that we support, because not all host architectures allow you to do both with the same instruction sequence. So you can't just change this function to make it provide atomic access without breaking the other guarantee it is providing. The long term fix for this is that we need to separate out our APIs so we can have a family of functions that guarantee to work on unaligned pointers, and a family that guarantee to work atomically, and calling code can use the one that provides the semantics it requires. The short term fix is to fix your toolchain/compilation environment options so that it isn't trying to override the definition of memcpy(). thanks -- PMM ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-25 10:34 ` Peter Maydell @ 2019-02-25 12:15 ` Fernando Casas Schössow 2019-02-25 12:21 ` Natanael Copa 1 sibling, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-25 12:15 UTC (permalink / raw) To: Peter Maydell Cc: Natanael Copa, Stefan Hajnoczi, Paolo Bonzini, qemu-devel, Richard Henderson I’m running the test guest on another host for the last three days and so far so good. Yet because of the nature of this bug/issue it can take a few hours or a few days more to fail. The failure is unpredictable. Does it make sense to continue running the guest on this different host to try to repro or can I migrate the guest back to the original host? At this point I don’t think it will make any difference. The current host storage is far more slower causing latency issues to the guest so if I can migrate it back it would be great. Sent from my iPhone > On 25 Feb 2019, at 11:34, Peter Maydell <peter.maydell@linaro.org> wrote: > >> On Mon, 25 Feb 2019 at 10:24, Natanael Copa <ncopa@alpinelinux.org> wrote: >> >> On Sat, 23 Feb 2019 16:18:15 +0000 >> Peter Maydell <peter.maydell@linaro.org> wrote: >> >>>> On Sat, 23 Feb 2019 at 16:05, Natanael Copa <ncopa@alpinelinux.org> wrote: >>>> I was thinking of something in the lines of: >>>> >>>> typedef volatile uint16_t __attribute__((__may_alias__)) volatile_uint16_t; >>>> static inline int lduw_he_p(const void *ptr) >>>> { >>>> volatile_uint16_t r = *(volatile_uint16_t*)ptr; >>>> return r; >>>> } >>> >>> This won't correctly handle accesses with unaligned pointers, >>> I'm afraid. We rely on these functions correctly working >>> with pointers that are potentially unaligned. >> >> Well, current code does not even handle access with aligned pointers, >> depending on FORTIFY_SOURCE implementation. > > It correctly handles aligned and unaligned pointers for the > API guarantees that the function in QEMU provides, which is > to say "definitely works on any kind of aligned or unaligned > pointer, not guaranteed to be atomic". Unfortunately some > code in QEMU is implicitly assuming it is atomic, which this > QEMU function does not guarantee -- it just happens to provide > that most of the time. > >> My thinking here is that we depend on assumption that compiler will >> remove the memcpy call, so maybe find other way to generate same >> assembly, while still depend on compiler optimization. > >> I did some tests and compared the assembly output. The compiler will >> generate same assembly if volatile is not used. The attribute >> __may_alias__ does not seem to make any difference. So the following >> function generates exactly same assembly: >> >> static inline int lduw_he_p(const void *ptr) >> { >> uint16_t r = *(uint16_t*)ptr; >> return r; >> } > > This still is not guaranteed to work on unaligned pointers. > (For instance it probably won't work on SPARC hosts.) > More generally there is no way to have a single function > that is guaranteed to handle unaligned pointers and also > guaranteed to produce an atomic access on all host architectures > that we support, because not all host architectures allow > you to do both with the same instruction sequence. So > you can't just change this function to make it provide > atomic access without breaking the other guarantee it is > providing. > > The long term fix for this is that we need to separate out > our APIs so we can have a family of functions that guarantee > to work on unaligned pointers, and a family that guarantee > to work atomically, and calling code can use the one that > provides the semantics it requires. > > The short term fix is to fix your toolchain/compilation > environment options so that it isn't trying to override > the definition of memcpy(). > > thanks > -- PMM ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-25 10:34 ` Peter Maydell 2019-02-25 12:15 ` Fernando Casas Schössow @ 2019-02-25 12:21 ` Natanael Copa 2019-02-25 13:06 ` Peter Maydell 1 sibling, 1 reply; 55+ messages in thread From: Natanael Copa @ 2019-02-25 12:21 UTC (permalink / raw) To: Peter Maydell Cc: Stefan Hajnoczi, Paolo Bonzini, Fernando Casas Schössow, qemu-devel, Richard Henderson On Mon, 25 Feb 2019 10:34:23 +0000 Peter Maydell <peter.maydell@linaro.org> wrote: > On Mon, 25 Feb 2019 at 10:24, Natanael Copa <ncopa@alpinelinux.org> wrote: > > > > On Sat, 23 Feb 2019 16:18:15 +0000 > > Peter Maydell <peter.maydell@linaro.org> wrote: > > > > > On Sat, 23 Feb 2019 at 16:05, Natanael Copa <ncopa@alpinelinux.org> wrote: > > > > I was thinking of something in the lines of: > > > > > > > > typedef volatile uint16_t __attribute__((__may_alias__)) volatile_uint16_t; > > > > static inline int lduw_he_p(const void *ptr) > > > > { > > > > volatile_uint16_t r = *(volatile_uint16_t*)ptr; > > > > return r; > > > > } > > > > > > This won't correctly handle accesses with unaligned pointers, > > > I'm afraid. We rely on these functions correctly working > > > with pointers that are potentially unaligned. > > > > Well, current code does not even handle access with aligned pointers, > > depending on FORTIFY_SOURCE implementation. > > It correctly handles aligned and unaligned pointers for the > API guarantees that the function in QEMU provides, which is > to say "definitely works on any kind of aligned or unaligned > pointer, not guaranteed to be atomic". Unfortunately some > code in QEMU is implicitly assuming it is atomic, which this > QEMU function does not guarantee -- it just happens to provide > that most of the time. > > > My thinking here is that we depend on assumption that compiler will > > remove the memcpy call, so maybe find other way to generate same > > assembly, while still depend on compiler optimization. > > > I did some tests and compared the assembly output. The compiler will > > generate same assembly if volatile is not used. The attribute > > __may_alias__ does not seem to make any difference. So the following > > function generates exactly same assembly: > > > > static inline int lduw_he_p(const void *ptr) > > { > > uint16_t r = *(uint16_t*)ptr; > > return r; > > } > > This still is not guaranteed to work on unaligned pointers. > (For instance it probably won't work on SPARC hosts.) > More generally there is no way to have a single function > that is guaranteed to handle unaligned pointers and also > guaranteed to produce an atomic access on all host architectures > that we support, because not all host architectures allow > you to do both with the same instruction sequence. So > you can't just change this function to make it provide > atomic access without breaking the other guarantee it is > providing. Right, so it is possible that there are other architectures (like SPARC) that suffer from the same race here as Alpine, because memcpy is not guaranteed to be atomic. > The long term fix for this is that we need to separate out > our APIs so we can have a family of functions that guarantee > to work on unaligned pointers, and a family that guarantee > to work atomically, and calling code can use the one that > provides the semantics it requires. > > The short term fix is to fix your toolchain/compilation > environment options so that it isn't trying to override > the definition of memcpy(). The easiest workaround is to simply disable FORTIY_SOURCE, but that will weaken the security for all implemented string functions, strcpy, memmove etc, so I don't want to do that. Is it only lduw_he_p that needs to be atomic or are the other functions in include/qemu/bswap.h using memcpy also required to be atomic? I intend to replace memcpy with __builtin_memcpy there. > > thanks > -- PMM ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-25 12:21 ` Natanael Copa @ 2019-02-25 13:06 ` Peter Maydell 2019-02-25 13:25 ` Natanael Copa 2019-02-28 9:58 ` Peter Maydell 0 siblings, 2 replies; 55+ messages in thread From: Peter Maydell @ 2019-02-25 13:06 UTC (permalink / raw) To: Natanael Copa Cc: Stefan Hajnoczi, Paolo Bonzini, Fernando Casas Schössow, qemu-devel, Richard Henderson On Mon, 25 Feb 2019 at 12:22, Natanael Copa <ncopa@alpinelinux.org> wrote: > > On Mon, 25 Feb 2019 10:34:23 +0000 > Peter Maydell <peter.maydell@linaro.org> wrote: > > The short term fix is to fix your toolchain/compilation > > environment options so that it isn't trying to override > > the definition of memcpy(). > > The easiest workaround is to simply disable FORTIY_SOURCE, but that > will weaken the security for all implemented string functions, strcpy, > memmove etc, so I don't want to do that. > > Is it only lduw_he_p that needs to be atomic or are the other functions > in include/qemu/bswap.h using memcpy also required to be atomic? Hard to say, since we haven't done the "audit all the callers" step that Stefan mentioned. If you're going to replace memcpy with __builtin_memcpy then the safest thing is to do it for all those uses (this will also give you much better generated code for performance purposes). thanks -- PMM ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-25 13:06 ` Peter Maydell @ 2019-02-25 13:25 ` Natanael Copa 2019-02-25 13:32 ` Fernando Casas Schössow 2019-02-28 9:58 ` Peter Maydell 1 sibling, 1 reply; 55+ messages in thread From: Natanael Copa @ 2019-02-25 13:25 UTC (permalink / raw) To: Peter Maydell Cc: Stefan Hajnoczi, Paolo Bonzini, Fernando Casas Schössow, qemu-devel, Richard Henderson On Mon, 25 Feb 2019 13:06:16 +0000 Peter Maydell <peter.maydell@linaro.org> wrote: > On Mon, 25 Feb 2019 at 12:22, Natanael Copa <ncopa@alpinelinux.org> wrote: > > > > On Mon, 25 Feb 2019 10:34:23 +0000 > > Peter Maydell <peter.maydell@linaro.org> wrote: > > > The short term fix is to fix your toolchain/compilation > > > environment options so that it isn't trying to override > > > the definition of memcpy(). > > > > The easiest workaround is to simply disable FORTIY_SOURCE, but that > > will weaken the security for all implemented string functions, strcpy, > > memmove etc, so I don't want to do that. > > > > Is it only lduw_he_p that needs to be atomic or are the other functions > > in include/qemu/bswap.h using memcpy also required to be atomic? > > Hard to say, since we haven't done the "audit all the callers" > step that Stefan mentioned. If you're going to replace memcpy > with __builtin_memcpy then the safest thing is to do it for > all those uses (this will also give you much better generated > code for performance purposes). I figured that and that is exactly what I did. Fernando: Can you please test the binary from qemu-system-x86_64-3.1.0-r3 from alpine edge? I will backport the fix if you can confirm it fixes the problem. Thanks! -nc PS. Those issues are pretty hard to track down, so big thanks to everyone who helped find the exact issue here. You have done a great work! ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-25 13:25 ` Natanael Copa @ 2019-02-25 13:32 ` Fernando Casas Schössow [not found] ` <VI1PR0602MB3245A6B693B23DA2E0E8E500A47A0@VI1PR0602MB3245.eurprd06.prod.outlo ok.com> 0 siblings, 1 reply; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-25 13:32 UTC (permalink / raw) To: Natanael Copa Cc: Peter Maydell, Stefan Hajnoczi, Paolo Bonzini, qemu-devel, Richard Henderson Thanks Natanael. Is the new package ready? I will update as soon as the package is available, try to repro and report back. Thanks everyone for looking into this! On lun, feb 25, 2019 at 2:25 PM, Natanael Copa <ncopa@alpinelinux.org> wrote: On Mon, 25 Feb 2019 13:06:16 +0000 Peter Maydell <peter.maydell@linaro.org<mailto:peter.maydell@linaro.org>> wrote: On Mon, 25 Feb 2019 at 12:22, Natanael Copa <ncopa@alpinelinux.org<mailto:ncopa@alpinelinux.org>> wrote: > > On Mon, 25 Feb 2019 10:34:23 +0000 > Peter Maydell <peter.maydell@linaro.org<mailto:peter.maydell@linaro.org>> wrote: > > The short term fix is to fix your toolchain/compilation > > environment options so that it isn't trying to override > > the definition of memcpy(). > > The easiest workaround is to simply disable FORTIY_SOURCE, but that > will weaken the security for all implemented string functions, strcpy, > memmove etc, so I don't want to do that. > > Is it only lduw_he_p that needs to be atomic or are the other functions > in include/qemu/bswap.h using memcpy also required to be atomic? Hard to say, since we haven't done the "audit all the callers" step that Stefan mentioned. If you're going to replace memcpy with __builtin_memcpy then the safest thing is to do it for all those uses (this will also give you much better generated code for performance purposes). I figured that and that is exactly what I did. Fernando: Can you please test the binary from qemu-system-x86_64-3.1.0-r3 from alpine edge? I will backport the fix if you can confirm it fixes the problem. Thanks! -nc PS. Those issues are pretty hard to track down, so big thanks to everyone who helped find the exact issue here. You have done a great work! ^ permalink raw reply [flat|nested] 55+ messages in thread
[parent not found: <VI1PR0602MB3245A6B693B23DA2E0E8E500A47A0@VI1PR0602MB3245.eurprd06.prod.outlo ok.com>]
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error [not found] ` <VI1PR0602MB3245A6B693B23DA2E0E8E500A47A0@VI1PR0602MB3245.eurprd06.prod.outlo ok.com> @ 2019-02-25 15:41 ` Fernando Casas Schössow 0 siblings, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-25 15:41 UTC (permalink / raw) To: Natanael Copa Cc: Peter Maydell, Stefan Hajnoczi, Paolo Bonzini, qemu-devel, Richard Henderson Ok, the new package is deployed. I stopped and started the test guest so it will use the new QEMU binary. Will monitor and report back. On lun, feb 25, 2019 at 2:32 PM, Fernando Casas Schössow <casasfernando@outlook.com> wrote: Thanks Natanael. Is the new package ready? I will update as soon as the package is available, try to repro and report back. Thanks everyone for looking into this! On lun, feb 25, 2019 at 2:25 PM, Natanael Copa <ncopa@alpinelinux.org> wrote: On Mon, 25 Feb 2019 13:06:16 +0000 Peter Maydell <peter.maydell@linaro.org<mailto:peter.maydell@linaro.org>> wrote: On Mon, 25 Feb 2019 at 12:22, Natanael Copa <ncopa@alpinelinux.org<mailto:ncopa@alpinelinux.org>> wrote: > > On Mon, 25 Feb 2019 10:34:23 +0000 > Peter Maydell <peter.maydell@linaro.org<mailto:peter.maydell@linaro.org>> wrote: > > The short term fix is to fix your toolchain/compilation > > environment options so that it isn't trying to override > > the definition of memcpy(). > > The easiest workaround is to simply disable FORTIY_SOURCE, but that > will weaken the security for all implemented string functions, strcpy, > memmove etc, so I don't want to do that. > > Is it only lduw_he_p that needs to be atomic or are the other functions > in include/qemu/bswap.h using memcpy also required to be atomic? Hard to say, since we haven't done the "audit all the callers" step that Stefan mentioned. If you're going to replace memcpy with __builtin_memcpy then the safest thing is to do it for all those uses (this will also give you much better generated code for performance purposes). I figured that and that is exactly what I did. Fernando: Can you please test the binary from qemu-system-x86_64-3.1.0-r3 from alpine edge? I will backport the fix if you can confirm it fixes the problem. Thanks! -nc PS. Those issues are pretty hard to track down, so big thanks to everyone who helped find the exact issue here. You have done a great work! ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-25 13:06 ` Peter Maydell 2019-02-25 13:25 ` Natanael Copa @ 2019-02-28 9:58 ` Peter Maydell 2019-03-07 7:14 ` Fernando Casas Schössow 1 sibling, 1 reply; 55+ messages in thread From: Peter Maydell @ 2019-02-28 9:58 UTC (permalink / raw) To: Natanael Copa Cc: Stefan Hajnoczi, Paolo Bonzini, Fernando Casas Schössow, qemu-devel, Richard Henderson On Mon, 25 Feb 2019 at 13:06, Peter Maydell <peter.maydell@linaro.org> wrote: > > On Mon, 25 Feb 2019 at 12:22, Natanael Copa <ncopa@alpinelinux.org> wrote: > > > > On Mon, 25 Feb 2019 10:34:23 +0000 > > Peter Maydell <peter.maydell@linaro.org> wrote: > > > The short term fix is to fix your toolchain/compilation > > > environment options so that it isn't trying to override > > > the definition of memcpy(). > > > > The easiest workaround is to simply disable FORTIY_SOURCE, but that > > will weaken the security for all implemented string functions, strcpy, > > memmove etc, so I don't want to do that. > > > > Is it only lduw_he_p that needs to be atomic or are the other functions > > in include/qemu/bswap.h using memcpy also required to be atomic? > > Hard to say, since we haven't done the "audit all the callers" > step that Stefan mentioned. If you're going to replace memcpy > with __builtin_memcpy then the safest thing is to do it for > all those uses (this will also give you much better generated > code for performance purposes). I mentioned this to Richard on IRC the other day, but I think there's a good argument for making upstream QEMU use __builtin_memcpy in these helpers: * it makes us robust against things like this fortify library in the short term (until we fix QEMU to have an equivalent set of functions for doing atomic accesses to AddressSpaces) * in the longer term it will mean that we don't end up with these functions being really badly-performing even if the semantics of the out-of-line memcpy() are correct thanks -- PMM ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-28 9:58 ` Peter Maydell @ 2019-03-07 7:14 ` Fernando Casas Schössow 0 siblings, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-03-07 7:14 UTC (permalink / raw) To: Peter Maydell Cc: Natanael Copa, Stefan Hajnoczi, Paolo Bonzini, qemu-devel, Richard Henderson After two weeks of running the new QEMU package everything is fine. Moreover I migrated the rest of the guests (both Windows and Linux) to virtio-scsi and no issues so far. I will monitor for another week but this issue seems pretty much fixed! Kudos to each and everyone of you that helped finding and solving this issue. Fer On jue, feb 28, 2019 at 10:58 AM, Peter Maydell <peter.maydell@linaro.org> wrote: On Mon, 25 Feb 2019 at 13:06, Peter Maydell <peter.maydell@linaro.org<mailto:peter.maydell@linaro.org>> wrote: On Mon, 25 Feb 2019 at 12:22, Natanael Copa <ncopa@alpinelinux.org<mailto:ncopa@alpinelinux.org>> wrote: > > On Mon, 25 Feb 2019 10:34:23 +0000 > Peter Maydell <peter.maydell@linaro.org<mailto:peter.maydell@linaro.org>> wrote: > > The short term fix is to fix your toolchain/compilation > > environment options so that it isn't trying to override > > the definition of memcpy(). > > The easiest workaround is to simply disable FORTIY_SOURCE, but that > will weaken the security for all implemented string functions, strcpy, > memmove etc, so I don't want to do that. > > Is it only lduw_he_p that needs to be atomic or are the other functions > in include/qemu/bswap.h using memcpy also required to be atomic? Hard to say, since we haven't done the "audit all the callers" step that Stefan mentioned. If you're going to replace memcpy with __builtin_memcpy then the safest thing is to do it for all those uses (this will also give you much better generated code for performance purposes). I mentioned this to Richard on IRC the other day, but I think there's a good argument for making upstream QEMU use __builtin_memcpy in these helpers: * it makes us robust against things like this fortify library in the short term (until we fix QEMU to have an equivalent set of functions for doing atomic accesses to AddressSpaces) * in the longer term it will mean that we don't end up with these functions being really badly-performing even if the semantics of the out-of-line memcpy() are correct thanks -- PMM ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-23 15:55 ` Natanael Copa 2019-02-23 16:18 ` Peter Maydell @ 2019-02-23 16:21 ` Fernando Casas Schössow 2019-02-25 10:30 ` Stefan Hajnoczi 2 siblings, 0 replies; 55+ messages in thread From: Fernando Casas Schössow @ 2019-02-23 16:21 UTC (permalink / raw) To: Natanael Copa Cc: Stefan Hajnoczi, Paolo Bonzini, Richard Henderson, qemu-devel Hi Natanael, The package information is the following: https://pkgs.alpinelinux.org/package/v3.9/main/x86_64/qemu-system-x86_64 The binary is for x86_64 architecture and dynamically linked: fernando@vmsvr01:~$ file /usr/bin/qemu-system-x86_64 /usr/bin/qemu-system-x86_64: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-x86_64.so.1, stripped Thanks. On sáb, feb 23, 2019 at 4:55 PM, Natanael Copa <ncopa@alpinelinux.org> wrote: On Fri, 22 Feb 2019 14:04:20 +0000 Stefan Hajnoczi <stefanha@gmail.com<mailto:stefanha@gmail.com>> wrote: On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow <casasfernando@outlook.com<mailto:casasfernando@outlook.com>> wrote: I have CCed Natanael Copa, qemu package maintainer in Alpine Linux. Fernando: Can you confirm that the bug occurs with an unmodified Alpine Linux qemu binary? I wonder exactly which binary it is. What arcitecture (x86 or x86_64) and what binary. Is it ddynamic or statically linked qemu-system-x86_64? ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-23 15:55 ` Natanael Copa 2019-02-23 16:18 ` Peter Maydell 2019-02-23 16:21 ` Fernando Casas Schössow @ 2019-02-25 10:30 ` Stefan Hajnoczi 2019-02-25 10:33 ` Stefan Hajnoczi 2 siblings, 1 reply; 55+ messages in thread From: Stefan Hajnoczi @ 2019-02-25 10:30 UTC (permalink / raw) To: Natanael Copa Cc: Fernando Casas Schössow, Paolo Bonzini, Richard Henderson, qemu-devel On Sat, Feb 23, 2019 at 3:55 PM Natanael Copa <ncopa@alpinelinux.org> wrote: > On Fri, 22 Feb 2019 14:04:20 +0000 > Stefan Hajnoczi <stefanha@gmail.com> wrote: > > On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow > > <casasfernando@outlook.com> wrote: > I tried to find this section. How do you get the assembly listing of > relevant secion? I tried to do "disas virtio_pop" from > `gdb /usr/bin/qemu-system-x86_64` from the binary in alpine edge. I > could find 2 memcpy but none of them look like a 16 bit operation after: > > 0x00000000004551f1 <+353>: mov 0x10(%rsp),%rdi > 0x00000000004551f6 <+358>: mov $0x10,%edx > 0x00000000004551fb <+363>: callq 0x3879e0 <memcpy@plt> > 0x0000000000455200 <+368>: movzwl 0x5c(%rsp),%eax > 0x0000000000455205 <+373>: test $0x4,%al > 0x0000000000455207 <+375>: je 0x4552aa <virtqueue_pop+538> > > .... > > 0x0000000000455291 <+513>: mov 0x10(%rsp),%rdi > 0x0000000000455296 <+518>: mov $0x10,%edx > 0x000000000045529b <+523>: callq 0x3879e0 <memcpy@plt> > 0x00000000004552a0 <+528>: mov %rbp,0x20(%rsp) > 0x00000000004552a5 <+533>: movzwl 0x5c(%rsp),%eax > 0x00000000004552aa <+538>: lea 0x20e0(%rsp),%rdi > 0x00000000004552b2 <+546>: xor %r11d,%r11d > 0x00000000004552b5 <+549>: mov %r15,0x38(%rsp) Here is the beginning of the function: 0000000000455090 <virtqueue_pop@@Base>: 455090: 41 57 push %r15 455092: 49 89 ff mov %rdi,%r15 455095: b9 0e 00 00 00 mov $0xe,%ecx 45509a: 41 56 push %r14 45509c: 41 55 push %r13 45509e: 41 54 push %r12 4550a0: 55 push %rbp 4550a1: 53 push %rbx 4550a2: 48 81 ec f8 60 00 00 sub $0x60f8,%rsp 4550a9: 4d 8b 67 58 mov 0x58(%r15),%r12 4550ad: 48 89 74 24 08 mov %rsi,0x8(%rsp) 4550b2: 48 8d 6c 24 70 lea 0x70(%rsp),%rbp 4550b7: 48 89 ef mov %rbp,%rdi 4550ba: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax 4550c1: 00 00 4550c3: 48 89 84 24 e8 60 00 mov %rax,0x60e8(%rsp) 4550ca: 00 4550cb: 31 c0 xor %eax,%eax 4550cd: 41 80 bc 24 6b 01 00 cmpb $0x0,0x16b(%r12) 4550d4: 00 00 4550d6: f3 48 ab rep stos %rax,%es:(%rdi) 4550d9: 0f 85 c1 04 00 00 jne 4555a0 <virtqueue_pop@@Base+0x510> 4550df: 48 8b 1d f2 4e 99 00 mov 0x994ef2(%rip),%rbx # de9fd8 <rcu_reader@@Base+0xde9f20> 4550e6: 64 8b 43 0c mov %fs:0xc(%rbx),%eax 4550ea: 8d 50 01 lea 0x1(%rax),%edx 4550ed: 64 89 53 0c mov %edx,%fs:0xc(%rbx) 4550f1: 85 c0 test %eax,%eax 4550f3: 0f 84 d7 03 00 00 je 4554d0 <virtqueue_pop@@Base+0x440> 4550f9: 49 83 7f 18 00 cmpq $0x0,0x18(%r15) 4550fe: 0f 84 7d 03 00 00 je 455481 <virtqueue_pop@@Base+0x3f1> 455104: 41 0f b7 47 30 movzwl 0x30(%r15),%eax 455109: 66 41 39 47 32 cmp %ax,0x32(%r15) 45510e: 0f 84 0c 04 00 00 je 455520 <virtqueue_pop@@Base+0x490> This last branch leads to: 455520: 4c 89 ff mov %r15,%rdi 455523: e8 78 b4 ff ff callq 4509a0 <virtio_queue_host_notifier_read@@Base+0x6d0> 455528: 48 83 b8 90 00 00 00 cmpq $0x3,0x90(%rax) 45552f: 03 455530: 0f 86 98 03 00 00 jbe 4558ce <virtqueue_pop@@Base+0x83e> 455536: 48 8b 90 80 00 00 00 mov 0x80(%rax),%rdx 45553d: 48 85 d2 test %rdx,%rdx 455540: 0f 84 c2 02 00 00 je 455808 <virtqueue_pop@@Base+0x778> 455546: 48 8d 72 02 lea 0x2(%rdx),%rsi 45554a: 48 8d 7c 24 42 lea 0x42(%rsp),%rdi 45554f: 48 39 fe cmp %rdi,%rsi 455552: 76 09 jbe 45555d <virtqueue_pop@@Base+0x4cd> 455554: 48 8d 47 02 lea 0x2(%rdi),%rax 455558: 48 39 c6 cmp %rax,%rsi 45555b: 72 3c jb 455599 <virtqueue_pop@@Base+0x509> 45555d: 48 39 fe cmp %rdi,%rsi 455560: 72 2e jb 455590 <virtqueue_pop@@Base+0x500> 455562: ba 02 00 00 00 mov $0x2,%edx 455567: e8 74 24 f3 ff callq 3879e0 <memcpy@plt> > I can test different CFLAGS with and without the _FORTIFY_SOURCE and > with different variants of memcpy (like __builtint_memcpy etc) but i > need find a way to get the correct assembly output so I know if/when I > have found something that works. Good. I think tweaking the compilation flags is the quickest workaround to produce a working QEMU binary for Alpine. Other distros don't seem to hit this problem. In the longer term QEMU will need to fix these memory accessor functions in the way that Peter described, but doing that properly requires auditing all the code so each instance that relies on atomicity can be converted properly (and the other ones will stay non-atomic because they must support misaligned addresses). Stefan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-25 10:30 ` Stefan Hajnoczi @ 2019-02-25 10:33 ` Stefan Hajnoczi 0 siblings, 0 replies; 55+ messages in thread From: Stefan Hajnoczi @ 2019-02-25 10:33 UTC (permalink / raw) To: Natanael Copa Cc: Fernando Casas Schössow, Paolo Bonzini, Richard Henderson, qemu-devel On Mon, Feb 25, 2019 at 10:30 AM Stefan Hajnoczi <stefanha@gmail.com> wrote: > On Sat, Feb 23, 2019 at 3:55 PM Natanael Copa <ncopa@alpinelinux.org> wrote: > > On Fri, 22 Feb 2019 14:04:20 +0000 > > Stefan Hajnoczi <stefanha@gmail.com> wrote: > > > On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow > > > <casasfernando@outlook.com> wrote: > > I tried to find this section. How do you get the assembly listing of > > relevant secion? I tried to do "disas virtio_pop" from > > `gdb /usr/bin/qemu-system-x86_64` from the binary in alpine edge. I > > could find 2 memcpy but none of them look like a 16 bit operation after: > > > > 0x00000000004551f1 <+353>: mov 0x10(%rsp),%rdi > > 0x00000000004551f6 <+358>: mov $0x10,%edx > > 0x00000000004551fb <+363>: callq 0x3879e0 <memcpy@plt> > > 0x0000000000455200 <+368>: movzwl 0x5c(%rsp),%eax > > 0x0000000000455205 <+373>: test $0x4,%al > > 0x0000000000455207 <+375>: je 0x4552aa <virtqueue_pop+538> > > > > .... > > > > 0x0000000000455291 <+513>: mov 0x10(%rsp),%rdi > > 0x0000000000455296 <+518>: mov $0x10,%edx > > 0x000000000045529b <+523>: callq 0x3879e0 <memcpy@plt> > > 0x00000000004552a0 <+528>: mov %rbp,0x20(%rsp) > > 0x00000000004552a5 <+533>: movzwl 0x5c(%rsp),%eax > > 0x00000000004552aa <+538>: lea 0x20e0(%rsp),%rdi > > 0x00000000004552b2 <+546>: xor %r11d,%r11d > > 0x00000000004552b5 <+549>: mov %r15,0x38(%rsp) > > Here is the beginning of the function: This was built from the docker alpine image: REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/alpine latest caf27325b298 3 weeks ago 5.52 MB aports git commit: be41538f0061b406a374564a0043a363efcb0293 gcc (Alpine 8.2.0) 8.2.0 Stefan ^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error 2019-02-22 14:04 ` Stefan Hajnoczi ` (4 preceding siblings ...) 2019-02-23 15:55 ` Natanael Copa @ 2019-02-23 16:57 ` Peter Maydell 5 siblings, 0 replies; 55+ messages in thread From: Peter Maydell @ 2019-02-23 16:57 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Fernando Casas Schössow, Paolo Bonzini, Richard Henderson, qemu-devel, Natanael Copa On Fri, 22 Feb 2019 at 14:07, Stefan Hajnoczi <stefanha@gmail.com> wrote: > Richard: Commit 7db2145a6826b14efceb8dd64bfe6ad8647072eb ("bswap: Add > host endian unaligned access functions") introduced the unaligned > memory access functions in question here. Please see below for > details on the bug - basically QEMU code assumes they are atomic, but > that is not guaranteed :(. Any ideas for how to fix this? I suspect we want a separate family of access functions for "I guarantee this will be an aligned access and I need the atomicity". (The other place where we've talked about needing the atomicity is in emulation of page-table-walk, where you need the page table loads to be atomic w.r.t. other CPU threads, especially in the case where you need to emulate a hardware update of a dirty/access bit in the page table entry.) Mostly this hasn't bitten us before because any sensible compiler will turn the memcpy into a straight load on most common hosts, which will be atomic (but accidentally rather than on purpose). thanks -- PMM ^ permalink raw reply [flat|nested] 55+ messages in thread
end of thread, other threads:[~2019-03-07 7:14 UTC | newest] Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-06-14 21:56 [Qemu-devel] Guest unresponsive after Virtqueue size exceeded error Fernando Casas Schössow 2017-06-16 6:58 ` Ladi Prosek 2017-06-16 10:11 ` Fernando Casas Schössow 2017-06-16 10:25 ` Ladi Prosek 2017-06-19 22:10 ` Fernando Casas Schössow 2017-06-20 5:59 ` Ladi Prosek 2017-06-20 6:30 ` Fernando Casas Schössow 2017-06-20 7:52 ` Ladi Prosek 2017-06-21 12:19 ` Fernando Casas Schössow 2017-06-22 7:43 ` Ladi Prosek 2017-06-23 6:29 ` Fernando Casas Schössow [not found] ` <1498199343.2815.0@smtp-mail.outlook.com> 2017-06-24 8:34 ` Fernando Casas Schössow 2019-01-31 11:32 ` Fernando Casas Schössow 2019-02-01 5:48 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi 2019-02-01 8:17 ` Fernando Casas Schössow 2019-02-04 6:06 ` Stefan Hajnoczi 2019-02-04 7:24 ` Fernando Casas Schössow [not found] ` <AM5PR0602MB32368CB5ADDEC05F42D8BC8FA46D0@AM5PR0602MB3236.eurprd06.prod.outlo ok.com> 2019-02-06 7:15 ` Fernando Casas Schössow [not found] ` <AM5PR0602MB32368CB5ADDEC05F42D8BC8FA46D0@AM5PR0602MB3236.eurprd06.prod.outlo> [not found] ` <VI1PR0602MB3245032D51A5DF45AF6E1952A46F0@VI1PR0602MB3245.eurprd06.prod.outlo ok.com> 2019-02-06 16:47 ` Fernando Casas Schössow 2019-02-11 3:17 ` Stefan Hajnoczi 2019-02-11 9:48 ` Fernando Casas Schössow 2019-02-18 7:21 ` Fernando Casas Schössow [not found] ` <VI1PR0602MB3245424120D151F29884A7E2A4630@VI1PR0602MB3245.eurprd06.prod.outlo ok.com> 2019-02-19 7:26 ` Fernando Casas Schössow 2019-02-20 16:58 ` Stefan Hajnoczi 2019-02-20 17:53 ` Paolo Bonzini 2019-02-20 18:56 ` Fernando Casas Schössow 2019-02-21 11:11 ` Stefan Hajnoczi 2019-02-21 11:33 ` Fernando Casas Schössow [not found] ` <VI1PR0602MB3245593855B029B427ED544FA47E0@VI1PR0602MB3245.eurprd06.prod.outlook.com> [not found] ` <CAJSP0QUs9Yz2-k1KyVMwpgx6RwY9cK7qdQRCQ74xmgXJPJR-qw@mail.gmail.com> [not found] ` <VI1PR0602MB32453A8B5CBC0308C7D18F1DA47E0@VI1PR0602MB3245.eurprd06.prod.outlook.com> [not found] ` <CAJSP0QVxaW3tezjBN9owJHsxzE9h8_qcaeRr5zHHKxKJOeFnkQ@mail.gmail.com> [not found] ` <CAJSP0QVXoZJ9MJ0qp4RM_m2fGJ8iFSyJMAU_X7mdiQvpOK59KA@mail.gmail.com> [not found] ` <VI1PR0602MB324516419266A934FE7759C6A47E0@VI1PR0602MB3245.eurprd06.prod.outlook.com> [not found] ` <VI1PR0602MB324516419266A934FE7759C6A47E0@VI1PR0602MB3245.eurprd06.prod.outlo> [not found] ` <VI1PR0602MB32454C17192EFA863E29CC49A47E0@VI1PR0602MB3245.eurprd06.prod.outlo> [not found] ` <VI1PR0602MB324547F72DA9EDEB1613C888A47E0@VI1PR0602MB3245.eurprd06.prod.outlook.com> [not found] ` <CAJSP0QUg=cq3tCSLidQ9BR2hxAo3K6gA6LKtpx5Rjb=_6XgJ6Q@mail.gmail.com> [not found] ` <28e6b4ed-9afd-3a79-6267-86c7385c23ce@redhat.com> [not found] ` <VI1PR0602MB324578F91F1AF9390D03022FA47F0@VI1PR0602MB3245.eurprd06.prod.outlook.com> 2019-02-22 14:04 ` Stefan Hajnoczi 2019-02-22 14:38 ` Paolo Bonzini 2019-02-22 14:43 ` Fernando Casas Schössow 2019-02-22 14:55 ` Paolo Bonzini 2019-02-22 15:48 ` Fernando Casas Schössow 2019-02-22 16:37 ` Dr. David Alan Gilbert 2019-02-22 16:39 ` Paolo Bonzini 2019-02-22 16:47 ` Dr. David Alan Gilbert 2019-02-23 11:49 ` Natanael Copa 2019-02-26 13:30 ` Paolo Bonzini 2019-02-28 7:35 ` Fernando Casas Schössow 2019-02-23 15:55 ` Natanael Copa 2019-02-23 16:18 ` Peter Maydell 2019-02-25 10:24 ` Natanael Copa 2019-02-25 10:34 ` Peter Maydell 2019-02-25 12:15 ` Fernando Casas Schössow 2019-02-25 12:21 ` Natanael Copa 2019-02-25 13:06 ` Peter Maydell 2019-02-25 13:25 ` Natanael Copa 2019-02-25 13:32 ` Fernando Casas Schössow [not found] ` <VI1PR0602MB3245A6B693B23DA2E0E8E500A47A0@VI1PR0602MB3245.eurprd06.prod.outlo ok.com> 2019-02-25 15:41 ` Fernando Casas Schössow 2019-02-28 9:58 ` Peter Maydell 2019-03-07 7:14 ` Fernando Casas Schössow 2019-02-23 16:21 ` Fernando Casas Schössow 2019-02-25 10:30 ` Stefan Hajnoczi 2019-02-25 10:33 ` Stefan Hajnoczi 2019-02-23 16:57 ` Peter Maydell
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.