All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] q35 migration broken
@ 2016-03-31 19:03 Dr. David Alan Gilbert
       [not found] ` <20160401100623.GE2242@work-vm>
  0 siblings, 1 reply; 5+ messages in thread
From: Dr. David Alan Gilbert @ 2016-03-31 19:03 UTC (permalink / raw)
  To: marcel, jsnow, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3458 bytes --]

Hi,
  I'm seeing a breakage on q35 migration on head (and possibly older
but certainly head; it's also on a 2.5.0 world I've got with a bunch
of patches but I've not tried a clean 2.5.0 yet).

It looks like some type of interrupt screwup; with a virtio-net device
I get a:
  BUG: soft lockup - CPU#0 stuck for 22s!
  ...  virtnet_config_changed_work 

but if I swap that out for an e1000 I get:
  Disabling IRQ #22

  and various timeouts on e1000 and cdrom (scsi).
The guest kind of limps along with an existing terminal scrolling dmesg -w output.

This is an f23 guest on a rhel7.2-ish host; with the guest sitting an idle
(MATE) Gui.

i440fx works.

qemu     30823     1 15 14:51 ?        00:00:07 /opt/qemu-head/bin/qemu-system-x86_64 -name f23-q35 -S -machine pc-q35-2.6,accel=kvm,usb=off,vmport=off -cpu SandyBridge -m 4096 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 3cc93d9b-9b87-4472-847c-25cea2bfc51f -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-12-f23-q35/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e -device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.2,addr=0x3 -device virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x4 -drive file=/home/vms/f23-q35.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.2,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-scsi0-0-0-0,readonly=on -device scsi-cd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 -netdev tap,fd=25,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=52:54:00:98:12:7d,bus=pci.2,addr=0x1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-12-f23-q35/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pcie.0,addr=0x1 -device intel-hda,id=sound0,bus=pci.2,addr=0x2 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.2,addr=0x6 -msg timestamp=on

(Attaching libvirt xml that generated that lot).

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

[-- Attachment #2: f23-q35.xml --]
[-- Type: text/xml, Size: 4768 bytes --]

<!--
WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
OVERWRITTEN AND LOST. Changes to this xml configuration should be made using:
  virsh edit f23-q35
or other application using the libvirt API.
-->

<domain type='kvm'>
  <name>f23-q35</name>
  <uuid>3cc93d9b-9b87-4472-847c-25cea2bfc51f</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <os>
    <type arch='x86_64' machine='pc-q35-2.6'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
  </features>
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>SandyBridge</model>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/opt/qemu-head/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/home/vms/f23-q35.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x05' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <target dev='sda' bus='scsi'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1d' function='0x2'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='dmi-to-pci-bridge'>
      <model name='i82801b11-bridge'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/>
    </controller>
    <controller type='pci' index='2' model='pci-bridge'>
      <model name='pci-bridge'/>
      <target chassisNr='2'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/>
    </controller>
    <controller type='scsi' index='0' model='virtio-scsi'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x04' function='0x0'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:98:12:7d'/>
      <source network='default'/>
      <model type='e1000'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x01' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <source mode='bind'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='spice' autoport='yes'/>
    <sound model='ich6'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x02' function='0x0'/>
    </sound>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x06' function='0x0'/>
    </memballoon>
  </devices>
</domain>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] q35 migration broken
       [not found] ` <20160401100623.GE2242@work-vm>
@ 2016-04-01 15:54   ` Dr. David Alan Gilbert
  2016-04-01 16:59     ` Marcel Apfelbaum
  0 siblings, 1 reply; 5+ messages in thread
From: Dr. David Alan Gilbert @ 2016-04-01 15:54 UTC (permalink / raw)
  To: marcel, jsnow, qemu-devel

* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> > Hi,
> >   I'm seeing a breakage on q35 migration on head (and possibly older
> > but certainly head; it's also on a 2.5.0 world I've got with a bunch
> > of patches but I've not tried a clean 2.5.0 yet).
> > 
> > It looks like some type of interrupt screwup; with a virtio-net device
> > I get a:
> >   BUG: soft lockup - CPU#0 stuck for 22s!
> >   ...  virtnet_config_changed_work 
> > 
> > but if I swap that out for an e1000 I get:
> >   Disabling IRQ #22
> > 
> >   and various timeouts on e1000 and cdrom (scsi).
> > The guest kind of limps along with an existing terminal scrolling dmesg -w output.
> > 
> > This is an f23 guest on a rhel7.2-ish host; with the guest sitting an idle
> > (MATE) Gui.
> 
> Also broken with 2.4.1 and 2.5.1 (with pc-q35-2.4 machine type);
> see a screen shot attached; note:
>    a) The large count on irq 22 (enp2s1) on cpu1
>    b) The large count on virtio2-config on cpu1
>    c) The count of 'Deferred Error APIC interrupts'.

OK, this seems to be the i82801b11-bridge; if I remove it from the config
it all works.

My minimum config that fails so far is:
/opt/qemu-head/bin/qemu-system-x86_64 -nographic -machine pc-q35-2.6,accel=kvm,usb=off,vmport=off -cpu SandyBridge -m 4096 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 \
 -device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e \
 -drive id=image,file=/home/vms/f23-serial.qcow2,if=none,cache=none \
 -device virtio-blk-pci,scsi=off,bus=pci.1,addr=0x5,drive=image,id=virtio-disk0,bootindex=1 \
 $*

if I flip the i82801b11-bridge for a pci-bridge  then it works.

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] q35 migration broken
  2016-04-01 15:54   ` Dr. David Alan Gilbert
@ 2016-04-01 16:59     ` Marcel Apfelbaum
  2016-04-01 17:01       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 5+ messages in thread
From: Marcel Apfelbaum @ 2016-04-01 16:59 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, jsnow, qemu-devel

On 04/01/2016 06:54 PM, Dr. David Alan Gilbert wrote:
> * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
>> * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
>>> Hi,
>>>    I'm seeing a breakage on q35 migration on head (and possibly older
>>> but certainly head; it's also on a 2.5.0 world I've got with a bunch
>>> of patches but I've not tried a clean 2.5.0 yet).
>>>
>>> It looks like some type of interrupt screwup; with a virtio-net device
>>> I get a:
>>>    BUG: soft lockup - CPU#0 stuck for 22s!
>>>    ...  virtnet_config_changed_work
>>>
>>> but if I swap that out for an e1000 I get:
>>>    Disabling IRQ #22
>>>
>>>    and various timeouts on e1000 and cdrom (scsi).
>>> The guest kind of limps along with an existing terminal scrolling dmesg -w output.
>>>
>>> This is an f23 guest on a rhel7.2-ish host; with the guest sitting an idle
>>> (MATE) Gui.
>>
>> Also broken with 2.4.1 and 2.5.1 (with pc-q35-2.4 machine type);
>> see a screen shot attached; note:
>>     a) The large count on irq 22 (enp2s1) on cpu1
>>     b) The large count on virtio2-config on cpu1
>>     c) The count of 'Deferred Error APIC interrupts'.
>
> OK, this seems to be the i82801b11-bridge; if I remove it from the config
> it all works.
>
> My minimum config that fails so far is:
> /opt/qemu-head/bin/qemu-system-x86_64 -nographic -machine pc-q35-2.6,accel=kvm,usb=off,vmport=off -cpu SandyBridge -m 4096 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 \
>   -device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e \
>   -drive id=image,file=/home/vms/f23-serial.qcow2,if=none,cache=none \
>   -device virtio-blk-pci,scsi=off,bus=pci.1,addr=0x5,drive=image,id=virtio-disk0,bootindex=1 \
>   $*
>
> if I flip the i82801b11-bridge for a pci-bridge  then it works.

Hi Dave,
That's good news. I see we don't have a vmstate for this bridge, but we do have one for the regular pci-bridge.
Maybe this is the problem?

Thanks,
Marcel

>
> Dave
>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] q35 migration broken
  2016-04-01 16:59     ` Marcel Apfelbaum
@ 2016-04-01 17:01       ` Dr. David Alan Gilbert
  2016-04-01 17:05         ` Marcel Apfelbaum
  0 siblings, 1 reply; 5+ messages in thread
From: Dr. David Alan Gilbert @ 2016-04-01 17:01 UTC (permalink / raw)
  To: Marcel Apfelbaum; +Cc: jsnow, qemu-devel

* Marcel Apfelbaum (marcel@redhat.com) wrote:
> On 04/01/2016 06:54 PM, Dr. David Alan Gilbert wrote:
> >* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> >>* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> >>>Hi,
> >>>   I'm seeing a breakage on q35 migration on head (and possibly older
> >>>but certainly head; it's also on a 2.5.0 world I've got with a bunch
> >>>of patches but I've not tried a clean 2.5.0 yet).
> >>>
> >>>It looks like some type of interrupt screwup; with a virtio-net device
> >>>I get a:
> >>>   BUG: soft lockup - CPU#0 stuck for 22s!
> >>>   ...  virtnet_config_changed_work
> >>>
> >>>but if I swap that out for an e1000 I get:
> >>>   Disabling IRQ #22
> >>>
> >>>   and various timeouts on e1000 and cdrom (scsi).
> >>>The guest kind of limps along with an existing terminal scrolling dmesg -w output.
> >>>
> >>>This is an f23 guest on a rhel7.2-ish host; with the guest sitting an idle
> >>>(MATE) Gui.
> >>
> >>Also broken with 2.4.1 and 2.5.1 (with pc-q35-2.4 machine type);
> >>see a screen shot attached; note:
> >>    a) The large count on irq 22 (enp2s1) on cpu1
> >>    b) The large count on virtio2-config on cpu1
> >>    c) The count of 'Deferred Error APIC interrupts'.
> >
> >OK, this seems to be the i82801b11-bridge; if I remove it from the config
> >it all works.
> >
> >My minimum config that fails so far is:
> >/opt/qemu-head/bin/qemu-system-x86_64 -nographic -machine pc-q35-2.6,accel=kvm,usb=off,vmport=off -cpu SandyBridge -m 4096 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 \
> >  -device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e \
> >  -drive id=image,file=/home/vms/f23-serial.qcow2,if=none,cache=none \
> >  -device virtio-blk-pci,scsi=off,bus=pci.1,addr=0x5,drive=image,id=virtio-disk0,bootindex=1 \
> >  $*
> >
> >if I flip the i82801b11-bridge for a pci-bridge  then it works.
> 
> Hi Dave,
> That's good news. I see we don't have a vmstate for this bridge, but we do have one for the regular pci-bridge.
> Maybe this is the problem?

Yeh that's one of my suspicions; but how do device classes work?
If a i82801b11-bridge is a subclass of pci-bridge should it just
pick up the vmstate of the subclass magically?

Dave
P.S. I've filed this as RH bz 1323273
> 
> Thanks,
> Marcel
> 
> >
> >Dave
> >
> >--
> >Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] q35 migration broken
  2016-04-01 17:01       ` Dr. David Alan Gilbert
@ 2016-04-01 17:05         ` Marcel Apfelbaum
  0 siblings, 0 replies; 5+ messages in thread
From: Marcel Apfelbaum @ 2016-04-01 17:05 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: jsnow, qemu-devel

On 04/01/2016 08:01 PM, Dr. David Alan Gilbert wrote:
> * Marcel Apfelbaum (marcel@redhat.com) wrote:
>> On 04/01/2016 06:54 PM, Dr. David Alan Gilbert wrote:
>>> * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
>>>> * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
>>>>> Hi,
>>>>>    I'm seeing a breakage on q35 migration on head (and possibly older
>>>>> but certainly head; it's also on a 2.5.0 world I've got with a bunch
>>>>> of patches but I've not tried a clean 2.5.0 yet).
>>>>>
>>>>> It looks like some type of interrupt screwup; with a virtio-net device
>>>>> I get a:
>>>>>    BUG: soft lockup - CPU#0 stuck for 22s!
>>>>>    ...  virtnet_config_changed_work
>>>>>
>>>>> but if I swap that out for an e1000 I get:
>>>>>    Disabling IRQ #22
>>>>>
>>>>>    and various timeouts on e1000 and cdrom (scsi).
>>>>> The guest kind of limps along with an existing terminal scrolling dmesg -w output.
>>>>>
>>>>> This is an f23 guest on a rhel7.2-ish host; with the guest sitting an idle
>>>>> (MATE) Gui.
>>>>
>>>> Also broken with 2.4.1 and 2.5.1 (with pc-q35-2.4 machine type);
>>>> see a screen shot attached; note:
>>>>     a) The large count on irq 22 (enp2s1) on cpu1
>>>>     b) The large count on virtio2-config on cpu1
>>>>     c) The count of 'Deferred Error APIC interrupts'.
>>>
>>> OK, this seems to be the i82801b11-bridge; if I remove it from the config
>>> it all works.
>>>
>>> My minimum config that fails so far is:
>>> /opt/qemu-head/bin/qemu-system-x86_64 -nographic -machine pc-q35-2.6,accel=kvm,usb=off,vmport=off -cpu SandyBridge -m 4096 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 \
>>>   -device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e \
>>>   -drive id=image,file=/home/vms/f23-serial.qcow2,if=none,cache=none \
>>>   -device virtio-blk-pci,scsi=off,bus=pci.1,addr=0x5,drive=image,id=virtio-disk0,bootindex=1 \
>>>   $*
>>>
>>> if I flip the i82801b11-bridge for a pci-bridge  then it works.
>>
>> Hi Dave,
>> That's good news. I see we don't have a vmstate for this bridge, but we do have one for the regular pci-bridge.
>> Maybe this is the problem?
>
> Yeh that's one of my suspicions; but how do device classes work?
> If a i82801b11-bridge is a subclass of pci-bridge should it just
> pick up the vmstate of the subclass magically?

I am not sure, I think you need something like:

static const VMStateDescription i82801b11-bridge_dev_vmstate = {
     .name = ""i82801b11_bridge,
     .fields = (VMStateField[]) {
         VMSTATE_PCI_DEVICE(parent_obj, PCIBridge),
         VMSTATE_END_OF_LIST()
     }
};



>
> Dave
> P.S. I've filed this as RH bz 1323273

good, we can track this now.

Thanks,
Marcel

>>
>> Thanks,
>> Marcel
>>
>>>
>>> Dave
>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-04-01 17:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-31 19:03 [Qemu-devel] q35 migration broken Dr. David Alan Gilbert
     [not found] ` <20160401100623.GE2242@work-vm>
2016-04-01 15:54   ` Dr. David Alan Gilbert
2016-04-01 16:59     ` Marcel Apfelbaum
2016-04-01 17:01       ` Dr. David Alan Gilbert
2016-04-01 17:05         ` Marcel Apfelbaum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.