All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cornelia Huck <cornelia.huck@de.ibm.com>
To: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>,
	mst@redhat.com, qemu-devel@nongnu.org, marcel@redhat.com,
	eblake@redhat.com, qemu-stable@nongnu.org, stefanha@redhat.com,
	dgilbert@redhat.com
Subject: Re: [Qemu-devel] [Qemu-stable] [PATCH v2] virtio-bus: Plug devices after features are negotiated
Date: Wed, 14 Dec 2016 08:44:51 +0100	[thread overview]
Message-ID: <20161214084451.2a5236d2.cornelia.huck@de.ibm.com> (raw)
In-Reply-To: <20161213212745.1976.91005@loki>

On Tue, 13 Dec 2016 15:27:45 -0600
Michael Roth <mdroth@linux.vnet.ibm.com> wrote:

> Quoting Maxime Coquelin (2016-09-13 08:30:30)
> > Currently, devices are plugged before features are negotiated.
> > If the backend doesn't support VIRTIO_F_VERSION_1, the transport
> > needs to rewind some settings.
> > 
> > This is the case for CCW, for which a post_plugged callback had
> > been introduced, where max_rev field is just updated if
> > VIRTIO_F_VERSION_1 is not supported by the backend.
> > For PCI, implementing post_plugged would be much more
> > complicated, so it needs to know whether the backend supports
> > VIRTIO_F_VERSION_1 at plug time.
> > 
> > Currently, nothing is done for PCI. Modern capabilities get
> > exposed to the guest even if VIRTIO_F_VERSION_1 is not supported
> > by the backend, which confuses the guest.
> > 
> > This patch replaces existing post_plugged solution with an
> > approach that fits with both transports.
> > Features negotiation is performed before ->device_plugged() call.
> > A pre_plugged callback is introduced so that the transports can
> > set their supported features.
> > 
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > Cc: qemu-stable@nongnu.org
> > Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> [ccw]
> > Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> > Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
> > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> 
> 
> It looks like this patch breaks migration under certain circumstances.
> One such scenario is migrating 2.7 -> 2.8.0-rc3 as detailed below on a
> host that doesn't have support for virtio-1 in vhost (which was
> introduced via 41e3e42 in kernel 3.18. In my case, I'm using Ubuntu
> 14.04, kernel 3.13):
> 
>  source (2.7.0):
>  
>    sudo build/v2.7.0/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -L
>    build/v2.7.0-bios -M pc-i440fx-2.7 -m 512M -drive
>    file=/home/mdroth/vm/win7_home_sp1_virtio_testing.qcow2,if=virtio -vga
>    cirrus -smp 4 -device virtio-net-pci,netdev=net0 -netdev
>    tap,id=net0,vhost=on,script=/etc/qemu-ifup -monitor
>    unix:/tmp/vm-hmp.sock,server,nowait -qmp
>    unix:/tmp/vm-qmp.sock,server,nowait -vnc :200
>   
>  target (2.8.0-rc3):
>    
>    sudo build/v2.8.0-rc3/x86_64-softmmu/qemu-system-x86_64 -enable-kvm
>    -L build/v2.8.0-rc3-bios -M pc-i440fx-2.7 -m 512M -drive
>    file=/home/mdroth/vm/win7_home_sp1_virtio_testing.qcow2,if=virtio -vga
>    cirrus -smp 4 -device virtio-net-pci,netdev=net0 -netdev
>    tap,id=net0,vhost=on,script=/etc/qemu-ifup -monitor
>    unix:/tmp/vm-hmp-incoming.sock,server,nowait -qmp
>    unix:/tmp/vm-qmp-incoming.sock,server,nowait -vnc :201
>    -incoming unix:/tmp/migrate.sock
>     
>  error on target after migration:
>      
>    qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x34
>    read: 98 device: 40 cmask: ff wmask: 0 w1cmask:0
>    qemu-system-x86_64: Failed to load PCIDevice:config
>    qemu-system-x86_64: Failed to load virtio-net:virtio
>    qemu-system-x86_64: error while loading state for instance 0x0 of
>    device '0000:00:03.0/virtio-net'
>    qemu-system-x86_64: load of migration failed: Invalid argument
> 

Ugh. Let me double-check what happens for ccw. I could have sworn I
tested this...

> 
> Based on discussion on IRC (excerpts below), I think the new handling needs
> to be controllable via a machine option that can be disabled by default
> for older machine types. This is being considered a release blocker for
> 2.8 since there are still pre-3.18 LTS kernels in the wild.
> 
> 
> root-cause:
> 
> 14:35 < stefanha> v3.13 will not work
> 14:35 < stefanha>         VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
> 14:35 < stefanha>                          (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
> 14:35 < stefanha>                          (1ULL << VIRTIO_RING_F_EVENT_IDX) |
> 14:35 < stefanha>                          (1ULL << VHOST_F_LOG_ALL),
> 14:35 < stefanha> The kernel side only knows about these guys
> 14:35 < davidgiluk> our downstream 3.10.0-524 seems to work - but that's probably got all the vhost stuff backported
> 14:35 < stefanha> plus these guys:
> 14:35 < stefanha> F        VHOST_NET_FEATURES = VHOST_FEATURES |
> 14:35 < stefanha>                          (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
> 14:35 < stefanha>                          (1ULL << VIRTIO_NET_F_MRG_RXBUF),
> 14:36 < stefanha> mdroth: So QEMU is going to check a list of feature bits including VERSION_1
> 14:36 < stefanha> and it will see that vhost doesn't support them.
> 14:36 < stefanha> So we're kind of doing the right thing now.
> 14:36 < stefanha> Before userspace was overriding the fact that vhost cannot handle VERSION_1.
> 14:36 < stefanha> ...except we broke migration
> 14:36 < davidgiluk> stefanha: But shouldn't it refuse to startup ?
> 14:36 < stefanha> Everything is perfect*
> 14:36 < stefanha> * except we broke migration
> 14:36 < stefanha> :)
> 
> suggestions on how to fix this:
> 
> 14:40 < stefanha> My understanding is this bug is vhost-specific.  If you have an old kernel that doesn't support VERSION_1 vhost then migration will break.
> 14:41 < stefanha> The actual bug is in QEMU though, not vhost
> 14:42 < stefanha> vhost is reporting the truth.  It's QEMU that has changed behavior.
> 14:44 < stefanha> mdroth: I think a lame fix would be to make virtio_pci_device_plugged() dependent on a flag that says: use old broken behavior instead of reporting the truth to the guest.
> 14:44 < stefanha> The flag could be enabled for all old machine types
> 14:44 < stefanha> and new machine types will report the truth.
> 14:44 < stefanha> That way migration continues to work.

I'll check whether we would need something for ccw as well.

> 14:44 < stefanha> Not sure if anyone can think of a nicer solution.
> 14:45 < stefanha> But we're going to have to keep lying to the guest if we want to preserve migration compatibility
> 14:45 < stefanha> The key change in behavior with the patch you identified is:
> 14:46 < stefanha> if (!virtio_has_feature(vdev->host_features, VIRTIO_F_VERSION_1)) {
> 14:46 < stefanha> virtio_pci_disable_modern(proxy);
> 14:46 < stefanha> Previously it didn't care about vdev->host_features.  It simply allowed VERSION_1 when proxy's disable_modern boolean was false.
> 14:47 < mdroth> stefanha: ok, that explains why it seems to work with disable-modern=true
> 14:48 < stefanha> mdroth: Your Ubuntu kernel old but 14.04 LTS is definitely still around so I don't think we can ship QEMU 2.8 like this.
> 14:49 < stefanha> mdroth: Let's summarize it on the mailing list and see what Michael Tsirkin and Maxime Coquelin think.
> 14:49 < mdroth> stefanha: i suppose a potential workaround would be to tell users to set disable-modern= to match their vhost capabilities, but it's hard for them to apply that retroactively if they're looking to migrate

Another thought: Maybe this bug only surfaced right now because older
qemus defaulted virtio-pci to legacy?

(I think modern virtio-pci with old vhost resulted in a config that was
rejected at least by Linux guests. Because pci defaulted to legacy, we
only had the post-plugged workaround for ccw before.)

> 14:49 < stefanha> We can delay the release in the meantime.
> 14:50 < stefanha> mdroth: I don't think a workaround is going to be smooth in this case
> 14:50 < stefanha> People will only notice once migration fails,
> 14:50 < stefanha> and that's when you lose your VM state!

  reply	other threads:[~2016-12-14  7:45 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-13 13:30 [Qemu-devel] [PATCH v2] virtio-bus: Plug devices after features are negotiated Maxime Coquelin
2016-12-13 21:27 ` [Qemu-devel] [Qemu-stable] " Michael Roth
2016-12-14  7:44   ` Cornelia Huck [this message]
2016-12-14  8:28     ` Maxime Coquelin
2016-12-14  8:41       ` Stefan Hajnoczi
2016-12-14  8:59         ` Cornelia Huck
2016-12-14  9:44           ` Maxime Coquelin
2016-12-14  9:50             ` Dr. David Alan Gilbert
2016-12-14 10:02               ` Maxime Coquelin
2016-12-14 10:08             ` Cornelia Huck
2016-12-14 11:48               ` Marcel Apfelbaum
2016-12-14  8:20   ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161214084451.2a5236d2.cornelia.huck@de.ibm.com \
    --to=cornelia.huck@de.ibm.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=marcel@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.