All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Peter Xu <peterx@redhat.com>,
	qemu-devel@nongnu.org, Eric Auger <eric.auger@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH 0/4] vl: Prioritize device realizations
Date: Wed, 20 Oct 2021 14:48:09 +0100	[thread overview]
Message-ID: <YXAeGdkCPh5h+kHg@redhat.com> (raw)
In-Reply-To: <2817620d-facb-eeee-b854-64193fa4da33@redhat.com>

On Wed, Oct 20, 2021 at 03:44:08PM +0200, David Hildenbrand wrote:
> On 18.08.21 21:42, Peter Xu wrote:
> > This is a long pending issue that we haven't fixed.  The issue is in QEMU we
> > have implicit device ordering requirement when realizing, otherwise some of the
> > device may not work properly.
> > 
> > The initial requirement comes from when vfio-pci starts to work with vIOMMUs.
> > To make sure vfio-pci will get the correct DMA address space, the vIOMMU device
> > needs to be created before vfio-pci otherwise vfio-pci will stop working when
> > the guest enables the vIOMMU and the device at the same time.
> > 
> > AFAIU Libvirt should have code that guarantees that.  For QEMU cmdline users,
> > they need to pay attention or things will stop working at some point.
> > 
> > Recently there's a growing and similar requirement on vDPA.  It's not a hard
> > requirement so far but vDPA has patches that try to workaround this issue.
> > 
> > This patchset allows us to realize the devices in the order that e.g. platform
> > devices will be created first (bus device, IOMMU, etc.), then the rest of
> > normal devices.  It's done simply by ordering the QemuOptsList of "device"
> > entries before realization.  The priority so far comes from migration
> > priorities which could be a little bit odd, but that's really about the same
> > problem and we can clean that part up in the future.
> > 
> > Libvirt can still keep its ordering for sure so old QEMU will still work,
> > however that won't be needed for new qemus after this patchset, so with the new
> > binary we should be able to specify qemu cmdline as wish on '-device'.
> > 
> > Logically this should also work for vDPA and the workaround code can be done
> > with more straightforward approaches.
> > 
> > Please review, thanks.
> 
> Hi Peter, looks like I have another use case:
> 
> vhost devices can heavily restrict the number of available memslots:
> e.g., upstream KVM ~64k, vhost-user usually 32 (!). With virtio-mem
> intending to make use of multiple memslots [1] and auto-detecting how
> many to use based on currently avilable memslots when plugging and
> realizing the virtio-mem device, this implies that realizing vhost
> devices (especially vhost-user device) after virtio-mem devices can
> similarly result in issues: when trying realization of the vhost device
> with restricted memslots, QEMU will bail out.
> 
> So similarly, we'd want to realize any vhost-* before any virtio-mem device.

Ordering virtio-mem vs vhost-* devices doesn't feel like a good
solution to this problem. eg if you start a guest with several
vhost-* devices, then virtio-mem auto-decides to use all/most
remaining memslots, we've now surely broken the abiltiy to then
hotplug more vhost-* devices at runtime by not leaving memslots
for them.

I think virtio-mem configuration needs to be stable in its memslot
usage regardless of how many other types of devices are present,
and not auto-adjust how many it consumes.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2021-10-20 14:07 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-18 19:42 [PATCH 0/4] vl: Prioritize device realizations Peter Xu
2021-08-18 19:42 ` [PATCH 1/4] qdev-monitor: Trace qdev creation Peter Xu
2021-08-18 19:43 ` [PATCH 2/4] qemu-config: Allow in-place sorting of QemuOptsList Peter Xu
2021-08-18 19:43 ` [PATCH 3/4] qdev: Export qdev_get_device_class() Peter Xu
2021-08-18 19:43 ` [PATCH 4/4] vl: Prioritize realizations of devices Peter Xu
2021-08-23 18:49   ` Eduardo Habkost
2021-08-23 19:18     ` Peter Xu
2021-08-23 21:07       ` Eduardo Habkost
2021-08-23 21:31         ` Peter Xu
2021-08-23 21:54           ` Michael S. Tsirkin
2021-08-23 22:51             ` Peter Xu
2021-08-23 21:56           ` Eduardo Habkost
2021-08-23 23:05             ` Peter Xu
2021-08-25  9:39               ` Markus Armbruster
2021-08-25 12:28                 ` Markus Armbruster
2021-08-25 21:50                   ` Peter Xu
2021-08-26  3:50                     ` Peter Xu
2021-08-26  8:01                       ` Markus Armbruster
2021-08-26 11:36                         ` Igor Mammedov
2021-08-26 13:43                           ` Peter Xu
2021-08-30 19:02                             ` Peter Xu
2021-08-31 11:35                               ` Markus Armbruster
2021-09-02  8:26                               ` Igor Mammedov
2021-09-02 13:45                                 ` Peter Xu
2021-09-02 13:53                                   ` Daniel P. Berrangé
2021-09-02 14:21                                     ` Peter Xu
2021-09-02 14:57                                       ` Markus Armbruster
2021-09-03 15:48                                         ` Peter Xu
2021-09-02 15:06                                       ` Daniel P. Berrangé
2021-09-02 15:26                                   ` Markus Armbruster
2021-09-03 13:00                                   ` Igor Mammedov
2021-09-03 16:03                                     ` Peter Xu
2021-09-06  8:49                                       ` Igor Mammedov
2021-09-02  7:46                             ` Igor Mammedov
2021-08-26  4:57                     ` Markus Armbruster
2021-08-23 22:05       ` Michael S. Tsirkin
2021-08-23 22:36         ` Peter Xu
2021-08-24  2:52           ` Jason Wang
2021-08-24 15:50             ` Peter Xu
2021-08-25  4:23               ` Jason Wang
2021-09-06  9:22                 ` Eric Auger
2021-08-24 16:24         ` David Hildenbrand
2021-08-24 19:52           ` Peter Xu
2021-08-25  8:08             ` David Hildenbrand
2021-08-24  2:51       ` Jason Wang
2021-10-20 13:44 ` [PATCH 0/4] vl: Prioritize device realizations David Hildenbrand
2021-10-20 13:48   ` Daniel P. Berrangé [this message]
2021-10-20 13:58     ` David Hildenbrand
2021-10-21  4:20   ` Peter Xu
2021-10-21  7:17     ` David Hildenbrand
2021-10-21  8:00       ` Peter Xu
2021-10-21 16:54         ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXAeGdkCPh5h+kHg@redhat.com \
    --to=berrange@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=armbru@redhat.com \
    --cc=david@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.