All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Daniel P . Berrangé" <berrange@redhat.com>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	qemu-devel@nongnu.org, "Markus Armbruster" <armbru@redhat.com>,
	"Eric Auger" <eric.auger@redhat.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH 4/4] vl: Prioritize realizations of devices
Date: Thu, 26 Aug 2021 06:57:26 +0200	[thread overview]
Message-ID: <87lf4o3irt.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <YSa7H3wGUHgccCrU@t490s> (Peter Xu's message of "Wed, 25 Aug 2021 17:50:23 -0400")

Peter Xu <peterx@redhat.com> writes:

> On Wed, Aug 25, 2021 at 02:28:55PM +0200, Markus Armbruster wrote:
>> Markus Armbruster <armbru@redhat.com> writes:
>> 
>> > Peter Xu <peterx@redhat.com> writes:
>> >
>> >> On Mon, Aug 23, 2021 at 05:56:23PM -0400, Eduardo Habkost wrote:
>> >>> I don't have any other example, but I assume address assignment
>> >>> based on ordering is a common pattern in device code.
>> >>> 
>> >>> I would take a very close and careful look at the devices with
>> >>> non-default vmsd priority.  If you can prove that the 13 device
>> >>> types with non-default priority are all order-insensitive, a
>> >>> custom sort function as you describe might be safe.
>> >>
>> >> Besides virtio-mem-pci, there'll also similar devfn issue with all
>> >> MIG_PRI_PCI_BUS, as they'll be allocated just like other pci devices.  Say,
>> >> below two cmdlines will generate different pci topology too:
>> >>
>> >>   $ qemu-system-x86_64 -device pcie-root-port,chassis=0 \
>> >>                        -device pcie-root-port,chassis=1 \
>> >>                        -device virtio-net-pci
>> >>
>> >> And:
>> >>
>> >>   $ qemu-system-x86_64 -device pcie-root-port,chassis=0 \
>> >>                        -device virtio-net-pci
>> >>                        -device pcie-root-port,chassis=1 \
>> >>
>> >> This cannot be solved by keeping priority==0 ordering.
>> >>
>> >> After a second thought, I think I was initially wrong on seeing migration
>> >> priority and device realization the same problem.
>> >>
>> >> For example, for live migration we have a requirement on PCI_BUS being migrated
>> >> earlier than MIG_PRI_IOMMU because there's bus number information required
>> >> because IOMMU relies on the bus number to find address spaces.  However that's
>> >> definitely not a requirement for device realizations, say, realizing vIOMMU
>> >> after pci buses are fine (bus assigned during bios).
>> >>
>> >> I've probably messed up with the ideas (though they really look alike!).  Sorry
>> >> about that.
>> >>
>> >> Since the only ordering constraint so far is IOMMU vs all the rest of devices,
>> >> I'll introduce a new priority mechanism and only make sure vIOMMUs are realized
>> >> earlier.  That'll also avoid other implications on pci devfn allocations.
>> >>
>> >> Will rework a new version tomorrow.  Thanks a lot for all the comments,
>> >
>> > Is it really a good idea to magically reorder device realization just to
>> > make a non-working command line work?  Why can't we just fail the
>> > non-working command line in a way that tells users how to get a working
>> > one?  We have way too much ordering magic already...
>> >
>> > If we decide we want more magic, then I'd argue for *dependencies*
>> > instead of priorities.  Dependencies are specific and local: $this needs
>> > to go after $that because $reasons.  Priorities are unspecific and
>> > global.
>> 
>> Having thought about this a bit more...
>> 
>> Constraints on realize order are nothing new.  For instance, when a
>> device plugs into a bus, it needs to be realized after the device
>> providing the bus.
>> 
>> We ensure this by having the device refer to the bus, e.g. bus=pci.0.
>> The reference may be implicit, but it's there.  It must resolve for
>> device creation to succeed, and if it resolves, the device providing the
>> bus will be realized in time.
>> 
>> I believe what's getting us into trouble with IOMMU is not having such a
>> reference.  Or in other words, keeping the dependence between the IOMMU
>> and the devices relying on it *implicit*, and thus hidden from the
>> existing realize-ordering machinery.
>> 
>> Instead of inventing another such machinery, let's try to use the one we
>> already have.
>
> Hmm... I just found that we don't have such machinery, do we?
>
> This does not really work:
>
> $ ./qemu-system-x86_64 -M q35 -device virtio-net-pci,bus=pcie.1 \
>                        -device pcie-root-port,id=pcie.1,bus=pcie.0
> qemu-system-x86_64: -device virtio-net-pci,bus=pcie.1: Bus 'pcie.1' not found
>
> While this will:
>
> $ ./qemu-system-x86_64 -M q35 -device pcie-root-port,id=pcie.1,bus=pcie.0 \
>                        -device virtio-net-pci,bus=pcie.1

This is exactly what I described.  bus=pcie.0 is the explicit reference.
It must resolve for device creation to succeed, and if it resolves, the
device providing the bus will be realized in time.  It resolves in the
second example, but not the first.

Look ma, no magic!  Instead, stupid & predictable.



  parent reply	other threads:[~2021-08-26  4:58 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-18 19:42 [PATCH 0/4] vl: Prioritize device realizations Peter Xu
2021-08-18 19:42 ` [PATCH 1/4] qdev-monitor: Trace qdev creation Peter Xu
2021-08-18 19:43 ` [PATCH 2/4] qemu-config: Allow in-place sorting of QemuOptsList Peter Xu
2021-08-18 19:43 ` [PATCH 3/4] qdev: Export qdev_get_device_class() Peter Xu
2021-08-18 19:43 ` [PATCH 4/4] vl: Prioritize realizations of devices Peter Xu
2021-08-23 18:49   ` Eduardo Habkost
2021-08-23 19:18     ` Peter Xu
2021-08-23 21:07       ` Eduardo Habkost
2021-08-23 21:31         ` Peter Xu
2021-08-23 21:54           ` Michael S. Tsirkin
2021-08-23 22:51             ` Peter Xu
2021-08-23 21:56           ` Eduardo Habkost
2021-08-23 23:05             ` Peter Xu
2021-08-25  9:39               ` Markus Armbruster
2021-08-25 12:28                 ` Markus Armbruster
2021-08-25 21:50                   ` Peter Xu
2021-08-26  3:50                     ` Peter Xu
2021-08-26  8:01                       ` Markus Armbruster
2021-08-26 11:36                         ` Igor Mammedov
2021-08-26 13:43                           ` Peter Xu
2021-08-30 19:02                             ` Peter Xu
2021-08-31 11:35                               ` Markus Armbruster
2021-09-02  8:26                               ` Igor Mammedov
2021-09-02 13:45                                 ` Peter Xu
2021-09-02 13:53                                   ` Daniel P. Berrangé
2021-09-02 14:21                                     ` Peter Xu
2021-09-02 14:57                                       ` Markus Armbruster
2021-09-03 15:48                                         ` Peter Xu
2021-09-02 15:06                                       ` Daniel P. Berrangé
2021-09-02 15:26                                   ` Markus Armbruster
2021-09-03 13:00                                   ` Igor Mammedov
2021-09-03 16:03                                     ` Peter Xu
2021-09-06  8:49                                       ` Igor Mammedov
2021-09-02  7:46                             ` Igor Mammedov
2021-08-26  4:57                     ` Markus Armbruster [this message]
2021-08-23 22:05       ` Michael S. Tsirkin
2021-08-23 22:36         ` Peter Xu
2021-08-24  2:52           ` Jason Wang
2021-08-24 15:50             ` Peter Xu
2021-08-25  4:23               ` Jason Wang
2021-09-06  9:22                 ` Eric Auger
2021-08-24 16:24         ` David Hildenbrand
2021-08-24 19:52           ` Peter Xu
2021-08-25  8:08             ` David Hildenbrand
2021-08-24  2:51       ` Jason Wang
2021-10-20 13:44 ` [PATCH 0/4] vl: Prioritize device realizations David Hildenbrand
2021-10-20 13:48   ` Daniel P. Berrangé
2021-10-20 13:58     ` David Hildenbrand
2021-10-21  4:20   ` Peter Xu
2021-10-21  7:17     ` David Hildenbrand
2021-10-21  8:00       ` Peter Xu
2021-10-21 16:54         ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lf4o3irt.fsf@dusky.pond.sub.org \
    --to=armbru@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.