All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: Laine Stump <laine@redhat.com>, qemu-devel@nongnu.org
Cc: "Daniel P. Berrange" <berrange@redhat.com>,
	Marcel Apfelbaum <marcel@redhat.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	Drew Jones <drjones@redhat.com>,
	mst@redhat.com, Andrea Bolognani <abologna@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Gerd Hoffmann <kraxel@redhat.com>
Subject: Re: [Qemu-devel] [PATCH RFC] docs: add PCIe devices placement guidelines
Date: Tue, 4 Oct 2016 20:56:26 +0200	[thread overview]
Message-ID: <926a9d57-9af3-034c-5bed-c1084b541f19@redhat.com> (raw)
In-Reply-To: <547407ca-e8f8-3327-d1c8-8a5c6317f28f@redhat.com>

On 10/04/16 20:08, Laine Stump wrote:
> On 10/04/2016 12:43 PM, Laszlo Ersek wrote:
>> On 10/04/16 18:10, Laine Stump wrote:
>>> On 10/04/2016 11:40 AM, Laszlo Ersek wrote:
>>>> On 10/04/16 16:59, Daniel P. Berrange wrote:
>>>>> On Mon, Sep 05, 2016 at 06:24:48PM +0200, Laszlo Ersek wrote:
>>>> All valid *high-level* topology goals should be permitted / covered one
>>>> way or another by this document, but in as few ways as possible --
>>>> hopefully only one way. For example, if you read the rest of the
>>>> thread,
>>>> flat hierarchies are preferred to deeply nested hierarchies, because
>>>> flat ones save on bus numbers
>>>
>>> Do they?
>>
>> Yes. Nesting implies bridges, and bridges take up bus numbers. For
>> example, in a PCI Express switch, the upstream port of the switch
>> consumes a bus number, with no practical usefulness.
> 
> I'ts all just idle number games, but what I was thinking of was the
> difference between plugging  a bunch of root-port+upstream+downstreamxN
> combos directly into pcie-root (flat), vs. plugging the first into
> pcie-root, and then subsequent ones into e.g. the last downstream port
> of the previous set. Take the simplest case of needing 63 hotpluggable
> slots. In the "flat" case, you have:
> 
>    2 x pcie-root-port
>    2 x pcie-switch-upstream-port
>    63 x pcie-switch-downstream-port
> 
> In the "nested" or "chained" case you have:
> 
>    1 x pcie-root-port
>    1 x pcie-switch-upstream-port
>    32 x pcie-downstream-port
>    1 x pcie-switch-upstream-port
>    32 x pcie-switch-downstream-port
> 
> so you use the same number of PCI controllers.
> 
> Of course if you're talking about the difference between using
> upstream+downstream vs. just having a bunch of pcie-root-ports directly
> on pcie-root then you're correct, but only marginally - for 63
> hotpluggable ports, you would need 63 x pcie-root-port, so a savings of
> 4 controllers - about 6.5%.

We aim at 200+ ports.

Also, nesting causes recursion in any guest code that traverses the
hierarchy. I think it has some performance impact, plus, for me at
least, interpreting PCI enumeration logs with deep recursion is way
harder than the flat stuff. The bus number space is flat, and for me
it's easier to "map back" to the topology if the topology is also mostly
flat.

> (Of course this is all moot since you run
> out of ioport space after, what, 7 controllers needing it anyway? :-P)

No, it's not moot. The idea is that PCI Express devices must not require
IO space for correct operation -- I believe this is actually mandated by
the PCI Express spec --, so in the PCI Express hierarchy we wouldn't
reserve IO space at all. We discussed this earlier up-thread, please see:

http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg00672.html

    * Finally, this is the spot where we should design and explain our
      resource reservation for hotplug: [...]

>> IIRC we collectively devised a flat pattern elsewhere in the thread
>> where you could exhaust the 0..255 bus number space such that almost
>> every bridge (= taking up a bus number) would also be capable of
>> accepting a hot-plugged or cold-plugged PCI Express device. That is,
>> practically no wasted bus numbers.
>>
>> Hm.... search this message for "population algorithm":
>>
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg394730.html
>>
>> and then Gerd's big improvement / simplification on it, with
>> multifunction:
>>
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg395437.html
>>
>> In Gerd's scheme, you'd only need only one or two (I'm lazy to count
>> exactly :)) PCI Express switches, to exhaust all bus numbers. Minimal
>> waste due to upstream ports.
> 
> Yep. And in response to his message, that's what I'm implementing as the
> default strategy in libvirt :-)

Sounds great, thanks!
Laszlo

  parent reply	other threads:[~2016-10-04 18:56 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-01 13:22 [Qemu-devel] [PATCH RFC] docs: add PCIe devices placement guidelines Marcel Apfelbaum
2016-09-01 13:27 ` Peter Maydell
2016-09-01 13:51   ` Marcel Apfelbaum
2016-09-01 17:14     ` Laszlo Ersek
2016-09-05 16:24 ` Laszlo Ersek
2016-09-05 20:02   ` Marcel Apfelbaum
2016-09-06 13:31     ` Laszlo Ersek
2016-09-06 14:46       ` Marcel Apfelbaum
2016-09-07  6:21       ` Gerd Hoffmann
2016-09-07  8:06         ` Laszlo Ersek
2016-09-07  8:23           ` Marcel Apfelbaum
2016-09-07  8:06         ` Marcel Apfelbaum
2016-09-07 16:08           ` Alex Williamson
2016-09-07 19:32             ` Marcel Apfelbaum
2016-09-07 17:55           ` Laine Stump
2016-09-07 19:39             ` Marcel Apfelbaum
2016-09-07 20:34               ` Laine Stump
2016-09-15  8:38               ` Andrew Jones
2016-09-15 14:20                 ` Marcel Apfelbaum
2016-09-16 16:50                   ` Andrea Bolognani
2016-09-08  7:33             ` Gerd Hoffmann
2016-09-06 11:35   ` Gerd Hoffmann
2016-09-06 13:58     ` Laine Stump
2016-09-07  7:04       ` Gerd Hoffmann
2016-09-07 18:20         ` Laine Stump
2016-09-08  7:26           ` Gerd Hoffmann
2016-09-06 14:47     ` Marcel Apfelbaum
2016-09-07  7:53     ` Laszlo Ersek
2016-09-07  7:57       ` Marcel Apfelbaum
2016-10-04 14:59   ` Daniel P. Berrange
2016-10-04 15:40     ` Laszlo Ersek
2016-10-04 16:10       ` Laine Stump
2016-10-04 16:43         ` Laszlo Ersek
2016-10-04 18:08           ` Laine Stump
2016-10-04 18:52             ` Alex Williamson
2016-10-10 12:02               ` Andrea Bolognani
2016-10-10 14:36                 ` Marcel Apfelbaum
2016-10-11 15:37                   ` Andrea Bolognani
2016-10-04 18:56             ` Laszlo Ersek [this message]
2016-10-04 17:54         ` Laine Stump
2016-10-05  9:17           ` Marcel Apfelbaum
2016-10-10 11:09             ` Andrea Bolognani
2016-10-10 14:15               ` Marcel Apfelbaum
2016-10-11 13:30                 ` Andrea Bolognani
2016-10-04 15:45     ` Alex Williamson
2016-10-04 16:25       ` Laine Stump
2016-10-05 10:03         ` Marcel Apfelbaum
2016-09-06 15:38 ` Alex Williamson
2016-09-06 18:14   ` Marcel Apfelbaum
2016-09-06 18:32     ` Alex Williamson
2016-09-06 18:59       ` Marcel Apfelbaum
2016-09-07  7:44       ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=926a9d57-9af3-034c-5bed-c1084b541f19@redhat.com \
    --to=lersek@redhat.com \
    --cc=abologna@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=berrange@redhat.com \
    --cc=drjones@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=laine@redhat.com \
    --cc=marcel@redhat.com \
    --cc=mst@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.