All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey G <x1917x@gmail.com>
To: Paul Durrant <Paul.Durrant@citrix.com>
Cc: StefanoStabellini <sstabellini@kernel.org>,
	Wei Liu <wei.liu2@citrix.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	Jan Beulich <JBeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Anthony Perard <anthony.perard@citrix.com>,
	Ian Jackson <Ian.Jackson@citrix.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring
Date: Sat, 24 Mar 2018 08:32:44 +1000	[thread overview]
Message-ID: <20180324083244.00003d8e@gmail.com> (raw)
In-Reply-To: <f6a0950911bb464cb48aafd67da2fd8f@AMSPEX02CL03.citrite.net>

On Fri, 23 Mar 2018 13:57:11 +0000
Paul Durrant <Paul.Durrant@citrix.com> wrote:
[...]
>> Few related thoughts:
>> 
>> 1. MMCONFIG address is chipset-specific. On Q35 it's a PCIEXBAR, on
>> other x86 systems it may be HECBASE or else. So we can assume it is
>> bound to the emulated machine
>
>Xen emulates the machine so it should be emulating PCIEXBAR. 

Actually, Xen currently emulates only few devices. Others are
provided by QEMU, that's the problem.

>> 2. We rely on QEMU to emulate different machines for us.
>We should not be. It's a historical artefact that we rely on QEMU for
>any part of machine emulation.

HVM guests need to see something more or less close to real hardware to
run. Even if we later install PV drivers for network/storage/etc usage,
we still need to support system firmware (SeaBIOS/OVMF) and be able to
install any (ideally) OS which expects to be installed only on some
real x86 hw. We also need to be ready to fallback to the emulated hw if
eg. user will boot OS in the safe mode.

It all depends on what you mean by not relying on QEMU for any part
of machine emulation.

There is a number of mandatory devices which should be provided for a
typical x86 system. Xen emulates some of them, but there is a number
which he doesn't. Apart from "classic" devices like RTC, PIT, KBC, etc
we need to provide at least storage and network interfaces.

Windows installer won't be happy to boot from the PV storage device, he
prefers to encounter something like AHCI (Windows 7+), ATA (for older
OSes) or ATAPI if it is an iso cd.
Providing emulation for the AHCI+ATA+ATAPI trio alone is a non-trivial
task. QEMU itself provides only partial implementation of these, many
features are unsupported. Another very useful thing to emulate is USB.
Depending on the controller version and device classes required, this
may be far more complex to emulate than AHCI+ATA+ATAPI combined.

So, if you suggest to drop QEMU completely, it means that all this
functionality must be replaced by own. Not that hard, but still a lot
of effort.


OTOH, if you mean stripping QEMU of general PCI bus control and
replacing his emulated NB/SB with Xen-owned -- well, it theory it
should be possible, with patches on QEMU side.

In fact, the emulated chipset (NB+SB combo without supplemental devices)
itself is a small part of required emulation. It's relatively easy to
provide own analogs of for eg. 'mch' and 'ICH9-LPC' QEMU PCIDevice's,
the problem is to glue all remaining parts together.

I assume the final goal in this case is to have only a set of necessary
QEMU PCIDevice's for which we will be providing I/O, MMIO and PCI conf
trapping facilities. Only devices such as rtl8139, ich9-ahci and few
others.

Basically, this means a new, chipset-less QEMU machine type.
Well, in theory it is possible with a bit of effort I think. The main
question is where will be the NB/SB/PCIbus emulating part reside in
this case. As this part must still have some priveleges, it's basically
the same decision problem as with QEMU's dwelling place -- stubdomain,
Dom0 or else.

>> 3. There are users which touch chipset-specific PCIEXBAR directly if
>> they see a Q35 system (OVMF so far)
>
>And we should squash such accesses.
>

Yes, we have that privilege (i.e. allocating all IO/MMIO bases) for
hvmloader. OVMF should not differ in this subject to SeaBIOS.

>The toolstack should be sole
>control of the guest memory map. It should be the only building MCFG
>so it should decide where the MMCONFIG regions go, not the firmware
>running in guest context.

HVM memory layout is another problem which needs solution BTW. I had to
implement one for my PT goals, but it's very radical I'm afraid.

Right now there are wicked issues present in handling memory layout
between hvmloader and QEMU. They may see a different memory map, even
with overlaps in some (depending on MMIO hole size and content) cases --
like an attempt to place MMIO BAR over memory which is used for vram
backing storage by QEMU, causing variety of issues like emulated I/O
errors (with a storage device) during guest boot attempt.

Regarding control of the guest memory map in the toolstack only... The
problem is, only firmware can see a final memory map at the moment.
And only the device model knows about invisible "service" ranges for
emulated devices, like the LFB content (aka "VRAM") when it is not
mapped to a guest.

In order to calculate the final memory/MMIO hole split, we need to know:

1) all PCI devices on a PCI bus. At the moment Xen contributes only
devices like PT to the final PCI bus (via QMP device_add). Others are
QEMU ones. Even Xen platform PCI device relies on QEMU emulation.
Non-QEMU device emulators are another source of virtual PCI devices I
guess.

2) all chipset-specific emulated MMIO ranges. MMCONFIG is one of them
and largest (up to 256Mb for a segment). There are few other smaller
ranges, eg. Root Complex registers. All this ranges depend on the
emulated chipset.

3) all reserved memory ranges (this one what toolstack already knows)

4) all "service" guest memory ranges like backing storage for VRAM in
QEMU. Emulated Option ROMs should belong here too, but IIRC xen-hvm.c
either intentionally or by mistate handles them as emulated ranges
currently.

If we miss any of these (like what are the chipset-specific ranges and
their size alignment requirements) -- we're in trouble. But, if we know
*all* of these, we can pre-calculate the MMIO hole size. Although this
is a bit fragile to do from the toolstack because both sizing algo in
the toolstack and MMIO BAR allocation code in the firmware (hvmloader)
must have their algorithms synchronized, because it is possible to
sruff BARs to MMIO hole in different ways, especially when PCI-PCI
bridges will appear on the scene. Both need to do it in a consistent way
(resulting in similar set of gaps between allocated BARs), otherwise
expected MMIO hole sizes won't match, which means we may need to
relocate MMIO BARs to the high MMIO hole and this in turn may lead to
those overlaps with QEMU memories.

>> Seems like we're pretty limited in freedom of choice in this
>> conditions, I'm afraid.  
>
>I don't think so. We're only limited if we use QEMU's Q35 emulation
>and what I'm saying is that we should not be doing that (nor should be
>we be using it to emulate any part of the PIIX today).
>
>  Paul


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2018-03-23 22:33 UTC|newest]

Thread overview: 183+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-12 18:33 [Qemu-devel] [RFC PATCH 00/30] Xen Q35 Bringup patches + support for PCIe Extended Capabilities for passed through devices Alexey Gerasimenko
2018-03-12 18:33 ` Alexey Gerasimenko
2018-03-12 18:33 ` [RFC PATCH 01/12] libacpi: new DSDT ACPI table for Q35 Alexey Gerasimenko
2018-03-12 19:38   ` Konrad Rzeszutek Wilk
2018-03-12 20:10     ` Alexey G
2018-03-12 20:32       ` Konrad Rzeszutek Wilk
2018-03-12 21:19         ` Alexey G
2018-03-13  2:41           ` Tian, Kevin
2018-03-19 12:43   ` Roger Pau Monné
2018-03-19 13:57     ` Alexey G
2018-03-12 18:33 ` [RFC PATCH 02/12] Makefile: build and use new DSDT " Alexey Gerasimenko
2018-03-19 12:46   ` Roger Pau Monné
2018-03-19 14:18     ` Alexey G
2018-03-19 13:07   ` Jan Beulich
2018-03-19 14:10     ` Alexey G
2018-03-12 18:33 ` [RFC PATCH 03/12] hvmloader: add function to query an emulated machine type (i440/Q35) Alexey Gerasimenko
2018-03-13 17:26   ` Wei Liu
2018-03-13 17:58     ` Alexey G
2018-03-13 18:04       ` Wei Liu
2018-03-19 12:56   ` Roger Pau Monné
2018-03-19 16:26     ` Alexey G
2018-03-12 18:33 ` [RFC PATCH 04/12] hvmloader: add ACPI enabling for Q35 Alexey Gerasimenko
2018-03-13 17:26   ` Wei Liu
2018-03-19 13:01   ` Roger Pau Monné
2018-03-19 23:59     ` Alexey G
2018-03-12 18:33 ` [RFC PATCH 05/12] hvmloader: add Q35 DSDT table loading Alexey Gerasimenko
2018-03-19 14:45   ` Roger Pau Monné
2018-03-20  0:15     ` Alexey G
2018-03-12 18:33 ` [RFC PATCH 06/12] hvmloader: add basic Q35 support Alexey Gerasimenko
2018-03-19 15:30   ` Roger Pau Monné
2018-03-19 23:44     ` Alexey G
2018-03-20  9:20       ` Roger Pau Monné
2018-03-20 21:23         ` Alexey G
2018-03-12 18:33 ` [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring Alexey Gerasimenko
2018-03-19 15:58   ` Roger Pau Monné
2018-03-19 19:49     ` Alexey G
2018-03-20  8:50       ` Roger Pau Monné
2018-03-20  9:25         ` Paul Durrant
2018-03-21  0:58         ` Alexey G
2018-03-21  9:09           ` Roger Pau Monné
2018-03-21  9:36             ` Paul Durrant
2018-03-21 14:35               ` Alexey G
2018-03-21 14:58                 ` Paul Durrant
2018-03-21 14:25             ` Alexey G
2018-03-21 14:54               ` Paul Durrant
2018-03-21 17:41                 ` Alexey G
2018-03-21 15:20               ` Roger Pau Monné
2018-03-21 16:56                 ` Alexey G
2018-03-21 17:06                   ` Paul Durrant
2018-03-22  0:31                     ` Alexey G
2018-03-22  9:04                       ` Jan Beulich
2018-03-22  9:55                         ` Alexey G
2018-03-22 10:06                           ` Paul Durrant
2018-03-22 11:56                             ` Alexey G
2018-03-22 12:09                               ` Jan Beulich
2018-03-22 13:05                                 ` Alexey G
2018-03-22 13:20                                   ` Jan Beulich
2018-03-22 14:34                                     ` Alexey G
2018-03-22 14:42                                       ` Jan Beulich
2018-03-22 15:08                                         ` Alexey G
2018-03-23 13:57                                           ` Paul Durrant
2018-03-23 22:32                                             ` Alexey G [this message]
2018-03-26  9:24                                               ` Roger Pau Monné
2018-03-26 19:42                                                 ` Alexey G
2018-03-27  8:45                                                   ` Roger Pau Monné
2018-03-27 15:37                                                     ` Alexey G
2018-03-28  9:30                                                       ` Roger Pau Monné
2018-03-28 11:42                                                         ` Alexey G
2018-03-28 12:05                                                           ` Paul Durrant
2018-03-28 10:03                                                       ` Paul Durrant
2018-03-28 14:14                                                         ` Alexey G
2018-03-21 17:15                   ` Roger Pau Monné
2018-03-21 22:49                     ` Alexey G
2018-03-22  9:29                       ` Paul Durrant
2018-03-22 10:05                         ` Roger Pau Monné
2018-03-22 10:09                           ` Paul Durrant
2018-03-22 11:36                             ` Alexey G
2018-03-22 10:50                         ` Alexey G
2018-03-22  9:57                       ` Roger Pau Monné
2018-03-22 12:29                         ` Alexey G
2018-03-22 12:44                           ` Roger Pau Monné
2018-03-22 15:31                             ` Alexey G
2018-03-23 10:29                               ` Paul Durrant
2018-03-23 11:38                                 ` Jan Beulich
2018-03-23 13:52                                   ` Paul Durrant
2018-05-29 14:23   ` Jan Beulich
2018-05-29 17:56     ` Alexey G
2018-05-29 18:47       ` Alexey G
2018-05-30  4:32         ` Alexey G
2018-05-30  8:13           ` Jan Beulich
2018-05-31  4:25             ` Alexey G
2018-05-30  8:12         ` Jan Beulich
2018-05-31  5:15           ` Alexey G
2018-06-01  5:30             ` Jan Beulich
2018-06-01 15:53               ` Alexey G
2018-03-12 18:33 ` [RFC PATCH 08/12] libxl: Q35 support (new option device_model_machine) Alexey Gerasimenko
2018-03-13 17:25   ` Wei Liu
2018-03-13 17:32     ` Anthony PERARD
2018-03-19 17:01   ` Roger Pau Monné
2018-03-19 22:11     ` Alexey G
2018-03-20  9:11       ` Roger Pau Monné
2018-03-21 16:27         ` Wei Liu
2018-03-21 17:03           ` Anthony PERARD
2018-03-21 16:25       ` Wei Liu
2018-03-12 18:33 ` [RFC PATCH 09/12] libxl: Xen Platform device support for Q35 Alexey Gerasimenko
2018-03-19 15:05   ` Alexey G
2018-03-21 16:32     ` Wei Liu
2018-03-12 18:33 ` [RFC PATCH 10/12] libacpi: build ACPI MCFG table if requested Alexey Gerasimenko
2018-03-19 17:33   ` Roger Pau Monné
2018-03-19 21:46     ` Alexey G
2018-03-20  9:03       ` Roger Pau Monné
2018-03-20 21:06         ` Alexey G
2018-05-29 14:36   ` Jan Beulich
2018-05-29 18:20     ` Alexey G
2018-03-12 18:33 ` [RFC PATCH 11/12] hvmloader: use libacpi to build MCFG table Alexey Gerasimenko
2018-03-14 17:48   ` Alexey G
2018-03-19 17:49   ` Roger Pau Monné
2018-03-19 21:20     ` Alexey G
2018-03-20  8:58       ` Roger Pau Monné
2018-03-20  9:36       ` Jan Beulich
2018-03-20 20:53         ` Alexey G
2018-03-21  7:36           ` Jan Beulich
2018-05-29 14:46   ` Jan Beulich
2018-05-29 17:26     ` Alexey G
2018-03-12 18:33 ` [RFC PATCH 12/12] docs: provide description for device_model_machine option Alexey Gerasimenko
2018-03-12 18:33 ` [Qemu-devel] [RFC PATCH 13/30] pc/xen: Xen Q35 support: provide IRQ handling for PCI devices Alexey Gerasimenko
2018-03-12 18:33   ` Alexey Gerasimenko
2018-03-14 10:48   ` [Qemu-devel] " Paolo Bonzini
2018-03-14 11:28     ` Alexey G
2018-03-14 11:28       ` Alexey G
2018-03-14 10:48   ` Paolo Bonzini
2018-03-12 18:33 ` [Qemu-devel] [RFC PATCH 14/30] pc/q35: Apply PCI bus BSEL property for Xen PCI device hotplug Alexey Gerasimenko
2018-03-12 18:33   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 15/30] q35/acpi/xen: Provide ACPI PCI hotplug interface for Xen on Q35 Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 16/30] q35/xen: Add Xen platform device support for Q35 Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 19:44   ` [Qemu-devel] " Eduardo Habkost
2018-03-12 20:56     ` Alexey G
2018-03-12 20:56       ` Alexey G
2018-03-12 21:44       ` Eduardo Habkost
2018-03-12 21:44       ` [Qemu-devel] " Eduardo Habkost
2018-03-13 23:49         ` Alexey G
2018-03-13 23:49           ` Alexey G
2018-03-12 19:44   ` Eduardo Habkost
2018-03-13  9:24   ` [Qemu-devel] " Daniel P. Berrangé
2018-03-13  9:24     ` Daniel P. Berrangé
2018-03-12 18:34 ` [RFC PATCH 17/30] q35: Fix incorrect values for PCIEXBAR masks Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 18/30] xen/pt: XenHostPCIDevice: provide functions for PCI Capabilities and PCIe Extended Capabilities enumeration Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 19/30] xen/pt: avoid reading PCIe device type and cap version multiple times Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 20/30] xen/pt: determine the legacy/PCIe mode for a passed through device Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 21/30] xen/pt: Xen PCIe passthrough support for Q35: bypass PCIe topology check Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 22/30] xen/pt: add support for PCIe Extended Capabilities and larger config space Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 23/30] xen/pt: handle PCIe Extended Capabilities Next register Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 24/30] xen/pt: allow to hide PCIe Extended Capabilities Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 25/30] xen/pt: add Vendor-specific PCIe Extended Capability descriptor and sizing Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 26/30] xen/pt: add fixed-size PCIe Extended Capabilities descriptors Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 27/30] xen/pt: add AER PCIe Extended Capability descriptor and sizing Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 28/30] xen/pt: add descriptors and size calculation for RCLD/ACS/PMUX/DPA/MCAST/TPH/DPC PCIe Extended Capabilities Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 29/30] xen/pt: add Resizable BAR PCIe Extended Capability descriptor and sizing Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-12 18:34 ` [Qemu-devel] [RFC PATCH 30/30] xen/pt: add VC/VC9/MFVC PCIe Extended Capabilities descriptors " Alexey Gerasimenko
2018-03-12 18:34   ` Alexey Gerasimenko
2018-03-13  9:21 ` [Qemu-devel] [RFC PATCH 00/30] Xen Q35 Bringup patches + support for PCIe Extended Capabilities for passed through devices Daniel P. Berrangé
2018-03-13  9:21   ` Daniel P. Berrangé
2018-03-13 11:37   ` Alexey G
2018-03-13 11:37     ` Alexey G
2018-03-13 11:44     ` Daniel P. Berrangé
2018-03-13 11:44     ` Daniel P. Berrangé
2018-03-16 17:34 ` Alexey G
2018-03-16 18:26   ` Stefano Stabellini
2018-03-16 18:36   ` Roger Pau Monné

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180324083244.00003d8e@gmail.com \
    --to=x1917x@gmail.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Paul.Durrant@citrix.com \
    --cc=anthony.perard@citrix.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.