On Wed, 15 Sep 2021, Oleksandr Andrushchenko wrote: > On 15.09.21 03:36, Stefano Stabellini wrote: > > On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote: > >> With the patch above I have the following log in Domain-0: > >> > >> generic-armv8-xt-dom0 login: root > >> root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input) > >> (XEN) ==== PCI devices ==== > >> (XEN) ==== segment 0000 ==== > >> (XEN) 0000:03:00.0 - d0 - node -1 > >> (XEN) 0000:02:02.0 - d0 - node -1 > >> (XEN) 0000:02:01.0 - d0 - node -1 > >> (XEN) 0000:02:00.0 - d0 - node -1 > >> (XEN) 0000:01:00.0 - d0 - node -1 > >> (XEN) 0000:00:00.0 - d0 - node -1 > >> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input) > >> > >> root@generic-armv8-xt-dom0:~# modprobe e1000e > >> [   46.104729] e1000e: Intel(R) PRO/1000 Network Driver > >> [   46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation. > >> [   46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002) > >> (XEN) map [e0000, e001f] -> 0xe0000 for d0 > >> (XEN) map [e0020, e003f] -> 0xe0020 for d0 > >> [   46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode > >> [   46.189668] pci_msi_setup_msi_irqs > >> [   46.191016] nwl_compose_msi_msg msg at fe440000 > >> (XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 gva=0xffff800010fa5000 gpa=0x000000e0040000 > >> [   46.200455] Unhandled fault at 0xffff800010fa5000 > >> > >> [snip] > >> > >> [   46.233079] Call trace: > >> [   46.233559]  __pci_write_msi_msg+0x70/0x180 > >> [   46.234149]  pci_msi_domain_write_msg+0x28/0x30 > >> [   46.234869]  msi_domain_activate+0x5c/0x88 > >> > >> From the above you can see that BARs are mapped for Domain-0 now > >> > >> only when an assigned PCI device gets enabled in Domain-0. > >> > >> Another thing to note is that we crash on MSI-X access as BARs are mapped > >> > >> to the domain while enabling memory decoding in the COMMAND register, > >> > >> but MSI-X are handled differently, e.g. we have MSI-X holes in the mappings. > >> > >> This is because before this change the whole PCI aperture was mapped into > >> > >> Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there > >> > >> was no need to do so, e.g. they were always mapped into Domain-0 and > >> > >> emulated for guests. > >> > >> Please note that one cannot use xl pci-attach in this case to attach the PCI device > >> > >> in question to Domain-0 as (please see the log) that device is already attached. > >> > >> Neither it can be detached and re-attached. So, without mapping MSI-X holes for > >> > >> Domain-0 the device becomes unusable in Domain-0. At the same time the device > >> > >> can be successfully passed to DomU. > >> > >> > >> Julien, Stefano! Please let me know how can we proceed with this. > > What was the plan for MSI-X in Dom0? > It just worked because we mapped everything > > > > Given that Dom0 interacts with a virtual-ITS and virtual-GIC rather than > > a physical-ITS and physical-GIC, I imagine that it wasn't correct for > > Dom0 to write to the real MSI-X table directly? > > > > Shouldn't Dom0 get emulated MSI-X tables like any DomU? > > > > Otherwise, if Dom0 is expected to have the real MSI-X tables mapped, then > > I would suggest to map them at the same time as the BARs. But I am > > thinking that Dom0 should get emulated MSI-X tables, not physical MSI-X > > tables. > > Yes, it seems more than reasonable to enable emulation for Domain-0 > > as well. Other than that, Stefano, do you think we are good to go with > > the changes I did in order to unmap everything for Domain-0? It might be better to resend the series with the patch in it, because it is difficult to review the patch like this. Nonetheless I tried, but I might have missed something. Previously the whole PCIe bridge aperture was mapped to Dom0, and it was done by map_range_to_domain, is that correct? Now this patch, to avoid mapping the entire aperture to Dom0, is skipping any operations for PCIe devices in map_range_to_domain by setting need_mapping = false. The idea is that instead, we'll only map things when needed and not the whole aperture. However, looking at the changes to pci_host_bridge_mappings (formerly known as pci_host_bridge_need_p2m_mapping), it is still going through the full list of address ranges of the PCIe bridge and map each range one by one using map_range_to_domain. Also, pci_host_bridge_mappings is still called unconditionally at boot for Dom0. So I am missing the part where the aperture is actually *not* mapped to Dom0. What's the difference between the loop in pci_host_bridge_mappings: for ( i = 0; i < dt_number_of_address(dev); i++ ) map_range_to_domain... and the previous code in map_range_to_domain? I think I am missing something but as mentioned it is difficult to review the patch like this out of order. Also, and this is minor, even if currently unused, it might be good to keep a length parameter to pci_host_bridge_need_p2m_mapping.