On Wed, 15 Sep 2021, Stefano Stabellini wrote: > On Wed, 15 Sep 2021, Oleksandr Andrushchenko wrote: > > On 15.09.21 03:36, Stefano Stabellini wrote: > > > On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote: > > >> With the patch above I have the following log in Domain-0: > > >> > > >> generic-armv8-xt-dom0 login: root > > >> root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input) > > >> (XEN) ==== PCI devices ==== > > >> (XEN) ==== segment 0000 ==== > > >> (XEN) 0000:03:00.0 - d0 - node -1 > > >> (XEN) 0000:02:02.0 - d0 - node -1 > > >> (XEN) 0000:02:01.0 - d0 - node -1 > > >> (XEN) 0000:02:00.0 - d0 - node -1 > > >> (XEN) 0000:01:00.0 - d0 - node -1 > > >> (XEN) 0000:00:00.0 - d0 - node -1 > > >> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input) > > >> > > >> root@generic-armv8-xt-dom0:~# modprobe e1000e > > >> [   46.104729] e1000e: Intel(R) PRO/1000 Network Driver > > >> [   46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation. > > >> [   46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002) > > >> (XEN) map [e0000, e001f] -> 0xe0000 for d0 > > >> (XEN) map [e0020, e003f] -> 0xe0020 for d0 > > >> [   46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode > > >> [   46.189668] pci_msi_setup_msi_irqs > > >> [   46.191016] nwl_compose_msi_msg msg at fe440000 > > >> (XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 gva=0xffff800010fa5000 gpa=0x000000e0040000 > > >> [   46.200455] Unhandled fault at 0xffff800010fa5000 > > >> > > >> [snip] > > >> > > >> [   46.233079] Call trace: > > >> [   46.233559]  __pci_write_msi_msg+0x70/0x180 > > >> [   46.234149]  pci_msi_domain_write_msg+0x28/0x30 > > >> [   46.234869]  msi_domain_activate+0x5c/0x88 > > >> > > >> From the above you can see that BARs are mapped for Domain-0 now > > >> > > >> only when an assigned PCI device gets enabled in Domain-0. > > >> > > >> Another thing to note is that we crash on MSI-X access as BARs are mapped > > >> > > >> to the domain while enabling memory decoding in the COMMAND register, > > >> > > >> but MSI-X are handled differently, e.g. we have MSI-X holes in the mappings. > > >> > > >> This is because before this change the whole PCI aperture was mapped into > > >> > > >> Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there > > >> > > >> was no need to do so, e.g. they were always mapped into Domain-0 and > > >> > > >> emulated for guests. > > >> > > >> Please note that one cannot use xl pci-attach in this case to attach the PCI device > > >> > > >> in question to Domain-0 as (please see the log) that device is already attached. > > >> > > >> Neither it can be detached and re-attached. So, without mapping MSI-X holes for > > >> > > >> Domain-0 the device becomes unusable in Domain-0. At the same time the device > > >> > > >> can be successfully passed to DomU. > > >> > > >> > > >> Julien, Stefano! Please let me know how can we proceed with this. > > > What was the plan for MSI-X in Dom0? > > It just worked because we mapped everything > > > > > > Given that Dom0 interacts with a virtual-ITS and virtual-GIC rather than > > > a physical-ITS and physical-GIC, I imagine that it wasn't correct for > > > Dom0 to write to the real MSI-X table directly? > > > > > > Shouldn't Dom0 get emulated MSI-X tables like any DomU? > > > > > > Otherwise, if Dom0 is expected to have the real MSI-X tables mapped, then > > > I would suggest to map them at the same time as the BARs. But I am > > > thinking that Dom0 should get emulated MSI-X tables, not physical MSI-X > > > tables. > > > > Yes, it seems more than reasonable to enable emulation for Domain-0 > > > > as well. Other than that, Stefano, do you think we are good to go with > > > > the changes I did in order to unmap everything for Domain-0? > > > It might be better to resend the series with the patch in it, because it > is difficult to review the patch like this. Nonetheless I tried, but I > might have missed something. > > Previously the whole PCIe bridge aperture was mapped to Dom0, and > it was done by map_range_to_domain, is that correct? > > Now this patch, to avoid mapping the entire aperture to Dom0, is > skipping any operations for PCIe devices in map_range_to_domain by > setting need_mapping = false. > > The idea is that instead, we'll only map things when needed and not the > whole aperture. However, looking at the changes to > pci_host_bridge_mappings (formerly known as > pci_host_bridge_need_p2m_mapping), it is still going through the full > list of address ranges of the PCIe bridge and map each range one by one > using map_range_to_domain. Also, pci_host_bridge_mappings is still > called unconditionally at boot for Dom0. > > So I am missing the part where the aperture is actually *not* mapped to > Dom0. What's the difference between the loop in > pci_host_bridge_mappings: > > for ( i = 0; i < dt_number_of_address(dev); i++ ) > map_range_to_domain... > > and the previous code in map_range_to_domain? I think I am missing > something but as mentioned it is difficult to review the patch like this > out of order. > > Also, and this is minor, even if currently unused, it might be good to > keep a length parameter to pci_host_bridge_need_p2m_mapping. It looks like the filtering is done based on: return cfg->phys_addr != addr in pci_ecam_need_p2m_mapping that is expected to filter out the address ranges we don't want to map because it comes from xen/arch/arm/pci/pci-host-common.c:gen_pci_init: /* Parse our PCI ecam register address*/ err = dt_device_get_address(dev, ecam_reg_idx, &addr, &size); if ( err ) goto err_exit; In pci_host_bridge_mappings, instead of parsing device tree again, can't we just fetch the address and length we need to map straight from bridge->sysdata->phys_addr/size ? At the point when pci_host_bridge_mappings is called in your new patch, we have already initialized all the PCIe-related data structures. It seems a bit useless to have to go via device tree again.