All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/7] PCI devices passthrough on Arm, part 2
@ 2021-11-05  6:33 Oleksandr Andrushchenko
  2021-11-05  6:33 ` [PATCH v6 1/7] xen/arm: rename DEVICE_PCI to DEVICE_PCI_HOSTBRIDGE Oleksandr Andrushchenko
                   ` (6 more replies)
  0 siblings, 7 replies; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-05  6:33 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Hi, all!

This is an assorted series of patches which aim is to make some further
basis for PCI passthrough on Arm support. The series continues the work
published earlier by Arm [1] and adds new helpers and clears the way for
vPCI changes which will follow.

RFC is at [2], [3]. Design presentation can be found at [4].

Thank you,
Oleksandr

[1] https://patchwork.kernel.org/project/xen-devel/list/?series=558681
[2] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg01184.html
[3] https://lists.xenproject.org/archives/html/xen-devel/2020-07/threads.html#01184
[4] https://static.sched.com/hosted_files/xen2021/e4/PCI_Device_Passthrough_On_Arm.pdf

Oleksandr Andrushchenko (7):
  xen/arm: rename DEVICE_PCI to DEVICE_PCI_HOSTBRIDGE
  xen/arm: add pci-domain for disabled devices
  xen/arm: setup MMIO range trap handlers for hardware domain
  xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m
  xen/arm: do not map IRQs and memory for disabled devices
  xen/arm: process pending vPCI map/unmap operations
  xen/arm: do not use void pointer in pci_host_common_probe

 xen/arch/arm/domain.c               |  2 +
 xen/arch/arm/domain_build.c         | 83 ++++++++++++++++++----------
 xen/arch/arm/pci/ecam.c             | 18 +++++-
 xen/arch/arm/pci/pci-host-common.c  | 85 +++++++++++++++++++++++++++--
 xen/arch/arm/pci/pci-host-generic.c |  2 +-
 xen/arch/arm/pci/pci-host-zynqmp.c  |  3 +-
 xen/arch/arm/pci/pci.c              |  2 +-
 xen/arch/arm/traps.c                | 13 +++++
 xen/arch/arm/vpci.c                 | 66 +++++++++++++++++++---
 xen/arch/arm/vpci.h                 |  6 ++
 xen/arch/x86/hvm/hvm.c              |  6 ++
 xen/common/ioreq.c                  |  9 ---
 xen/include/asm-arm/device.h        |  2 +-
 xen/include/asm-arm/pci.h           | 28 +++++++++-
 xen/include/asm-arm/setup.h         | 13 +++++
 15 files changed, 278 insertions(+), 60 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v6 1/7] xen/arm: rename DEVICE_PCI to DEVICE_PCI_HOSTBRIDGE
  2021-11-05  6:33 [PATCH v6 0/7] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
@ 2021-11-05  6:33 ` Oleksandr Andrushchenko
  2021-11-16 18:26   ` Julien Grall
  2021-11-05  6:33 ` [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices Oleksandr Andrushchenko
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-05  6:33 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

To better reflect the nature of the device type and not to make any
confusion rename DEVICE_PCI to DEVICE_PCI_HOSTBRIDGE which it
really is.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Suggested-by: Julien Grall <julien@xen.org>

---
New in v6
---
 xen/arch/arm/pci/pci-host-generic.c | 2 +-
 xen/arch/arm/pci/pci-host-zynqmp.c  | 2 +-
 xen/arch/arm/pci/pci.c              | 2 +-
 xen/include/asm-arm/device.h        | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/pci/pci-host-generic.c b/xen/arch/arm/pci/pci-host-generic.c
index 33457fbe9615..46de6e43cc72 100644
--- a/xen/arch/arm/pci/pci-host-generic.c
+++ b/xen/arch/arm/pci/pci-host-generic.c
@@ -32,7 +32,7 @@ static int __init pci_host_generic_probe(struct dt_device_node *dev,
     return pci_host_common_probe(dev, &pci_generic_ecam_ops);
 }
 
-DT_DEVICE_START(pci_gen, "PCI HOST GENERIC", DEVICE_PCI)
+DT_DEVICE_START(pci_gen, "PCI HOST GENERIC", DEVICE_PCI_HOSTBRIDGE)
 .dt_match = gen_pci_dt_match,
 .init = pci_host_generic_probe,
 DT_DEVICE_END
diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
index 61a9807d3d58..516982bca833 100644
--- a/xen/arch/arm/pci/pci-host-zynqmp.c
+++ b/xen/arch/arm/pci/pci-host-zynqmp.c
@@ -49,7 +49,7 @@ static int __init pci_host_generic_probe(struct dt_device_node *dev,
     return pci_host_common_probe(dev, &nwl_pcie_ops);
 }
 
-DT_DEVICE_START(pci_gen, "PCI HOST ZYNQMP", DEVICE_PCI)
+DT_DEVICE_START(pci_gen, "PCI HOST ZYNQMP", DEVICE_PCI_HOSTBRIDGE)
 .dt_match = nwl_pcie_dt_match,
 .init = pci_host_generic_probe,
 DT_DEVICE_END
diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
index 082c14e127a8..78b97beaef12 100644
--- a/xen/arch/arm/pci/pci.c
+++ b/xen/arch/arm/pci/pci.c
@@ -46,7 +46,7 @@ static int __init dt_pci_init(void)
 
     dt_for_each_device_node(dt_host, np)
     {
-        rc = device_init(np, DEVICE_PCI, NULL);
+        rc = device_init(np, DEVICE_PCI_HOSTBRIDGE, NULL);
         /*
          * Ignore the following error codes:
          *   - EBADF: Indicate the current device is not a pci device.
diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index 3782660751b6..086dde13eb6b 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -37,7 +37,7 @@ enum device_class
     DEVICE_SERIAL,
     DEVICE_IOMMU,
     DEVICE_GIC,
-    DEVICE_PCI,
+    DEVICE_PCI_HOSTBRIDGE,
     /* Use for error */
     DEVICE_UNKNOWN,
 };
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-05  6:33 [PATCH v6 0/7] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
  2021-11-05  6:33 ` [PATCH v6 1/7] xen/arm: rename DEVICE_PCI to DEVICE_PCI_HOSTBRIDGE Oleksandr Andrushchenko
@ 2021-11-05  6:33 ` Oleksandr Andrushchenko
  2021-11-16 18:48   ` Julien Grall
  2021-11-05  6:33 ` [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-05  6:33 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

If a PCI host bridge device is present in the device tree, but is
disabled, then its PCI host bridge driver was not instantiated.
This results in the failure of the pci_get_host_bridge_segment()
and the following panic during Xen start:

(XEN) Device tree generation failed (-22).
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Could not set up DOM0 guest OS
(XEN) ****************************************

Fix this by adding "linux,pci-domain" property for all device tree nodes
which have "pci" device type, so we know which segments will be used by
the guest for which bridges.

Fixes: 4cfab4425d39 ("xen/arm: Add linux,pci-domain property for hwdom if not available.")

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

---
New in v6
---
 xen/arch/arm/domain_build.c        | 15 ++++++++++++++-
 xen/arch/arm/pci/pci-host-common.c |  2 +-
 xen/include/asm-arm/pci.h          |  8 ++++++++
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 491f5e2c316e..f7fcb1400c19 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -753,9 +753,22 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo,
         {
             uint16_t segment;
 
+            /*
+             * The node doesn't have "linux,pci-domain" property and it is
+             * possible that:
+             *  - Xen only has drivers for a part of the host bridges
+             *  - some host bridges are disabled
+             * Make sure we insert the correct "linux,pci-domain" property
+             * in any case, so we know which segments will be used
+             * by Linux for which bridges.
+             */
             res = pci_get_host_bridge_segment(node, &segment);
             if ( res < 0 )
-                return res;
+            {
+                segment = pci_get_new_domain_nr();
+                printk(XENLOG_DEBUG "Assigned segment %d to %s\n",
+                       segment, node->full_name);
+            }
 
             res = fdt_property_cell(kinfo->fdt, "linux,pci-domain", segment);
             if ( res )
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index d8cbaaaba654..47104b22b221 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -137,7 +137,7 @@ void pci_add_host_bridge(struct pci_host_bridge *bridge)
     list_add_tail(&bridge->node, &pci_host_bridges);
 }
 
-static int pci_get_new_domain_nr(void)
+int pci_get_new_domain_nr(void)
 {
     return atomic_inc_return(&domain_nr);
 }
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 81273e0d87ac..c20eba643d86 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -108,6 +108,8 @@ static always_inline bool is_pci_passthrough_enabled(void)
 
 void arch_pci_init_pdev(struct pci_dev *pdev);
 
+int pci_get_new_domain_nr(void);
+
 #else   /*!CONFIG_HAS_PCI*/
 
 struct arch_pci_dev { };
@@ -128,5 +130,11 @@ static inline int pci_get_host_bridge_segment(const struct dt_device_node *node,
     return -EINVAL;
 }
 
+static inline int pci_get_new_domain_nr(void)
+{
+    ASSERT_UNREACHABLE();
+    return -1;
+}
+
 #endif  /*!CONFIG_HAS_PCI*/
 #endif /* __ARM_PCI_H__ */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-05  6:33 [PATCH v6 0/7] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
  2021-11-05  6:33 ` [PATCH v6 1/7] xen/arm: rename DEVICE_PCI to DEVICE_PCI_HOSTBRIDGE Oleksandr Andrushchenko
  2021-11-05  6:33 ` [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices Oleksandr Andrushchenko
@ 2021-11-05  6:33 ` Oleksandr Andrushchenko
  2021-11-09  9:20   ` Oleksandr Andrushchenko
  2021-11-16 19:12   ` Julien Grall
  2021-11-05  6:33 ` [PATCH v6 4/7] xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-05  6:33 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

In order for vPCI to work it needs to maintain guest and hardware
domain's views of the configuration space. For example, BARs and
COMMAND registers require emulation for guests and the guest view
of the registers needs to be in sync with the real contents of the
relevant registers. For that ECAM address space needs to also be
trapped for the hardware domain, so we need to implement PCI host
bridge specific callbacks to properly setup MMIO handlers for those
ranges depending on particular host bridge implementation.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
Since v5:
- add vpci_sbdf_from_gpa helper for gpa to SBDF translation
- take bridge's bus start into account while calculating SBDF
Since v4:
- unsigned int for functions working with count
- gate number of MMIO handlers needed for CONFIG_HAS_PCI_MSI
  and fix their number, e.g. single handler for PBA and
  MSI-X tables (Roger)
- re-work code for assigning MMIO handlers to be simpler
  and account on the fact that there could multiple host bridges
  exist for the hwdom
Since v3:
- fixed comment formatting
Since v2:
- removed unneeded assignment (count = 0)
- removed unneeded header inclusion
- update commit message
Since v1:
 - Dynamically calculate the number of MMIO handlers required for vPCI
   and update the total number accordingly
 - s/clb/cb
 - Do not introduce a new callback for MMIO handler setup
---
 xen/arch/arm/domain.c              |  2 +
 xen/arch/arm/pci/pci-host-common.c | 27 ++++++++++++
 xen/arch/arm/vpci.c                | 66 ++++++++++++++++++++++++++----
 xen/arch/arm/vpci.h                |  6 +++
 xen/include/asm-arm/pci.h          |  5 +++
 5 files changed, 98 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 96e1b235501d..92a6c509e5c5 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -739,6 +739,8 @@ int arch_domain_create(struct domain *d,
     if ( (rc = domain_vgic_register(d, &count)) != 0 )
         goto fail;
 
+    count += domain_vpci_get_num_mmio_handlers(d);
+
     if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
         goto fail;
 
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index 47104b22b221..0d271a6e8881 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -289,6 +289,33 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
     return -EINVAL;
 }
 
+int pci_host_iterate_bridges(struct domain *d,
+                             int (*cb)(struct domain *d,
+                                       struct pci_host_bridge *bridge))
+{
+    struct pci_host_bridge *bridge;
+    int err;
+
+    list_for_each_entry( bridge, &pci_host_bridges, node )
+    {
+        err = cb(d, bridge);
+        if ( err )
+            return err;
+    }
+    return 0;
+}
+
+unsigned int pci_host_get_num_bridges(void)
+{
+    struct pci_host_bridge *bridge;
+    unsigned int count = 0;
+
+    list_for_each_entry( bridge, &pci_host_bridges, node )
+        count++;
+
+    return count;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
index 23f45386f4b3..5a6ebd8b9868 100644
--- a/xen/arch/arm/vpci.c
+++ b/xen/arch/arm/vpci.c
@@ -16,16 +16,31 @@
 
 #include <asm/mmio.h>
 
+static pci_sbdf_t vpci_sbdf_from_gpa(const struct pci_host_bridge *bridge,
+                                     paddr_t gpa)
+{
+    pci_sbdf_t sbdf;
+
+    if ( bridge )
+    {
+        sbdf.sbdf = VPCI_ECAM_BDF(gpa - bridge->cfg->phys_addr);
+        sbdf.seg = bridge->segment;
+        sbdf.bus += bridge->cfg->busn_start;
+    }
+    else
+        sbdf.sbdf = VPCI_ECAM_BDF(gpa - GUEST_VPCI_ECAM_BASE);
+
+    return sbdf;
+}
+
 static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
                           register_t *r, void *p)
 {
-    pci_sbdf_t sbdf;
+    struct pci_host_bridge *bridge = p;
+    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
     /* data is needed to prevent a pointer cast on 32bit */
     unsigned long data;
 
-    /* We ignore segment part and always handle segment 0 */
-    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
-
     if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa),
                         1U << info->dabt.size, &data) )
     {
@@ -41,10 +56,8 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
 static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
                            register_t r, void *p)
 {
-    pci_sbdf_t sbdf;
-
-    /* We ignore segment part and always handle segment 0 */
-    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
+    struct pci_host_bridge *bridge = p;
+    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
 
     return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa),
                            1U << info->dabt.size, r);
@@ -55,17 +68,54 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
     .write = vpci_mmio_write,
 };
 
+static int vpci_setup_mmio_handler_cb(struct domain *d,
+                                      struct pci_host_bridge *bridge)
+{
+    struct pci_config_window *cfg = bridge->cfg;
+
+    register_mmio_handler(d, &vpci_mmio_handler,
+                          cfg->phys_addr, cfg->size, bridge);
+    return 0;
+}
+
 int domain_vpci_init(struct domain *d)
 {
     if ( !has_vpci(d) )
         return 0;
 
+    if ( is_hardware_domain(d) )
+        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler_cb);
+
+    /* Guest domains use what is programmed in their device tree. */
     register_mmio_handler(d, &vpci_mmio_handler,
                           GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
 
     return 0;
 }
 
+unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
+{
+    unsigned int count;
+
+    if ( is_hardware_domain(d) )
+        /* For each PCI host bridge's configuration space. */
+        count = pci_host_get_num_bridges();
+    else
+        /*
+         * There's a single MSI-X MMIO handler that deals with both PBA
+         * and MSI-X tables per each PCI device being passed through.
+         * Maximum number of supported devices is 32 as virtual bus
+         * topology emulates the devices as embedded endpoints.
+         * +1 for a single emulated host bridge's configuration space.
+         */
+        count = 1;
+#ifdef CONFIG_HAS_PCI_MSI
+        count += 32;
+#endif
+
+    return count;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h
index d8a7b0e3e802..3c713f3fcdb5 100644
--- a/xen/arch/arm/vpci.h
+++ b/xen/arch/arm/vpci.h
@@ -17,11 +17,17 @@
 
 #ifdef CONFIG_HAS_VPCI
 int domain_vpci_init(struct domain *d);
+unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d);
 #else
 static inline int domain_vpci_init(struct domain *d)
 {
     return 0;
 }
+
+static inline unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
+{
+    return 0;
+}
 #endif
 
 #endif /* __ARCH_ARM_VPCI_H__ */
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index c20eba643d86..969333043431 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -110,6 +110,11 @@ void arch_pci_init_pdev(struct pci_dev *pdev);
 
 int pci_get_new_domain_nr(void);
 
+int pci_host_iterate_bridges(struct domain *d,
+                             int (*clb)(struct domain *d,
+                                        struct pci_host_bridge *bridge));
+unsigned int pci_host_get_num_bridges(void);
+
 #else   /*!CONFIG_HAS_PCI*/
 
 struct arch_pci_dev { };
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 4/7] xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-11-05  6:33 [PATCH v6 0/7] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (2 preceding siblings ...)
  2021-11-05  6:33 ` [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
@ 2021-11-05  6:33 ` Oleksandr Andrushchenko
  2021-11-23 16:42   ` Julien Grall
  2021-11-05  6:33 ` [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices Oleksandr Andrushchenko
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-05  6:33 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

PCI host bridges are special devices in terms of implementing PCI
passthrough. According to [1] the current implementation depends on
Domain-0 to perform the initialization of the relevant PCI host
bridge hardware and perform PCI device enumeration. In order to
achieve that one of the required changes is to not map all the memory
ranges in map_range_to_domain as we traverse the device tree on startup
and perform some additional checks if the range needs to be mapped to
Domain-0.

The generic PCI host controller device tree binding says [2]:
- ranges: As described in IEEE Std 1275-1994, but must provide
          at least a definition of non-prefetchable memory. One
          or both of prefetchable Memory and IO Space may also
          be provided.

- reg   : The Configuration Space base address and size, as accessed
          from the parent bus.  The base address corresponds to
          the first bus in the "bus-range" property.  If no
          "bus-range" is specified, this will be bus 0 (the default).

From the above none of the memory ranges from the "ranges" property
needs to be mapped to Domain-0 at startup as MMIO mapping is going to
be handled dynamically by vPCI as we assign PCI devices, e.g. each
device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
as needed by Xen.

The "reg" property covers not only ECAM space, but may also have other
then the configuration memory ranges described, for example [3]:
- reg: Should contain rc_dbi, config registers location and length.
- reg-names: Must include the following entries:
   "rc_dbi": controller configuration registers;
   "config": PCIe configuration space registers.

This patch makes it possible to not map all the ranges from the
"ranges" property and also ECAM from the "reg". All the rest from the
"reg" property still needs to be mapped to Domain-0, so the PCI
host bridge remains functional in Domain-0.

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html
[2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt
[3] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
Since v5:
- remove some need_mapping local variables
- use own_device in handle_device
- add __init for pci_ecam_need_p2m_hwdom_mapping
- make pci_host_bridge_mappings use p2m_mmio_direct_dev directly
Since v4:
- update skip_mapping comment
- add comment why we need to map interrupts to Dom0
Since v3:
 - pass struct map_range_data to map_dt_irq_to_domain
 - remove redundant check from map_range_to_domain
 - fix handle_device's .skip_mapping
Since v2:
 - removed check in map_range_to_domain for PCI_DEV
   and moved it to handle_device, so the code is
   simpler
 - s/map_pci_bridge/skip_mapping
 - extended comment in pci_host_bridge_mappings
 - minor code restructure in construct_dom0
 - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
   callbacks
 - unsigned int i; in pci_host_bridge_mappings
Since v1:
 - Added better description of why and what needs to be mapped into
   Domain-0's p2m and what doesn't
 - Do not do any mappings for PCI devices while traversing the DT
 - Walk all the bridges and make required mappings in one go
---
 xen/arch/arm/domain_build.c        | 67 +++++++++++++++++-------------
 xen/arch/arm/pci/ecam.c            | 14 +++++++
 xen/arch/arm/pci/pci-host-common.c | 50 ++++++++++++++++++++++
 xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
 xen/include/asm-arm/pci.h          | 10 +++++
 xen/include/asm-arm/setup.h        | 13 ++++++
 6 files changed, 126 insertions(+), 29 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index f7fcb1400c19..c7d992456ca7 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -10,7 +10,6 @@
 #include <asm/regs.h>
 #include <xen/errno.h>
 #include <xen/err.h>
-#include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
 #include <xen/guest_access.h>
 #include <xen/iocap.h>
@@ -51,12 +50,6 @@ static int __init parse_dom0_mem(const char *s)
 }
 custom_param("dom0_mem", parse_dom0_mem);
 
-struct map_range_data
-{
-    struct domain *d;
-    p2m_type_t p2mt;
-};
-
 /* Override macros from asm/page.h to make them work with mfn_t */
 #undef virt_to_mfn
 #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
@@ -1676,10 +1669,10 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
                                        const struct dt_irq *dt_irq,
                                        void *data)
 {
-    struct domain *d = data;
+    struct map_range_data *mr_data = data;
+    struct domain *d = mr_data->d;
     unsigned int irq = dt_irq->irq;
     int res;
-    bool need_mapping = !dt_device_for_passthrough(dev);
 
     if ( irq < NR_LOCAL_IRQS )
     {
@@ -1698,18 +1691,16 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
         return res;
     }
 
-    res = map_irq_to_domain(d, irq, need_mapping, dt_node_name(dev));
+    res = map_irq_to_domain(d, irq, !mr_data->skip_mapping, dt_node_name(dev));
 
     return 0;
 }
 
-static int __init map_range_to_domain(const struct dt_device_node *dev,
-                                      u64 addr, u64 len,
-                                      void *data)
+int __init map_range_to_domain(const struct dt_device_node *dev,
+                               u64 addr, u64 len, void *data)
 {
     struct map_range_data *mr_data = data;
     struct domain *d = mr_data->d;
-    bool need_mapping = !dt_device_for_passthrough(dev);
     int res;
 
     res = iomem_permit_access(d, paddr_to_pfn(addr),
@@ -1723,7 +1714,7 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
         return res;
     }
 
-    if ( need_mapping )
+    if ( !mr_data->skip_mapping )
     {
         res = map_regions_p2mt(d,
                                gaddr_to_gfn(addr),
@@ -1752,23 +1743,21 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
  * then we may need to perform additional mappings in order to make
  * the child resources available to domain 0.
  */
-static int __init map_device_children(struct domain *d,
-                                      const struct dt_device_node *dev,
-                                      p2m_type_t p2mt)
+static int __init map_device_children(const struct dt_device_node *dev,
+                                      struct map_range_data *mr_data)
 {
-    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
-    int ret;
-
     if ( dt_device_type_is_equal(dev, "pci") )
     {
+        int ret;
+
         dt_dprintk("Mapping children of %s to guest\n",
                    dt_node_full_name(dev));
 
-        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, d);
+        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, mr_data);
         if ( ret < 0 )
             return ret;
 
-        ret = dt_for_each_range(dev, &map_range_to_domain, &mr_data);
+        ret = dt_for_each_range(dev, &map_range_to_domain, mr_data);
         if ( ret < 0 )
             return ret;
     }
@@ -1848,14 +1837,28 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
     unsigned int i;
     int res;
     u64 addr, size;
-    bool need_mapping = !dt_device_for_passthrough(dev);
+    bool own_device = !dt_device_for_passthrough(dev);
+    /*
+     * For PCI passthrough we only need to remap to Dom0 the interrupts
+     * and memory ranges from "reg" property which cover controller's
+     * configuration registers and such. PCIe configuration space registers
+     * of the PCIe Root Complex and PCIe aperture should not be mapped
+     * automatically to Dom0.
+     */
+    struct map_range_data mr_data = {
+        .d = d,
+        .p2mt = p2mt,
+        .skip_mapping = !own_device ||
+                        (is_pci_passthrough_enabled() &&
+                        (device_get_class(dev) == DEVICE_PCI_HOSTBRIDGE))
+    };
 
     naddr = dt_number_of_address(dev);
 
     dt_dprintk("%s passthrough = %d naddr = %u\n",
-               dt_node_full_name(dev), need_mapping, naddr);
+               dt_node_full_name(dev), own_device, naddr);
 
-    if ( need_mapping )
+    if ( own_device )
     {
         dt_dprintk("Check if %s is behind the IOMMU and add it\n",
                    dt_node_full_name(dev));
@@ -1881,14 +1884,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
         }
     }
 
-    res = handle_device_interrupts(d, dev, need_mapping);
+    res = handle_device_interrupts(d, dev, own_device);
     if ( res < 0 )
         return res;
 
     /* Give permission and map MMIOs */
     for ( i = 0; i < naddr; i++ )
     {
-        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
         res = dt_device_get_address(dev, i, &addr, &size);
         if ( res )
         {
@@ -1902,7 +1904,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
             return res;
     }
 
-    res = map_device_children(d, dev, p2mt);
+    res = map_device_children(dev, &mr_data);
     if ( res )
         return res;
 
@@ -3060,7 +3062,14 @@ static int __init construct_dom0(struct domain *d)
         return rc;
 
     if ( acpi_disabled )
+    {
         rc = prepare_dtb_hwdom(d, &kinfo);
+        if ( rc < 0 )
+            return rc;
+#ifdef CONFIG_HAS_PCI
+        rc = pci_host_bridge_mappings(d);
+#endif
+    }
     else
         rc = prepare_acpi(d, &kinfo);
 
diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
index 602d00799c8d..4f71b11c3057 100644
--- a/xen/arch/arm/pci/ecam.c
+++ b/xen/arch/arm/pci/ecam.c
@@ -40,6 +40,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
     return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where;
 }
 
+bool __init pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
+                                            struct pci_host_bridge *bridge,
+                                            uint64_t addr)
+{
+    struct pci_config_window *cfg = bridge->cfg;
+
+    /*
+     * We do not want ECAM address space to be mapped in Domain-0's p2m,
+     * so we can trap access to it.
+     */
+    return cfg->phys_addr != addr;
+}
+
 /* ECAM ops */
 const struct pci_ecam_ops pci_generic_ecam_ops = {
     .bus_shift  = 20,
@@ -47,6 +60,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
         .map_bus                = pci_ecam_map_bus,
         .read                   = pci_generic_config_read,
         .write                  = pci_generic_config_write,
+        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
     }
 };
 
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index 0d271a6e8881..6af845ab9d6c 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -22,6 +22,8 @@
 #include <xen/sched.h>
 #include <xen/vmap.h>
 
+#include <asm/setup.h>
+
 /*
  * List for all the pci host bridges.
  */
@@ -316,6 +318,54 @@ unsigned int pci_host_get_num_bridges(void)
     return count;
 }
 
+int __init pci_host_bridge_mappings(struct domain *d)
+{
+    struct pci_host_bridge *bridge;
+    struct map_range_data mr_data = {
+        .d = d,
+        .p2mt = p2m_mmio_direct_dev,
+        .skip_mapping = false
+    };
+
+    /*
+     * For each PCI host bridge we need to only map those ranges
+     * which are used by Domain-0 to properly initialize the bridge,
+     * e.g. we do not want to map ECAM configuration space which lives in
+     * "reg" device tree property, but we want to map other regions of
+     * the host bridge. The PCI aperture defined by the "ranges" device
+     * tree property should also be skipped.
+     */
+    list_for_each_entry( bridge, &pci_host_bridges, node )
+    {
+        const struct dt_device_node *dev = bridge->dt_node;
+        unsigned int i;
+
+        for ( i = 0; i < dt_number_of_address(dev); i++ )
+        {
+            uint64_t addr, size;
+            int err;
+
+            err = dt_device_get_address(dev, i, &addr, &size);
+            if ( err )
+            {
+                printk(XENLOG_ERR
+                       "Unable to retrieve address range index=%u for %s\n",
+                       i, dt_node_full_name(dev));
+                return err;
+            }
+
+            if ( bridge->ops->need_p2m_hwdom_mapping(d, bridge, addr) )
+            {
+                err = map_range_to_domain(dev, addr, size, &mr_data);
+                if ( err )
+                    return err;
+            }
+        }
+    }
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
index 516982bca833..101edb8593c1 100644
--- a/xen/arch/arm/pci/pci-host-zynqmp.c
+++ b/xen/arch/arm/pci/pci-host-zynqmp.c
@@ -34,6 +34,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
         .map_bus                = pci_ecam_map_bus,
         .read                   = pci_generic_config_read,
         .write                  = pci_generic_config_write,
+        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
     }
 };
 
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 969333043431..3d706fdd1d88 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -17,6 +17,8 @@
 
 #ifdef CONFIG_HAS_PCI
 
+#include <asm/p2m.h>
+
 #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
 
 extern bool pci_passthrough_enabled;
@@ -73,6 +75,9 @@ struct pci_ops {
                 uint32_t reg, uint32_t len, uint32_t *value);
     int (*write)(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
                  uint32_t reg, uint32_t len, uint32_t value);
+    bool (*need_p2m_hwdom_mapping)(struct domain *d,
+                                   struct pci_host_bridge *bridge,
+                                   uint64_t addr);
 };
 
 /*
@@ -96,6 +101,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
                              uint32_t reg, uint32_t len, uint32_t value);
 void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
                                pci_sbdf_t sbdf, uint32_t where);
+bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
+                                     struct pci_host_bridge *bridge,
+                                     uint64_t addr);
 struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
 struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
 int pci_get_host_bridge_segment(const struct dt_device_node *node,
@@ -115,6 +123,8 @@ int pci_host_iterate_bridges(struct domain *d,
                                         struct pci_host_bridge *bridge));
 unsigned int pci_host_get_num_bridges(void);
 
+int pci_host_bridge_mappings(struct domain *d);
+
 #else   /*!CONFIG_HAS_PCI*/
 
 struct arch_pci_dev { };
diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
index 95da0b7ab9cd..88d9673db817 100644
--- a/xen/include/asm-arm/setup.h
+++ b/xen/include/asm-arm/setup.h
@@ -2,6 +2,8 @@
 #define __ARM_SETUP_H_
 
 #include <public/version.h>
+#include <asm/p2m.h>
+#include <xen/device_tree.h>
 
 #define MIN_FDT_ALIGN 8
 #define MAX_FDT_SIZE SZ_2M
@@ -77,6 +79,14 @@ struct bootinfo {
 #endif
 };
 
+struct map_range_data
+{
+    struct domain *d;
+    p2m_type_t p2mt;
+    /* Set if mapping of the memory ranges must be skipped. */
+    bool skip_mapping;
+};
+
 extern struct bootinfo bootinfo;
 
 extern domid_t max_init_domid;
@@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
 u32 device_tree_get_u32(const void *fdt, int node,
                         const char *prop_name, u32 dflt);
 
+int map_range_to_domain(const struct dt_device_node *dev,
+                        u64 addr, u64 len, void *data);
+
 #endif
 /*
  * Local variables:
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices
  2021-11-05  6:33 [PATCH v6 0/7] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (3 preceding siblings ...)
  2021-11-05  6:33 ` [PATCH v6 4/7] xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
@ 2021-11-05  6:33 ` Oleksandr Andrushchenko
  2021-11-16 19:22   ` Julien Grall
  2021-11-05  6:33 ` [PATCH v6 6/7] xen/arm: process pending vPCI map/unmap operations Oleksandr Andrushchenko
  2021-11-05  6:33 ` [PATCH v6 7/7] xen/arm: do not use void pointer in pci_host_common_probe Oleksandr Andrushchenko
  6 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-05  6:33 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Currently Xen maps all IRQs and memory ranges for all devices except
those marked for passthrough, e.g. it doesn't pay attention to the
"status" property of the node.

According to the device tree specification [1]:
 - "okay"     Indicates the device is operational.
 - "disabled" Indicates that the device is not presently operational,
              but it might become operational in the future (for example,
	      something is not plugged in, or switched off).
	      Refer to the device binding for details on what disabled means
	      for a given device.

So, "disabled" status is device dependent and mapping should be taken by
case-by-case approach with that respect. Although in general Xen should map
IRQs and memory ranges as the disabled devices might become operational it
makes it impossible for the other devices, which are not operational in
any case, to skip the mappings.

This patch disables mapping for the devices with status = "disabled".

[1] https://github.com/devicetree-org/devicetree-specification/releases/download/v0.3/devicetree-specification-v0.3.pdf

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

---
New in v6
---
 xen/arch/arm/domain_build.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index c7d992456ca7..d3a4c0a173b8 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1837,7 +1837,8 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
     unsigned int i;
     int res;
     u64 addr, size;
-    bool own_device = !dt_device_for_passthrough(dev);
+    bool own_device = !dt_device_for_passthrough(dev) &&
+                      dt_device_is_available(dev);
     /*
      * For PCI passthrough we only need to remap to Dom0 the interrupts
      * and memory ranges from "reg" property which cover controller's
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 6/7] xen/arm: process pending vPCI map/unmap operations
  2021-11-05  6:33 [PATCH v6 0/7] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (4 preceding siblings ...)
  2021-11-05  6:33 ` [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices Oleksandr Andrushchenko
@ 2021-11-05  6:33 ` Oleksandr Andrushchenko
  2021-11-05  7:40   ` Jan Beulich
  2021-11-17 21:26   ` Julien Grall
  2021-11-05  6:33 ` [PATCH v6 7/7] xen/arm: do not use void pointer in pci_host_common_probe Oleksandr Andrushchenko
  6 siblings, 2 replies; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-05  6:33 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

vPCI may map and unmap PCI device memory (BARs) being passed through which
may take a lot of time. For this those operations may be deferred to be
performed later, so that they can be safely preempted.

Currently this deferred processing is happening in common IOREQ code
which doesn't seem to be the right place for x86 and is even more
doubtful because IOREQ may not be enabled for Arm at all.
So, for Arm the pending vPCI work may have no chance to be executed
if the processing is left as is in the common IOREQ code only.
For that reason make vPCI processing happen in arch specific code.

Please be aware that there are a few outstanding TODOs affecting this
code path, see xen/drivers/vpci/header.c:map_range and
xen/drivers/vpci/header.c:vpci_process_pending.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Paul Durrant <paul@xen.org>

Since v5:
 - check_for_vcpu_work: vPCI addition is moved before the
   vcpu_ioreq__handle_completion(v). This is to avoid differences
   with the x86 version. (Julien)
Since v2:
 - update commit message with more insight on x86, IOREQ and Arm
 - restored order of invocation for IOREQ and vPCI processing (Jan)
Since v1:
 - Moved the check for pending vpci work from the common IOREQ code
   to hvm_do_resume on x86
 - Re-worked the code for Arm to ensure we don't miss pending vPCI work
---
 xen/arch/arm/traps.c   | 13 +++++++++++++
 xen/arch/x86/hvm/hvm.c |  6 ++++++
 xen/common/ioreq.c     |  9 ---------
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 219ab3c3fbde..8757210a798b 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -34,6 +34,7 @@
 #include <xen/symbols.h>
 #include <xen/version.h>
 #include <xen/virtual_region.h>
+#include <xen/vpci.h>
 
 #include <public/sched.h>
 #include <public/xen.h>
@@ -2290,6 +2291,18 @@ static bool check_for_vcpu_work(void)
 {
     struct vcpu *v = current;
 
+    if ( has_vpci(v->domain) )
+    {
+        bool pending;
+
+        local_irq_enable();
+        pending = vpci_process_pending(v);
+        local_irq_disable();
+
+        if ( pending )
+            return true;
+    }
+
 #ifdef CONFIG_IOREQ_SERVER
     if ( domain_has_ioreq_server(v->domain) )
     {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index eee365711d63..096a61b7ea02 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -546,6 +546,12 @@ void hvm_do_resume(struct vcpu *v)
 
     pt_restore_timer(v);
 
+    if ( has_vpci(v->domain) && vpci_process_pending(v) )
+    {
+        raise_softirq(SCHEDULE_SOFTIRQ);
+        return;
+    }
+
     if ( !vcpu_ioreq_handle_completion(v) )
         return;
 
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index d732dc045df9..689d256544c8 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -25,9 +25,7 @@
 #include <xen/lib.h>
 #include <xen/paging.h>
 #include <xen/sched.h>
-#include <xen/softirq.h>
 #include <xen/trace.h>
-#include <xen/vpci.h>
 
 #include <asm/guest_atomics.h>
 #include <asm/ioreq.h>
@@ -212,19 +210,12 @@ static bool wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
 
 bool vcpu_ioreq_handle_completion(struct vcpu *v)
 {
-    struct domain *d = v->domain;
     struct vcpu_io *vio = &v->io;
     struct ioreq_server *s;
     struct ioreq_vcpu *sv;
     enum vio_completion completion;
     bool res = true;
 
-    if ( has_vpci(d) && vpci_process_pending(v) )
-    {
-        raise_softirq(SCHEDULE_SOFTIRQ);
-        return false;
-    }
-
     while ( (sv = get_pending_vcpu(v, &s)) != NULL )
         if ( !wait_for_io(sv, get_ioreq(s, v)) )
             return false;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 7/7] xen/arm: do not use void pointer in pci_host_common_probe
  2021-11-05  6:33 [PATCH v6 0/7] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (5 preceding siblings ...)
  2021-11-05  6:33 ` [PATCH v6 6/7] xen/arm: process pending vPCI map/unmap operations Oleksandr Andrushchenko
@ 2021-11-05  6:33 ` Oleksandr Andrushchenko
  2021-11-17 11:12   ` Rahul Singh
  2021-11-17 21:45   ` Julien Grall
  6 siblings, 2 replies; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-05  6:33 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

There is no reason to use void pointer while passing ECAM ops to the
pci_host_common_probe function as it is anyway casted to struct pci_ecam_ops
inside. For that reason remove the void pointer and pass struct pci_ecam_ops
pointer as is.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

---
New in v4
---
 xen/arch/arm/pci/ecam.c            | 4 ++--
 xen/arch/arm/pci/pci-host-common.c | 6 ++----
 xen/include/asm-arm/pci.h          | 5 +++--
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
index 4f71b11c3057..6aeea12a68bf 100644
--- a/xen/arch/arm/pci/ecam.c
+++ b/xen/arch/arm/pci/ecam.c
@@ -24,8 +24,8 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
                                pci_sbdf_t sbdf, uint32_t where)
 {
     const struct pci_config_window *cfg = bridge->cfg;
-    struct pci_ecam_ops *ops =
-        container_of(bridge->ops, struct pci_ecam_ops, pci_ops);
+    const struct pci_ecam_ops *ops =
+        container_of(bridge->ops, const struct pci_ecam_ops, pci_ops);
     unsigned int devfn_shift = ops->bus_shift - 8;
     void __iomem *base;
 
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index 6af845ab9d6c..1aad664b213e 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -194,15 +194,13 @@ static int pci_bus_find_domain_nr(struct dt_device_node *dev)
     return domain;
 }
 
-int pci_host_common_probe(struct dt_device_node *dev, const void *data)
+int pci_host_common_probe(struct dt_device_node *dev,
+                          const struct pci_ecam_ops *ops)
 {
     struct pci_host_bridge *bridge;
     struct pci_config_window *cfg;
-    struct pci_ecam_ops *ops;
     int err;
 
-    ops = (struct pci_ecam_ops *) data;
-
     bridge = pci_alloc_host_bridge();
     if ( !bridge )
         return -ENOMEM;
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 3d706fdd1d88..4199e0267d24 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -65,7 +65,7 @@ struct pci_host_bridge {
     struct list_head node;           /* Node in list of host bridges */
     uint16_t segment;                /* Segment number */
     struct pci_config_window* cfg;   /* Pointer to the bridge config window */
-    struct pci_ops *ops;
+    const struct pci_ops *ops;
 };
 
 struct pci_ops {
@@ -94,7 +94,8 @@ struct pci_ecam_ops {
 /* Default ECAM ops */
 extern const struct pci_ecam_ops pci_generic_ecam_ops;
 
-int pci_host_common_probe(struct dt_device_node *dev, const void *data);
+int pci_host_common_probe(struct dt_device_node *dev,
+                          const struct pci_ecam_ops *ops);
 int pci_generic_config_read(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
                             uint32_t reg, uint32_t len, uint32_t *value);
 int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 6/7] xen/arm: process pending vPCI map/unmap operations
  2021-11-05  6:33 ` [PATCH v6 6/7] xen/arm: process pending vPCI map/unmap operations Oleksandr Andrushchenko
@ 2021-11-05  7:40   ` Jan Beulich
  2021-11-17 21:26   ` Julien Grall
  1 sibling, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2021-11-05  7:40 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, andrew.cooper3, george.dunlap, paul,
	bertrand.marquis, rahul.singh, Oleksandr Andrushchenko,
	xen-devel

On 05.11.2021 07:33, Oleksandr Andrushchenko wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -546,6 +546,12 @@ void hvm_do_resume(struct vcpu *v)
>  
>      pt_restore_timer(v);
>  
> +    if ( has_vpci(v->domain) && vpci_process_pending(v) )
> +    {
> +        raise_softirq(SCHEDULE_SOFTIRQ);
> +        return;
> +    }
> +
>      if ( !vcpu_ioreq_handle_completion(v) )
>          return;
>  

For this part of the change:
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-05  6:33 ` [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
@ 2021-11-09  9:20   ` Oleksandr Andrushchenko
  2021-11-16 19:12   ` Julien Grall
  1 sibling, 0 replies; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-09  9:20 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko



On 05.11.21 08:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> [snip]
> +int pci_host_iterate_bridges(struct domain *d,
> +                             int (*cb)(struct domain *d,
> +                                       struct pci_host_bridge *bridge))
> +{
> +    struct pci_host_bridge *bridge;
> +    int err;
> +
> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +    {
> +        err = cb(d, bridge);
> +        if ( err )
> +            return err;
> +    }
> +    return 0;
> +}
> +
> +unsigned int pci_host_get_num_bridges(void)
> +{
> +    struct pci_host_bridge *bridge;
> +    unsigned int count = 0;
> +
> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +        count++;
> +
> +    return count;
> +}
> +
>
This can be even simpler if pci_host_iterate_bridges returns the count:

-int pci_host_iterate_bridges(struct domain *d,
-                             int (*cb)(struct domain *d,
-                                       struct pci_host_bridge *bridge))
+int pci_host_iterate_bridges_and_count(struct domain *d,
+                                       int (*cb)(struct domain *d,
+                                                 struct pci_host_bridge *bridge))
  {
      struct pci_host_bridge *bridge;
-    int err;
+    int err, count = 0;

      list_for_each_entry( bridge, &pci_host_bridges, node )
      {
          err = cb(d, bridge);
          if ( err )
              return err;
+        count += err;
      }
-    return 0;
-}

Then pci_host_get_num_bridges goes away and we can count
different entities with the same iterator and a simple callback.
This becomes possible as there is a single user for pci_host_iterate_bridges
now which sets up MMIOs, so the change above seems to be reasonable
I will include this change in v7

Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 1/7] xen/arm: rename DEVICE_PCI to DEVICE_PCI_HOSTBRIDGE
  2021-11-05  6:33 ` [PATCH v6 1/7] xen/arm: rename DEVICE_PCI to DEVICE_PCI_HOSTBRIDGE Oleksandr Andrushchenko
@ 2021-11-16 18:26   ` Julien Grall
  0 siblings, 0 replies; 43+ messages in thread
From: Julien Grall @ 2021-11-16 18:26 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

Hi Oleksandr,

On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> To better reflect the nature of the device type and not to make any
> confusion rename DEVICE_PCI to DEVICE_PCI_HOSTBRIDGE which it
> really is.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Suggested-by: Julien Grall <julien@xen.org>

We usually put the Suggested-by tag before the Signed-off-by. Other than 
that:

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-05  6:33 ` [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices Oleksandr Andrushchenko
@ 2021-11-16 18:48   ` Julien Grall
  2021-11-17  6:56     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-16 18:48 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

Hi Oleksander,

On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> If a PCI host bridge device is present in the device tree, but is
> disabled, then its PCI host bridge driver was not instantiated.
> This results in the failure of the pci_get_host_bridge_segment()
> and the following panic during Xen start:
> 
> (XEN) Device tree generation failed (-22).
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Could not set up DOM0 guest OS
> (XEN) ****************************************
> 
> Fix this by adding "linux,pci-domain" property for all device tree nodes
> which have "pci" device type, so we know which segments will be used by
> the guest for which bridges.
> 
> Fixes: 4cfab4425d39 ("xen/arm: Add linux,pci-domain property for hwdom if not available.")
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> ---
> New in v6
> ---
>   xen/arch/arm/domain_build.c        | 15 ++++++++++++++-
>   xen/arch/arm/pci/pci-host-common.c |  2 +-
>   xen/include/asm-arm/pci.h          |  8 ++++++++
>   3 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 491f5e2c316e..f7fcb1400c19 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -753,9 +753,22 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo,
>           {
>               uint16_t segment;
>   
> +            /*
> +             * The node doesn't have "linux,pci-domain" property and it is
> +             * possible that:
> +             *  - Xen only has drivers for a part of the host bridges
> +             *  - some host bridges are disabled
> +             * Make sure we insert the correct "linux,pci-domain" property
> +             * in any case, so we know which segments will be used
> +             * by Linux for which bridges.

The check above will check the node type is "pci". AFAICT, this would 
also cover PCI devices. I am not aware of any issue to add 
"linux,pci-domain" for them. However, this feels a bit odd.

 From my understanding, a PCI device would always be described as a 
child of the hostbridges. So I would rework the 'if' to also check if 
the parent type is not "pci".

> +             */
>               res = pci_get_host_bridge_segment(node, &segment);
>               if ( res < 0 )
> -                return res;
> +            {
> +                segment = pci_get_new_domain_nr();
> +                printk(XENLOG_DEBUG "Assigned segment %d to %s\n",
> +                       segment, node->full_name);
> +            }
>   
>               res = fdt_property_cell(kinfo->fdt, "linux,pci-domain", segment);
>               if ( res )
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index d8cbaaaba654..47104b22b221 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -137,7 +137,7 @@ void pci_add_host_bridge(struct pci_host_bridge *bridge)
>       list_add_tail(&bridge->node, &pci_host_bridges);
>   }
>   
> -static int pci_get_new_domain_nr(void)
> +int pci_get_new_domain_nr(void)
>   {
>       return atomic_inc_return(&domain_nr);

We may have a DT where only the nodes used by Xen have 
"linux,pci-domain". In this case, we would end up to return 0, 1... 
which may have already been used.

This will probably make Linux unhappy. So I would return -1 here if 
use_dt_domains == 1. The caller would also need to bail out if -1 is 
returned.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-05  6:33 ` [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
  2021-11-09  9:20   ` Oleksandr Andrushchenko
@ 2021-11-16 19:12   ` Julien Grall
  2021-11-18  7:27     ` Oleksandr Andrushchenko
  1 sibling, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-16 19:12 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

Hi Oleksandr,

On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> In order for vPCI to work it needs to maintain guest and hardware
> domain's views of the configuration space. For example, BARs and
> COMMAND registers require emulation for guests and the guest view
> of the registers needs to be in sync with the real contents of the
> relevant registers. For that ECAM address space needs to also be
> trapped for the hardware domain, so we need to implement PCI host
> bridge specific callbacks to properly setup MMIO handlers for those
> ranges depending on particular host bridge implementation.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
> Since v5:
> - add vpci_sbdf_from_gpa helper for gpa to SBDF translation
> - take bridge's bus start into account while calculating SBDF
> Since v4:
> - unsigned int for functions working with count
> - gate number of MMIO handlers needed for CONFIG_HAS_PCI_MSI
>    and fix their number, e.g. single handler for PBA and
>    MSI-X tables (Roger)
> - re-work code for assigning MMIO handlers to be simpler
>    and account on the fact that there could multiple host bridges
>    exist for the hwdom
> Since v3:
> - fixed comment formatting
> Since v2:
> - removed unneeded assignment (count = 0)
> - removed unneeded header inclusion
> - update commit message
> Since v1:
>   - Dynamically calculate the number of MMIO handlers required for vPCI
>     and update the total number accordingly
>   - s/clb/cb
>   - Do not introduce a new callback for MMIO handler setup
> ---
>   xen/arch/arm/domain.c              |  2 +
>   xen/arch/arm/pci/pci-host-common.c | 27 ++++++++++++
>   xen/arch/arm/vpci.c                | 66 ++++++++++++++++++++++++++----
>   xen/arch/arm/vpci.h                |  6 +++
>   xen/include/asm-arm/pci.h          |  5 +++
>   5 files changed, 98 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 96e1b235501d..92a6c509e5c5 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -739,6 +739,8 @@ int arch_domain_create(struct domain *d,
>       if ( (rc = domain_vgic_register(d, &count)) != 0 )
>           goto fail;
>   
> +    count += domain_vpci_get_num_mmio_handlers(d);
> +
>       if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
>           goto fail;
>   
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index 47104b22b221..0d271a6e8881 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -289,6 +289,33 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
>       return -EINVAL;
>   }
>   
> +int pci_host_iterate_bridges(struct domain *d,
> +                             int (*cb)(struct domain *d,
> +                                       struct pci_host_bridge *bridge))
> +{
> +    struct pci_host_bridge *bridge;
> +    int err;
> +
> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +    {
> +        err = cb(d, bridge);
> +        if ( err )
> +            return err;
> +    }
> +    return 0;
> +}
> +
> +unsigned int pci_host_get_num_bridges(void)
> +{
> +    struct pci_host_bridge *bridge;
> +    unsigned int count = 0;

How about making this static and...

> +
> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +        count++;

... only call list_for_each_entry() when count is -1? So we would only 
go through the list once.

This should be fine given hostbridge can only be added during boot (we 
would need to protect pci_host_bridges with a lock otherwise).

> +
> +    return count;
> +}
> +
>   /*
>    * Local variables:
>    * mode: C
> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
> index 23f45386f4b3..5a6ebd8b9868 100644
> --- a/xen/arch/arm/vpci.c
> +++ b/xen/arch/arm/vpci.c
> @@ -16,16 +16,31 @@
>   
>   #include <asm/mmio.h>
>   
> +static pci_sbdf_t vpci_sbdf_from_gpa(const struct pci_host_bridge *bridge,
> +                                     paddr_t gpa)
> +{
> +    pci_sbdf_t sbdf;
> +
> +    if ( bridge )
> +    {
> +        sbdf.sbdf = VPCI_ECAM_BDF(gpa - bridge->cfg->phys_addr);
> +        sbdf.seg = bridge->segment;
> +        sbdf.bus += bridge->cfg->busn_start;
> +    }
> +    else
> +        sbdf.sbdf = VPCI_ECAM_BDF(gpa - GUEST_VPCI_ECAM_BASE);
> +
> +    return sbdf;
> +}
> +
>   static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>                             register_t *r, void *p)
>   {
> -    pci_sbdf_t sbdf;
> +    struct pci_host_bridge *bridge = p;
> +    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>       /* data is needed to prevent a pointer cast on 32bit */
>       unsigned long data;
>   
> -    /* We ignore segment part and always handle segment 0 */
> -    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
> -
>       if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa),
>                           1U << info->dabt.size, &data) )
>       {
> @@ -41,10 +56,8 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>   static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>                              register_t r, void *p)
>   {
> -    pci_sbdf_t sbdf;
> -
> -    /* We ignore segment part and always handle segment 0 */
> -    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
> +    struct pci_host_bridge *bridge = p;
> +    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>   
>       return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa),
>                              1U << info->dabt.size, r);
> @@ -55,17 +68,54 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
>       .write = vpci_mmio_write,
>   };
>   
> +static int vpci_setup_mmio_handler_cb(struct domain *d,
> +                                      struct pci_host_bridge *bridge)
> +{
> +    struct pci_config_window *cfg = bridge->cfg;
> +
> +    register_mmio_handler(d, &vpci_mmio_handler,
> +                          cfg->phys_addr, cfg->size, bridge);
> +    return 0;
> +}
> +
>   int domain_vpci_init(struct domain *d)
>   {
>       if ( !has_vpci(d) )
>           return 0;
>   
> +    if ( is_hardware_domain(d) )
> +        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler_cb);
> +
> +    /* Guest domains use what is programmed in their device tree. */

I would rather not make the assumption that the guest is using a 
Device-Tree. So how about:

/*
  * The hardware domain gets one virtual hostbridge by "real"
  * hostbridges.
  * Guests get the virtual platform layout (one virtual host bridge for
  * now).
  */

The comment would have to be moved before if ( is_hardware_domain(d) ).

>       register_mmio_handler(d, &vpci_mmio_handler,
>                             GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
>   
>       return 0;
>   }
>   
> +unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)

AFAICT, this function would also be called even if vPCI is not enabled 
for the domain. So we should add:

if ( !has_vpci(d) )
   return 0;

> +{
> +    unsigned int count;
> +
> +    if ( is_hardware_domain(d) )
> +        /* For each PCI host bridge's configuration space. */
> +        count = pci_host_get_num_bridges();

This first part makes sense to me. But...

> +    else

... I don't understand how the else is related to this commit. Can you 
clarify it?

> +        /*
> +         * There's a single MSI-X MMIO handler that deals with both PBA
> +         * and MSI-X tables per each PCI device being passed through.
> +         * Maximum number of supported devices is 32 as virtual bus
> +         * topology emulates the devices as embedded endpoints.
> +         * +1 for a single emulated host bridge's configuration space.
> +         */
> +        count = 1;
> +#ifdef CONFIG_HAS_PCI_MSI
> +        count += 32;

Surely, this is a decision that is based on other factor in the vPCI 
code. So can use a define and avoid hardcoding the number?

> +#endif


> +
> +    return count;
> +}
> +
>   /*
>    * Local variables:
>    * mode: C
> diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h
> index d8a7b0e3e802..3c713f3fcdb5 100644
> --- a/xen/arch/arm/vpci.h
> +++ b/xen/arch/arm/vpci.h
> @@ -17,11 +17,17 @@
>   
>   #ifdef CONFIG_HAS_VPCI
>   int domain_vpci_init(struct domain *d);
> +unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d);
>   #else
>   static inline int domain_vpci_init(struct domain *d)
>   {
>       return 0;
>   }
> +
> +static inline unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
> +{
> +    return 0;
> +}
>   #endif
>   
>   #endif /* __ARCH_ARM_VPCI_H__ */
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index c20eba643d86..969333043431 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -110,6 +110,11 @@ void arch_pci_init_pdev(struct pci_dev *pdev);
>   
>   int pci_get_new_domain_nr(void);
>   
> +int pci_host_iterate_bridges(struct domain *d,
> +                             int (*clb)(struct domain *d,

NIT: This is more common to call a callback 'cb'. In any case, I would 
prefer if the names matches the one used in the implementation.

> +                                        struct pci_host_bridge *bridge));
> +unsigned int pci_host_get_num_bridges(void);
> +
>   #else   /*!CONFIG_HAS_PCI*/
>   
>   struct arch_pci_dev { };
> 

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices
  2021-11-05  6:33 ` [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices Oleksandr Andrushchenko
@ 2021-11-16 19:22   ` Julien Grall
  2021-11-18  6:59     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-16 19:22 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

Hi Oleksandr,

On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Currently Xen maps all IRQs and memory ranges for all devices except
> those marked for passthrough, e.g. it doesn't pay attention to the
> "status" property of the node.
> 
> According to the device tree specification [1]:
>   - "okay"     Indicates the device is operational.
>   - "disabled" Indicates that the device is not presently operational,
>                but it might become operational in the future (for example,
> 	      something is not plugged in, or switched off).
> 	      Refer to the device binding for details on what disabled means
> 	      for a given device.
> 
> So, "disabled" status is device dependent and mapping should be taken by
> case-by-case approach with that respect. Although in general Xen should map
> IRQs and memory ranges as the disabled devices might become operational

Right, this change effectively prevent dom0 to use such device if it 
becomes operational in the future. So this sounds like a big regression.

How do you plan to handle it?

> it
> makes it impossible for the other devices, which are not operational in
> any case, to skip the mappings.

You wrote "makes it impossible for the other devices", but it is not 
clear to me what's would go wrong when we map a disabled device (Dom0 
should not touch it). Do you have an example?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-16 18:48   ` Julien Grall
@ 2021-11-17  6:56     ` Oleksandr Andrushchenko
  2021-11-17 21:33       ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-17  6:56 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko

Hi, Julien!

On 16.11.21 20:48, Julien Grall wrote:
> Hi Oleksander,
>
> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> If a PCI host bridge device is present in the device tree, but is
>> disabled, then its PCI host bridge driver was not instantiated.
>> This results in the failure of the pci_get_host_bridge_segment()
>> and the following panic during Xen start:
>>
>> (XEN) Device tree generation failed (-22).
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) Could not set up DOM0 guest OS
>> (XEN) ****************************************
>>
>> Fix this by adding "linux,pci-domain" property for all device tree nodes
>> which have "pci" device type, so we know which segments will be used by
>> the guest for which bridges.
>>
>> Fixes: 4cfab4425d39 ("xen/arm: Add linux,pci-domain property for hwdom if not available.")
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> ---
>> New in v6
>> ---
>>   xen/arch/arm/domain_build.c        | 15 ++++++++++++++-
>>   xen/arch/arm/pci/pci-host-common.c |  2 +-
>>   xen/include/asm-arm/pci.h          |  8 ++++++++
>>   3 files changed, 23 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index 491f5e2c316e..f7fcb1400c19 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -753,9 +753,22 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo,
>>           {
>>               uint16_t segment;
>>   +            /*
>> +             * The node doesn't have "linux,pci-domain" property and it is
>> +             * possible that:
>> +             *  - Xen only has drivers for a part of the host bridges
>> +             *  - some host bridges are disabled
>> +             * Make sure we insert the correct "linux,pci-domain" property
>> +             * in any case, so we know which segments will be used
>> +             * by Linux for which bridges.
>
> The check above will check the node type is "pci". AFAICT, this would also cover PCI devices. I am not aware of any issue to add "linux,pci-domain" for them. However, this feels a bit odd.
>
> From my understanding, a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci".
>
We may have "bridge -> bridge -> device" topology as well.
So, I prefer to have the check as it is.
>> +             */
>>               res = pci_get_host_bridge_segment(node, &segment);
>>               if ( res < 0 )
>> -                return res;
>> +            {
>> +                segment = pci_get_new_domain_nr();
>> +                printk(XENLOG_DEBUG "Assigned segment %d to %s\n",
>> +                       segment, node->full_name);
>> +            }
>>                 res = fdt_property_cell(kinfo->fdt, "linux,pci-domain", segment);
>>               if ( res )
>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>> index d8cbaaaba654..47104b22b221 100644
>> --- a/xen/arch/arm/pci/pci-host-common.c
>> +++ b/xen/arch/arm/pci/pci-host-common.c
>> @@ -137,7 +137,7 @@ void pci_add_host_bridge(struct pci_host_bridge *bridge)
>>       list_add_tail(&bridge->node, &pci_host_bridges);
>>   }
>>   -static int pci_get_new_domain_nr(void)
>> +int pci_get_new_domain_nr(void)
>>   {
>>       return atomic_inc_return(&domain_nr);
>
> We may have a DT where only the nodes used by Xen have "linux,pci-domain". In this case, we would end up to return 0, 1... which may have already been used.
>
> This will probably make Linux unhappy. So I would return -1 here if use_dt_domains == 1. The caller would also need to bail out if -1 is returned.
Yes, this sounds reasonable, I will add this change and print an error message so it is
easier to understand what Xen doesn't like (it took me a while to debug and understand
why I have "(XEN) Device tree generation failed (-22).")
>
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 7/7] xen/arm: do not use void pointer in pci_host_common_probe
  2021-11-05  6:33 ` [PATCH v6 7/7] xen/arm: do not use void pointer in pci_host_common_probe Oleksandr Andrushchenko
@ 2021-11-17 11:12   ` Rahul Singh
  2021-11-17 21:45   ` Julien Grall
  1 sibling, 0 replies; 43+ messages in thread
From: Rahul Singh @ 2021-11-17 11:12 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini,
	oleksandr_tyshchenko, volodymyr_babchuk, Artem_Mygaiev,
	roger.pau, jbeulich, andrew.cooper3, george.dunlap, paul,
	Bertrand Marquis, Oleksandr Andrushchenko

Hi Oleksandr,

> On 5 Nov 2021, at 6:33 am, Oleksandr Andrushchenko <andr2000@gmail.com> wrote:
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> There is no reason to use void pointer while passing ECAM ops to the
> pci_host_common_probe function as it is anyway casted to struct pci_ecam_ops
> inside. For that reason remove the void pointer and pass struct pci_ecam_ops
> pointer as is.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>

Regards,
Rahul
> 
> ---
> New in v4
> ---
> xen/arch/arm/pci/ecam.c            | 4 ++--
> xen/arch/arm/pci/pci-host-common.c | 6 ++----
> xen/include/asm-arm/pci.h          | 5 +++--
> 3 files changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> index 4f71b11c3057..6aeea12a68bf 100644
> --- a/xen/arch/arm/pci/ecam.c
> +++ b/xen/arch/arm/pci/ecam.c
> @@ -24,8 +24,8 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>                                pci_sbdf_t sbdf, uint32_t where)
> {
>     const struct pci_config_window *cfg = bridge->cfg;
> -    struct pci_ecam_ops *ops =
> -        container_of(bridge->ops, struct pci_ecam_ops, pci_ops);
> +    const struct pci_ecam_ops *ops =
> +        container_of(bridge->ops, const struct pci_ecam_ops, pci_ops);
>     unsigned int devfn_shift = ops->bus_shift - 8;
>     void __iomem *base;
> 
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index 6af845ab9d6c..1aad664b213e 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -194,15 +194,13 @@ static int pci_bus_find_domain_nr(struct dt_device_node *dev)
>     return domain;
> }
> 
> -int pci_host_common_probe(struct dt_device_node *dev, const void *data)
> +int pci_host_common_probe(struct dt_device_node *dev,
> +                          const struct pci_ecam_ops *ops)
> {
>     struct pci_host_bridge *bridge;
>     struct pci_config_window *cfg;
> -    struct pci_ecam_ops *ops;
>     int err;
> 
> -    ops = (struct pci_ecam_ops *) data;
> -
>     bridge = pci_alloc_host_bridge();
>     if ( !bridge )
>         return -ENOMEM;
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index 3d706fdd1d88..4199e0267d24 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -65,7 +65,7 @@ struct pci_host_bridge {
>     struct list_head node;           /* Node in list of host bridges */
>     uint16_t segment;                /* Segment number */
>     struct pci_config_window* cfg;   /* Pointer to the bridge config window */
> -    struct pci_ops *ops;
> +    const struct pci_ops *ops;
> };
> 
> struct pci_ops {
> @@ -94,7 +94,8 @@ struct pci_ecam_ops {
> /* Default ECAM ops */
> extern const struct pci_ecam_ops pci_generic_ecam_ops;
> 
> -int pci_host_common_probe(struct dt_device_node *dev, const void *data);
> +int pci_host_common_probe(struct dt_device_node *dev,
> +                          const struct pci_ecam_ops *ops);
> int pci_generic_config_read(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>                             uint32_t reg, uint32_t len, uint32_t *value);
> int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
> -- 
> 2.25.1
> 
> 



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 6/7] xen/arm: process pending vPCI map/unmap operations
  2021-11-05  6:33 ` [PATCH v6 6/7] xen/arm: process pending vPCI map/unmap operations Oleksandr Andrushchenko
  2021-11-05  7:40   ` Jan Beulich
@ 2021-11-17 21:26   ` Julien Grall
  1 sibling, 0 replies; 43+ messages in thread
From: Julien Grall @ 2021-11-17 21:26 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

Hi Oleksandr,

On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> vPCI may map and unmap PCI device memory (BARs) being passed through which
> may take a lot of time. For this those operations may be deferred to be
> performed later, so that they can be safely preempted.
> 
> Currently this deferred processing is happening in common IOREQ code
> which doesn't seem to be the right place for x86 and is even more
> doubtful because IOREQ may not be enabled for Arm at all.
> So, for Arm the pending vPCI work may have no chance to be executed
> if the processing is left as is in the common IOREQ code only.
> For that reason make vPCI processing happen in arch specific code.
> 
> Please be aware that there are a few outstanding TODOs affecting this
> code path, see xen/drivers/vpci/header.c:map_range and
> xen/drivers/vpci/header.c:vpci_process_pending.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-17  6:56     ` Oleksandr Andrushchenko
@ 2021-11-17 21:33       ` Julien Grall
  2021-11-18  7:13         ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-17 21:33 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh

Hi Oleksandr,

On 17/11/2021 06:56, Oleksandr Andrushchenko wrote:
> Hi, Julien!
> 
> On 16.11.21 20:48, Julien Grall wrote:
>> Hi Oleksander,
>>
>> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>
>>> If a PCI host bridge device is present in the device tree, but is
>>> disabled, then its PCI host bridge driver was not instantiated.
>>> This results in the failure of the pci_get_host_bridge_segment()
>>> and the following panic during Xen start:
>>>
>>> (XEN) Device tree generation failed (-22).
>>> (XEN)
>>> (XEN) ****************************************
>>> (XEN) Panic on CPU 0:
>>> (XEN) Could not set up DOM0 guest OS
>>> (XEN) ****************************************
>>>
>>> Fix this by adding "linux,pci-domain" property for all device tree nodes
>>> which have "pci" device type, so we know which segments will be used by
>>> the guest for which bridges.
>>>
>>> Fixes: 4cfab4425d39 ("xen/arm: Add linux,pci-domain property for hwdom if not available.")
>>>
>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>
>>> ---
>>> New in v6
>>> ---
>>>    xen/arch/arm/domain_build.c        | 15 ++++++++++++++-
>>>    xen/arch/arm/pci/pci-host-common.c |  2 +-
>>>    xen/include/asm-arm/pci.h          |  8 ++++++++
>>>    3 files changed, 23 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>> index 491f5e2c316e..f7fcb1400c19 100644
>>> --- a/xen/arch/arm/domain_build.c
>>> +++ b/xen/arch/arm/domain_build.c
>>> @@ -753,9 +753,22 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo,
>>>            {
>>>                uint16_t segment;
>>>    +            /*
>>> +             * The node doesn't have "linux,pci-domain" property and it is
>>> +             * possible that:
>>> +             *  - Xen only has drivers for a part of the host bridges
>>> +             *  - some host bridges are disabled
>>> +             * Make sure we insert the correct "linux,pci-domain" property
>>> +             * in any case, so we know which segments will be used
>>> +             * by Linux for which bridges.
>>
>> The check above will check the node type is "pci". AFAICT, this would also cover PCI devices. I am not aware of any issue to add "linux,pci-domain" for them. However, this feels a bit odd.
>>
>>  From my understanding, a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci".
>>
> We may have "bridge -> bridge -> device" topology as well.

Do you have an example of Device-Tree?

> So, I prefer to have the check as it is.

I don't really like the idea to spuriously add "linux,pci-domain" to PCI 
DT node. But if there are no other solution, then this should at least 
be mentionned in the commit message and code.

[...]

> Yes, this sounds reasonable, I will add this change and print an error message so it is
> easier to understand what Xen doesn't like (it took me a while to debug and understand
> why I have "(XEN) Device tree generation failed (-22).")


Sounds good to me. The current error is indeed confusing.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 7/7] xen/arm: do not use void pointer in pci_host_common_probe
  2021-11-05  6:33 ` [PATCH v6 7/7] xen/arm: do not use void pointer in pci_host_common_probe Oleksandr Andrushchenko
  2021-11-17 11:12   ` Rahul Singh
@ 2021-11-17 21:45   ` Julien Grall
  2021-11-18  7:34     ` Oleksandr Andrushchenko
  1 sibling, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-17 21:45 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

Hi Oleksandr,

On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> There is no reason to use void pointer while passing ECAM ops to the
> pci_host_common_probe function as it is anyway casted to struct pci_ecam_ops
> inside. For that reason remove the void pointer and pass struct pci_ecam_ops
> pointer as is.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

I was going to ack and push the patch. But then I couldn't apply the 
patch...

> 
> ---
> New in v4
> ---
>   xen/arch/arm/pci/ecam.c            | 4 ++--
>   xen/arch/arm/pci/pci-host-common.c | 6 ++----
>   xen/include/asm-arm/pci.h          | 5 +++--
>   3 files changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> index 4f71b11c3057..6aeea12a68bf 100644
> --- a/xen/arch/arm/pci/ecam.c
> +++ b/xen/arch/arm/pci/ecam.c
> @@ -24,8 +24,8 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>                                  pci_sbdf_t sbdf, uint32_t where)
>   {
>       const struct pci_config_window *cfg = bridge->cfg;
> -    struct pci_ecam_ops *ops =
> -        container_of(bridge->ops, struct pci_ecam_ops, pci_ops);
> +    const struct pci_ecam_ops *ops =
> +        container_of(bridge->ops, const struct pci_ecam_ops, pci_ops);
>       unsigned int devfn_shift = ops->bus_shift - 8;
>       void __iomem *base;
>   
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index 6af845ab9d6c..1aad664b213e 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -194,15 +194,13 @@ static int pci_bus_find_domain_nr(struct dt_device_node *dev)
>       return domain;
>   }
>   
> -int pci_host_common_probe(struct dt_device_node *dev, const void *data)
> +int pci_host_common_probe(struct dt_device_node *dev,
> +                          const struct pci_ecam_ops *ops)
>   {
>       struct pci_host_bridge *bridge;
>       struct pci_config_window *cfg;
> -    struct pci_ecam_ops *ops;
>       int err;

... in staging, the code has an two additional lines here:

     if ( dt_device_for_passthrough(dev) )
         return 0;

Is this series relying on patch that are not yet upstreamed?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices
  2021-11-16 19:22   ` Julien Grall
@ 2021-11-18  6:59     ` Oleksandr Andrushchenko
  2021-11-22 19:31       ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-18  6:59 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko

Hi, Julien!

On 16.11.21 21:22, Julien Grall wrote:
> Hi Oleksandr,
>
> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> Currently Xen maps all IRQs and memory ranges for all devices except
>> those marked for passthrough, e.g. it doesn't pay attention to the
>> "status" property of the node.
>>
>> According to the device tree specification [1]:
>>   - "okay"     Indicates the device is operational.
>>   - "disabled" Indicates that the device is not presently operational,
>>                but it might become operational in the future (for example,
>>           something is not plugged in, or switched off).
>>           Refer to the device binding for details on what disabled means
>>           for a given device.
>>
>> So, "disabled" status is device dependent and mapping should be taken by
>> case-by-case approach with that respect. Although in general Xen should map
>> IRQs and memory ranges as the disabled devices might become operational
>
> Right, this change effectively prevent dom0 to use such device if it becomes operational in the future.
Or doesn't, see below
> So this sounds like a big regression.
>
> How do you plan to handle it?
It depends if this is really a regression or is ok and "by design"
>
>> it
>> makes it impossible for the other devices, which are not operational in
>> any case, to skip the mappings.
>
> You wrote "makes it impossible for the other devices", but it is not clear to me what's would go wrong when we map a disabled device (Dom0 should not touch it). Do you have an example?
Ok, here we go. In the SoC I am working with the PCIe controller may run in two modes:
Root or Endpoint. Not configurable at run-time, so you configure it in the device tree.
The are two device tree entries for that: for the root [1] and for the endpoint [2].
So, when you want the controller to be a Root Complex then you put status = "disabled"
for the Endpoint entry and vise versa.

If you take a look at the nodes they both define the same "reg" and "interrupts",
effectively making it impossible for me in the patch [3] to actually trap MMIOs:
we skip the mappings for [1] and then, because we assume "disabled" is
still valid for mappings, we add those for [2].

So, this is probably why device tree documentation says about the disabled status
to be device specific.

Hope this describes a very valid use-case. At the same time you may argue that
I just need to remove the offending entry from the device tree which may also be
valid.
>
> Cheers,
>
Thank you,
Oleksandr

[1] https://github.com/renesas-rcar/linux-bsp/blob/v5.10.41/rcar-5.1.3.rc6/arch/arm64/boot/dts/renesas/r8a779f0.dtsi#L843
[2] https://github.com/renesas-rcar/linux-bsp/blob/v5.10.41/rcar-5.1.3.rc6/arch/arm64/boot/dts/renesas/r8a779f0.dtsi#L896
[3] https://patchwork.kernel.org/project/xen-devel/patch/20211105063326.939843-5-andr2000@gmail.com/

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-17 21:33       ` Julien Grall
@ 2021-11-18  7:13         ` Oleksandr Andrushchenko
  2021-11-22 15:29           ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-18  7:13 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko

Hi, Julien!

On 17.11.21 23:33, Julien Grall wrote:
> Hi Oleksandr,
>
> On 17/11/2021 06:56, Oleksandr Andrushchenko wrote:
>> Hi, Julien!
>>
>> On 16.11.21 20:48, Julien Grall wrote:
>>> Hi Oleksander,
>>>
>>> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>
>>>> If a PCI host bridge device is present in the device tree, but is
>>>> disabled, then its PCI host bridge driver was not instantiated.
>>>> This results in the failure of the pci_get_host_bridge_segment()
>>>> and the following panic during Xen start:
>>>>
>>>> (XEN) Device tree generation failed (-22).
>>>> (XEN)
>>>> (XEN) ****************************************
>>>> (XEN) Panic on CPU 0:
>>>> (XEN) Could not set up DOM0 guest OS
>>>> (XEN) ****************************************
>>>>
>>>> Fix this by adding "linux,pci-domain" property for all device tree nodes
>>>> which have "pci" device type, so we know which segments will be used by
>>>> the guest for which bridges.
>>>>
>>>> Fixes: 4cfab4425d39 ("xen/arm: Add linux,pci-domain property for hwdom if not available.")
>>>>
>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>
>>>> ---
>>>> New in v6
>>>> ---
>>>>    xen/arch/arm/domain_build.c        | 15 ++++++++++++++-
>>>>    xen/arch/arm/pci/pci-host-common.c |  2 +-
>>>>    xen/include/asm-arm/pci.h          |  8 ++++++++
>>>>    3 files changed, 23 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>> index 491f5e2c316e..f7fcb1400c19 100644
>>>> --- a/xen/arch/arm/domain_build.c
>>>> +++ b/xen/arch/arm/domain_build.c
>>>> @@ -753,9 +753,22 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo,
>>>>            {
>>>>                uint16_t segment;
>>>>    +            /*
>>>> +             * The node doesn't have "linux,pci-domain" property and it is
>>>> +             * possible that:
>>>> +             *  - Xen only has drivers for a part of the host bridges
>>>> +             *  - some host bridges are disabled
>>>> +             * Make sure we insert the correct "linux,pci-domain" property
>>>> +             * in any case, so we know which segments will be used
>>>> +             * by Linux for which bridges.
>>>
>>> The check above will check the node type is "pci". AFAICT, this would also cover PCI devices. I am not aware of any issue to add "linux,pci-domain" for them. However, this feels a bit odd.
>>>
>>>  From my understanding, a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci".
>>>
>> We may have "bridge -> bridge -> device" topology as well.
>
> Do you have an example of Device-Tree?
No, I don't have at hand, but I can imagine this can relatively easy be done with QEMU
Even if not, do you assume this topology can never happen?
>
>> So, I prefer to have the check as it is.
>
> I don't really like the idea to spuriously add "linux,pci-domain" to PCI DT node. But if there are no other solution, then this should at least be mentionned in the commit message and code.
I am fine with any solution here, I just want that to be defined and implemented.
Please let me know the final decision on this and how we proceed
>
> [...]
>
>> Yes, this sounds reasonable, I will add this change and print an error message so it is
>> easier to understand what Xen doesn't like (it took me a while to debug and understand
>> why I have "(XEN) Device tree generation failed (-22).")
>
>
> Sounds good to me. The current error is indeed confusing.
>
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-16 19:12   ` Julien Grall
@ 2021-11-18  7:27     ` Oleksandr Andrushchenko
  2021-11-18 10:46       ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-18  7:27 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko

Hi, Julien!

On 16.11.21 21:12, Julien Grall wrote:
> Hi Oleksandr,
>
> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> In order for vPCI to work it needs to maintain guest and hardware
>> domain's views of the configuration space. For example, BARs and
>> COMMAND registers require emulation for guests and the guest view
>> of the registers needs to be in sync with the real contents of the
>> relevant registers. For that ECAM address space needs to also be
>> trapped for the hardware domain, so we need to implement PCI host
>> bridge specific callbacks to properly setup MMIO handlers for those
>> ranges depending on particular host bridge implementation.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> ---
>> Since v5:
>> - add vpci_sbdf_from_gpa helper for gpa to SBDF translation
>> - take bridge's bus start into account while calculating SBDF
>> Since v4:
>> - unsigned int for functions working with count
>> - gate number of MMIO handlers needed for CONFIG_HAS_PCI_MSI
>>    and fix their number, e.g. single handler for PBA and
>>    MSI-X tables (Roger)
>> - re-work code for assigning MMIO handlers to be simpler
>>    and account on the fact that there could multiple host bridges
>>    exist for the hwdom
>> Since v3:
>> - fixed comment formatting
>> Since v2:
>> - removed unneeded assignment (count = 0)
>> - removed unneeded header inclusion
>> - update commit message
>> Since v1:
>>   - Dynamically calculate the number of MMIO handlers required for vPCI
>>     and update the total number accordingly
>>   - s/clb/cb
>>   - Do not introduce a new callback for MMIO handler setup
>> ---
>>   xen/arch/arm/domain.c              |  2 +
>>   xen/arch/arm/pci/pci-host-common.c | 27 ++++++++++++
>>   xen/arch/arm/vpci.c                | 66 ++++++++++++++++++++++++++----
>>   xen/arch/arm/vpci.h                |  6 +++
>>   xen/include/asm-arm/pci.h          |  5 +++
>>   5 files changed, 98 insertions(+), 8 deletions(-)
>>
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 96e1b235501d..92a6c509e5c5 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -739,6 +739,8 @@ int arch_domain_create(struct domain *d,
>>       if ( (rc = domain_vgic_register(d, &count)) != 0 )
>>           goto fail;
>>   +    count += domain_vpci_get_num_mmio_handlers(d);
>> +
>>       if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
>>           goto fail;
>>   diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>> index 47104b22b221..0d271a6e8881 100644
>> --- a/xen/arch/arm/pci/pci-host-common.c
>> +++ b/xen/arch/arm/pci/pci-host-common.c
>> @@ -289,6 +289,33 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
>>       return -EINVAL;
>>   }
>>   +int pci_host_iterate_bridges(struct domain *d,
>> +                             int (*cb)(struct domain *d,
>> +                                       struct pci_host_bridge *bridge))
>> +{
>> +    struct pci_host_bridge *bridge;
>> +    int err;
>> +
>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>> +    {
>> +        err = cb(d, bridge);
>> +        if ( err )
>> +            return err;
>> +    }
>> +    return 0;
>> +}
>> +
>> +unsigned int pci_host_get_num_bridges(void)
>> +{
>> +    struct pci_host_bridge *bridge;
>> +    unsigned int count = 0;
>
> How about making this static and...
>
>> +
>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>> +        count++;
>
> ... only call list_for_each_entry() when count is -1? So we would only go through the list once.
>
> This should be fine given hostbridge can only be added during boot (we would need to protect pci_host_bridges with a lock otherwise).
Ok, I can do that
>
>> +
>> +    return count;
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
>> index 23f45386f4b3..5a6ebd8b9868 100644
>> --- a/xen/arch/arm/vpci.c
>> +++ b/xen/arch/arm/vpci.c
>> @@ -16,16 +16,31 @@
>>     #include <asm/mmio.h>
>>   +static pci_sbdf_t vpci_sbdf_from_gpa(const struct pci_host_bridge *bridge,
>> +                                     paddr_t gpa)
>> +{
>> +    pci_sbdf_t sbdf;
>> +
>> +    if ( bridge )
>> +    {
>> +        sbdf.sbdf = VPCI_ECAM_BDF(gpa - bridge->cfg->phys_addr);
>> +        sbdf.seg = bridge->segment;
>> +        sbdf.bus += bridge->cfg->busn_start;
>> +    }
>> +    else
>> +        sbdf.sbdf = VPCI_ECAM_BDF(gpa - GUEST_VPCI_ECAM_BASE);
>> +
>> +    return sbdf;
>> +}
>> +
>>   static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>                             register_t *r, void *p)
>>   {
>> -    pci_sbdf_t sbdf;
>> +    struct pci_host_bridge *bridge = p;
>> +    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>>       /* data is needed to prevent a pointer cast on 32bit */
>>       unsigned long data;
>>   -    /* We ignore segment part and always handle segment 0 */
>> -    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
>> -
>>       if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa),
>>                           1U << info->dabt.size, &data) )
>>       {
>> @@ -41,10 +56,8 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>   static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>>                              register_t r, void *p)
>>   {
>> -    pci_sbdf_t sbdf;
>> -
>> -    /* We ignore segment part and always handle segment 0 */
>> -    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
>> +    struct pci_host_bridge *bridge = p;
>> +    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>>         return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa),
>>                              1U << info->dabt.size, r);
>> @@ -55,17 +68,54 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
>>       .write = vpci_mmio_write,
>>   };
>>   +static int vpci_setup_mmio_handler_cb(struct domain *d,
>> +                                      struct pci_host_bridge *bridge)
>> +{
>> +    struct pci_config_window *cfg = bridge->cfg;
>> +
>> +    register_mmio_handler(d, &vpci_mmio_handler,
>> +                          cfg->phys_addr, cfg->size, bridge);
>> +    return 0;
>> +}
>> +
>>   int domain_vpci_init(struct domain *d)
>>   {
>>       if ( !has_vpci(d) )
>>           return 0;
>>   +    if ( is_hardware_domain(d) )
>> +        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler_cb);
>> +
>> +    /* Guest domains use what is programmed in their device tree. */
>
> I would rather not make the assumption that the guest is using a Device-Tree. So how about:
>
> /*
>  * The hardware domain gets one virtual hostbridge by "real"
>  * hostbridges.
>  * Guests get the virtual platform layout (one virtual host bridge for
>  * now).
>  */
>
> The comment would have to be moved before if ( is_hardware_domain(d) ).
Sure, I can extend the comment
>
>>       register_mmio_handler(d, &vpci_mmio_handler,
>>                             GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
>>         return 0;
>>   }
>>   +unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
>
> AFAICT, this function would also be called even if vPCI is not enabled for the domain. So we should add:
>
> if ( !has_vpci(d) )
>   return 0;
>
Good catch, will add
>> +{
>> +    unsigned int count;
>> +
>> +    if ( is_hardware_domain(d) )
>> +        /* For each PCI host bridge's configuration space. */
>> +        count = pci_host_get_num_bridges();
>
> This first part makes sense to me. But...
>
>> +    else
>
> ... I don't understand how the else is related to this commit. Can you clarify it?
>
>> +        /*
>> +         * There's a single MSI-X MMIO handler that deals with both PBA
>> +         * and MSI-X tables per each PCI device being passed through.
>> +         * Maximum number of supported devices is 32 as virtual bus
>> +         * topology emulates the devices as embedded endpoints.
>> +         * +1 for a single emulated host bridge's configuration space.
>> +         */
>> +        count = 1;
>> +#ifdef CONFIG_HAS_PCI_MSI
>> +        count += 32;
>
> Surely, this is a decision that is based on other factor in the vPCI code. So can use a define and avoid hardcoding the number?
Well, in the later series [1] this is defined via PCI_SLOT(~0) + 1 and there is no dedicated
constant for that. I can use the same here, e.g. s/32/PCI_SLOT(~0) + 1
>
>> +#endif
>
>
>> +
>> +    return count;
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h
>> index d8a7b0e3e802..3c713f3fcdb5 100644
>> --- a/xen/arch/arm/vpci.h
>> +++ b/xen/arch/arm/vpci.h
>> @@ -17,11 +17,17 @@
>>     #ifdef CONFIG_HAS_VPCI
>>   int domain_vpci_init(struct domain *d);
>> +unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d);
>>   #else
>>   static inline int domain_vpci_init(struct domain *d)
>>   {
>>       return 0;
>>   }
>> +
>> +static inline unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
>> +{
>> +    return 0;
>> +}
>>   #endif
>>     #endif /* __ARCH_ARM_VPCI_H__ */
>> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
>> index c20eba643d86..969333043431 100644
>> --- a/xen/include/asm-arm/pci.h
>> +++ b/xen/include/asm-arm/pci.h
>> @@ -110,6 +110,11 @@ void arch_pci_init_pdev(struct pci_dev *pdev);
>>     int pci_get_new_domain_nr(void);
>>   +int pci_host_iterate_bridges(struct domain *d,
>> +                             int (*clb)(struct domain *d,
>
> NIT: This is more common to call a callback 'cb'. In any case, I would prefer if the names matches the one used in the implementation.
Will change
>
>> + struct pci_host_bridge *bridge));
>> +unsigned int pci_host_get_num_bridges(void);
>> +
>>   #else   /*!CONFIG_HAS_PCI*/
>>     struct arch_pci_dev { };
>>
>
> Cheers,
>
Thank you,
Oleksandr

[1] https://patchwork.kernel.org/project/xen-devel/patch/20211105065629.940943-11-andr2000@gmail.com/

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 7/7] xen/arm: do not use void pointer in pci_host_common_probe
  2021-11-17 21:45   ` Julien Grall
@ 2021-11-18  7:34     ` Oleksandr Andrushchenko
  2021-11-22 17:48       ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-18  7:34 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko

Hi, Julien!

On 17.11.21 23:45, Julien Grall wrote:
> Hi Oleksandr,
>
> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> There is no reason to use void pointer while passing ECAM ops to the
>> pci_host_common_probe function as it is anyway casted to struct pci_ecam_ops
>> inside. For that reason remove the void pointer and pass struct pci_ecam_ops
>> pointer as is.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> I was going to ack and push the patch. But then I couldn't apply the patch...
>
>>
>> ---
>> New in v4
>> ---
>>   xen/arch/arm/pci/ecam.c            | 4 ++--
>>   xen/arch/arm/pci/pci-host-common.c | 6 ++----
>>   xen/include/asm-arm/pci.h          | 5 +++--
>>   3 files changed, 7 insertions(+), 8 deletions(-)
>>
>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>> index 4f71b11c3057..6aeea12a68bf 100644
>> --- a/xen/arch/arm/pci/ecam.c
>> +++ b/xen/arch/arm/pci/ecam.c
>> @@ -24,8 +24,8 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>                                  pci_sbdf_t sbdf, uint32_t where)
>>   {
>>       const struct pci_config_window *cfg = bridge->cfg;
>> -    struct pci_ecam_ops *ops =
>> -        container_of(bridge->ops, struct pci_ecam_ops, pci_ops);
>> +    const struct pci_ecam_ops *ops =
>> +        container_of(bridge->ops, const struct pci_ecam_ops, pci_ops);
>>       unsigned int devfn_shift = ops->bus_shift - 8;
>>       void __iomem *base;
>>   diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>> index 6af845ab9d6c..1aad664b213e 100644
>> --- a/xen/arch/arm/pci/pci-host-common.c
>> +++ b/xen/arch/arm/pci/pci-host-common.c
>> @@ -194,15 +194,13 @@ static int pci_bus_find_domain_nr(struct dt_device_node *dev)
>>       return domain;
>>   }
>>   -int pci_host_common_probe(struct dt_device_node *dev, const void *data)
>> +int pci_host_common_probe(struct dt_device_node *dev,
>> +                          const struct pci_ecam_ops *ops)
>>   {
>>       struct pci_host_bridge *bridge;
>>       struct pci_config_window *cfg;
>> -    struct pci_ecam_ops *ops;
>>       int err;
>
> ... in staging, the code has an two additional lines here:
>
>     if ( dt_device_for_passthrough(dev) )
>         return 0;
>
> Is this series relying on patch that are not yet upstreamed?
Yes, I mistakenly had a patch below that I didn't want to upstream with
this series, so this is why. Sorry about that.
Frankly, I didn't expect patches to be merged from this series now as
1) I expect v7
2) I thought we won't push until the release

That being said: do you mind if I put your Acked-by in this patch, so
it is ready for v7?
>
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-18  7:27     ` Oleksandr Andrushchenko
@ 2021-11-18 10:46       ` Oleksandr Andrushchenko
  2021-11-22 17:36         ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-18 10:46 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko



On 18.11.21 09:27, Oleksandr Andrushchenko wrote:
> Hi, Julien!
>
> On 16.11.21 21:12, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>
>>> In order for vPCI to work it needs to maintain guest and hardware
>>> domain's views of the configuration space. For example, BARs and
>>> COMMAND registers require emulation for guests and the guest view
>>> of the registers needs to be in sync with the real contents of the
>>> relevant registers. For that ECAM address space needs to also be
>>> trapped for the hardware domain, so we need to implement PCI host
>>> bridge specific callbacks to properly setup MMIO handlers for those
>>> ranges depending on particular host bridge implementation.
>>>
>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>> ---
>>> Since v5:
>>> - add vpci_sbdf_from_gpa helper for gpa to SBDF translation
>>> - take bridge's bus start into account while calculating SBDF
>>> Since v4:
>>> - unsigned int for functions working with count
>>> - gate number of MMIO handlers needed for CONFIG_HAS_PCI_MSI
>>>     and fix their number, e.g. single handler for PBA and
>>>     MSI-X tables (Roger)
>>> - re-work code for assigning MMIO handlers to be simpler
>>>     and account on the fact that there could multiple host bridges
>>>     exist for the hwdom
>>> Since v3:
>>> - fixed comment formatting
>>> Since v2:
>>> - removed unneeded assignment (count = 0)
>>> - removed unneeded header inclusion
>>> - update commit message
>>> Since v1:
>>>    - Dynamically calculate the number of MMIO handlers required for vPCI
>>>      and update the total number accordingly
>>>    - s/clb/cb
>>>    - Do not introduce a new callback for MMIO handler setup
>>> ---
>>>    xen/arch/arm/domain.c              |  2 +
>>>    xen/arch/arm/pci/pci-host-common.c | 27 ++++++++++++
>>>    xen/arch/arm/vpci.c                | 66 ++++++++++++++++++++++++++----
>>>    xen/arch/arm/vpci.h                |  6 +++
>>>    xen/include/asm-arm/pci.h          |  5 +++
>>>    5 files changed, 98 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>> index 96e1b235501d..92a6c509e5c5 100644
>>> --- a/xen/arch/arm/domain.c
>>> +++ b/xen/arch/arm/domain.c
>>> @@ -739,6 +739,8 @@ int arch_domain_create(struct domain *d,
>>>        if ( (rc = domain_vgic_register(d, &count)) != 0 )
>>>            goto fail;
>>>    +    count += domain_vpci_get_num_mmio_handlers(d);
>>> +
>>>        if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
>>>            goto fail;
>>>    diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>>> index 47104b22b221..0d271a6e8881 100644
>>> --- a/xen/arch/arm/pci/pci-host-common.c
>>> +++ b/xen/arch/arm/pci/pci-host-common.c
>>> @@ -289,6 +289,33 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
>>>        return -EINVAL;
>>>    }
>>>    +int pci_host_iterate_bridges(struct domain *d,
>>> +                             int (*cb)(struct domain *d,
>>> +                                       struct pci_host_bridge *bridge))
>>> +{
>>> +    struct pci_host_bridge *bridge;
>>> +    int err;
>>> +
>>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>>> +    {
>>> +        err = cb(d, bridge);
>>> +        if ( err )
>>> +            return err;
>>> +    }
>>> +    return 0;
>>> +}
>>> +
>>> +unsigned int pci_host_get_num_bridges(void)
>>> +{
>>> +    struct pci_host_bridge *bridge;
>>> +    unsigned int count = 0;
>> How about making this static and...
>>
>>> +
>>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>>> +        count++;
>> ... only call list_for_each_entry() when count is -1? So we would only go through the list once.
>>
>> This should be fine given hostbridge can only be added during boot (we would need to protect pci_host_bridges with a lock otherwise).
> Ok, I can do that
I have re-worked the patch so that more code can be re-used:

     if ( is_hardware_domain(d) )
     {
         int count;

         count = pci_host_iterate_bridges_and_count(d,
vpci_setup_mmio_handler_cb);
         if ( count < 0 )
             return count;

         return 0;
     }

     register_mmio_handler(d, &vpci_mmio_handler,
                           GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);

so pci_host_get_num_bridges goes away.
>>> +
>>> +    return count;
>>> +}
>>> +
>>>    /*
>>>     * Local variables:
>>>     * mode: C
>>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
>>> index 23f45386f4b3..5a6ebd8b9868 100644
>>> --- a/xen/arch/arm/vpci.c
>>> +++ b/xen/arch/arm/vpci.c
>>> @@ -16,16 +16,31 @@
>>>      #include <asm/mmio.h>
>>>    +static pci_sbdf_t vpci_sbdf_from_gpa(const struct pci_host_bridge *bridge,
>>> +                                     paddr_t gpa)
>>> +{
>>> +    pci_sbdf_t sbdf;
>>> +
>>> +    if ( bridge )
>>> +    {
>>> +        sbdf.sbdf = VPCI_ECAM_BDF(gpa - bridge->cfg->phys_addr);
>>> +        sbdf.seg = bridge->segment;
>>> +        sbdf.bus += bridge->cfg->busn_start;
>>> +    }
>>> +    else
>>> +        sbdf.sbdf = VPCI_ECAM_BDF(gpa - GUEST_VPCI_ECAM_BASE);
>>> +
>>> +    return sbdf;
>>> +}
>>> +
>>>    static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>                              register_t *r, void *p)
>>>    {
>>> -    pci_sbdf_t sbdf;
>>> +    struct pci_host_bridge *bridge = p;
>>> +    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>>>        /* data is needed to prevent a pointer cast on 32bit */
>>>        unsigned long data;
>>>    -    /* We ignore segment part and always handle segment 0 */
>>> -    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
>>> -
>>>        if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa),
>>>                            1U << info->dabt.size, &data) )
>>>        {
>>> @@ -41,10 +56,8 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>    static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>>>                               register_t r, void *p)
>>>    {
>>> -    pci_sbdf_t sbdf;
>>> -
>>> -    /* We ignore segment part and always handle segment 0 */
>>> -    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
>>> +    struct pci_host_bridge *bridge = p;
>>> +    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>>>          return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa),
>>>                               1U << info->dabt.size, r);
>>> @@ -55,17 +68,54 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
>>>        .write = vpci_mmio_write,
>>>    };
>>>    +static int vpci_setup_mmio_handler_cb(struct domain *d,
>>> +                                      struct pci_host_bridge *bridge)
>>> +{
>>> +    struct pci_config_window *cfg = bridge->cfg;
>>> +
>>> +    register_mmio_handler(d, &vpci_mmio_handler,
>>> +                          cfg->phys_addr, cfg->size, bridge);
>>> +    return 0;
>>> +}
>>> +
>>>    int domain_vpci_init(struct domain *d)
>>>    {
>>>        if ( !has_vpci(d) )
>>>            return 0;
>>>    +    if ( is_hardware_domain(d) )
>>> +        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler_cb);
>>> +
>>> +    /* Guest domains use what is programmed in their device tree. */
>> I would rather not make the assumption that the guest is using a Device-Tree. So how about:
>>
>> /*
>>   * The hardware domain gets one virtual hostbridge by "real"
>>   * hostbridges.
>>   * Guests get the virtual platform layout (one virtual host bridge for
>>   * now).
>>   */
>>
>> The comment would have to be moved before if ( is_hardware_domain(d) ).
> Sure, I can extend the comment
     /*
      * The hardware domain gets as many MMIOs as required by the
      * physical host bridge.
      * Guests get the virtual platform layout: one virtual host bridge for now.
      */

>>>        register_mmio_handler(d, &vpci_mmio_handler,
>>>                              GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
>>>          return 0;
>>>    }
>>>    +unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
>> AFAICT, this function would also be called even if vPCI is not enabled for the domain. So we should add:
>>
>> if ( !has_vpci(d) )
>>    return 0;
>>
> Good catch, will add
Hm... but we have

static inline unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
+{
+    return 0;
+}
fir that case

>>> +{
>>> +    unsigned int count;
>>> +
>>> +    if ( is_hardware_domain(d) )
>>> +        /* For each PCI host bridge's configuration space. */
>>> +        count = pci_host_get_num_bridges();
>> This first part makes sense to me. But...
>>
>>> +    else
>> ... I don't understand how the else is related to this commit. Can you clarify it?
>>
>>> +        /*
>>> +         * There's a single MSI-X MMIO handler that deals with both PBA
>>> +         * and MSI-X tables per each PCI device being passed through.
>>> +         * Maximum number of supported devices is 32 as virtual bus
>>> +         * topology emulates the devices as embedded endpoints.
>>> +         * +1 for a single emulated host bridge's configuration space.
>>> +         */
>>> +        count = 1;
>>> +#ifdef CONFIG_HAS_PCI_MSI
>>> +        count += 32;
>> Surely, this is a decision that is based on other factor in the vPCI code. So can use a define and avoid hardcoding the number?
> Well, in the later series [1] this is defined via PCI_SLOT(~0) + 1 and there is no dedicated
> constant for that. I can use the same here, e.g. s/32/PCI_SLOT(~0) + 1
>>> +#endif
>>
>>> +
>>> +    return count;
>>> +}
>>> +
>>>    /*
>>>     * Local variables:
>>>     * mode: C
>>> diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h
>>> index d8a7b0e3e802..3c713f3fcdb5 100644
>>> --- a/xen/arch/arm/vpci.h
>>> +++ b/xen/arch/arm/vpci.h
>>> @@ -17,11 +17,17 @@
>>>      #ifdef CONFIG_HAS_VPCI
>>>    int domain_vpci_init(struct domain *d);
>>> +unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d);
>>>    #else
>>>    static inline int domain_vpci_init(struct domain *d)
>>>    {
>>>        return 0;
>>>    }
>>> +
>>> +static inline unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
>>> +{
>>> +    return 0;
>>> +}
>>>    #endif
>>>      #endif /* __ARCH_ARM_VPCI_H__ */
>>> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
>>> index c20eba643d86..969333043431 100644
>>> --- a/xen/include/asm-arm/pci.h
>>> +++ b/xen/include/asm-arm/pci.h
>>> @@ -110,6 +110,11 @@ void arch_pci_init_pdev(struct pci_dev *pdev);
>>>      int pci_get_new_domain_nr(void);
>>>    +int pci_host_iterate_bridges(struct domain *d,
>>> +                             int (*clb)(struct domain *d,
>> NIT: This is more common to call a callback 'cb'. In any case, I would prefer if the names matches the one used in the implementation.
> Will change
>>> + struct pci_host_bridge *bridge));
>>> +unsigned int pci_host_get_num_bridges(void);
>>> +
>>>    #else   /*!CONFIG_HAS_PCI*/
>>>      struct arch_pci_dev { };
>>>
>> Cheers,
>>
> Thank you,
> Oleksandr
>
> [1] https://patchwork.kernel.org/project/xen-devel/patch/20211105065629.940943-11-andr2000@gmail.com/

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-18  7:13         ` Oleksandr Andrushchenko
@ 2021-11-22 15:29           ` Julien Grall
  2021-11-22 16:23             ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-22 15:29 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh



On 18/11/2021 07:13, Oleksandr Andrushchenko wrote:
> Hi, Julien!

Hi Oleksandr,

> On 17.11.21 23:33, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 17/11/2021 06:56, Oleksandr Andrushchenko wrote:
>>> Hi, Julien!
>>>
>>> On 16.11.21 20:48, Julien Grall wrote:
>>>> Hi Oleksander,
>>>>
>>>> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>
>>>>> If a PCI host bridge device is present in the device tree, but is
>>>>> disabled, then its PCI host bridge driver was not instantiated.
>>>>> This results in the failure of the pci_get_host_bridge_segment()
>>>>> and the following panic during Xen start:
>>>>>
>>>>> (XEN) Device tree generation failed (-22).
>>>>> (XEN)
>>>>> (XEN) ****************************************
>>>>> (XEN) Panic on CPU 0:
>>>>> (XEN) Could not set up DOM0 guest OS
>>>>> (XEN) ****************************************
>>>>>
>>>>> Fix this by adding "linux,pci-domain" property for all device tree nodes
>>>>> which have "pci" device type, so we know which segments will be used by
>>>>> the guest for which bridges.
>>>>>
>>>>> Fixes: 4cfab4425d39 ("xen/arm: Add linux,pci-domain property for hwdom if not available.")
>>>>>
>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>
>>>>> ---
>>>>> New in v6
>>>>> ---
>>>>>     xen/arch/arm/domain_build.c        | 15 ++++++++++++++-
>>>>>     xen/arch/arm/pci/pci-host-common.c |  2 +-
>>>>>     xen/include/asm-arm/pci.h          |  8 ++++++++
>>>>>     3 files changed, 23 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>> index 491f5e2c316e..f7fcb1400c19 100644
>>>>> --- a/xen/arch/arm/domain_build.c
>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>> @@ -753,9 +753,22 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo,
>>>>>             {
>>>>>                 uint16_t segment;
>>>>>     +            /*
>>>>> +             * The node doesn't have "linux,pci-domain" property and it is
>>>>> +             * possible that:
>>>>> +             *  - Xen only has drivers for a part of the host bridges
>>>>> +             *  - some host bridges are disabled
>>>>> +             * Make sure we insert the correct "linux,pci-domain" property
>>>>> +             * in any case, so we know which segments will be used
>>>>> +             * by Linux for which bridges.
>>>>
>>>> The check above will check the node type is "pci". AFAICT, this would also cover PCI devices. I am not aware of any issue to add "linux,pci-domain" for them. However, this feels a bit odd.
>>>>
>>>>   From my understanding, a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci".
>>>>
>>> We may have "bridge -> bridge -> device" topology as well.
>>
>> Do you have an example of Device-Tree?
> No, I don't have at hand, but I can imagine this can relatively easy be done with QEMU > Even if not, do you assume this topology can never happen?

I think it is not possible to describe them in the Device-Tree because 
you wouldn't be able to translate the "regs" of the nested hostbridge to 
an actual MMIO address.

In fact, looking at dt_device_get_address(), I think Xen would reject 
such DT.

>>
>>> So, I prefer to have the check as it is.
>>
>> I don't really like the idea to spuriously add "linux,pci-domain" to PCI DT node. But if there are no other solution, then this should at least be mentionned in the commit message and code.
> I am fine with any solution here, I just want that to be defined and implemented.
> Please let me know the final decision on this and how we proceed

I would prefer to go with my way. This can be refined in the future if 
we find Device-Tree that matches what you wrote.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-22 15:29           ` Julien Grall
@ 2021-11-22 16:23             ` Oleksandr Andrushchenko
  2021-11-22 17:17               ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-22 16:23 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko



On 22.11.21 17:29, Julien Grall wrote:
>
>
> On 18/11/2021 07:13, Oleksandr Andrushchenko wrote:
>> Hi, Julien!
>
> Hi Oleksandr,
>
>> On 17.11.21 23:33, Julien Grall wrote:
>>> Hi Oleksandr,
>>>
>>> On 17/11/2021 06:56, Oleksandr Andrushchenko wrote:
>>>> Hi, Julien!
>>>>
>>>> On 16.11.21 20:48, Julien Grall wrote:
>>>>> Hi Oleksander,
>>>>>
>>>>> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>>
>>>>>> If a PCI host bridge device is present in the device tree, but is
>>>>>> disabled, then its PCI host bridge driver was not instantiated.
>>>>>> This results in the failure of the pci_get_host_bridge_segment()
>>>>>> and the following panic during Xen start:
>>>>>>
>>>>>> (XEN) Device tree generation failed (-22).
>>>>>> (XEN)
>>>>>> (XEN) ****************************************
>>>>>> (XEN) Panic on CPU 0:
>>>>>> (XEN) Could not set up DOM0 guest OS
>>>>>> (XEN) ****************************************
>>>>>>
>>>>>> Fix this by adding "linux,pci-domain" property for all device tree nodes
>>>>>> which have "pci" device type, so we know which segments will be used by
>>>>>> the guest for which bridges.
>>>>>>
>>>>>> Fixes: 4cfab4425d39 ("xen/arm: Add linux,pci-domain property for hwdom if not available.")
>>>>>>
>>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>>
>>>>>> ---
>>>>>> New in v6
>>>>>> ---
>>>>>>     xen/arch/arm/domain_build.c        | 15 ++++++++++++++-
>>>>>>     xen/arch/arm/pci/pci-host-common.c |  2 +-
>>>>>>     xen/include/asm-arm/pci.h          |  8 ++++++++
>>>>>>     3 files changed, 23 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>>> index 491f5e2c316e..f7fcb1400c19 100644
>>>>>> --- a/xen/arch/arm/domain_build.c
>>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>>> @@ -753,9 +753,22 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo,
>>>>>>             {
>>>>>>                 uint16_t segment;
>>>>>>     +            /*
>>>>>> +             * The node doesn't have "linux,pci-domain" property and it is
>>>>>> +             * possible that:
>>>>>> +             *  - Xen only has drivers for a part of the host bridges
>>>>>> +             *  - some host bridges are disabled
>>>>>> +             * Make sure we insert the correct "linux,pci-domain" property
>>>>>> +             * in any case, so we know which segments will be used
>>>>>> +             * by Linux for which bridges.
>>>>>
>>>>> The check above will check the node type is "pci". AFAICT, this would also cover PCI devices. I am not aware of any issue to add "linux,pci-domain" for them. However, this feels a bit odd.
>>>>>
>>>>>   From my understanding, a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci".
>>>>>
>>>> We may have "bridge -> bridge -> device" topology as well.
>>>
>>> Do you have an example of Device-Tree?
>> No, I don't have at hand, but I can imagine this can relatively easy be done with QEMU > Even if not, do you assume this topology can never happen?
>
> I think it is not possible to describe them in the Device-Tree because you wouldn't be able to translate the "regs" of the nested hostbridge to an actual MMIO address.
>
> In fact, looking at dt_device_get_address(), I think Xen would reject such DT.
>
>>>
>>>> So, I prefer to have the check as it is.
>>>
>>> I don't really like the idea to spuriously add "linux,pci-domain" to PCI DT node. But if there are no other solution, then this should at least be mentionned in the commit message and code.
>> I am fine with any solution here, I just want that to be defined and implemented.
>> Please let me know the final decision on this and how we proceed
>
> I would prefer to go with my way. This can be refined in the future if we find Device-Tree that matches what you wrote.
Ok, so just to make it clear:
 >a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci"
>
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-22 16:23             ` Oleksandr Andrushchenko
@ 2021-11-22 17:17               ` Julien Grall
  2021-11-23  6:31                 ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-22 17:17 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh

Hi,

On 22/11/2021 16:23, Oleksandr Andrushchenko wrote:
> On 22.11.21 17:29, Julien Grall wrote:
>> I would prefer to go with my way. This can be refined in the future if we find Device-Tree that matches what you wrote.
> Ok, so just to make it clear:
>   >a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci"

That's correct. Thank you!

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-18 10:46       ` Oleksandr Andrushchenko
@ 2021-11-22 17:36         ` Julien Grall
  2021-11-23  6:58           ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-22 17:36 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh

Hi,

On 18/11/2021 10:46, Oleksandr Andrushchenko wrote:
> On 18.11.21 09:27, Oleksandr Andrushchenko wrote:
>> Hi, Julien!
>>
>> On 16.11.21 21:12, Julien Grall wrote:
>>> Hi Oleksandr,
>>>
>>> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>
>>>> In order for vPCI to work it needs to maintain guest and hardware
>>>> domain's views of the configuration space. For example, BARs and
>>>> COMMAND registers require emulation for guests and the guest view
>>>> of the registers needs to be in sync with the real contents of the
>>>> relevant registers. For that ECAM address space needs to also be
>>>> trapped for the hardware domain, so we need to implement PCI host
>>>> bridge specific callbacks to properly setup MMIO handlers for those
>>>> ranges depending on particular host bridge implementation.
>>>>
>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>> ---
>>>> Since v5:
>>>> - add vpci_sbdf_from_gpa helper for gpa to SBDF translation
>>>> - take bridge's bus start into account while calculating SBDF
>>>> Since v4:
>>>> - unsigned int for functions working with count
>>>> - gate number of MMIO handlers needed for CONFIG_HAS_PCI_MSI
>>>>      and fix their number, e.g. single handler for PBA and
>>>>      MSI-X tables (Roger)
>>>> - re-work code for assigning MMIO handlers to be simpler
>>>>      and account on the fact that there could multiple host bridges
>>>>      exist for the hwdom
>>>> Since v3:
>>>> - fixed comment formatting
>>>> Since v2:
>>>> - removed unneeded assignment (count = 0)
>>>> - removed unneeded header inclusion
>>>> - update commit message
>>>> Since v1:
>>>>     - Dynamically calculate the number of MMIO handlers required for vPCI
>>>>       and update the total number accordingly
>>>>     - s/clb/cb
>>>>     - Do not introduce a new callback for MMIO handler setup
>>>> ---
>>>>     xen/arch/arm/domain.c              |  2 +
>>>>     xen/arch/arm/pci/pci-host-common.c | 27 ++++++++++++
>>>>     xen/arch/arm/vpci.c                | 66 ++++++++++++++++++++++++++----
>>>>     xen/arch/arm/vpci.h                |  6 +++
>>>>     xen/include/asm-arm/pci.h          |  5 +++
>>>>     5 files changed, 98 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>>> index 96e1b235501d..92a6c509e5c5 100644
>>>> --- a/xen/arch/arm/domain.c
>>>> +++ b/xen/arch/arm/domain.c
>>>> @@ -739,6 +739,8 @@ int arch_domain_create(struct domain *d,
>>>>         if ( (rc = domain_vgic_register(d, &count)) != 0 )
>>>>             goto fail;
>>>>     +    count += domain_vpci_get_num_mmio_handlers(d);
>>>> +
>>>>         if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
>>>>             goto fail;
>>>>     diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>>>> index 47104b22b221..0d271a6e8881 100644
>>>> --- a/xen/arch/arm/pci/pci-host-common.c
>>>> +++ b/xen/arch/arm/pci/pci-host-common.c
>>>> @@ -289,6 +289,33 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
>>>>         return -EINVAL;
>>>>     }
>>>>     +int pci_host_iterate_bridges(struct domain *d,
>>>> +                             int (*cb)(struct domain *d,
>>>> +                                       struct pci_host_bridge *bridge))
>>>> +{
>>>> +    struct pci_host_bridge *bridge;
>>>> +    int err;
>>>> +
>>>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>>>> +    {
>>>> +        err = cb(d, bridge);
>>>> +        if ( err )
>>>> +            return err;
>>>> +    }
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +unsigned int pci_host_get_num_bridges(void)
>>>> +{
>>>> +    struct pci_host_bridge *bridge;
>>>> +    unsigned int count = 0;
>>> How about making this static and...
>>>
>>>> +
>>>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>>>> +        count++;
>>> ... only call list_for_each_entry() when count is -1? So we would only go through the list once.
>>>
>>> This should be fine given hostbridge can only be added during boot (we would need to protect pci_host_bridges with a lock otherwise).
>> Ok, I can do that
> I have re-worked the patch so that more code can be re-used:
> 
>       if ( is_hardware_domain(d) )
>       {
>           int count;
> 
>           count = pci_host_iterate_bridges_and_count(d,
> vpci_setup_mmio_handler_cb);
>           if ( count < 0 )
>               return count;
> 
>           return 0;
>       }
> 
>       register_mmio_handler(d, &vpci_mmio_handler,
>                             GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
> 
> so pci_host_get_num_bridges goes away.

If this will be the only caller that needs to know the number 
hostbridges, then I am happy with this appropach. Otherwise, I would 
prefer to keep the helper pci_host_get_num_bridges().

>>>> +
>>>> +    return count;
>>>> +}
>>>> +
>>>>     /*
>>>>      * Local variables:
>>>>      * mode: C
>>>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
>>>> index 23f45386f4b3..5a6ebd8b9868 100644
>>>> --- a/xen/arch/arm/vpci.c
>>>> +++ b/xen/arch/arm/vpci.c
>>>> @@ -16,16 +16,31 @@
>>>>       #include <asm/mmio.h>
>>>>     +static pci_sbdf_t vpci_sbdf_from_gpa(const struct pci_host_bridge *bridge,
>>>> +                                     paddr_t gpa)
>>>> +{
>>>> +    pci_sbdf_t sbdf;
>>>> +
>>>> +    if ( bridge )
>>>> +    {
>>>> +        sbdf.sbdf = VPCI_ECAM_BDF(gpa - bridge->cfg->phys_addr);
>>>> +        sbdf.seg = bridge->segment;
>>>> +        sbdf.bus += bridge->cfg->busn_start;
>>>> +    }
>>>> +    else
>>>> +        sbdf.sbdf = VPCI_ECAM_BDF(gpa - GUEST_VPCI_ECAM_BASE);
>>>> +
>>>> +    return sbdf;
>>>> +}
>>>> +
>>>>     static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>>                               register_t *r, void *p)
>>>>     {
>>>> -    pci_sbdf_t sbdf;
>>>> +    struct pci_host_bridge *bridge = p;
>>>> +    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>>>>         /* data is needed to prevent a pointer cast on 32bit */
>>>>         unsigned long data;
>>>>     -    /* We ignore segment part and always handle segment 0 */
>>>> -    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
>>>> -
>>>>         if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa),
>>>>                             1U << info->dabt.size, &data) )
>>>>         {
>>>> @@ -41,10 +56,8 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>>     static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>>>>                                register_t r, void *p)
>>>>     {
>>>> -    pci_sbdf_t sbdf;
>>>> -
>>>> -    /* We ignore segment part and always handle segment 0 */
>>>> -    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
>>>> +    struct pci_host_bridge *bridge = p;
>>>> +    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>>>>           return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa),
>>>>                                1U << info->dabt.size, r);
>>>> @@ -55,17 +68,54 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
>>>>         .write = vpci_mmio_write,
>>>>     };
>>>>     +static int vpci_setup_mmio_handler_cb(struct domain *d,
>>>> +                                      struct pci_host_bridge *bridge)
>>>> +{
>>>> +    struct pci_config_window *cfg = bridge->cfg;
>>>> +
>>>> +    register_mmio_handler(d, &vpci_mmio_handler,
>>>> +                          cfg->phys_addr, cfg->size, bridge);
>>>> +    return 0;
>>>> +}
>>>> +
>>>>     int domain_vpci_init(struct domain *d)
>>>>     {
>>>>         if ( !has_vpci(d) )
>>>>             return 0;
>>>>     +    if ( is_hardware_domain(d) )
>>>> +        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler_cb);
>>>> +
>>>> +    /* Guest domains use what is programmed in their device tree. */
>>> I would rather not make the assumption that the guest is using a Device-Tree. So how about:
>>>
>>> /*
>>>    * The hardware domain gets one virtual hostbridge by "real"
>>>    * hostbridges.
>>>    * Guests get the virtual platform layout (one virtual host bridge for
>>>    * now).
>>>    */
>>>
>>> The comment would have to be moved before if ( is_hardware_domain(d) ).
>> Sure, I can extend the comment
>       /*
>        * The hardware domain gets as many MMIOs as required by the
>        * physical host bridge.
>        * Guests get the virtual platform layout: one virtual host bridge for now.
>        */

LGTM.

> 
>>>>         register_mmio_handler(d, &vpci_mmio_handler,
>>>>                               GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
>>>>           return 0;
>>>>     }
>>>>     +unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
>>> AFAICT, this function would also be called even if vPCI is not enabled for the domain. So we should add:
>>>
>>> if ( !has_vpci(d) )
>>>     return 0;
>>>
>> Good catch, will add
> Hm... but we have
> 
> static inline unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
> +{
> +    return 0;
> +}
> fir that case

This would only cover the case where Xen was built without vPCI support. 
When Xen is built with vPCI support, we only want to increase the number 
of regions for domain with vPCI enabled.

> 
>>>> +{
>>>> +    unsigned int count;
>>>> +
>>>> +    if ( is_hardware_domain(d) )
>>>> +        /* For each PCI host bridge's configuration space. */
>>>> +        count = pci_host_get_num_bridges();
>>> This first part makes sense to me. But...
>>>
>>>> +    else
>>> ... I don't understand how the else is related to this commit. Can you clarify it?
>>>
>>>> +        /*
>>>> +         * There's a single MSI-X MMIO handler that deals with both PBA
>>>> +         * and MSI-X tables per each PCI device being passed through.
>>>> +         * Maximum number of supported devices is 32 as virtual bus
>>>> +         * topology emulates the devices as embedded endpoints.
>>>> +         * +1 for a single emulated host bridge's configuration space.
>>>> +         */
>>>> +        count = 1;
>>>> +#ifdef CONFIG_HAS_PCI_MSI
>>>> +        count += 32;
>>> Surely, this is a decision that is based on other factor in the vPCI code. So can use a define and avoid hardcoding the number?
>> Well, in the later series [1] this is defined via PCI_SLOT(~0) + 1 and there is no dedicated
>> constant for that. I can use the same here, e.g. s/32/PCI_SLOT(~0) + 1

I would prefer if we introduce a new constant for that. This makes 
easier to update the code if we decide to increase the number of virtual 
devices.

However, I am still not sure how the 'else' part is related to this 
commit. Can you please clarify it?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 7/7] xen/arm: do not use void pointer in pci_host_common_probe
  2021-11-18  7:34     ` Oleksandr Andrushchenko
@ 2021-11-22 17:48       ` Julien Grall
  0 siblings, 0 replies; 43+ messages in thread
From: Julien Grall @ 2021-11-22 17:48 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh



On 18/11/2021 07:34, Oleksandr Andrushchenko wrote:
> Hi, Julien!

Hi,

> On 17.11.21 23:45, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>
>>> There is no reason to use void pointer while passing ECAM ops to the
>>> pci_host_common_probe function as it is anyway casted to struct pci_ecam_ops
>>> inside. For that reason remove the void pointer and pass struct pci_ecam_ops
>>> pointer as is.
>>>
>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> I was going to ack and push the patch. But then I couldn't apply the patch...
>>
>>>
>>> ---
>>> New in v4
>>> ---
>>>    xen/arch/arm/pci/ecam.c            | 4 ++--
>>>    xen/arch/arm/pci/pci-host-common.c | 6 ++----
>>>    xen/include/asm-arm/pci.h          | 5 +++--
>>>    3 files changed, 7 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>>> index 4f71b11c3057..6aeea12a68bf 100644
>>> --- a/xen/arch/arm/pci/ecam.c
>>> +++ b/xen/arch/arm/pci/ecam.c
>>> @@ -24,8 +24,8 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>>                                   pci_sbdf_t sbdf, uint32_t where)
>>>    {
>>>        const struct pci_config_window *cfg = bridge->cfg;
>>> -    struct pci_ecam_ops *ops =
>>> -        container_of(bridge->ops, struct pci_ecam_ops, pci_ops);
>>> +    const struct pci_ecam_ops *ops =
>>> +        container_of(bridge->ops, const struct pci_ecam_ops, pci_ops);
>>>        unsigned int devfn_shift = ops->bus_shift - 8;
>>>        void __iomem *base;
>>>    diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>>> index 6af845ab9d6c..1aad664b213e 100644
>>> --- a/xen/arch/arm/pci/pci-host-common.c
>>> +++ b/xen/arch/arm/pci/pci-host-common.c
>>> @@ -194,15 +194,13 @@ static int pci_bus_find_domain_nr(struct dt_device_node *dev)
>>>        return domain;
>>>    }
>>>    -int pci_host_common_probe(struct dt_device_node *dev, const void *data)
>>> +int pci_host_common_probe(struct dt_device_node *dev,
>>> +                          const struct pci_ecam_ops *ops)
>>>    {
>>>        struct pci_host_bridge *bridge;
>>>        struct pci_config_window *cfg;
>>> -    struct pci_ecam_ops *ops;
>>>        int err;
>>
>> ... in staging, the code has an two additional lines here:
>>
>>      if ( dt_device_for_passthrough(dev) )
>>          return 0;
>>
>> Is this series relying on patch that are not yet upstreamed?
> Yes, I mistakenly had a patch below that I didn't want to upstream with
> this series, so this is why. Sorry about that.
> Frankly, I didn't expect patches to be merged from this series now as
> 1) I expect v7

We tend to merge patches in a different order, if there are no 
dependencies and would make sense without the rest of the series. This 
help reducing the size of the series.

> 2) I thought we won't push until the release

For Arm, Stefano and I have been created for-next/XX.YY (for this 
release the branch is for-next/4.17) to queue patches until the tree is 
re-opened for several releases.

> 
> That being said: do you mind if I put your Acked-by in this patch, so
> it is ready for v7?

Sure. So long this is a simple rebase:

Acked-by: Julien Grall <jgrall@amazon.com>

Cheers,

[1] 
https://xenbits.xen.org/gitweb/?p=people/julieng/xen-unstable.git;a=shortlog;h=refs/heads/for-next/4.17

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices
  2021-11-18  6:59     ` Oleksandr Andrushchenko
@ 2021-11-22 19:31       ` Julien Grall
  2021-11-23  7:23         ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-22 19:31 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh

Hi Oleksandr,

On 18/11/2021 06:59, Oleksandr Andrushchenko wrote:
> Hi, Julien!
> 
> On 16.11.21 21:22, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>
>>> Currently Xen maps all IRQs and memory ranges for all devices except
>>> those marked for passthrough, e.g. it doesn't pay attention to the
>>> "status" property of the node.
>>>
>>> According to the device tree specification [1]:
>>>    - "okay"     Indicates the device is operational.
>>>    - "disabled" Indicates that the device is not presently operational,
>>>                 but it might become operational in the future (for example,
>>>            something is not plugged in, or switched off).
>>>            Refer to the device binding for details on what disabled means
>>>            for a given device.
>>>
>>> So, "disabled" status is device dependent and mapping should be taken by
>>> case-by-case approach with that respect. Although in general Xen should map
>>> IRQs and memory ranges as the disabled devices might become operational
>>
>> Right, this change effectively prevent dom0 to use such device if it becomes operational in the future.
> Or doesn't, see below
>> So this sounds like a big regression.
>>
>> How do you plan to handle it?
> It depends if this is really a regression or is ok and "by design"

Quoting what you wrote in the commit message:

"Indicates that the device is not presently operational, but it might 
become operational in the future."

I read that as it might be possible to have a device turn on after boot.
In fact, looking at Linux, I can see some code allowing to change the 
state of a device after boot (see [4]). So I think we want to keep 
mapping the regions in dom0 at least for some devices.

Note, that other bits of the DT overlay would not work. This should be 
covered by the new series from Xilinx [5].

>>
>>> it
>>> makes it impossible for the other devices, which are not operational in
>>> any case, to skip the mappings.
>>
>> You wrote "makes it impossible for the other devices", but it is not clear to me what's would go wrong when we map a disabled device (Dom0 should not touch it). Do you have an example?
> Ok, here we go. In the SoC I am working with the PCIe controller may run in two modes:
> Root or Endpoint. Not configurable at run-time, so you configure it in the device tree.
> The are two device tree entries for that: for the root [1] and for the endpoint [2].
> So, when you want the controller to be a Root Complex then you put status = "disabled"
> for the Endpoint entry and vise versa.

Thank for the example. I think this is something that should be 
explained in the commit message.

> 
> If you take a look at the nodes they both define the same "reg" and "interrupts",
> effectively making it impossible for me in the patch [3] to actually trap MMIOs: > we skip the mappings for [1] and then, because we assume "disabled" is
> still valid for mappings, we add those for [2].

Technically, you would have the same issues if your device is sharing 
the interrupt or the MMIO page.

The former is fairly common, but IIUC you are not emulated the 
interrupts. Right?

For the latter, I have seen multiple UARTs in the same page (e.g. 
pine64) but not multiple PCI hostbridges. However, this is only with 4KB 
page granularity. We may have the issue arising with 16KB/64KB ones. So 
I would say we could ignore it for now.

> 
> So, this is probably why device tree documentation says about the disabled status
> to be device specific.

Correct. That said, until now, all the devices would have their 
MMIOsregions mapped to dom0. So the interpretation of "disabled" didn't 
matter too much.

> 
> Hope this describes a very valid use-case. At the same time you may argue that
> I just need to remove the offending entry from the device tree which may also be
> valid.

We need to be able to accept any *valid* Device-Tree without any 
modification. At the same time, we want to avoid break potential 
existing use-cases.

I think, you can handle the two gracefully by adding a else in 
pci_host_bridge_mappings() that would remove the P2M mapping if 
bridge->ops->need_p2m_hwdom_mapping() returns false.

This is not pretty but it should do the job for now. Long run, I think 
we should consider to create fake entries in the P2M for emulated regions.

The advantage is we could easily find clash and use them to ignore 
mapping when the region is emulated.

Cheers,

>>
>> Cheers,
>>
> Thank you,
> Oleksandr
> 
> [1] https://github.com/renesas-rcar/linux-bsp/blob/v5.10.41/rcar-5.1.3.rc6/arch/arm64/boot/dts/renesas/r8a779f0.dtsi#L843
> [2] https://github.com/renesas-rcar/linux-bsp/blob/v5.10.41/rcar-5.1.3.rc6/arch/arm64/boot/dts/renesas/r8a779f0.dtsi#L896
> [3] https://patchwork.kernel.org/project/xen-devel/patch/20211105063326.939843-5-andr2000@gmail.com/

[4] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/of/dynamic.c#n111
[5] 
https://lore.kernel.org/xen-devel/1636441347-133850-1-git-send-email-fnu.vikram@xilinx.com/


> 

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-22 17:17               ` Julien Grall
@ 2021-11-23  6:31                 ` Oleksandr Andrushchenko
  2021-11-23 16:05                   ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-23  6:31 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko



On 22.11.21 19:17, Julien Grall wrote:
> Hi,
>
> On 22/11/2021 16:23, Oleksandr Andrushchenko wrote:
>> On 22.11.21 17:29, Julien Grall wrote:
>>> I would prefer to go with my way. This can be refined in the future if we find Device-Tree that matches what you wrote.
>> Ok, so just to make it clear:
>>   >a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci"
>
> That's correct. Thank you!
Ok, so how about
     if ( is_pci_passthrough_enabled() && dt_device_type_is_equal(node, "pci") )
     {
         bool skip = false;

         /*
          * If the parent is also a "pci" device, then "linux,pci-domain"
          * should already be there, so nothing to do then.
          */
         if ( node->parent && dt_device_type_is_equal(node->parent, "pci") )
             skip = true;

         if ( !skip && !dt_find_property(node, "linux,pci-domain", NULL) )
         {
I played with a single if and it looks scary...
>
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-22 17:36         ` Julien Grall
@ 2021-11-23  6:58           ` Oleksandr Andrushchenko
  2021-11-23 16:12             ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-23  6:58 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko

Hi, Julien!

On 22.11.21 19:36, Julien Grall wrote:
> Hi,
>
> On 18/11/2021 10:46, Oleksandr Andrushchenko wrote:
>> On 18.11.21 09:27, Oleksandr Andrushchenko wrote:
>>> Hi, Julien!
>>>
>>> On 16.11.21 21:12, Julien Grall wrote:
>>>> Hi Oleksandr,
>>>>
>>>> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>
>>>>> In order for vPCI to work it needs to maintain guest and hardware
>>>>> domain's views of the configuration space. For example, BARs and
>>>>> COMMAND registers require emulation for guests and the guest view
>>>>> of the registers needs to be in sync with the real contents of the
>>>>> relevant registers. For that ECAM address space needs to also be
>>>>> trapped for the hardware domain, so we need to implement PCI host
>>>>> bridge specific callbacks to properly setup MMIO handlers for those
>>>>> ranges depending on particular host bridge implementation.
>>>>>
>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>> ---
>>>>> Since v5:
>>>>> - add vpci_sbdf_from_gpa helper for gpa to SBDF translation
>>>>> - take bridge's bus start into account while calculating SBDF
>>>>> Since v4:
>>>>> - unsigned int for functions working with count
>>>>> - gate number of MMIO handlers needed for CONFIG_HAS_PCI_MSI
>>>>>      and fix their number, e.g. single handler for PBA and
>>>>>      MSI-X tables (Roger)
>>>>> - re-work code for assigning MMIO handlers to be simpler
>>>>>      and account on the fact that there could multiple host bridges
>>>>>      exist for the hwdom
>>>>> Since v3:
>>>>> - fixed comment formatting
>>>>> Since v2:
>>>>> - removed unneeded assignment (count = 0)
>>>>> - removed unneeded header inclusion
>>>>> - update commit message
>>>>> Since v1:
>>>>>     - Dynamically calculate the number of MMIO handlers required for vPCI
>>>>>       and update the total number accordingly
>>>>>     - s/clb/cb
>>>>>     - Do not introduce a new callback for MMIO handler setup
>>>>> ---
>>>>>     xen/arch/arm/domain.c              |  2 +
>>>>>     xen/arch/arm/pci/pci-host-common.c | 27 ++++++++++++
>>>>>     xen/arch/arm/vpci.c                | 66 ++++++++++++++++++++++++++----
>>>>>     xen/arch/arm/vpci.h                |  6 +++
>>>>>     xen/include/asm-arm/pci.h          |  5 +++
>>>>>     5 files changed, 98 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>>>> index 96e1b235501d..92a6c509e5c5 100644
>>>>> --- a/xen/arch/arm/domain.c
>>>>> +++ b/xen/arch/arm/domain.c
>>>>> @@ -739,6 +739,8 @@ int arch_domain_create(struct domain *d,
>>>>>         if ( (rc = domain_vgic_register(d, &count)) != 0 )
>>>>>             goto fail;
>>>>>     +    count += domain_vpci_get_num_mmio_handlers(d);
>>>>> +
>>>>>         if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
>>>>>             goto fail;
>>>>>     diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>>>>> index 47104b22b221..0d271a6e8881 100644
>>>>> --- a/xen/arch/arm/pci/pci-host-common.c
>>>>> +++ b/xen/arch/arm/pci/pci-host-common.c
>>>>> @@ -289,6 +289,33 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
>>>>>         return -EINVAL;
>>>>>     }
>>>>>     +int pci_host_iterate_bridges(struct domain *d,
>>>>> +                             int (*cb)(struct domain *d,
>>>>> +                                       struct pci_host_bridge *bridge))
>>>>> +{
>>>>> +    struct pci_host_bridge *bridge;
>>>>> +    int err;
>>>>> +
>>>>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>>>>> +    {
>>>>> +        err = cb(d, bridge);
>>>>> +        if ( err )
>>>>> +            return err;
>>>>> +    }
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +unsigned int pci_host_get_num_bridges(void)
>>>>> +{
>>>>> +    struct pci_host_bridge *bridge;
>>>>> +    unsigned int count = 0;
>>>> How about making this static and...
>>>>
>>>>> +
>>>>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>>>>> +        count++;
>>>> ... only call list_for_each_entry() when count is -1? So we would only go through the list once.
>>>>
>>>> This should be fine given hostbridge can only be added during boot (we would need to protect pci_host_bridges with a lock otherwise).
>>> Ok, I can do that
>> I have re-worked the patch so that more code can be re-used:
>>
>>       if ( is_hardware_domain(d) )
>>       {
>>           int count;
>>
>>           count = pci_host_iterate_bridges_and_count(d,
>> vpci_setup_mmio_handler_cb);
>>           if ( count < 0 )
>>               return count;
>>
>>           return 0;
>>       }
>>
>>       register_mmio_handler(d, &vpci_mmio_handler,
>>                             GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
>>
>> so pci_host_get_num_bridges goes away.
>
> If this will be the only caller that needs to know the number hostbridges, then I am happy with this appropach. Otherwise, I would prefer to keep the helper pci_host_get_num_bridges().
pci_host_get_num_bridges won't be needed, so it is ok to not introduce pci_host_get_num_bridges
>
>>>>> +
>>>>> +    return count;
>>>>> +}
>>>>> +
>>>>>     /*
>>>>>      * Local variables:
>>>>>      * mode: C
>>>>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
>>>>> index 23f45386f4b3..5a6ebd8b9868 100644
>>>>> --- a/xen/arch/arm/vpci.c
>>>>> +++ b/xen/arch/arm/vpci.c
>>>>> @@ -16,16 +16,31 @@
>>>>>       #include <asm/mmio.h>
>>>>>     +static pci_sbdf_t vpci_sbdf_from_gpa(const struct pci_host_bridge *bridge,
>>>>> +                                     paddr_t gpa)
>>>>> +{
>>>>> +    pci_sbdf_t sbdf;
>>>>> +
>>>>> +    if ( bridge )
>>>>> +    {
>>>>> +        sbdf.sbdf = VPCI_ECAM_BDF(gpa - bridge->cfg->phys_addr);
>>>>> +        sbdf.seg = bridge->segment;
>>>>> +        sbdf.bus += bridge->cfg->busn_start;
>>>>> +    }
>>>>> +    else
>>>>> +        sbdf.sbdf = VPCI_ECAM_BDF(gpa - GUEST_VPCI_ECAM_BASE);
>>>>> +
>>>>> +    return sbdf;
>>>>> +}
>>>>> +
>>>>>     static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>>>                               register_t *r, void *p)
>>>>>     {
>>>>> -    pci_sbdf_t sbdf;
>>>>> +    struct pci_host_bridge *bridge = p;
>>>>> +    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>>>>>         /* data is needed to prevent a pointer cast on 32bit */
>>>>>         unsigned long data;
>>>>>     -    /* We ignore segment part and always handle segment 0 */
>>>>> -    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
>>>>> -
>>>>>         if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa),
>>>>>                             1U << info->dabt.size, &data) )
>>>>>         {
>>>>> @@ -41,10 +56,8 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>>>     static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>>>>>                                register_t r, void *p)
>>>>>     {
>>>>> -    pci_sbdf_t sbdf;
>>>>> -
>>>>> -    /* We ignore segment part and always handle segment 0 */
>>>>> -    sbdf.sbdf = VPCI_ECAM_BDF(info->gpa - GUEST_VPCI_ECAM_BASE);
>>>>> +    struct pci_host_bridge *bridge = p;
>>>>> +    pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
>>>>>           return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa),
>>>>>                                1U << info->dabt.size, r);
>>>>> @@ -55,17 +68,54 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
>>>>>         .write = vpci_mmio_write,
>>>>>     };
>>>>>     +static int vpci_setup_mmio_handler_cb(struct domain *d,
>>>>> +                                      struct pci_host_bridge *bridge)
>>>>> +{
>>>>> +    struct pci_config_window *cfg = bridge->cfg;
>>>>> +
>>>>> +    register_mmio_handler(d, &vpci_mmio_handler,
>>>>> +                          cfg->phys_addr, cfg->size, bridge);
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>>     int domain_vpci_init(struct domain *d)
>>>>>     {
>>>>>         if ( !has_vpci(d) )
>>>>>             return 0;
>>>>>     +    if ( is_hardware_domain(d) )
>>>>> +        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler_cb);
>>>>> +
>>>>> +    /* Guest domains use what is programmed in their device tree. */
>>>> I would rather not make the assumption that the guest is using a Device-Tree. So how about:
>>>>
>>>> /*
>>>>    * The hardware domain gets one virtual hostbridge by "real"
>>>>    * hostbridges.
>>>>    * Guests get the virtual platform layout (one virtual host bridge for
>>>>    * now).
>>>>    */
>>>>
>>>> The comment would have to be moved before if ( is_hardware_domain(d) ).
>>> Sure, I can extend the comment
>>       /*
>>        * The hardware domain gets as many MMIOs as required by the
>>        * physical host bridge.
>>        * Guests get the virtual platform layout: one virtual host bridge for now.
>>        */
>
> LGTM.
>
>>
>>>>>         register_mmio_handler(d, &vpci_mmio_handler,
>>>>>                               GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
>>>>>           return 0;
>>>>>     }
>>>>>     +unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
>>>> AFAICT, this function would also be called even if vPCI is not enabled for the domain. So we should add:
>>>>
>>>> if ( !has_vpci(d) )
>>>>     return 0;
>>>>
>>> Good catch, will add
>> Hm... but we have
>>
>> static inline unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
>> +{
>> +    return 0;
>> +}
>> fir that case
>
> This would only cover the case where Xen was built without vPCI support. When Xen is built with vPCI support, we only want to increase the number of regions for domain with vPCI enabled.
Yes, you are right, I will add the check
>
>>
>>>>> +{
>>>>> +    unsigned int count;
>>>>> +
>>>>> +    if ( is_hardware_domain(d) )
>>>>> +        /* For each PCI host bridge's configuration space. */
>>>>> +        count = pci_host_get_num_bridges();
>>>> This first part makes sense to me. But...
>>>>
>>>>> +    else
>>>> ... I don't understand how the else is related to this commit. Can you clarify it?
>>>>
>>>>> +        /*
>>>>> +         * There's a single MSI-X MMIO handler that deals with both PBA
>>>>> +         * and MSI-X tables per each PCI device being passed through.
>>>>> +         * Maximum number of supported devices is 32 as virtual bus
>>>>> +         * topology emulates the devices as embedded endpoints.
>>>>> +         * +1 for a single emulated host bridge's configuration space.
>>>>> +         */
>>>>> +        count = 1;
>>>>> +#ifdef CONFIG_HAS_PCI_MSI
>>>>> +        count += 32;
>>>> Surely, this is a decision that is based on other factor in the vPCI code. So can use a define and avoid hardcoding the number?
>>> Well, in the later series [1] this is defined via PCI_SLOT(~0) + 1 and there is no dedicated
>>> constant for that. I can use the same here, e.g. s/32/PCI_SLOT(~0) + 1
>
> I would prefer if we introduce a new constant for that. This makes easier to update the code if we decide to increase the number of virtual devices.
>
> However, I am still not sure how the 'else' part is related to this commit. Can you please clarify it?
Well, yes, this is too early for this patch to introduce some future knowledge, so I'll instead have:

unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
{
     if ( !has_vpci(d) )
         return 0;

     if ( is_hardware_domain(d) )
     {
         int ret = pci_host_iterate_bridges_and_count(d, vpci_get_num_handlers_cb);

         return ret < 0 ? 0 : ret;
     }

     /*
      * This is a guest domain:
      *
      * 1 for a single emulated host bridge's configuration space.
      */
     return 1;
}

>
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices
  2021-11-22 19:31       ` Julien Grall
@ 2021-11-23  7:23         ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-23  7:23 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko

Hi, Julien!

On 22.11.21 21:31, Julien Grall wrote:
> Hi Oleksandr,
>
> On 18/11/2021 06:59, Oleksandr Andrushchenko wrote:
>> Hi, Julien!
>>
>> On 16.11.21 21:22, Julien Grall wrote:
>>> Hi Oleksandr,
>>>
>>> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>
>>>> Currently Xen maps all IRQs and memory ranges for all devices except
>>>> those marked for passthrough, e.g. it doesn't pay attention to the
>>>> "status" property of the node.
>>>>
>>>> According to the device tree specification [1]:
>>>>    - "okay"     Indicates the device is operational.
>>>>    - "disabled" Indicates that the device is not presently operational,
>>>>                 but it might become operational in the future (for example,
>>>>            something is not plugged in, or switched off).
>>>>            Refer to the device binding for details on what disabled means
>>>>            for a given device.
>>>>
>>>> So, "disabled" status is device dependent and mapping should be taken by
>>>> case-by-case approach with that respect. Although in general Xen should map
>>>> IRQs and memory ranges as the disabled devices might become operational
>>>
>>> Right, this change effectively prevent dom0 to use such device if it becomes operational in the future.
>> Or doesn't, see below
>>> So this sounds like a big regression.
>>>
>>> How do you plan to handle it?
>> It depends if this is really a regression or is ok and "by design"
>
> Quoting what you wrote in the commit message:
>
> "Indicates that the device is not presently operational, but it might become operational in the future."
>
> I read that as it might be possible to have a device turn on after boot.
> In fact, looking at Linux, I can see some code allowing to change the state of a device after boot (see [4]). So I think we want to keep mapping the regions in dom0 at least for some devices.
>
> Note, that other bits of the DT overlay would not work. This should be covered by the new series from Xilinx [5].
>
>>>
>>>> it
>>>> makes it impossible for the other devices, which are not operational in
>>>> any case, to skip the mappings.
>>>
>>> You wrote "makes it impossible for the other devices", but it is not clear to me what's would go wrong when we map a disabled device (Dom0 should not touch it). Do you have an example?
>> Ok, here we go. In the SoC I am working with the PCIe controller may run in two modes:
>> Root or Endpoint. Not configurable at run-time, so you configure it in the device tree.
>> The are two device tree entries for that: for the root [1] and for the endpoint [2].
>> So, when you want the controller to be a Root Complex then you put status = "disabled"
>> for the Endpoint entry and vise versa.
>
> Thank for the example. I think this is something that should be explained in the commit message.
>
>>
>> If you take a look at the nodes they both define the same "reg" and "interrupts",
>> effectively making it impossible for me in the patch [3] to actually trap MMIOs: > we skip the mappings for [1] and then, because we assume "disabled" is
>> still valid for mappings, we add those for [2].
>
> Technically, you would have the same issues if your device is sharing the interrupt or the MMIO page.
>
> The former is fairly common, but IIUC you are not emulated the interrupts. Right?
>
> For the latter, I have seen multiple UARTs in the same page (e.g. pine64) but not multiple PCI hostbridges. However, this is only with 4KB page granularity. We may have the issue arising with 16KB/64KB ones. So I would say we could ignore it for now.
>
>>
>> So, this is probably why device tree documentation says about the disabled status
>> to be device specific.
>
> Correct. That said, until now, all the devices would have their MMIOsregions mapped to dom0. So the interpretation of "disabled" didn't matter too much.
>
>>
>> Hope this describes a very valid use-case. At the same time you may argue that
>> I just need to remove the offending entry from the device tree which may also be
>> valid.
>
> We need to be able to accept any *valid* Device-Tree without any modification. At the same time, we want to avoid break potential existing use-cases.
>
> I think, you can handle the two gracefully by adding a else in pci_host_bridge_mappings() that would remove the P2M mapping if bridge->ops->need_p2m_hwdom_mapping() returns false.
>
> This is not pretty but it should do the job for now. Long run, I think we should consider to create fake entries in the P2M for emulated regions.
>
> The advantage is we could easily find clash and use them to ignore mapping when the region is emulated.
I think I will drop this patch from the series for now: although it is good to have the device tree
unmodified I still see it way easier to do so. I will delete the node from the device tree as in my
case it is anyways decided at compile time which function the PCI host bridge is about to be.
If need be I'll re-introduce this patch in some other form.

Thank you,
Oleksandr
>
> Cheers,
>
>>>
>>> Cheers,
>>>
>> Thank you,
>> Oleksandr
>>
>> [1] https://urldefense.com/v3/__https://github.com/renesas-rcar/linux-bsp/blob/v5.10.41/rcar-5.1.3.rc6/arch/arm64/boot/dts/renesas/r8a779f0.dtsi*L843__;Iw!!GF_29dbcQIUBPA!ledLSDAXi6ggPloos-Wz0fWRM3IlSkJVX-H4QrTBSM9rlWvCItQ4D8hzLKHe7fBtoUOnJDmMoQ$ [github[.]com]
>> [2] https://urldefense.com/v3/__https://github.com/renesas-rcar/linux-bsp/blob/v5.10.41/rcar-5.1.3.rc6/arch/arm64/boot/dts/renesas/r8a779f0.dtsi*L896__;Iw!!GF_29dbcQIUBPA!ledLSDAXi6ggPloos-Wz0fWRM3IlSkJVX-H4QrTBSM9rlWvCItQ4D8hzLKHe7fBtoUPG8u2ngQ$ [github[.]com]
>> [3] https://urldefense.com/v3/__https://patchwork.kernel.org/project/xen-devel/patch/20211105063326.939843-5-andr2000@gmail.com/__;!!GF_29dbcQIUBPA!ledLSDAXi6ggPloos-Wz0fWRM3IlSkJVX-H4QrTBSM9rlWvCItQ4D8hzLKHe7fBtoUPU79EtaA$ [patchwork[.]kernel[.]org]
>
> [4] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/of/dynamic.c*n111__;Iw!!GF_29dbcQIUBPA!ledLSDAXi6ggPloos-Wz0fWRM3IlSkJVX-H4QrTBSM9rlWvCItQ4D8hzLKHe7fBtoUPRg1zUUw$ [git[.]kernel[.]org]
> [5] https://urldefense.com/v3/__https://lore.kernel.org/xen-devel/1636441347-133850-1-git-send-email-fnu.vikram@xilinx.com/__;!!GF_29dbcQIUBPA!ledLSDAXi6ggPloos-Wz0fWRM3IlSkJVX-H4QrTBSM9rlWvCItQ4D8hzLKHe7fBtoUMfgs0sdA$ [lore[.]kernel[.]org]
>
>
>>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-23  6:31                 ` Oleksandr Andrushchenko
@ 2021-11-23 16:05                   ` Julien Grall
  2021-11-23 16:44                     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-23 16:05 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh

Hi Oleksandr,

On 23/11/2021 06:31, Oleksandr Andrushchenko wrote:
> 
> 
> On 22.11.21 19:17, Julien Grall wrote:
>> Hi,
>>
>> On 22/11/2021 16:23, Oleksandr Andrushchenko wrote:
>>> On 22.11.21 17:29, Julien Grall wrote:
>>>> I would prefer to go with my way. This can be refined in the future if we find Device-Tree that matches what you wrote.
>>> Ok, so just to make it clear:
>>>    >a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci"
>>
>> That's correct. Thank you!
> Ok, so how about
>       if ( is_pci_passthrough_enabled() && dt_device_type_is_equal(node, "pci") )
>       {
>           bool skip = false;
> 
>           /*
>            * If the parent is also a "pci" device, then "linux,pci-domain"
>            * should already be there, so nothing to do then.
>            */

This comment is a bit confusing. I think what matter if the parent is a 
"pci" device, then the current node must not be a hostbridge. So we can 
skip it. However...

>           if ( node->parent && dt_device_type_is_equal(node->parent, "pci") )
>               skip = true;
> 
>           if ( !skip && !dt_find_property(node, "linux,pci-domain", NULL) )
>           {
> I played with a single if and it looks scary...

... how about introducing an helper that will return true if this node 
is likely an hostbridge?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-23  6:58           ` Oleksandr Andrushchenko
@ 2021-11-23 16:12             ` Julien Grall
  2021-11-23 16:41               ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-23 16:12 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh



On 23/11/2021 06:58, Oleksandr Andrushchenko wrote:
> Hi, Julien!

Hi Oleksandr,

> On 22.11.21 19:36, Julien Grall wrote:
>> On 18/11/2021 10:46, Oleksandr Andrushchenko wrote:
>>> On 18.11.21 09:27, Oleksandr Andrushchenko wrote:
>>>>>> +    unsigned int count;
>>>>>> +
>>>>>> +    if ( is_hardware_domain(d) )
>>>>>> +        /* For each PCI host bridge's configuration space. */
>>>>>> +        count = pci_host_get_num_bridges();
>>>>> This first part makes sense to me. But...
>>>>>
>>>>>> +    else
>>>>> ... I don't understand how the else is related to this commit. Can you clarify it?
>>>>>
>>>>>> +        /*
>>>>>> +         * There's a single MSI-X MMIO handler that deals with both PBA
>>>>>> +         * and MSI-X tables per each PCI device being passed through.
>>>>>> +         * Maximum number of supported devices is 32 as virtual bus
>>>>>> +         * topology emulates the devices as embedded endpoints.
>>>>>> +         * +1 for a single emulated host bridge's configuration space.
>>>>>> +         */
>>>>>> +        count = 1;
>>>>>> +#ifdef CONFIG_HAS_PCI_MSI
>>>>>> +        count += 32;
>>>>> Surely, this is a decision that is based on other factor in the vPCI code. So can use a define and avoid hardcoding the number?
>>>> Well, in the later series [1] this is defined via PCI_SLOT(~0) + 1 and there is no dedicated
>>>> constant for that. I can use the same here, e.g. s/32/PCI_SLOT(~0) + 1
>>
>> I would prefer if we introduce a new constant for that. This makes easier to update the code if we decide to increase the number of virtual devices.
>>
>> However, I am still not sure how the 'else' part is related to this commit. Can you please clarify it?
> Well, yes, this is too early for this patch to introduce some future knowledge, so I'll instead have:
> 
> unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
> {
>       if ( !has_vpci(d) )
>           return 0;
> 
>       if ( is_hardware_domain(d) )
>       {
>           int ret = pci_host_iterate_bridges_and_count(d, vpci_get_num_handlers_cb);
> 
>           return ret < 0 ? 0 : ret;
>       }
> 
>       /*
>        * This is a guest domain:
>        *
>        * 1 for a single emulated host bridge's configuration space.
>        */
>       return 1;

I am afraid that my question stands even with this approach. This patch 
is only meant to handle the hardware domain, therefore the change seems 
to be out of context.

I would prefer if this change is done separately.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-23 16:12             ` Julien Grall
@ 2021-11-23 16:41               ` Oleksandr Andrushchenko
  2021-11-23 16:58                 ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-23 16:41 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko

Hi, Julien!

On 23.11.21 18:12, Julien Grall wrote:
>
>
> On 23/11/2021 06:58, Oleksandr Andrushchenko wrote:
>> Hi, Julien!
>
> Hi Oleksandr,
>
>> On 22.11.21 19:36, Julien Grall wrote:
>>> On 18/11/2021 10:46, Oleksandr Andrushchenko wrote:
>>>> On 18.11.21 09:27, Oleksandr Andrushchenko wrote:
>>>>>>> +    unsigned int count;
>>>>>>> +
>>>>>>> +    if ( is_hardware_domain(d) )
>>>>>>> +        /* For each PCI host bridge's configuration space. */
>>>>>>> +        count = pci_host_get_num_bridges();
>>>>>> This first part makes sense to me. But...
>>>>>>
>>>>>>> +    else
>>>>>> ... I don't understand how the else is related to this commit. Can you clarify it?
>>>>>>
>>>>>>> +        /*
>>>>>>> +         * There's a single MSI-X MMIO handler that deals with both PBA
>>>>>>> +         * and MSI-X tables per each PCI device being passed through.
>>>>>>> +         * Maximum number of supported devices is 32 as virtual bus
>>>>>>> +         * topology emulates the devices as embedded endpoints.
>>>>>>> +         * +1 for a single emulated host bridge's configuration space.
>>>>>>> +         */
>>>>>>> +        count = 1;
>>>>>>> +#ifdef CONFIG_HAS_PCI_MSI
>>>>>>> +        count += 32;
>>>>>> Surely, this is a decision that is based on other factor in the vPCI code. So can use a define and avoid hardcoding the number?
>>>>> Well, in the later series [1] this is defined via PCI_SLOT(~0) + 1 and there is no dedicated
>>>>> constant for that. I can use the same here, e.g. s/32/PCI_SLOT(~0) + 1
>>>
>>> I would prefer if we introduce a new constant for that. This makes easier to update the code if we decide to increase the number of virtual devices.
>>>
>>> However, I am still not sure how the 'else' part is related to this commit. Can you please clarify it?
>> Well, yes, this is too early for this patch to introduce some future knowledge, so I'll instead have:
>>
>> unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
>> {
>>       if ( !has_vpci(d) )
>>           return 0;
>>
>>       if ( is_hardware_domain(d) )
>>       {
>>           int ret = pci_host_iterate_bridges_and_count(d, vpci_get_num_handlers_cb);
>>
>>           return ret < 0 ? 0 : ret;
>>       }
>>
>>       /*
>>        * This is a guest domain:
>>        *
>>        * 1 for a single emulated host bridge's configuration space.
>>        */
>>       return 1;
>
> I am afraid that my question stands even with this approach. This patch is only meant to handle the hardware domain, therefore the change seems to be out of context.
>
> I would prefer if this change is done separately.
While I do agree that MSI part and virtual bus topology are not belonging to this
patch I can't agree with the rest: we already have MMIO handlers for guest domains
and we introduce domain_vpci_get_num_mmio_handlers which must also account
on guests and stay consistent.
So, despite the patch has "hardware domain" in its name it doesn't mean we should
break guests here.
Thus I do think the above is still correct wrt this patch.
>
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 4/7] xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-11-05  6:33 ` [PATCH v6 4/7] xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
@ 2021-11-23 16:42   ` Julien Grall
  2021-11-24  7:42     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-23 16:42 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

Hi Oleksandr,

On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> PCI host bridges are special devices in terms of implementing PCI
> passthrough. According to [1] the current implementation depends on
> Domain-0 to perform the initialization of the relevant PCI host
> bridge hardware and perform PCI device enumeration. In order to
> achieve that one of the required changes is to not map all the memory
> ranges in map_range_to_domain as we traverse the device tree on startup
> and perform some additional checks if the range needs to be mapped to
> Domain-0.
> 
> The generic PCI host controller device tree binding says [2]:
> - ranges: As described in IEEE Std 1275-1994, but must provide
>            at least a definition of non-prefetchable memory. One
>            or both of prefetchable Memory and IO Space may also
>            be provided.
> 
> - reg   : The Configuration Space base address and size, as accessed
>            from the parent bus.  The base address corresponds to
>            the first bus in the "bus-range" property.  If no
>            "bus-range" is specified, this will be bus 0 (the default).
> 
>  From the above none of the memory ranges from the "ranges" property
> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
> be handled dynamically by vPCI as we assign PCI devices, e.g. each
> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
> as needed by Xen.
> 
> The "reg" property covers not only ECAM space, but may also have other
> then the configuration memory ranges described, for example [3]:
> - reg: Should contain rc_dbi, config registers location and length.
> - reg-names: Must include the following entries:
>     "rc_dbi": controller configuration registers;
>     "config": PCIe configuration space registers.
> 
> This patch makes it possible to not map all the ranges from the
> "ranges" property and also ECAM from the "reg". All the rest from the
> "reg" property still needs to be mapped to Domain-0, so the PCI
> host bridge remains functional in Domain-0.

The commit message is explaining the problematic quite well (thanks for 
that). I think it also wants to explain the approach taken (the fact 
that we are deferring the mapping for hostbridges).

> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html
> [2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt
> [3] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
> Since v5:
> - remove some need_mapping local variables
> - use own_device in handle_device
> - add __init for pci_ecam_need_p2m_hwdom_mapping
> - make pci_host_bridge_mappings use p2m_mmio_direct_dev directly
> Since v4:
> - update skip_mapping comment
> - add comment why we need to map interrupts to Dom0
> Since v3:
>   - pass struct map_range_data to map_dt_irq_to_domain
>   - remove redundant check from map_range_to_domain
>   - fix handle_device's .skip_mapping
> Since v2:
>   - removed check in map_range_to_domain for PCI_DEV
>     and moved it to handle_device, so the code is
>     simpler
>   - s/map_pci_bridge/skip_mapping
>   - extended comment in pci_host_bridge_mappings
>   - minor code restructure in construct_dom0
>   - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
>     callbacks
>   - unsigned int i; in pci_host_bridge_mappings
> Since v1:
>   - Added better description of why and what needs to be mapped into
>     Domain-0's p2m and what doesn't
>   - Do not do any mappings for PCI devices while traversing the DT
>   - Walk all the bridges and make required mappings in one go
> ---
>   xen/arch/arm/domain_build.c        | 67 +++++++++++++++++-------------
>   xen/arch/arm/pci/ecam.c            | 14 +++++++
>   xen/arch/arm/pci/pci-host-common.c | 50 ++++++++++++++++++++++
>   xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
>   xen/include/asm-arm/pci.h          | 10 +++++
>   xen/include/asm-arm/setup.h        | 13 ++++++
>   6 files changed, 126 insertions(+), 29 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index f7fcb1400c19..c7d992456ca7 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -10,7 +10,6 @@
>   #include <asm/regs.h>
>   #include <xen/errno.h>
>   #include <xen/err.h>
> -#include <xen/device_tree.h>

You are still using helpers defined by this header. So I would keep the 
include even if it may have been included by another one.

>   #include <xen/libfdt/libfdt.h>
>   #include <xen/guest_access.h>
>   #include <xen/iocap.h>
> @@ -51,12 +50,6 @@ static int __init parse_dom0_mem(const char *s)
>   }
>   custom_param("dom0_mem", parse_dom0_mem);
>   
> -struct map_range_data
> -{
> -    struct domain *d;
> -    p2m_type_t p2mt;
> -};
> -
>   /* Override macros from asm/page.h to make them work with mfn_t */
>   #undef virt_to_mfn
>   #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
> @@ -1676,10 +1669,10 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>                                          const struct dt_irq *dt_irq,
>                                          void *data)
>   {
> -    struct domain *d = data;
> +    struct map_range_data *mr_data = data;
> +    struct domain *d = mr_data->d;
>       unsigned int irq = dt_irq->irq;
>       int res;
> -    bool need_mapping = !dt_device_for_passthrough(dev);
>   
>       if ( irq < NR_LOCAL_IRQS )
>       {
> @@ -1698,18 +1691,16 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>           return res;
>       }
>   
> -    res = map_irq_to_domain(d, irq, need_mapping, dt_node_name(dev));
> +    res = map_irq_to_domain(d, irq, !mr_data->skip_mapping, dt_node_name(dev));
>   
>       return 0;
>   }
>   
> -static int __init map_range_to_domain(const struct dt_device_node *dev,
> -                                      u64 addr, u64 len,
> -                                      void *data)
> +int __init map_range_to_domain(const struct dt_device_node *dev,
> +                               u64 addr, u64 len, void *data)
>   {
>       struct map_range_data *mr_data = data;
>       struct domain *d = mr_data->d;
> -    bool need_mapping = !dt_device_for_passthrough(dev);
>       int res;
>   
>       res = iomem_permit_access(d, paddr_to_pfn(addr),
> @@ -1723,7 +1714,7 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>           return res;
>       }
>   
> -    if ( need_mapping )
> +    if ( !mr_data->skip_mapping )
>       {
>           res = map_regions_p2mt(d,
>                                  gaddr_to_gfn(addr),
> @@ -1752,23 +1743,21 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>    * then we may need to perform additional mappings in order to make
>    * the child resources available to domain 0.
>    */
> -static int __init map_device_children(struct domain *d,
> -                                      const struct dt_device_node *dev,
> -                                      p2m_type_t p2mt)
> +static int __init map_device_children(const struct dt_device_node *dev,
> +                                      struct map_range_data *mr_data)
>   {
> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> -    int ret;
> -
>       if ( dt_device_type_is_equal(dev, "pci") )
>       {
> +        int ret;
> +
>           dt_dprintk("Mapping children of %s to guest\n",
>                      dt_node_full_name(dev));
>   
> -        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, d);
> +        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, mr_data);
>           if ( ret < 0 )
>               return ret;
>   
> -        ret = dt_for_each_range(dev, &map_range_to_domain, &mr_data);
> +        ret = dt_for_each_range(dev, &map_range_to_domain, mr_data);
>           if ( ret < 0 )
>               return ret;
>       }
> @@ -1848,14 +1837,28 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>       unsigned int i;
>       int res;
>       u64 addr, size;
> -    bool need_mapping = !dt_device_for_passthrough(dev);
> +    bool own_device = !dt_device_for_passthrough(dev);
> +    /*
> +     * For PCI passthrough we only need to remap to Dom0 the interrupts
> +     * and memory ranges from "reg" property which cover controller's
> +     * configuration registers and such. PCIe configuration space registers
> +     * of the PCIe Root Complex and PCIe aperture should not be mapped
> +     * automatically to Dom0.
> +     */

I would clarify in this comment (and the commit message) that this only 
cover hostbridge that are been shared between dom0 and Xen.

But I find the comment confusing because I would expect to explain this...

> +    struct map_range_data mr_data = {
> +        .d = d,
> +        .p2mt = p2mt,
> +        .skip_mapping = !own_device ||
> +                        (is_pci_passthrough_enabled() &&
> +                        (device_get_class(dev) == DEVICE_PCI_HOSTBRIDGE))

... line. Instead, it explains the global behavior. I would rework the 
comment to something like:

/*
  * We want to avoid mappings the MMIO in dom0 for the following cases:
  *   - The device is owned by dom0 (i.e. it has been flagged for
  *     passthrough).
  *   - PCI hostbridges with driver in Xen. They will later on by
  *     pci_host_bridge_mappings().
  */

> +    };
>   
>       naddr = dt_number_of_address(dev);
>   
>       dt_dprintk("%s passthrough = %d naddr = %u\n",
> -               dt_node_full_name(dev), need_mapping, naddr);
> +               dt_node_full_name(dev), own_device, naddr);
>   
> -    if ( need_mapping )
> +    if ( own_device )
>       {
>           dt_dprintk("Check if %s is behind the IOMMU and add it\n",
>                      dt_node_full_name(dev));
> @@ -1881,14 +1884,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>           }
>       }
>   
> -    res = handle_device_interrupts(d, dev, need_mapping);
> +    res = handle_device_interrupts(d, dev, own_device);
>       if ( res < 0 )
>           return res;
>   
>       /* Give permission and map MMIOs */
>       for ( i = 0; i < naddr; i++ )
>       {
> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>           res = dt_device_get_address(dev, i, &addr, &size);
>           if ( res )
>           {
> @@ -1902,7 +1904,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>               return res;
>       }
>   
> -    res = map_device_children(d, dev, p2mt);
> +    res = map_device_children(dev, &mr_data);
>       if ( res )
>           return res;
>   
> @@ -3060,7 +3062,14 @@ static int __init construct_dom0(struct domain *d)
>           return rc;
>   
>       if ( acpi_disabled )
> +    {
>           rc = prepare_dtb_hwdom(d, &kinfo);
> +        if ( rc < 0 )
> +            return rc;
> +#ifdef CONFIG_HAS_PCI
> +        rc = pci_host_bridge_mappings(d);
> +#endif
> +    }
>       else
>           rc = prepare_acpi(d, &kinfo);
>   
> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> index 602d00799c8d..4f71b11c3057 100644
> --- a/xen/arch/arm/pci/ecam.c
> +++ b/xen/arch/arm/pci/ecam.c
> @@ -40,6 +40,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>       return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where;
>   }
>   
> +bool __init pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
> +                                            struct pci_host_bridge *bridge,
> +                                            uint64_t addr)
> +{
> +    struct pci_config_window *cfg = bridge->cfg;
> +
> +    /*
> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
> +     * so we can trap access to it.
> +     */
> +    return cfg->phys_addr != addr;
> +}
> +
>   /* ECAM ops */
>   const struct pci_ecam_ops pci_generic_ecam_ops = {
>       .bus_shift  = 20,
> @@ -47,6 +60,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>           .map_bus                = pci_ecam_map_bus,
>           .read                   = pci_generic_config_read,
>           .write                  = pci_generic_config_write,
> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
>       }
>   };
>   
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index 0d271a6e8881..6af845ab9d6c 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -22,6 +22,8 @@
>   #include <xen/sched.h>
>   #include <xen/vmap.h>
>   
> +#include <asm/setup.h>
> +
>   /*
>    * List for all the pci host bridges.
>    */
> @@ -316,6 +318,54 @@ unsigned int pci_host_get_num_bridges(void)
>       return count;
>   }
>   
> +int __init pci_host_bridge_mappings(struct domain *d)
> +{
> +    struct pci_host_bridge *bridge;
> +    struct map_range_data mr_data = {
> +        .d = d,
> +        .p2mt = p2m_mmio_direct_dev,
> +        .skip_mapping = false
> +    };
> +
> +    /*
> +     * For each PCI host bridge we need to only map those ranges
> +     * which are used by Domain-0 to properly initialize the bridge,
> +     * e.g. we do not want to map ECAM configuration space which lives in
> +     * "reg" device tree property, but we want to map other regions of
> +     * the host bridge. The PCI aperture defined by the "ranges" device
> +     * tree property should also be skipped.
> +     */

This seems to explain the purpose of the function. So I would move the 
commment on top of pci_host_bridge_mappings().

> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +    {
> +        const struct dt_device_node *dev = bridge->dt_node;
> +        unsigned int i;
> +
> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
> +        {
> +            uint64_t addr, size;
> +            int err;
> +
> +            err = dt_device_get_address(dev, i, &addr, &size);
> +            if ( err )
> +            {
> +                printk(XENLOG_ERR
> +                       "Unable to retrieve address range index=%u for %s\n",
> +                       i, dt_node_full_name(dev));
> +                return err;
> +            }
> +
> +            if ( bridge->ops->need_p2m_hwdom_mapping(d, bridge, addr) )
> +            {
> +                err = map_range_to_domain(dev, addr, size, &mr_data);
> +                if ( err )
> +                    return err;
> +            }
> +        }
> +    }
> +
> +    return 0;
> +}
> +
>   /*
>    * Local variables:
>    * mode: C
> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
> index 516982bca833..101edb8593c1 100644
> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
> @@ -34,6 +34,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
>           .map_bus                = pci_ecam_map_bus,
>           .read                   = pci_generic_config_read,
>           .write                  = pci_generic_config_write,
> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
>       }
>   };
>   
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index 969333043431..3d706fdd1d88 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -17,6 +17,8 @@
>   
>   #ifdef CONFIG_HAS_PCI
>   
> +#include <asm/p2m.h>
> +
>   #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
>   
>   extern bool pci_passthrough_enabled;
> @@ -73,6 +75,9 @@ struct pci_ops {
>                   uint32_t reg, uint32_t len, uint32_t *value);
>       int (*write)(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>                    uint32_t reg, uint32_t len, uint32_t value);
> +    bool (*need_p2m_hwdom_mapping)(struct domain *d,
> +                                   struct pci_host_bridge *bridge,
> +                                   uint64_t addr);
>   };
>   
>   /*
> @@ -96,6 +101,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>                                uint32_t reg, uint32_t len, uint32_t value);
>   void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>                                  pci_sbdf_t sbdf, uint32_t where);
> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
> +                                     struct pci_host_bridge *bridge,
> +                                     uint64_t addr);
>   struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
>   struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
>   int pci_get_host_bridge_segment(const struct dt_device_node *node,
> @@ -115,6 +123,8 @@ int pci_host_iterate_bridges(struct domain *d,
>                                           struct pci_host_bridge *bridge));
>   unsigned int pci_host_get_num_bridges(void);
>   
> +int pci_host_bridge_mappings(struct domain *d);
> +
>   #else   /*!CONFIG_HAS_PCI*/
>   
>   struct arch_pci_dev { };
> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
> index 95da0b7ab9cd..88d9673db817 100644
> --- a/xen/include/asm-arm/setup.h
> +++ b/xen/include/asm-arm/setup.h
> @@ -2,6 +2,8 @@
>   #define __ARM_SETUP_H_
>   
>   #include <public/version.h>
> +#include <asm/p2m.h>
> +#include <xen/device_tree.h>
>   
>   #define MIN_FDT_ALIGN 8
>   #define MAX_FDT_SIZE SZ_2M
> @@ -77,6 +79,14 @@ struct bootinfo {
>   #endif
>   };
>   
> +struct map_range_data
> +{
> +    struct domain *d;
> +    p2m_type_t p2mt;
> +    /* Set if mapping of the memory ranges must be skipped. */
> +    bool skip_mapping;
> +};
> +
>   extern struct bootinfo bootinfo;
>   
>   extern domid_t max_init_domid;
> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
>   u32 device_tree_get_u32(const void *fdt, int node,
>                           const char *prop_name, u32 dflt);
>   
> +int map_range_to_domain(const struct dt_device_node *dev,
> +                        u64 addr, u64 len, void *data);
> +
>   #endif
>   /*
>    * Local variables:
> 

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-23 16:05                   ` Julien Grall
@ 2021-11-23 16:44                     ` Oleksandr Andrushchenko
  2021-11-23 17:15                       ` Julien Grall
  0 siblings, 1 reply; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-23 16:44 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko

Hi, Julien!

On 23.11.21 18:05, Julien Grall wrote:
> Hi Oleksandr,
>
> On 23/11/2021 06:31, Oleksandr Andrushchenko wrote:
>>
>>
>> On 22.11.21 19:17, Julien Grall wrote:
>>> Hi,
>>>
>>> On 22/11/2021 16:23, Oleksandr Andrushchenko wrote:
>>>> On 22.11.21 17:29, Julien Grall wrote:
>>>>> I would prefer to go with my way. This can be refined in the future if we find Device-Tree that matches what you wrote.
>>>> Ok, so just to make it clear:
>>>>    >a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci"
>>>
>>> That's correct. Thank you!
>> Ok, so how about
>>       if ( is_pci_passthrough_enabled() && dt_device_type_is_equal(node, "pci") )
>>       {
>>           bool skip = false;
>>
>>           /*
>>            * If the parent is also a "pci" device, then "linux,pci-domain"
>>            * should already be there, so nothing to do then.
>>            */
>
> This comment is a bit confusing.
Do you have something on your mind?
> I think what matter if the parent is a "pci" device, then the current node must not be a hostbridge. So we can skip it.
By skipping you only mean we do not need to add/assign "linux,pci-domain", right?
> However...
>
>>           if ( node->parent && dt_device_type_is_equal(node->parent, "pci") )
>>               skip = true;
>>
>>           if ( !skip && !dt_find_property(node, "linux,pci-domain", NULL) )
>>           {
>> I played with a single if and it looks scary...
>
> ... how about introducing an helper that will return true if this node is likely an hostbridge?
This is yet a single use of such a check: why would we need a helper and what would that
helper do?
>
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-23 16:41               ` Oleksandr Andrushchenko
@ 2021-11-23 16:58                 ` Julien Grall
  2021-11-24  7:22                   ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-23 16:58 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh

Hi,

On 23/11/2021 16:41, Oleksandr Andrushchenko wrote:
> On 23.11.21 18:12, Julien Grall wrote:
>> On 23/11/2021 06:58, Oleksandr Andrushchenko wrote:
>>> unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
>>> {
>>>        if ( !has_vpci(d) )
>>>            return 0;
>>>
>>>        if ( is_hardware_domain(d) )
>>>        {
>>>            int ret = pci_host_iterate_bridges_and_count(d, vpci_get_num_handlers_cb);
>>>
>>>            return ret < 0 ? 0 : ret;
>>>        }
>>>
>>>        /*
>>>         * This is a guest domain:
>>>         *
>>>         * 1 for a single emulated host bridge's configuration space.
>>>         */
>>>        return 1;
>>
>> I am afraid that my question stands even with this approach. This patch is only meant to handle the hardware domain, therefore the change seems to be out of context.
>>
>> I would prefer if this change is done separately.
> While I do agree that MSI part and virtual bus topology are not belonging to this
> patch I can't agree with the rest: we already have MMIO handlers for guest domains
> and we introduce domain_vpci_get_num_mmio_handlers which must also account
> on guests and stay consistent.
> So, despite the patch has "hardware domain" in its name it doesn't mean we should
> break guests here.

We were already registering the handler for guest domain before your 
patch. So this is nothing new.

At the moment, we always allocate an extra 16 slot for IO handlers (see 
MAX_IO_HANDLER). So we are not breaking anything. Instead, this is 
simply a latent bug.

> Thus I do think the above is still correct wrt this patch.

The idea of splitting patch is to separate bug fix from new code. This 
helps backporting and review.

In this case, we don't care about backport (PCI passthrough is no 
supported) and the fix a simple enough. So I am not going to insist on 
splitting to a separate patch.

However, this change *must* be explained in the commit message.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-23 16:44                     ` Oleksandr Andrushchenko
@ 2021-11-23 17:15                       ` Julien Grall
  2021-11-24  6:54                         ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 43+ messages in thread
From: Julien Grall @ 2021-11-23 17:15 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh

Hi,

On 23/11/2021 16:44, Oleksandr Andrushchenko wrote:
> On 23.11.21 18:05, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 23/11/2021 06:31, Oleksandr Andrushchenko wrote:
>>>
>>>
>>> On 22.11.21 19:17, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 22/11/2021 16:23, Oleksandr Andrushchenko wrote:
>>>>> On 22.11.21 17:29, Julien Grall wrote:
>>>>>> I would prefer to go with my way. This can be refined in the future if we find Device-Tree that matches what you wrote.
>>>>> Ok, so just to make it clear:
>>>>>     >a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci"
>>>>
>>>> That's correct. Thank you!
>>> Ok, so how about
>>>        if ( is_pci_passthrough_enabled() && dt_device_type_is_equal(node, "pci") )
>>>        {
>>>            bool skip = false;
>>>
>>>            /*
>>>             * If the parent is also a "pci" device, then "linux,pci-domain"
>>>             * should already be there, so nothing to do then.
>>>             */
>>
>> This comment is a bit confusing.
> Do you have something on your mind?

Yes, explain that we only want to cover hostbridge (see my reply on the 
previous answer).

>> I think what matter if the parent is a "pci" device, then the current node must not be a hostbridge. So we can skip it.
> By skipping you only mean we do not need to add/assign "linux,pci-domain", right?

Yes.

>> However...
>>
>>>            if ( node->parent && dt_device_type_is_equal(node->parent, "pci") )
>>>                skip = true;
>>>
>>>            if ( !skip && !dt_find_property(node, "linux,pci-domain", NULL) )
>>>            {
>>> I played with a single if and it looks scary...
>>
>> ... how about introducing an helper that will return true if this node is likely an hostbridge?
> This is yet a single use of such a check: why would we need a helper and what would that
> helper do?

I like splitting code in multiple functions even if they are only called 
once because this:
   1) keeps the functions line count small
   2) easier to understand what is the purpose of the check

In fact, I would actually move the handling for "linux,pci-domain" in a 
separate helper. Something like:

/*
  * When PCI passthrough is available we want to keep the
  * "linux,pci-domain" in sync for every hostbridge.
  *
  * Xen may not have a driver for all the hostbridges. So we have
  * to write an heuristic to detect whether a device node describes
  * an hostbridge.
  *
  * The current heuristic assumes that a device is an hostbridge
  * if the type is "pci" and then parent type is not "pci".
  */
static int handle_linux_pci_domain(struct dt_device *node)
{
         if ( !is_pci_passthrough_enabled() )
           return 0;

	if ( !dt_device_type_is_equal(node, "pci") )
           return 0;

         if ( node->parent && dt_device_type_is_equal(node->parent) )
           return 0;

         if ( dt_find_property(node, "linux,pci-domain", NULL )
           return 0;

         /* Allocate and create the linux,pci-domain */
}

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices
  2021-11-23 17:15                       ` Julien Grall
@ 2021-11-24  6:54                         ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-24  6:54 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh

Hi, Julien!

On 23.11.21 19:15, Julien Grall wrote:
> Hi,
>
> On 23/11/2021 16:44, Oleksandr Andrushchenko wrote:
>> On 23.11.21 18:05, Julien Grall wrote:
>>> Hi Oleksandr,
>>>
>>> On 23/11/2021 06:31, Oleksandr Andrushchenko wrote:
>>>>
>>>>
>>>> On 22.11.21 19:17, Julien Grall wrote:
>>>>> Hi,
>>>>>
>>>>> On 22/11/2021 16:23, Oleksandr Andrushchenko wrote:
>>>>>> On 22.11.21 17:29, Julien Grall wrote:
>>>>>>> I would prefer to go with my way. This can be refined in the future if we find Device-Tree that matches what you wrote.
>>>>>> Ok, so just to make it clear:
>>>>>>     >a PCI device would always be described as a child of the hostbridges. So I would rework the 'if' to also check if the parent type is not "pci"
>>>>>
>>>>> That's correct. Thank you!
>>>> Ok, so how about
>>>>        if ( is_pci_passthrough_enabled() && dt_device_type_is_equal(node, "pci") )
>>>>        {
>>>>            bool skip = false;
>>>>
>>>>            /*
>>>>             * If the parent is also a "pci" device, then "linux,pci-domain"
>>>>             * should already be there, so nothing to do then.
>>>>             */
>>>
>>> This comment is a bit confusing.
>> Do you have something on your mind?
>
> Yes, explain that we only want to cover hostbridge (see my reply on the previous answer).
I guess this will be explained by the comment to handle_linux_pci_domain below
>
>>> I think what matter if the parent is a "pci" device, then the current node must not be a hostbridge. So we can skip it.
>> By skipping you only mean we do not need to add/assign "linux,pci-domain", right?
>
> Yes.
>
>>> However...
>>>
>>>>            if ( node->parent && dt_device_type_is_equal(node->parent, "pci") )
>>>>                skip = true;
>>>>
>>>>            if ( !skip && !dt_find_property(node, "linux,pci-domain", NULL) )
>>>>            {
>>>> I played with a single if and it looks scary...
>>>
>>> ... how about introducing an helper that will return true if this node is likely an hostbridge?
>> This is yet a single use of such a check: why would we need a helper and what would that
>> helper do?
>
> I like splitting code in multiple functions even if they are only called once because this:
>   1) keeps the functions line count small
>   2) easier to understand what is the purpose of the check
>
> In fact, I would actually move the handling for "linux,pci-domain" in a separate helper. Something like:
>
> /*
>  * When PCI passthrough is available we want to keep the
>  * "linux,pci-domain" in sync for every hostbridge.
>  *
>  * Xen may not have a driver for all the hostbridges. So we have
>  * to write an heuristic to detect whether a device node describes
>  * an hostbridge.
>  *
>  * The current heuristic assumes that a device is an hostbridge
>  * if the type is "pci" and then parent type is not "pci".
>  */
> static int handle_linux_pci_domain(struct dt_device *node)
> {
>         if ( !is_pci_passthrough_enabled() )
>           return 0;
>
>     if ( !dt_device_type_is_equal(node, "pci") )
>           return 0;
>
>         if ( node->parent && dt_device_type_is_equal(node->parent) )
>           return 0;
>
>         if ( dt_find_property(node, "linux,pci-domain", NULL )
>           return 0;
>
>         /* Allocate and create the linux,pci-domain */
> }
>
I'm fine with this, will move, thanks
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain
  2021-11-23 16:58                 ` Julien Grall
@ 2021-11-24  7:22                   ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-24  7:22 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh

Hi, Julien!

On 23.11.21 18:58, Julien Grall wrote:
> Hi,
>
> On 23/11/2021 16:41, Oleksandr Andrushchenko wrote:
>> On 23.11.21 18:12, Julien Grall wrote:
>>> On 23/11/2021 06:58, Oleksandr Andrushchenko wrote:
>>>> unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
>>>> {
>>>>        if ( !has_vpci(d) )
>>>>            return 0;
>>>>
>>>>        if ( is_hardware_domain(d) )
>>>>        {
>>>>            int ret = pci_host_iterate_bridges_and_count(d, vpci_get_num_handlers_cb);
>>>>
>>>>            return ret < 0 ? 0 : ret;
>>>>        }
>>>>
>>>>        /*
>>>>         * This is a guest domain:
>>>>         *
>>>>         * 1 for a single emulated host bridge's configuration space.
>>>>         */
>>>>        return 1;
>>>
>>> I am afraid that my question stands even with this approach. This patch is only meant to handle the hardware domain, therefore the change seems to be out of context.
>>>
>>> I would prefer if this change is done separately.
>> While I do agree that MSI part and virtual bus topology are not belonging to this
>> patch I can't agree with the rest: we already have MMIO handlers for guest domains
>> and we introduce domain_vpci_get_num_mmio_handlers which must also account
>> on guests and stay consistent.
>> So, despite the patch has "hardware domain" in its name it doesn't mean we should
>> break guests here.
>
> We were already registering the handler for guest domain before your patch. So this is nothing new.
>
> At the moment, we always allocate an extra 16 slot for IO handlers (see MAX_IO_HANDLER). So we are not breaking anything. Instead, this is simply a latent bug.
Agree
>
>> Thus I do think the above is still correct wrt this patch.
>
> The idea of splitting patch is to separate bug fix from new code. This helps backporting and review.
>
> In this case, we don't care about backport (PCI passthrough is no supported) and the fix a simple enough. So I am not going to insist on splitting to a separate patch.
>
> However, this change *must* be explained in the commit message.
I will add a dedicated patch to fix that
>
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 4/7] xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-11-23 16:42   ` Julien Grall
@ 2021-11-24  7:42     ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 43+ messages in thread
From: Oleksandr Andrushchenko @ 2021-11-24  7:42 UTC (permalink / raw)
  To: Julien Grall, Oleksandr Andrushchenko, xen-devel
  Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh

Hi, Julien!

On 23.11.21 18:42, Julien Grall wrote:
> Hi Oleksandr,
>
> On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> PCI host bridges are special devices in terms of implementing PCI
>> passthrough. According to [1] the current implementation depends on
>> Domain-0 to perform the initialization of the relevant PCI host
>> bridge hardware and perform PCI device enumeration. In order to
>> achieve that one of the required changes is to not map all the memory
>> ranges in map_range_to_domain as we traverse the device tree on startup
>> and perform some additional checks if the range needs to be mapped to
>> Domain-0.
>>
>> The generic PCI host controller device tree binding says [2]:
>> - ranges: As described in IEEE Std 1275-1994, but must provide
>>            at least a definition of non-prefetchable memory. One
>>            or both of prefetchable Memory and IO Space may also
>>            be provided.
>>
>> - reg   : The Configuration Space base address and size, as accessed
>>            from the parent bus.  The base address corresponds to
>>            the first bus in the "bus-range" property.  If no
>>            "bus-range" is specified, this will be bus 0 (the default).
>>
>>  From the above none of the memory ranges from the "ranges" property
>> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
>> be handled dynamically by vPCI as we assign PCI devices, e.g. each
>> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
>> as needed by Xen.
>>
>> The "reg" property covers not only ECAM space, but may also have other
>> then the configuration memory ranges described, for example [3]:
>> - reg: Should contain rc_dbi, config registers location and length.
>> - reg-names: Must include the following entries:
>>     "rc_dbi": controller configuration registers;
>>     "config": PCIe configuration space registers.
>>
>> This patch makes it possible to not map all the ranges from the
>> "ranges" property and also ECAM from the "reg". All the rest from the
>> "reg" property still needs to be mapped to Domain-0, so the PCI
>> host bridge remains functional in Domain-0.
>
> The commit message is explaining the problematic quite well (thanks for that). I think it also wants to explain the approach taken (the fact that we are deferring the mapping for hostbridges).
Will add
>
>>
>> [1] https://urldefense.com/v3/__https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html__;!!GF_29dbcQIUBPA!mdKldRDDMc62Zlr3zHPdIXeXjHNI2H6pGE7ZHNvbK3xizCNe0mhV7jMF5WLGZ7bA5B47tK4zFA$ [lists[.]xenproject[.]org]
>> [2] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt__;!!GF_29dbcQIUBPA!mdKldRDDMc62Zlr3zHPdIXeXjHNI2H6pGE7ZHNvbK3xizCNe0mhV7jMF5WLGZ7bA5B7UwSKSww$ [kernel[.]org]
>> [3] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt__;!!GF_29dbcQIUBPA!mdKldRDDMc62Zlr3zHPdIXeXjHNI2H6pGE7ZHNvbK3xizCNe0mhV7jMF5WLGZ7bA5B6NizsNXQ$ [kernel[.]org]
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> ---
>> Since v5:
>> - remove some need_mapping local variables
>> - use own_device in handle_device
>> - add __init for pci_ecam_need_p2m_hwdom_mapping
>> - make pci_host_bridge_mappings use p2m_mmio_direct_dev directly
>> Since v4:
>> - update skip_mapping comment
>> - add comment why we need to map interrupts to Dom0
>> Since v3:
>>   - pass struct map_range_data to map_dt_irq_to_domain
>>   - remove redundant check from map_range_to_domain
>>   - fix handle_device's .skip_mapping
>> Since v2:
>>   - removed check in map_range_to_domain for PCI_DEV
>>     and moved it to handle_device, so the code is
>>     simpler
>>   - s/map_pci_bridge/skip_mapping
>>   - extended comment in pci_host_bridge_mappings
>>   - minor code restructure in construct_dom0
>>   - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
>>     callbacks
>>   - unsigned int i; in pci_host_bridge_mappings
>> Since v1:
>>   - Added better description of why and what needs to be mapped into
>>     Domain-0's p2m and what doesn't
>>   - Do not do any mappings for PCI devices while traversing the DT
>>   - Walk all the bridges and make required mappings in one go
>> ---
>>   xen/arch/arm/domain_build.c        | 67 +++++++++++++++++-------------
>>   xen/arch/arm/pci/ecam.c            | 14 +++++++
>>   xen/arch/arm/pci/pci-host-common.c | 50 ++++++++++++++++++++++
>>   xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
>>   xen/include/asm-arm/pci.h          | 10 +++++
>>   xen/include/asm-arm/setup.h        | 13 ++++++
>>   6 files changed, 126 insertions(+), 29 deletions(-)
>>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index f7fcb1400c19..c7d992456ca7 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -10,7 +10,6 @@
>>   #include <asm/regs.h>
>>   #include <xen/errno.h>
>>   #include <xen/err.h>
>> -#include <xen/device_tree.h>
>
> You are still using helpers defined by this header. So I would keep the include even if it may have been included by another one.
Sure, will remove this change
>
>>   #include <xen/libfdt/libfdt.h>
>>   #include <xen/guest_access.h>
>>   #include <xen/iocap.h>
>> @@ -51,12 +50,6 @@ static int __init parse_dom0_mem(const char *s)
>>   }
>>   custom_param("dom0_mem", parse_dom0_mem);
>>   -struct map_range_data
>> -{
>> -    struct domain *d;
>> -    p2m_type_t p2mt;
>> -};
>> -
>>   /* Override macros from asm/page.h to make them work with mfn_t */
>>   #undef virt_to_mfn
>>   #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
>> @@ -1676,10 +1669,10 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>>                                          const struct dt_irq *dt_irq,
>>                                          void *data)
>>   {
>> -    struct domain *d = data;
>> +    struct map_range_data *mr_data = data;
>> +    struct domain *d = mr_data->d;
>>       unsigned int irq = dt_irq->irq;
>>       int res;
>> -    bool need_mapping = !dt_device_for_passthrough(dev);
>>         if ( irq < NR_LOCAL_IRQS )
>>       {
>> @@ -1698,18 +1691,16 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>>           return res;
>>       }
>>   -    res = map_irq_to_domain(d, irq, need_mapping, dt_node_name(dev));
>> +    res = map_irq_to_domain(d, irq, !mr_data->skip_mapping, dt_node_name(dev));
>>         return 0;
>>   }
>>   -static int __init map_range_to_domain(const struct dt_device_node *dev,
>> -                                      u64 addr, u64 len,
>> -                                      void *data)
>> +int __init map_range_to_domain(const struct dt_device_node *dev,
>> +                               u64 addr, u64 len, void *data)
>>   {
>>       struct map_range_data *mr_data = data;
>>       struct domain *d = mr_data->d;
>> -    bool need_mapping = !dt_device_for_passthrough(dev);
>>       int res;
>>         res = iomem_permit_access(d, paddr_to_pfn(addr),
>> @@ -1723,7 +1714,7 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>>           return res;
>>       }
>>   -    if ( need_mapping )
>> +    if ( !mr_data->skip_mapping )
>>       {
>>           res = map_regions_p2mt(d,
>>                                  gaddr_to_gfn(addr),
>> @@ -1752,23 +1743,21 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>>    * then we may need to perform additional mappings in order to make
>>    * the child resources available to domain 0.
>>    */
>> -static int __init map_device_children(struct domain *d,
>> -                                      const struct dt_device_node *dev,
>> -                                      p2m_type_t p2mt)
>> +static int __init map_device_children(const struct dt_device_node *dev,
>> +                                      struct map_range_data *mr_data)
>>   {
>> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>> -    int ret;
>> -
>>       if ( dt_device_type_is_equal(dev, "pci") )
>>       {
>> +        int ret;
>> +
>>           dt_dprintk("Mapping children of %s to guest\n",
>>                      dt_node_full_name(dev));
>>   -        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, d);
>> +        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, mr_data);
>>           if ( ret < 0 )
>>               return ret;
>>   -        ret = dt_for_each_range(dev, &map_range_to_domain, &mr_data);
>> +        ret = dt_for_each_range(dev, &map_range_to_domain, mr_data);
>>           if ( ret < 0 )
>>               return ret;
>>       }
>> @@ -1848,14 +1837,28 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>       unsigned int i;
>>       int res;
>>       u64 addr, size;
>> -    bool need_mapping = !dt_device_for_passthrough(dev);
>> +    bool own_device = !dt_device_for_passthrough(dev);
>> +    /*
>> +     * For PCI passthrough we only need to remap to Dom0 the interrupts
>> +     * and memory ranges from "reg" property which cover controller's
>> +     * configuration registers and such. PCIe configuration space registers
>> +     * of the PCIe Root Complex and PCIe aperture should not be mapped
>> +     * automatically to Dom0.
>> +     */
>
> I would clarify in this comment (and the commit message) that this only cover hostbridge that are been shared between dom0 and Xen.
>
> But I find the comment confusing because I would expect to explain this...
>
>> +    struct map_range_data mr_data = {
>> +        .d = d,
>> +        .p2mt = p2mt,
>> +        .skip_mapping = !own_device ||
>> +                        (is_pci_passthrough_enabled() &&
>> +                        (device_get_class(dev) == DEVICE_PCI_HOSTBRIDGE))
>
> ... line. Instead, it explains the global behavior. I would rework the comment to something like:
>
> /*
>  * We want to avoid mappings the MMIO in dom0 for the following cases:
>  *   - The device is owned by dom0 (i.e. it has been flagged for
>  *     passthrough).
>  *   - PCI hostbridges with driver in Xen. They will later on by
>  *     pci_host_bridge_mappings().
>  */
Ok, sounds good
>
>> +    };
>>         naddr = dt_number_of_address(dev);
>>         dt_dprintk("%s passthrough = %d naddr = %u\n",
>> -               dt_node_full_name(dev), need_mapping, naddr);
>> +               dt_node_full_name(dev), own_device, naddr);
>>   -    if ( need_mapping )
>> +    if ( own_device )
>>       {
>>           dt_dprintk("Check if %s is behind the IOMMU and add it\n",
>>                      dt_node_full_name(dev));
>> @@ -1881,14 +1884,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>           }
>>       }
>>   -    res = handle_device_interrupts(d, dev, need_mapping);
>> +    res = handle_device_interrupts(d, dev, own_device);
>>       if ( res < 0 )
>>           return res;
>>         /* Give permission and map MMIOs */
>>       for ( i = 0; i < naddr; i++ )
>>       {
>> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>>           res = dt_device_get_address(dev, i, &addr, &size);
>>           if ( res )
>>           {
>> @@ -1902,7 +1904,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>               return res;
>>       }
>>   -    res = map_device_children(d, dev, p2mt);
>> +    res = map_device_children(dev, &mr_data);
>>       if ( res )
>>           return res;
>>   @@ -3060,7 +3062,14 @@ static int __init construct_dom0(struct domain *d)
>>           return rc;
>>         if ( acpi_disabled )
>> +    {
>>           rc = prepare_dtb_hwdom(d, &kinfo);
>> +        if ( rc < 0 )
>> +            return rc;
>> +#ifdef CONFIG_HAS_PCI
>> +        rc = pci_host_bridge_mappings(d);
>> +#endif
>> +    }
>>       else
>>           rc = prepare_acpi(d, &kinfo);
>>   diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>> index 602d00799c8d..4f71b11c3057 100644
>> --- a/xen/arch/arm/pci/ecam.c
>> +++ b/xen/arch/arm/pci/ecam.c
>> @@ -40,6 +40,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>       return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where;
>>   }
>>   +bool __init pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
>> +                                            struct pci_host_bridge *bridge,
>> +                                            uint64_t addr)
>> +{
>> +    struct pci_config_window *cfg = bridge->cfg;
>> +
>> +    /*
>> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
>> +     * so we can trap access to it.
>> +     */
>> +    return cfg->phys_addr != addr;
>> +}
>> +
>>   /* ECAM ops */
>>   const struct pci_ecam_ops pci_generic_ecam_ops = {
>>       .bus_shift  = 20,
>> @@ -47,6 +60,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>>           .map_bus                = pci_ecam_map_bus,
>>           .read                   = pci_generic_config_read,
>>           .write                  = pci_generic_config_write,
>> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
>>       }
>>   };
>>   diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>> index 0d271a6e8881..6af845ab9d6c 100644
>> --- a/xen/arch/arm/pci/pci-host-common.c
>> +++ b/xen/arch/arm/pci/pci-host-common.c
>> @@ -22,6 +22,8 @@
>>   #include <xen/sched.h>
>>   #include <xen/vmap.h>
>>   +#include <asm/setup.h>
>> +
>>   /*
>>    * List for all the pci host bridges.
>>    */
>> @@ -316,6 +318,54 @@ unsigned int pci_host_get_num_bridges(void)
>>       return count;
>>   }
>>   +int __init pci_host_bridge_mappings(struct domain *d)
>> +{
>> +    struct pci_host_bridge *bridge;
>> +    struct map_range_data mr_data = {
>> +        .d = d,
>> +        .p2mt = p2m_mmio_direct_dev,
>> +        .skip_mapping = false
>> +    };
>> +
>> +    /*
>> +     * For each PCI host bridge we need to only map those ranges
>> +     * which are used by Domain-0 to properly initialize the bridge,
>> +     * e.g. we do not want to map ECAM configuration space which lives in
>> +     * "reg" device tree property, but we want to map other regions of
>> +     * the host bridge. The PCI aperture defined by the "ranges" device
>> +     * tree property should also be skipped.
>> +     */
>
> This seems to explain the purpose of the function. So I would move the commment on top of pci_host_bridge_mappings().
Makes sense
>
>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>> +    {
>> +        const struct dt_device_node *dev = bridge->dt_node;
>> +        unsigned int i;
>> +
>> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
>> +        {
>> +            uint64_t addr, size;
>> +            int err;
>> +
>> +            err = dt_device_get_address(dev, i, &addr, &size);
>> +            if ( err )
>> +            {
>> +                printk(XENLOG_ERR
>> +                       "Unable to retrieve address range index=%u for %s\n",
>> +                       i, dt_node_full_name(dev));
>> +                return err;
>> +            }
>> +
>> +            if ( bridge->ops->need_p2m_hwdom_mapping(d, bridge, addr) )
>> +            {
>> +                err = map_range_to_domain(dev, addr, size, &mr_data);
>> +                if ( err )
>> +                    return err;
>> +            }
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
>> index 516982bca833..101edb8593c1 100644
>> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
>> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
>> @@ -34,6 +34,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
>>           .map_bus                = pci_ecam_map_bus,
>>           .read                   = pci_generic_config_read,
>>           .write                  = pci_generic_config_write,
>> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
>>       }
>>   };
>>   diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
>> index 969333043431..3d706fdd1d88 100644
>> --- a/xen/include/asm-arm/pci.h
>> +++ b/xen/include/asm-arm/pci.h
>> @@ -17,6 +17,8 @@
>>     #ifdef CONFIG_HAS_PCI
>>   +#include <asm/p2m.h>
>> +
>>   #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
>>     extern bool pci_passthrough_enabled;
>> @@ -73,6 +75,9 @@ struct pci_ops {
>>                   uint32_t reg, uint32_t len, uint32_t *value);
>>       int (*write)(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>>                    uint32_t reg, uint32_t len, uint32_t value);
>> +    bool (*need_p2m_hwdom_mapping)(struct domain *d,
>> +                                   struct pci_host_bridge *bridge,
>> +                                   uint64_t addr);
>>   };
>>     /*
>> @@ -96,6 +101,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>>                                uint32_t reg, uint32_t len, uint32_t value);
>>   void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>                                  pci_sbdf_t sbdf, uint32_t where);
>> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
>> +                                     struct pci_host_bridge *bridge,
>> +                                     uint64_t addr);
>>   struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
>>   struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
>>   int pci_get_host_bridge_segment(const struct dt_device_node *node,
>> @@ -115,6 +123,8 @@ int pci_host_iterate_bridges(struct domain *d,
>>                                           struct pci_host_bridge *bridge));
>>   unsigned int pci_host_get_num_bridges(void);
>>   +int pci_host_bridge_mappings(struct domain *d);
>> +
>>   #else   /*!CONFIG_HAS_PCI*/
>>     struct arch_pci_dev { };
>> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
>> index 95da0b7ab9cd..88d9673db817 100644
>> --- a/xen/include/asm-arm/setup.h
>> +++ b/xen/include/asm-arm/setup.h
>> @@ -2,6 +2,8 @@
>>   #define __ARM_SETUP_H_
>>     #include <public/version.h>
>> +#include <asm/p2m.h>
>> +#include <xen/device_tree.h>
>>     #define MIN_FDT_ALIGN 8
>>   #define MAX_FDT_SIZE SZ_2M
>> @@ -77,6 +79,14 @@ struct bootinfo {
>>   #endif
>>   };
>>   +struct map_range_data
>> +{
>> +    struct domain *d;
>> +    p2m_type_t p2mt;
>> +    /* Set if mapping of the memory ranges must be skipped. */
>> +    bool skip_mapping;
>> +};
>> +
>>   extern struct bootinfo bootinfo;
>>     extern domid_t max_init_domid;
>> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
>>   u32 device_tree_get_u32(const void *fdt, int node,
>>                           const char *prop_name, u32 dflt);
>>   +int map_range_to_domain(const struct dt_device_node *dev,
>> +                        u64 addr, u64 len, void *data);
>> +
>>   #endif
>>   /*
>>    * Local variables:
>>
>
> Cheers,
>
Thank you,
Oleksandr

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2021-11-24  7:42 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-05  6:33 [PATCH v6 0/7] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
2021-11-05  6:33 ` [PATCH v6 1/7] xen/arm: rename DEVICE_PCI to DEVICE_PCI_HOSTBRIDGE Oleksandr Andrushchenko
2021-11-16 18:26   ` Julien Grall
2021-11-05  6:33 ` [PATCH v6 2/7] xen/arm: add pci-domain for disabled devices Oleksandr Andrushchenko
2021-11-16 18:48   ` Julien Grall
2021-11-17  6:56     ` Oleksandr Andrushchenko
2021-11-17 21:33       ` Julien Grall
2021-11-18  7:13         ` Oleksandr Andrushchenko
2021-11-22 15:29           ` Julien Grall
2021-11-22 16:23             ` Oleksandr Andrushchenko
2021-11-22 17:17               ` Julien Grall
2021-11-23  6:31                 ` Oleksandr Andrushchenko
2021-11-23 16:05                   ` Julien Grall
2021-11-23 16:44                     ` Oleksandr Andrushchenko
2021-11-23 17:15                       ` Julien Grall
2021-11-24  6:54                         ` Oleksandr Andrushchenko
2021-11-05  6:33 ` [PATCH v6 3/7] xen/arm: setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
2021-11-09  9:20   ` Oleksandr Andrushchenko
2021-11-16 19:12   ` Julien Grall
2021-11-18  7:27     ` Oleksandr Andrushchenko
2021-11-18 10:46       ` Oleksandr Andrushchenko
2021-11-22 17:36         ` Julien Grall
2021-11-23  6:58           ` Oleksandr Andrushchenko
2021-11-23 16:12             ` Julien Grall
2021-11-23 16:41               ` Oleksandr Andrushchenko
2021-11-23 16:58                 ` Julien Grall
2021-11-24  7:22                   ` Oleksandr Andrushchenko
2021-11-05  6:33 ` [PATCH v6 4/7] xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
2021-11-23 16:42   ` Julien Grall
2021-11-24  7:42     ` Oleksandr Andrushchenko
2021-11-05  6:33 ` [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices Oleksandr Andrushchenko
2021-11-16 19:22   ` Julien Grall
2021-11-18  6:59     ` Oleksandr Andrushchenko
2021-11-22 19:31       ` Julien Grall
2021-11-23  7:23         ` Oleksandr Andrushchenko
2021-11-05  6:33 ` [PATCH v6 6/7] xen/arm: process pending vPCI map/unmap operations Oleksandr Andrushchenko
2021-11-05  7:40   ` Jan Beulich
2021-11-17 21:26   ` Julien Grall
2021-11-05  6:33 ` [PATCH v6 7/7] xen/arm: do not use void pointer in pci_host_common_probe Oleksandr Andrushchenko
2021-11-17 11:12   ` Rahul Singh
2021-11-17 21:45   ` Julien Grall
2021-11-18  7:34     ` Oleksandr Andrushchenko
2021-11-22 17:48       ` Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.