All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/11] PCI devices passthrough on Arm, part 2
@ 2021-09-23 12:54 Oleksandr Andrushchenko
  2021-09-23 12:54 ` [PATCH v2 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
                   ` (10 more replies)
  0 siblings, 11 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Hi, all!

This is an assorted series of patches which aim is to make some further
basis for PCI passthrough on Arm support. The series continues the work
published earlier by Arm [1] and adds new helpers and clears the way for
vPCI changes which will follow.

RFC is at [2], [3]. Design presentation can be found at [4].

Thank you,
Oleksandr

[1] https://patchwork.kernel.org/project/xen-devel/list/?series=550933
[2] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg01184.html
[3] https://lists.xenproject.org/archives/html/xen-devel/2020-07/threads.html#01184
[4] https://static.sched.com/hosted_files/xen2021/e4/PCI_Device_Passthrough_On_Arm.pdf


Oleksandr Andrushchenko (10):
  xen/arm: Fix dev_is_dt macro definition
  xen/arm: Add new device type for PCI
  xen/arm: Introduce pci_find_host_bridge_node helper
  xen/device-tree: Make dt_find_node_by_phandle global
  xen/arm: Mark device as PCI while creating one
  libxl: Allow removing PCI devices for all types of domains
  libxl: Only map legacy PCI IRQs if they are supported
  xen/arm: Setup MMIO range trap handlers for hardware domain
  xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  xen/arm: Process pending vPCI map/unmap operations

Oleksandr Tyshchenko (1):
  xen/domain: Call pci_release_devices() when releasing domain resources

 tools/libs/light/Makefile          |  4 ++
 tools/libs/light/libxl_pci.c       | 15 ++++-
 xen/arch/arm/domain.c              | 11 +++-
 xen/arch/arm/domain_build.c        | 38 ++++++++----
 xen/arch/arm/pci/ecam.c            | 14 +++++
 xen/arch/arm/pci/pci-host-common.c | 93 ++++++++++++++++++++++++++++++
 xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
 xen/arch/arm/pci/pci.c             | 10 ++++
 xen/arch/arm/traps.c               | 13 +++++
 xen/arch/arm/vpci.c                | 33 +++++++++++
 xen/arch/arm/vpci.h                |  6 ++
 xen/arch/x86/hvm/hvm.c             |  6 ++
 xen/common/device_tree.c           |  2 +-
 xen/common/ioreq.c                 |  9 ---
 xen/drivers/passthrough/pci.c      |  3 +
 xen/include/asm-arm/device.h       |  6 +-
 xen/include/asm-arm/pci.h          | 24 ++++++++
 xen/include/asm-arm/setup.h        | 13 +++++
 xen/include/xen/device_tree.h      |  2 +
 19 files changed, 275 insertions(+), 28 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 01/11] xen/arm: Fix dev_is_dt macro definition
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-24 23:39   ` Stefano Stabellini
  2021-09-28  7:31   ` Michal Orzel
  2021-09-23 12:54 ` [PATCH v2 02/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

This macro is not currently used, but still has an error in it:
a missing parenthesis. Fix this, so the macro is properly defined.

Fixes: 6c5d3075d97e ("xen/arm: Introduce a generic way to describe device")

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

---
New in v2
---
 xen/include/asm-arm/device.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index 582119c31ee0..64aaa2641b7f 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -29,7 +29,7 @@ typedef struct device device_t;
 
 /* TODO: Correctly implement dev_is_pci when PCI is supported on ARM */
 #define dev_is_pci(dev) ((void)(dev), 0)
-#define dev_is_dt(dev)  ((dev->type == DEV_DT)
+#define dev_is_dt(dev)  ((dev)->type == DEV_DT)
 
 enum device_class
 {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v2 02/11] xen/arm: Add new device type for PCI
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
  2021-09-23 12:54 ` [PATCH v2 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-24 23:45   ` Stefano Stabellini
  2021-09-27  7:41   ` Jan Beulich
  2021-09-23 12:54 ` [PATCH v2 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Add new device type (DEV_PCI) to distinguish PCI devices from platform
DT devices, so some drivers, like IOMMU, can handle PCI devices
differently.

Also add a helper which is when given a struct device returns the
corresponding struct pci_dev which this device is a part of.

Because of the header cross-dependencies, e.g. we need both
struct pci_dev and struct arch_pci_dev at the same time, this cannot be
done with an inline. Macro can be implemented, but looks scary:

 #define dev_to_pci_dev(dev) container_of((container_of((dev), \
                        struct arch_pci_dev, dev), struct pci_dev, arch)

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

---
Since v1:
 - Folded new device type (DEV_PCI) into this patch.
---
 xen/arch/arm/pci/pci.c       | 10 ++++++++++
 xen/include/asm-arm/device.h |  4 ++--
 xen/include/asm-arm/pci.h    |  7 +++++++
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
index bb15edbccc90..e0420d0d86c1 100644
--- a/xen/arch/arm/pci/pci.c
+++ b/xen/arch/arm/pci/pci.c
@@ -27,6 +27,16 @@ int arch_pci_clean_pirqs(struct domain *d)
     return 0;
 }
 
+struct pci_dev *dev_to_pci(struct device *dev)
+{
+    struct arch_pci_dev *arch_dev;
+
+    ASSERT(dev->type == DEV_PCI);
+
+    arch_dev = container_of((dev), struct arch_pci_dev, dev);
+    return container_of(arch_dev, struct pci_dev, arch);
+}
+
 static int __init dt_pci_init(void)
 {
     struct dt_device_node *np;
diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index 64aaa2641b7f..12de217b36b9 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -4,6 +4,7 @@
 enum device_type
 {
     DEV_DT,
+    DEV_PCI,
 };
 
 struct dev_archdata {
@@ -27,8 +28,7 @@ typedef struct device device_t;
 
 #include <xen/device_tree.h>
 
-/* TODO: Correctly implement dev_is_pci when PCI is supported on ARM */
-#define dev_is_pci(dev) ((void)(dev), 0)
+#define dev_is_pci(dev) ((dev)->type == DEV_PCI)
 #define dev_is_dt(dev)  ((dev)->type == DEV_DT)
 
 enum device_class
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index d2728a098a11..9e366ae67e83 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -27,6 +27,13 @@ struct arch_pci_dev {
     struct device dev;
 };
 
+/*
+ * Because of the header cross-dependencies, e.g. we need both
+ * struct pci_dev and struct arch_pci_dev at the same time, this cannot be
+ * done with an inline here. Macro can be implemented, but looks scary.
+ */
+struct pci_dev *dev_to_pci(struct device *dev);
+
 /* Arch-specific MSI data for vPCI. */
 struct vpci_arch_msi {
 };
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v2 03/11] xen/arm: Introduce pci_find_host_bridge_node helper
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
  2021-09-23 12:54 ` [PATCH v2 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
  2021-09-23 12:54 ` [PATCH v2 02/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-24 23:48   ` Stefano Stabellini
  2021-09-23 12:54 ` [PATCH v2 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Get host bridge node given a PCI device attached to it.

This helper will be re-used for adding PCI devices by the subsequent
patches.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
---
 xen/arch/arm/pci/pci-host-common.c | 17 +++++++++++++++++
 xen/include/asm-arm/pci.h          |  1 +
 2 files changed, 18 insertions(+)

diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index a88f20175ea9..1567b6e2956c 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -283,6 +283,23 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
     return -EINVAL;
 }
 
+/*
+ * Get host bridge node given a device attached to it.
+ */
+struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
+{
+    struct pci_host_bridge *bridge;
+    struct pci_dev *pdev = dev_to_pci(dev);
+
+    bridge = pci_find_host_bridge(pdev->seg, pdev->bus);
+    if ( unlikely(!bridge) )
+    {
+        printk(XENLOG_ERR "Unable to find PCI bridge for "PRI_pci"\n",
+               pdev->seg, pdev->bus, pdev->sbdf.dev, pdev->sbdf.fn);
+        return NULL;
+    }
+    return bridge->dt_node;
+}
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 9e366ae67e83..5b100556225e 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -103,6 +103,7 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
 struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
 int pci_get_host_bridge_segment(const struct dt_device_node *node,
                                 uint16_t *segment);
+struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
 
 static always_inline bool is_pci_passthrough_enabled(void)
 {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v2 04/11] xen/device-tree: Make dt_find_node_by_phandle global
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (2 preceding siblings ...)
  2021-09-23 12:54 ` [PATCH v2 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-24 23:51   ` Stefano Stabellini
  2021-09-23 12:54 ` [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Make dt_find_node_by_phandle globally visible, so it can be re-used by
other frameworks.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
 xen/common/device_tree.c      | 2 +-
 xen/include/xen/device_tree.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index ea93da1725f6..4aae281e89bf 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -1047,7 +1047,7 @@ int dt_for_each_range(const struct dt_device_node *dev,
  *
  * Returns a node pointer.
  */
-static struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle)
+struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle)
 {
     struct dt_device_node *np;
 
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 9069040ef7f7..3334048d3bb5 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -850,6 +850,8 @@ int dt_count_phandle_with_args(const struct dt_device_node *np,
  */
 int dt_get_pci_domain_nr(struct dt_device_node *node);
 
+struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle);
+
 #ifdef CONFIG_DEVICE_TREE_DEBUG
 #define dt_dprintk(fmt, args...)  \
     printk(XENLOG_DEBUG fmt, ## args)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (3 preceding siblings ...)
  2021-09-23 12:54 ` [PATCH v2 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-24 23:54   ` Stefano Stabellini
  2021-09-27  7:45   ` Jan Beulich
  2021-09-23 12:54 ` [PATCH v2 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

While adding a PCI device mark it as such, so other frameworks
can distinguish it form DT devices.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

---
Since v1:
 - Moved the assignment from iommu_add_device to alloc_pdev
---
 xen/drivers/passthrough/pci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 633e89ac1311..fc3469bc12dc 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
     *((u8*) &pdev->bus) = bus;
     *((u8*) &pdev->devfn) = devfn;
     pdev->domain = NULL;
+#ifdef CONFIG_ARM
+    pci_to_dev(pdev)->type = DEV_PCI;
+#endif
 
     rc = pdev_msi_init(pdev);
     if ( rc )
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v2 06/11] xen/domain: Call pci_release_devices() when releasing domain resources
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (4 preceding siblings ...)
  2021-09-23 12:54 ` [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-24 23:57   ` Stefano Stabellini
  2021-09-23 12:54 ` [PATCH v2 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This is the very same that we already do for DT devices. Moreover, x86
already calls pci_release_devices().

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

---
Since v1:
 - re-wording in the commit message
---
 xen/arch/arm/domain.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index f7ed130023d5..854e8fed0393 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -985,7 +985,8 @@ static int relinquish_memory(struct domain *d, struct page_list_head *list)
  * function which may return -ERESTART.
  */
 enum {
-    PROG_tee = 1,
+    PROG_pci = 1,
+    PROG_tee,
     PROG_xen,
     PROG_page,
     PROG_mapping,
@@ -1022,6 +1023,12 @@ int domain_relinquish_resources(struct domain *d)
 #ifdef CONFIG_IOREQ_SERVER
         ioreq_server_destroy_all(d);
 #endif
+#ifdef CONFIG_HAS_PCI
+    PROGRESS(pci):
+        ret = pci_release_devices(d);
+        if ( ret )
+            return ret;
+#endif
 
     PROGRESS(tee):
         ret = tee_relinquish_resources(d);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v2 07/11] libxl: Allow removing PCI devices for all types of domains
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (5 preceding siblings ...)
  2021-09-23 12:54 ` [PATCH v2 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-25  0:00   ` Stefano Stabellini
  2021-09-23 12:54 ` [PATCH v2 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko, Ian Jackson, Juergen Gross

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

The PCI device remove path may now be used by PVH on ARM, so the
assert is no longer valid.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Juergen Gross <jgross@suse.com>
---
 tools/libs/light/libxl_pci.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 1a1c2630803b..59f3686fc85e 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1947,8 +1947,6 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
             goto out_fail;
         }
     } else {
-        assert(type == LIBXL_DOMAIN_TYPE_PV);
-
         char *sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/resource", pci->domain,
                                      pci->bus, pci->dev, pci->func);
         FILE *f = fopen(sysfs_path, "r");
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v2 08/11] libxl: Only map legacy PCI IRQs if they are supported
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (6 preceding siblings ...)
  2021-09-23 12:54 ` [PATCH v2 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-25  0:06   ` Stefano Stabellini
  2021-09-23 12:54 ` [PATCH v2 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko, Ian Jackson, Juergen Gross

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Arm's PCI passthrough implementation doesn't support legacy interrupts,
but MSI/MSI-X. This can be the case for other platforms too.
For that reason introduce a new CONFIG_PCI_SUPP_LEGACY_IRQ and add
it to the CFLAGS and compile the relevant code in the toolstack only if
applicable.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Juergen Gross <jgross@suse.com>

---
Since v1:
 - Minimized #idefery by introducing pci_supp_legacy_irq function
   for relevant checks
---
 tools/libs/light/Makefile    |  4 ++++
 tools/libs/light/libxl_pci.c | 13 +++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
index 7d8c51d49242..bd3f6be2a183 100644
--- a/tools/libs/light/Makefile
+++ b/tools/libs/light/Makefile
@@ -46,6 +46,10 @@ CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
 	-Wno-declaration-after-statement -Wformat-nonliteral
 CFLAGS += -I.
 
+ifeq ($(CONFIG_X86),y)
+CFLAGS += -DCONFIG_PCI_SUPP_LEGACY_IRQ
+endif
+
 SRCS-$(CONFIG_X86) += libxl_cpuid.c
 SRCS-$(CONFIG_X86) += libxl_x86.c
 SRCS-$(CONFIG_X86) += libxl_psr.c
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 59f3686fc85e..4c2d7aeefbb2 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1364,6 +1364,15 @@ static void pci_add_timeout(libxl__egc *egc, libxl__ev_time *ev,
     pci_add_dm_done(egc, pas, rc);
 }
 
+static bool pci_supp_legacy_irq(void)
+{
+#ifdef CONFIG_PCI_SUPP_LEGACY_IRQ
+    return true;
+#else
+    return false;
+#endif
+}
+
 static void pci_add_dm_done(libxl__egc *egc,
                             pci_add_state *pas,
                             int rc)
@@ -1434,6 +1443,8 @@ static void pci_add_dm_done(libxl__egc *egc,
         }
     }
     fclose(f);
+    if (!pci_supp_legacy_irq())
+        goto out_no_irq;
     sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
                                 pci->bus, pci->dev, pci->func);
     f = fopen(sysfs_path, "r");
@@ -1983,6 +1994,8 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
         }
         fclose(f);
 skip1:
+        if (!pci_supp_legacy_irq())
+            goto skip_irq;
         sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
                                pci->bus, pci->dev, pci->func);
         f = fopen(sysfs_path, "r");
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v2 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (7 preceding siblings ...)
  2021-09-23 12:54 ` [PATCH v2 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-25  0:18   ` Stefano Stabellini
  2021-09-23 12:54 ` [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
  2021-09-23 12:54 ` [PATCH v2 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
  10 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

In order for vPCI to work it needs to maintain guest and hardware
domain's views on the configuration space. For example, BARs and
COMMAND registers require emulation for guests and the guest view
on the registers needs to be in sync with the real contents of the
relevant registers. For that ECAM address space needs to also be
trapped for the hardware domain, so we need to implement PCI host
bridge specific callbacks to properly setup MMIO handlers for those
ranges depending on particular host bridge implementation.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

---
Since v1:
 - Dynamically calculate the number of MMIO handlers required for vPCI
   and update the total number accordingly
 - s/clb/cb
 - Do not introduce a new callback for MMIO handler setup
---
 xen/arch/arm/domain.c              |  2 ++
 xen/arch/arm/pci/pci-host-common.c | 28 +++++++++++++++++++++++++
 xen/arch/arm/vpci.c                | 33 ++++++++++++++++++++++++++++++
 xen/arch/arm/vpci.h                |  6 ++++++
 xen/include/asm-arm/pci.h          |  7 +++++++
 5 files changed, 76 insertions(+)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 854e8fed0393..c7b25bc70439 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -733,6 +733,8 @@ int arch_domain_create(struct domain *d,
     if ( (rc = domain_vgic_register(d, &count)) != 0 )
         goto fail;
 
+    count += domain_vpci_get_num_mmio_handlers(d);
+
     if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
         goto fail;
 
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index 1567b6e2956c..155f2a2743af 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -300,6 +300,34 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
     }
     return bridge->dt_node;
 }
+
+int pci_host_iterate_bridges(struct domain *d,
+                             int (*cb)(struct domain *d,
+                                       struct pci_host_bridge *bridge))
+{
+    struct pci_host_bridge *bridge;
+    int err;
+
+    list_for_each_entry( bridge, &pci_host_bridges, node )
+    {
+        err = cb(d, bridge);
+        if ( err )
+            return err;
+    }
+    return 0;
+}
+
+int pci_host_get_num_bridges(void)
+{
+    struct pci_host_bridge *bridge;
+    int count = 0;
+
+    list_for_each_entry( bridge, &pci_host_bridges, node )
+        count++;
+
+    return count;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
index 76c12b92814f..14947e975d69 100644
--- a/xen/arch/arm/vpci.c
+++ b/xen/arch/arm/vpci.c
@@ -80,17 +80,50 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
     .write = vpci_mmio_write,
 };
 
+static int vpci_setup_mmio_handler(struct domain *d,
+                                   struct pci_host_bridge *bridge)
+{
+    struct pci_config_window *cfg = bridge->cfg;
+
+    register_mmio_handler(d, &vpci_mmio_handler,
+                          cfg->phys_addr, cfg->size, NULL);
+    return 0;
+}
+
 int domain_vpci_init(struct domain *d)
 {
     if ( !has_vpci(d) )
         return 0;
 
+    if ( is_hardware_domain(d) )
+        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler);
+
+    /* Guest domains use what is programmed in their device tree. */
     register_mmio_handler(d, &vpci_mmio_handler,
                           GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
 
     return 0;
 }
 
+int domain_vpci_get_num_mmio_handlers(struct domain *d)
+{
+    int count = 0;
+
+    if ( is_hardware_domain(d) )
+        /* For each PCI host bridge's configuration space. */
+        count += pci_host_get_num_bridges();
+    else
+        /*
+         * VPCI_MSIX_MEM_NUM handlers for MSI-X tables per each PCI device
+         * being passed through. Maximum number of supported devices
+         * is 32 as virtual bus topology emulates the devices as embedded
+         * endpoints.
+         * +1 for a single emulated host bridge's configuration space. */
+        count = VPCI_MSIX_MEM_NUM * 32 + 1;
+
+    return count;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h
index d8a7b0e3e802..27a2b069abd2 100644
--- a/xen/arch/arm/vpci.h
+++ b/xen/arch/arm/vpci.h
@@ -17,11 +17,17 @@
 
 #ifdef CONFIG_HAS_VPCI
 int domain_vpci_init(struct domain *d);
+int domain_vpci_get_num_mmio_handlers(struct domain *d);
 #else
 static inline int domain_vpci_init(struct domain *d)
 {
     return 0;
 }
+
+static inline int domain_vpci_get_num_mmio_handlers(struct domain *d)
+{
+    return 0;
+}
 #endif
 
 #endif /* __ARCH_ARM_VPCI_H__ */
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 5b100556225e..7618f0b6725b 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -15,6 +15,8 @@
 #ifndef __ARM_PCI_H__
 #define __ARM_PCI_H__
 
+#include <asm/mmio.h>
+
 #ifdef CONFIG_HAS_PCI
 
 #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
@@ -109,6 +111,11 @@ static always_inline bool is_pci_passthrough_enabled(void)
 {
     return !!pci_passthrough_enabled;
 }
+
+int pci_host_iterate_bridges(struct domain *d,
+                             int (*clb)(struct domain *d,
+                                        struct pci_host_bridge *bridge));
+int pci_host_get_num_bridges(void);
 #else   /*!CONFIG_HAS_PCI*/
 
 #define pci_passthrough_enabled (false)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (8 preceding siblings ...)
  2021-09-23 12:54 ` [PATCH v2 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-25  0:44   ` Stefano Stabellini
  2021-09-23 12:54 ` [PATCH v2 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
  10 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

PCI host bridges are special devices in terms of implementing PCI
passthrough. According to [1] the current implementation depends on
Domain-0 to perform the initialization of the relevant PCI host
bridge hardware and perform PCI device enumeration. In order to
achieve that one of the required changes is to not map all the memory
ranges in map_range_to_domain as we traverse the device tree on startup
and perform some additional checks if the range needs to be mapped to
Domain-0.

The generic PCI host controller device tree binding says [2]:
- ranges: As described in IEEE Std 1275-1994, but must provide
          at least a definition of non-prefetchable memory. One
          or both of prefetchable Memory and IO Space may also
          be provided.

- reg   : The Configuration Space base address and size, as accessed
          from the parent bus.  The base address corresponds to
          the first bus in the "bus-range" property.  If no
          "bus-range" is specified, this will be bus 0 (the default).

From the above none of the memory ranges from the "ranges" property
needs to be mapped to Domain-0 at startup as MMIO mapping is going to
be handled dynamically by vPCI as we assign PCI devices, e.g. each
device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
as needed by Xen.

The "reg" property covers not only ECAM space, but may also have other
then the configuration memory ranges described, for example [3]:
- reg: Should contain rc_dbi, config registers location and length.
- reg-names: Must include the following entries:
   "rc_dbi": controller configuration registers;
   "config": PCIe configuration space registers.

This patch makes it possible to not map all the ranges from the
"ranges" property and also ECAM from the "reg". All the rest from the
"reg" property still needs to be mapped to Domain-0, so the PCI
host bridge remains functional in Domain-0.

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html
[2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt
[3] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

---
Since v1:
 - Added better description of why and what needs to be mapped into
   Domain-0's p2m and what doesn't
 - Do not do any mappings for PCI devices while traversing the DT
 - Walk all the bridges and make required mappings in one go
---
 xen/arch/arm/domain_build.c        | 38 +++++++++++++++--------
 xen/arch/arm/pci/ecam.c            | 14 +++++++++
 xen/arch/arm/pci/pci-host-common.c | 48 ++++++++++++++++++++++++++++++
 xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
 xen/include/asm-arm/pci.h          |  9 ++++++
 xen/include/asm-arm/setup.h        | 13 ++++++++
 6 files changed, 111 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 83ab0d52cce9..e72c1b881cae 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -10,7 +10,6 @@
 #include <asm/regs.h>
 #include <xen/errno.h>
 #include <xen/err.h>
-#include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
 #include <xen/guest_access.h>
 #include <xen/iocap.h>
@@ -47,12 +46,6 @@ static int __init parse_dom0_mem(const char *s)
 }
 custom_param("dom0_mem", parse_dom0_mem);
 
-struct map_range_data
-{
-    struct domain *d;
-    p2m_type_t p2mt;
-};
-
 /* Override macros from asm/page.h to make them work with mfn_t */
 #undef virt_to_mfn
 #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
@@ -1388,9 +1381,8 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
     return 0;
 }
 
-static int __init map_range_to_domain(const struct dt_device_node *dev,
-                                      u64 addr, u64 len,
-                                      void *data)
+int __init map_range_to_domain(const struct dt_device_node *dev,
+                               u64 addr, u64 len, void *data)
 {
     struct map_range_data *mr_data = data;
     struct domain *d = mr_data->d;
@@ -1417,6 +1409,13 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
         }
     }
 
+#ifdef CONFIG_HAS_PCI
+    if ( is_pci_passthrough_enabled() &&
+         (device_get_class(dev) == DEVICE_PCI) &&
+         !mr_data->map_pci_bridge )
+        need_mapping = false;
+#endif
+
     if ( need_mapping )
     {
         res = map_regions_p2mt(d,
@@ -1450,7 +1449,11 @@ static int __init map_device_children(struct domain *d,
                                       const struct dt_device_node *dev,
                                       p2m_type_t p2mt)
 {
-    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
+    struct map_range_data mr_data = {
+        .d = d,
+        .p2mt = p2mt,
+        .map_pci_bridge = false
+    };
     int ret;
 
     if ( dt_device_type_is_equal(dev, "pci") )
@@ -1582,7 +1585,11 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
     /* Give permission and map MMIOs */
     for ( i = 0; i < naddr; i++ )
     {
-        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
+        struct map_range_data mr_data = {
+            .d = d,
+            .p2mt = p2mt,
+            .map_pci_bridge = false
+        };
         res = dt_device_get_address(dev, i, &addr, &size);
         if ( res )
         {
@@ -2754,7 +2761,14 @@ static int __init construct_dom0(struct domain *d)
         return rc;
 
     if ( acpi_disabled )
+    {
         rc = prepare_dtb_hwdom(d, &kinfo);
+#ifdef CONFIG_HAS_PCI
+        if ( rc < 0 )
+            return rc;
+        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
+#endif
+    }
     else
         rc = prepare_acpi(d, &kinfo);
 
diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
index 9b88b1cedaa2..eae177f2cbc2 100644
--- a/xen/arch/arm/pci/ecam.c
+++ b/xen/arch/arm/pci/ecam.c
@@ -39,6 +39,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
     return base + (PCI_DEVFN2(sbdf) << devfn_shift) + where;
 }
 
+bool pci_ecam_need_p2m_mapping(struct domain *d,
+                               struct pci_host_bridge *bridge,
+                               uint64_t addr)
+{
+    struct pci_config_window *cfg = bridge->cfg;
+
+    /*
+     * We do not want ECAM address space to be mapped in Domain-0's p2m,
+     * so we can trap access to it.
+     */
+    return cfg->phys_addr != addr;
+}
+
 /* ECAM ops */
 const struct pci_ecam_ops pci_generic_ecam_ops = {
     .bus_shift  = 20,
@@ -46,6 +59,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
         .map_bus                = pci_ecam_map_bus,
         .read                   = pci_generic_config_read,
         .write                  = pci_generic_config_write,
+        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
     }
 };
 
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index 155f2a2743af..f350826ea26b 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -18,6 +18,7 @@
 
 #include <xen/init.h>
 #include <xen/pci.h>
+#include <asm/setup.h>
 #include <xen/rwlock.h>
 #include <xen/sched.h>
 #include <xen/vmap.h>
@@ -328,6 +329,53 @@ int pci_host_get_num_bridges(void)
     return count;
 }
 
+int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
+{
+    struct pci_host_bridge *bridge;
+    struct map_range_data mr_data = {
+        .d = d,
+        .p2mt = p2mt,
+        .map_pci_bridge = true
+    };
+
+    /*
+     * For each PCI host bridge we need to only map those ranges
+     * which are used by Domain-0 to properly initialize the bridge,
+     * e.g. we do not want to map ECAM configuration space which lives in
+     * "reg" or "assigned-addresses" device tree property.
+     * Neither we want to map any of the MMIO ranges found in the "ranges"
+     * device tree property.
+     */
+    list_for_each_entry( bridge, &pci_host_bridges, node )
+    {
+        const struct dt_device_node *dev = bridge->dt_node;
+        int i;
+
+        for ( i = 0; i < dt_number_of_address(dev); i++ )
+        {
+            uint64_t addr, size;
+            int err;
+
+            err = dt_device_get_address(dev, i, &addr, &size);
+            if ( err )
+            {
+                printk(XENLOG_ERR "Unable to retrieve address %u for %s\n",
+                       i, dt_node_full_name(dev));
+                return err;
+            }
+
+            if ( bridge->ops->need_p2m_mapping(d, bridge, addr) )
+            {
+                err = map_range_to_domain(dev, addr, size, &mr_data);
+                if ( err )
+                    return err;
+            }
+        }
+    }
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
index c27b4ea9f02f..adbe3627871f 100644
--- a/xen/arch/arm/pci/pci-host-zynqmp.c
+++ b/xen/arch/arm/pci/pci-host-zynqmp.c
@@ -33,6 +33,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
         .map_bus                = pci_ecam_map_bus,
         .read                   = pci_generic_config_read,
         .write                  = pci_generic_config_write,
+        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
     }
 };
 
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 7618f0b6725b..b81f66e813ef 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -19,6 +19,8 @@
 
 #ifdef CONFIG_HAS_PCI
 
+#include <asm/p2m.h>
+
 #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
 #define PRI_pci "%04x:%02x:%02x.%u"
 
@@ -79,6 +81,9 @@ struct pci_ops {
                 uint32_t reg, uint32_t len, uint32_t *value);
     int (*write)(struct pci_host_bridge *bridge, uint32_t sbdf,
                  uint32_t reg, uint32_t len, uint32_t value);
+    bool (*need_p2m_mapping)(struct domain *d,
+                             struct pci_host_bridge *bridge,
+                             uint64_t addr);
 };
 
 /*
@@ -102,6 +107,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, uint32_t sbdf,
                             uint32_t reg, uint32_t len, uint32_t value);
 void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
                                uint32_t sbdf, uint32_t where);
+bool pci_ecam_need_p2m_mapping(struct domain *d,
+                               struct pci_host_bridge *bridge,
+                               uint64_t addr);
 struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
 int pci_get_host_bridge_segment(const struct dt_device_node *node,
                                 uint16_t *segment);
@@ -116,6 +124,7 @@ int pci_host_iterate_bridges(struct domain *d,
                              int (*clb)(struct domain *d,
                                         struct pci_host_bridge *bridge));
 int pci_host_get_num_bridges(void);
+int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
 #else   /*!CONFIG_HAS_PCI*/
 
 #define pci_passthrough_enabled (false)
diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
index 95da0b7ab9cd..21863dd2bc58 100644
--- a/xen/include/asm-arm/setup.h
+++ b/xen/include/asm-arm/setup.h
@@ -2,6 +2,8 @@
 #define __ARM_SETUP_H_
 
 #include <public/version.h>
+#include <asm/p2m.h>
+#include <xen/device_tree.h>
 
 #define MIN_FDT_ALIGN 8
 #define MAX_FDT_SIZE SZ_2M
@@ -77,6 +79,14 @@ struct bootinfo {
 #endif
 };
 
+struct map_range_data
+{
+    struct domain *d;
+    p2m_type_t p2mt;
+    /* Set if mappings for PCI host bridges must not be skipped. */
+    bool map_pci_bridge;
+};
+
 extern struct bootinfo bootinfo;
 
 extern domid_t max_init_domid;
@@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
 u32 device_tree_get_u32(const void *fdt, int node,
                         const char *prop_name, u32 dflt);
 
+int map_range_to_domain(const struct dt_device_node *dev,
+                        u64 addr, u64 len, void *data);
+
 #endif
 /*
  * Local variables:
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v2 11/11] xen/arm: Process pending vPCI map/unmap operations
  2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (9 preceding siblings ...)
  2021-09-23 12:54 ` [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
@ 2021-09-23 12:54 ` Oleksandr Andrushchenko
  2021-09-25  1:20   ` Stefano Stabellini
  2021-09-27  8:06   ` Jan Beulich
  10 siblings, 2 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-23 12:54 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

vPCI may map and unmap PCI device memory (BARs) being passed through which
may take a lot of time. For this those operations may be deferred to be
performed later, so that they can be safely preempted.
Run the corresponding vPCI code while switching a vCPU.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

---
Since v1:
 - Moved the check for pending vpci work from the common IOREQ code
   to hvm_do_resume on x86
 - Re-worked the code for Arm to ensure we don't miss pending vPCI work
---
 xen/arch/arm/traps.c   | 13 +++++++++++++
 xen/arch/x86/hvm/hvm.c |  6 ++++++
 xen/common/ioreq.c     |  9 ---------
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 219ab3c3fbde..b246f51086e3 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -34,6 +34,7 @@
 #include <xen/symbols.h>
 #include <xen/version.h>
 #include <xen/virtual_region.h>
+#include <xen/vpci.h>
 
 #include <public/sched.h>
 #include <public/xen.h>
@@ -2304,6 +2305,18 @@ static bool check_for_vcpu_work(void)
     }
 #endif
 
+    if ( has_vpci(v->domain) )
+    {
+        bool pending;
+
+        local_irq_enable();
+        pending = vpci_process_pending(v);
+        local_irq_disable();
+
+        if ( pending )
+            return true;
+    }
+
     if ( likely(!v->arch.need_flush_to_ram) )
         return false;
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 7b48a1b925bb..d32f5d572941 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -549,6 +549,12 @@ void hvm_do_resume(struct vcpu *v)
     if ( !vcpu_ioreq_handle_completion(v) )
         return;
 
+    if ( has_vpci(v->domain) && vpci_process_pending(v) )
+    {
+        raise_softirq(SCHEDULE_SOFTIRQ);
+        return;
+    }
+
     if ( unlikely(v->arch.vm_event) )
         hvm_vm_event_do_resume(v);
 
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index d732dc045df9..689d256544c8 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -25,9 +25,7 @@
 #include <xen/lib.h>
 #include <xen/paging.h>
 #include <xen/sched.h>
-#include <xen/softirq.h>
 #include <xen/trace.h>
-#include <xen/vpci.h>
 
 #include <asm/guest_atomics.h>
 #include <asm/ioreq.h>
@@ -212,19 +210,12 @@ static bool wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
 
 bool vcpu_ioreq_handle_completion(struct vcpu *v)
 {
-    struct domain *d = v->domain;
     struct vcpu_io *vio = &v->io;
     struct ioreq_server *s;
     struct ioreq_vcpu *sv;
     enum vio_completion completion;
     bool res = true;
 
-    if ( has_vpci(d) && vpci_process_pending(v) )
-    {
-        raise_softirq(SCHEDULE_SOFTIRQ);
-        return false;
-    }
-
     while ( (sv = get_pending_vcpu(v, &s)) != NULL )
         if ( !wait_for_io(sv, get_ioreq(s, v)) )
             return false;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/11] xen/arm: Fix dev_is_dt macro definition
  2021-09-23 12:54 ` [PATCH v2 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
@ 2021-09-24 23:39   ` Stefano Stabellini
  2021-09-28  7:31   ` Michal Orzel
  1 sibling, 0 replies; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-24 23:39 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> This macro is not currently used, but still has an error in it:
> a missing parenthesis. Fix this, so the macro is properly defined.
> 
> Fixes: 6c5d3075d97e ("xen/arm: Introduce a generic way to describe device")
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> New in v2
> ---
>  xen/include/asm-arm/device.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
> index 582119c31ee0..64aaa2641b7f 100644
> --- a/xen/include/asm-arm/device.h
> +++ b/xen/include/asm-arm/device.h
> @@ -29,7 +29,7 @@ typedef struct device device_t;
>  
>  /* TODO: Correctly implement dev_is_pci when PCI is supported on ARM */
>  #define dev_is_pci(dev) ((void)(dev), 0)
> -#define dev_is_dt(dev)  ((dev->type == DEV_DT)
> +#define dev_is_dt(dev)  ((dev)->type == DEV_DT)
>  
>  enum device_class
>  {
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 02/11] xen/arm: Add new device type for PCI
  2021-09-23 12:54 ` [PATCH v2 02/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
@ 2021-09-24 23:45   ` Stefano Stabellini
  2021-09-27  7:41   ` Jan Beulich
  1 sibling, 0 replies; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-24 23:45 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Add new device type (DEV_PCI) to distinguish PCI devices from platform
> DT devices, so some drivers, like IOMMU, can handle PCI devices
> differently.
> 
> Also add a helper which is when given a struct device returns the
> corresponding struct pci_dev which this device is a part of.
> 
> Because of the header cross-dependencies, e.g. we need both
> struct pci_dev and struct arch_pci_dev at the same time, this cannot be
> done with an inline. Macro can be implemented, but looks scary:
> 
>  #define dev_to_pci_dev(dev) container_of((container_of((dev), \
>                         struct arch_pci_dev, dev), struct pci_dev, arch)
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Since v1:
>  - Folded new device type (DEV_PCI) into this patch.
> ---
>  xen/arch/arm/pci/pci.c       | 10 ++++++++++
>  xen/include/asm-arm/device.h |  4 ++--
>  xen/include/asm-arm/pci.h    |  7 +++++++
>  3 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
> index bb15edbccc90..e0420d0d86c1 100644
> --- a/xen/arch/arm/pci/pci.c
> +++ b/xen/arch/arm/pci/pci.c
> @@ -27,6 +27,16 @@ int arch_pci_clean_pirqs(struct domain *d)
>      return 0;
>  }
>  
> +struct pci_dev *dev_to_pci(struct device *dev)
> +{
> +    struct arch_pci_dev *arch_dev;
> +
> +    ASSERT(dev->type == DEV_PCI);
> +
> +    arch_dev = container_of((dev), struct arch_pci_dev, dev);
> +    return container_of(arch_dev, struct pci_dev, arch);
> +}
> +
>  static int __init dt_pci_init(void)
>  {
>      struct dt_device_node *np;
> diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
> index 64aaa2641b7f..12de217b36b9 100644
> --- a/xen/include/asm-arm/device.h
> +++ b/xen/include/asm-arm/device.h
> @@ -4,6 +4,7 @@
>  enum device_type
>  {
>      DEV_DT,
> +    DEV_PCI,
>  };
>  
>  struct dev_archdata {
> @@ -27,8 +28,7 @@ typedef struct device device_t;
>  
>  #include <xen/device_tree.h>
>  
> -/* TODO: Correctly implement dev_is_pci when PCI is supported on ARM */
> -#define dev_is_pci(dev) ((void)(dev), 0)
> +#define dev_is_pci(dev) ((dev)->type == DEV_PCI)
>  #define dev_is_dt(dev)  ((dev)->type == DEV_DT)
>  
>  enum device_class
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index d2728a098a11..9e366ae67e83 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -27,6 +27,13 @@ struct arch_pci_dev {
>      struct device dev;
>  };
>  
> +/*
> + * Because of the header cross-dependencies, e.g. we need both
> + * struct pci_dev and struct arch_pci_dev at the same time, this cannot be
> + * done with an inline here. Macro can be implemented, but looks scary.
> + */
> +struct pci_dev *dev_to_pci(struct device *dev);
> +
>  /* Arch-specific MSI data for vPCI. */
>  struct vpci_arch_msi {
>  };
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 03/11] xen/arm: Introduce pci_find_host_bridge_node helper
  2021-09-23 12:54 ` [PATCH v2 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
@ 2021-09-24 23:48   ` Stefano Stabellini
  0 siblings, 0 replies; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-24 23:48 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Get host bridge node given a PCI device attached to it.
> 
> This helper will be re-used for adding PCI devices by the subsequent
> patches.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  xen/arch/arm/pci/pci-host-common.c | 17 +++++++++++++++++
>  xen/include/asm-arm/pci.h          |  1 +
>  2 files changed, 18 insertions(+)
> 
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index a88f20175ea9..1567b6e2956c 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -283,6 +283,23 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
>      return -EINVAL;
>  }
>  
> +/*
> + * Get host bridge node given a device attached to it.
> + */
> +struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
> +{
> +    struct pci_host_bridge *bridge;
> +    struct pci_dev *pdev = dev_to_pci(dev);
> +
> +    bridge = pci_find_host_bridge(pdev->seg, pdev->bus);
> +    if ( unlikely(!bridge) )
> +    {
> +        printk(XENLOG_ERR "Unable to find PCI bridge for "PRI_pci"\n",
> +               pdev->seg, pdev->bus, pdev->sbdf.dev, pdev->sbdf.fn);
> +        return NULL;
> +    }
> +    return bridge->dt_node;
> +}
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index 9e366ae67e83..5b100556225e 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -103,6 +103,7 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>  struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
>  int pci_get_host_bridge_segment(const struct dt_device_node *node,
>                                  uint16_t *segment);
> +struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
>  
>  static always_inline bool is_pci_passthrough_enabled(void)
>  {
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 04/11] xen/device-tree: Make dt_find_node_by_phandle global
  2021-09-23 12:54 ` [PATCH v2 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
@ 2021-09-24 23:51   ` Stefano Stabellini
  0 siblings, 0 replies; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-24 23:51 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Make dt_find_node_by_phandle globally visible, so it can be re-used by
> other frameworks.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  xen/common/device_tree.c      | 2 +-
>  xen/include/xen/device_tree.h | 2 ++
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index ea93da1725f6..4aae281e89bf 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -1047,7 +1047,7 @@ int dt_for_each_range(const struct dt_device_node *dev,
>   *
>   * Returns a node pointer.
>   */
> -static struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle)
> +struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle)
>  {
>      struct dt_device_node *np;
>  
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index 9069040ef7f7..3334048d3bb5 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -850,6 +850,8 @@ int dt_count_phandle_with_args(const struct dt_device_node *np,
>   */
>  int dt_get_pci_domain_nr(struct dt_device_node *node);
>  
> +struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle);
> +
>  #ifdef CONFIG_DEVICE_TREE_DEBUG
>  #define dt_dprintk(fmt, args...)  \
>      printk(XENLOG_DEBUG fmt, ## args)
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-23 12:54 ` [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
@ 2021-09-24 23:54   ` Stefano Stabellini
  2021-09-27  7:13     ` Oleksandr Andrushchenko
  2021-09-27  7:45   ` Jan Beulich
  1 sibling, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-24 23:54 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> While adding a PCI device mark it as such, so other frameworks
> can distinguish it form DT devices.
                     ^ from

> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Since v1:
>  - Moved the assignment from iommu_add_device to alloc_pdev
> ---
>  xen/drivers/passthrough/pci.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index 633e89ac1311..fc3469bc12dc 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>      *((u8*) &pdev->bus) = bus;
>      *((u8*) &pdev->devfn) = devfn;
>      pdev->domain = NULL;
> +#ifdef CONFIG_ARM
> +    pci_to_dev(pdev)->type = DEV_PCI;
> +#endif
>  
>      rc = pdev_msi_init(pdev);
>      if ( rc )
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 06/11] xen/domain: Call pci_release_devices() when releasing domain resources
  2021-09-23 12:54 ` [PATCH v2 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
@ 2021-09-24 23:57   ` Stefano Stabellini
  0 siblings, 0 replies; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-24 23:57 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This is the very same that we already do for DT devices. Moreover, x86
> already calls pci_release_devices().
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Since v1:
>  - re-wording in the commit message
> ---
>  xen/arch/arm/domain.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index f7ed130023d5..854e8fed0393 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -985,7 +985,8 @@ static int relinquish_memory(struct domain *d, struct page_list_head *list)
>   * function which may return -ERESTART.
>   */
>  enum {
> -    PROG_tee = 1,
> +    PROG_pci = 1,
> +    PROG_tee,
>      PROG_xen,
>      PROG_page,
>      PROG_mapping,
> @@ -1022,6 +1023,12 @@ int domain_relinquish_resources(struct domain *d)
>  #ifdef CONFIG_IOREQ_SERVER
>          ioreq_server_destroy_all(d);
>  #endif
> +#ifdef CONFIG_HAS_PCI
> +    PROGRESS(pci):
> +        ret = pci_release_devices(d);
> +        if ( ret )
> +            return ret;
> +#endif
>  
>      PROGRESS(tee):
>          ret = tee_relinquish_resources(d);
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 07/11] libxl: Allow removing PCI devices for all types of domains
  2021-09-23 12:54 ` [PATCH v2 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
@ 2021-09-25  0:00   ` Stefano Stabellini
  2021-09-27  7:17     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-25  0:00 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko, Ian Jackson, Juergen Gross,
	jbeulich, roger.pau, andrew.cooper3

+x86 maintainers

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> The PCI device remove path may now be used by PVH on ARM, so the
> assert is no longer valid.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Cc: Ian Jackson <iwj@xenproject.org>
> Cc: Juergen Gross <jgross@suse.com>
>
> ---
>  tools/libs/light/libxl_pci.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 1a1c2630803b..59f3686fc85e 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1947,8 +1947,6 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
>              goto out_fail;
>          }
>      } else {
> -        assert(type == LIBXL_DOMAIN_TYPE_PV);

This is fine for ARM, but is it OK from an x86 point of view considering
the PVH implications?


>          char *sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/resource", pci->domain,
>                                       pci->bus, pci->dev, pci->func);
>          FILE *f = fopen(sysfs_path, "r");
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 08/11] libxl: Only map legacy PCI IRQs if they are supported
  2021-09-23 12:54 ` [PATCH v2 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
@ 2021-09-25  0:06   ` Stefano Stabellini
  2021-09-27 13:02     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-25  0:06 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko, Ian Jackson, Juergen Gross,
	jbeulich, roger.pau, andrew.cooper3

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Arm's PCI passthrough implementation doesn't support legacy interrupts,
> but MSI/MSI-X. This can be the case for other platforms too.
> For that reason introduce a new CONFIG_PCI_SUPP_LEGACY_IRQ and add
> it to the CFLAGS and compile the relevant code in the toolstack only if
> applicable.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Cc: Ian Jackson <iwj@xenproject.org>
> Cc: Juergen Gross <jgross@suse.com>
>
> ---
> Since v1:
>  - Minimized #idefery by introducing pci_supp_legacy_irq function
>    for relevant checks
> ---
>  tools/libs/light/Makefile    |  4 ++++
>  tools/libs/light/libxl_pci.c | 13 +++++++++++++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
> index 7d8c51d49242..bd3f6be2a183 100644
> --- a/tools/libs/light/Makefile
> +++ b/tools/libs/light/Makefile
> @@ -46,6 +46,10 @@ CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
>  	-Wno-declaration-after-statement -Wformat-nonliteral
>  CFLAGS += -I.
>  
> +ifeq ($(CONFIG_X86),y)
> +CFLAGS += -DCONFIG_PCI_SUPP_LEGACY_IRQ
> +endif

This patch is a lot better than the previous version, thanks!

I think the usage of pci_supp_legacy_irq below is good and we can't do
better than that.

I wonder if there is a better way than the above to export
CONFIG_PCI_SUPP_LEGACY_IRQ. Suggestions?


>  SRCS-$(CONFIG_X86) += libxl_cpuid.c
>  SRCS-$(CONFIG_X86) += libxl_x86.c
>  SRCS-$(CONFIG_X86) += libxl_psr.c
> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 59f3686fc85e..4c2d7aeefbb2 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1364,6 +1364,15 @@ static void pci_add_timeout(libxl__egc *egc, libxl__ev_time *ev,
>      pci_add_dm_done(egc, pas, rc);
>  }
>  
> +static bool pci_supp_legacy_irq(void)
> +{
> +#ifdef CONFIG_PCI_SUPP_LEGACY_IRQ
> +    return true;
> +#else
> +    return false;
> +#endif
> +}
> +
>  static void pci_add_dm_done(libxl__egc *egc,
>                              pci_add_state *pas,
>                              int rc)
> @@ -1434,6 +1443,8 @@ static void pci_add_dm_done(libxl__egc *egc,
>          }
>      }
>      fclose(f);
> +    if (!pci_supp_legacy_irq())
> +        goto out_no_irq;
>      sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>                                  pci->bus, pci->dev, pci->func);
>      f = fopen(sysfs_path, "r");
> @@ -1983,6 +1994,8 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
>          }
>          fclose(f);
>  skip1:
> +        if (!pci_supp_legacy_irq())
> +            goto skip_irq;
>          sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>                                 pci->bus, pci->dev, pci->func);
>          f = fopen(sysfs_path, "r");
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
  2021-09-23 12:54 ` [PATCH v2 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
@ 2021-09-25  0:18   ` Stefano Stabellini
  2021-09-27  8:47     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-25  0:18 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> In order for vPCI to work it needs to maintain guest and hardware
> domain's views on the configuration space. For example, BARs and
                 ^ of

> COMMAND registers require emulation for guests and the guest view
> on the registers needs to be in sync with the real contents of the
  ^ of

> relevant registers. For that ECAM address space needs to also be
> trapped for the hardware domain, so we need to implement PCI host
> bridge specific callbacks to properly setup MMIO handlers for those
> ranges depending on particular host bridge implementation.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

The patch looks pretty good, only a couple of minor comments below


> ---
> Since v1:
>  - Dynamically calculate the number of MMIO handlers required for vPCI
>    and update the total number accordingly
>  - s/clb/cb
>  - Do not introduce a new callback for MMIO handler setup
> ---
>  xen/arch/arm/domain.c              |  2 ++
>  xen/arch/arm/pci/pci-host-common.c | 28 +++++++++++++++++++++++++
>  xen/arch/arm/vpci.c                | 33 ++++++++++++++++++++++++++++++
>  xen/arch/arm/vpci.h                |  6 ++++++
>  xen/include/asm-arm/pci.h          |  7 +++++++
>  5 files changed, 76 insertions(+)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 854e8fed0393..c7b25bc70439 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -733,6 +733,8 @@ int arch_domain_create(struct domain *d,
>      if ( (rc = domain_vgic_register(d, &count)) != 0 )
>          goto fail;
>  
> +    count += domain_vpci_get_num_mmio_handlers(d);
> +
>      if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
>          goto fail;
>  
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index 1567b6e2956c..155f2a2743af 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -300,6 +300,34 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
>      }
>      return bridge->dt_node;
>  }
> +
> +int pci_host_iterate_bridges(struct domain *d,
> +                             int (*cb)(struct domain *d,
> +                                       struct pci_host_bridge *bridge))
> +{
> +    struct pci_host_bridge *bridge;
> +    int err;
> +
> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +    {
> +        err = cb(d, bridge);
> +        if ( err )
> +            return err;
> +    }
> +    return 0;
> +}
> +
> +int pci_host_get_num_bridges(void)
> +{
> +    struct pci_host_bridge *bridge;
> +    int count = 0;
> +
> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +        count++;
> +
> +    return count;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
> index 76c12b92814f..14947e975d69 100644
> --- a/xen/arch/arm/vpci.c
> +++ b/xen/arch/arm/vpci.c
> @@ -80,17 +80,50 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
>      .write = vpci_mmio_write,
>  };
>  
> +static int vpci_setup_mmio_handler(struct domain *d,
> +                                   struct pci_host_bridge *bridge)
> +{
> +    struct pci_config_window *cfg = bridge->cfg;
> +
> +    register_mmio_handler(d, &vpci_mmio_handler,
> +                          cfg->phys_addr, cfg->size, NULL);
> +    return 0;
> +}
> +
>  int domain_vpci_init(struct domain *d)
>  {
>      if ( !has_vpci(d) )
>          return 0;
>  
> +    if ( is_hardware_domain(d) )
> +        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler);
> +
> +    /* Guest domains use what is programmed in their device tree. */
>      register_mmio_handler(d, &vpci_mmio_handler,
>                            GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
>  
>      return 0;
>  }
>  
> +int domain_vpci_get_num_mmio_handlers(struct domain *d)
> +{
> +    int count = 0;
> +
> +    if ( is_hardware_domain(d) )
> +        /* For each PCI host bridge's configuration space. */
> +        count += pci_host_get_num_bridges();

NIT: why += instead of = ?


> +    else
> +        /*
> +         * VPCI_MSIX_MEM_NUM handlers for MSI-X tables per each PCI device
> +         * being passed through. Maximum number of supported devices
> +         * is 32 as virtual bus topology emulates the devices as embedded
> +         * endpoints.
> +         * +1 for a single emulated host bridge's configuration space. */
> +        count = VPCI_MSIX_MEM_NUM * 32 + 1;
> +
> +    return count;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h
> index d8a7b0e3e802..27a2b069abd2 100644
> --- a/xen/arch/arm/vpci.h
> +++ b/xen/arch/arm/vpci.h
> @@ -17,11 +17,17 @@
>  
>  #ifdef CONFIG_HAS_VPCI
>  int domain_vpci_init(struct domain *d);
> +int domain_vpci_get_num_mmio_handlers(struct domain *d);
>  #else
>  static inline int domain_vpci_init(struct domain *d)
>  {
>      return 0;
>  }
> +
> +static inline int domain_vpci_get_num_mmio_handlers(struct domain *d)
> +{
> +    return 0;
> +}
>  #endif
>  
>  #endif /* __ARCH_ARM_VPCI_H__ */
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index 5b100556225e..7618f0b6725b 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -15,6 +15,8 @@
>  #ifndef __ARM_PCI_H__
>  #define __ARM_PCI_H__
>  
> +#include <asm/mmio.h>

This should be no longer required.


>  #ifdef CONFIG_HAS_PCI
>  
>  #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
> @@ -109,6 +111,11 @@ static always_inline bool is_pci_passthrough_enabled(void)
>  {
>      return !!pci_passthrough_enabled;
>  }
> +
> +int pci_host_iterate_bridges(struct domain *d,
> +                             int (*clb)(struct domain *d,
> +                                        struct pci_host_bridge *bridge));
> +int pci_host_get_num_bridges(void);
>  #else   /*!CONFIG_HAS_PCI*/
>  
>  #define pci_passthrough_enabled (false)



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-09-23 12:54 ` [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
@ 2021-09-25  0:44   ` Stefano Stabellini
  2021-09-27 12:44     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-25  0:44 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> PCI host bridges are special devices in terms of implementing PCI
> passthrough. According to [1] the current implementation depends on
> Domain-0 to perform the initialization of the relevant PCI host
> bridge hardware and perform PCI device enumeration. In order to
> achieve that one of the required changes is to not map all the memory
> ranges in map_range_to_domain as we traverse the device tree on startup
> and perform some additional checks if the range needs to be mapped to
> Domain-0.
> 
> The generic PCI host controller device tree binding says [2]:
> - ranges: As described in IEEE Std 1275-1994, but must provide
>           at least a definition of non-prefetchable memory. One
>           or both of prefetchable Memory and IO Space may also
>           be provided.
> 
> - reg   : The Configuration Space base address and size, as accessed
>           from the parent bus.  The base address corresponds to
>           the first bus in the "bus-range" property.  If no
>           "bus-range" is specified, this will be bus 0 (the default).
> 
> >From the above none of the memory ranges from the "ranges" property
> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
> be handled dynamically by vPCI as we assign PCI devices, e.g. each
> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
> as needed by Xen.
> 
> The "reg" property covers not only ECAM space, but may also have other
> then the configuration memory ranges described, for example [3]:
> - reg: Should contain rc_dbi, config registers location and length.
> - reg-names: Must include the following entries:
>    "rc_dbi": controller configuration registers;
>    "config": PCIe configuration space registers.
> 
> This patch makes it possible to not map all the ranges from the
> "ranges" property and also ECAM from the "reg". All the rest from the
> "reg" property still needs to be mapped to Domain-0, so the PCI
> host bridge remains functional in Domain-0.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html
> [2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt
> [3] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> ---
> Since v1:
>  - Added better description of why and what needs to be mapped into
>    Domain-0's p2m and what doesn't
>  - Do not do any mappings for PCI devices while traversing the DT
>  - Walk all the bridges and make required mappings in one go
> ---
>  xen/arch/arm/domain_build.c        | 38 +++++++++++++++--------
>  xen/arch/arm/pci/ecam.c            | 14 +++++++++
>  xen/arch/arm/pci/pci-host-common.c | 48 ++++++++++++++++++++++++++++++
>  xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
>  xen/include/asm-arm/pci.h          |  9 ++++++
>  xen/include/asm-arm/setup.h        | 13 ++++++++
>  6 files changed, 111 insertions(+), 12 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 83ab0d52cce9..e72c1b881cae 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -10,7 +10,6 @@
>  #include <asm/regs.h>
>  #include <xen/errno.h>
>  #include <xen/err.h>
> -#include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <xen/guest_access.h>
>  #include <xen/iocap.h>
> @@ -47,12 +46,6 @@ static int __init parse_dom0_mem(const char *s)
>  }
>  custom_param("dom0_mem", parse_dom0_mem);
>  
> -struct map_range_data
> -{
> -    struct domain *d;
> -    p2m_type_t p2mt;
> -};
> -
>  /* Override macros from asm/page.h to make them work with mfn_t */
>  #undef virt_to_mfn
>  #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
> @@ -1388,9 +1381,8 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>      return 0;
>  }
>  
> -static int __init map_range_to_domain(const struct dt_device_node *dev,
> -                                      u64 addr, u64 len,
> -                                      void *data)
> +int __init map_range_to_domain(const struct dt_device_node *dev,
> +                               u64 addr, u64 len, void *data)
>  {
>      struct map_range_data *mr_data = data;
>      struct domain *d = mr_data->d;
> @@ -1417,6 +1409,13 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>          }
>      }
>  
> +#ifdef CONFIG_HAS_PCI
> +    if ( is_pci_passthrough_enabled() &&
> +         (device_get_class(dev) == DEVICE_PCI) &&
> +         !mr_data->map_pci_bridge )
> +        need_mapping = false;
> +#endif

With the change I suggested below turning map_pci_bridge into
skip_mapping, then this check could go away if we just set need_mapping
as follows:

bool need_mapping = !dt_device_for_passthrough(dev) &&
                    !mr_data->skip_mapping;


>      if ( need_mapping )
>      {
>          res = map_regions_p2mt(d,
> @@ -1450,7 +1449,11 @@ static int __init map_device_children(struct domain *d,
>                                        const struct dt_device_node *dev,
>                                        p2m_type_t p2mt)
>  {
> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> +    struct map_range_data mr_data = {
> +        .d = d,
> +        .p2mt = p2mt,
> +        .map_pci_bridge = false
> +    };
>      int ret;
>  
>      if ( dt_device_type_is_equal(dev, "pci") )
> @@ -1582,7 +1585,11 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>      /* Give permission and map MMIOs */
>      for ( i = 0; i < naddr; i++ )
>      {
> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> +        struct map_range_data mr_data = {
> +            .d = d,
> +            .p2mt = p2mt,
> +            .map_pci_bridge = false
> +        };
>          res = dt_device_get_address(dev, i, &addr, &size);
>          if ( res )
>          {
> @@ -2754,7 +2761,14 @@ static int __init construct_dom0(struct domain *d)
>          return rc;
>  
>      if ( acpi_disabled )
> +    {
>          rc = prepare_dtb_hwdom(d, &kinfo);
> +#ifdef CONFIG_HAS_PCI
> +        if ( rc < 0 )
> +            return rc;

This doesn't look great :-)

I would move the call to pci_host_bridge_mappings() below just before
construct_domain.


> +        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
> +#endif
> +    }
>      else
>          rc = prepare_acpi(d, &kinfo);
>  
> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> index 9b88b1cedaa2..eae177f2cbc2 100644
> --- a/xen/arch/arm/pci/ecam.c
> +++ b/xen/arch/arm/pci/ecam.c
> @@ -39,6 +39,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>      return base + (PCI_DEVFN2(sbdf) << devfn_shift) + where;
>  }
>  
> +bool pci_ecam_need_p2m_mapping(struct domain *d,
> +                               struct pci_host_bridge *bridge,
> +                               uint64_t addr)
> +{
> +    struct pci_config_window *cfg = bridge->cfg;
> +
> +    /*
> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
> +     * so we can trap access to it.
> +     */
> +    return cfg->phys_addr != addr;
> +}
> +
>  /* ECAM ops */
>  const struct pci_ecam_ops pci_generic_ecam_ops = {
>      .bus_shift  = 20,
> @@ -46,6 +59,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>          .map_bus                = pci_ecam_map_bus,
>          .read                   = pci_generic_config_read,
>          .write                  = pci_generic_config_write,
> +        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
>      }
>  };
>  
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index 155f2a2743af..f350826ea26b 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -18,6 +18,7 @@
>  
>  #include <xen/init.h>
>  #include <xen/pci.h>
> +#include <asm/setup.h>
>  #include <xen/rwlock.h>
>  #include <xen/sched.h>
>  #include <xen/vmap.h>
> @@ -328,6 +329,53 @@ int pci_host_get_num_bridges(void)
>      return count;
>  }
>  
> +int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
> +{
> +    struct pci_host_bridge *bridge;
> +    struct map_range_data mr_data = {
> +        .d = d,
> +        .p2mt = p2mt,
> +        .map_pci_bridge = true
> +    };
> +
> +    /*
> +     * For each PCI host bridge we need to only map those ranges
> +     * which are used by Domain-0 to properly initialize the bridge,
> +     * e.g. we do not want to map ECAM configuration space which lives in
> +     * "reg" or "assigned-addresses" device tree property.
> +     * Neither we want to map any of the MMIO ranges found in the "ranges"
> +     * device tree property.
> +     */
> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +    {
> +        const struct dt_device_node *dev = bridge->dt_node;
> +        int i;

i should be unsigned int


> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
> +        {
> +            uint64_t addr, size;
> +            int err;
> +
> +            err = dt_device_get_address(dev, i, &addr, &size);
> +            if ( err )
> +            {
> +                printk(XENLOG_ERR "Unable to retrieve address %u for %s\n",

Maybe rephrase it to:

    Unable to retrieve address range index=%u for %s


> +                       i, dt_node_full_name(dev));
> +                return err;
> +            }
> +
> +            if ( bridge->ops->need_p2m_mapping(d, bridge, addr) )

The current implementation of need_p2m_mapping filters out ECAM and
nothing else. Just double-checking here: is there anything else we
should filter out? Looking at the device tree pci node for ZynqMP:

			reg = <0x0 0xfd0e0000 0x0 0x1000 0x0 0xfd480000 0x0 0x1000 0x80 0x0 0x0 0x1000000>;
			reg-names = "breg", "pcireg", "cfg";

We are filtering out cfg, but do we need both "breg" and "pcireg" here?

If not, do we need another function like .cfg_reg_index to know what we
actually need to map?


> +            {
> +                err = map_range_to_domain(dev, addr, size, &mr_data);
> +                if ( err )
> +                    return err;
> +            }
> +        }
> +    }
> +
> +    return 0;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
> index c27b4ea9f02f..adbe3627871f 100644
> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
> @@ -33,6 +33,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
>          .map_bus                = pci_ecam_map_bus,
>          .read                   = pci_generic_config_read,
>          .write                  = pci_generic_config_write,
> +        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
>      }
>  };
>  
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index 7618f0b6725b..b81f66e813ef 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -19,6 +19,8 @@
>  
>  #ifdef CONFIG_HAS_PCI
>  
> +#include <asm/p2m.h>
> +
>  #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
>  #define PRI_pci "%04x:%02x:%02x.%u"
>  
> @@ -79,6 +81,9 @@ struct pci_ops {
>                  uint32_t reg, uint32_t len, uint32_t *value);
>      int (*write)(struct pci_host_bridge *bridge, uint32_t sbdf,
>                   uint32_t reg, uint32_t len, uint32_t value);
> +    bool (*need_p2m_mapping)(struct domain *d,
> +                             struct pci_host_bridge *bridge,
> +                             uint64_t addr);

I would call this function: need_p2m_hwdom_mapping


>  };
>  
>  /*
> @@ -102,6 +107,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, uint32_t sbdf,
>                              uint32_t reg, uint32_t len, uint32_t value);
>  void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>                                 uint32_t sbdf, uint32_t where);
> +bool pci_ecam_need_p2m_mapping(struct domain *d,
> +                               struct pci_host_bridge *bridge,
> +                               uint64_t addr);
>  struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
>  int pci_get_host_bridge_segment(const struct dt_device_node *node,
>                                  uint16_t *segment);
> @@ -116,6 +124,7 @@ int pci_host_iterate_bridges(struct domain *d,
>                               int (*clb)(struct domain *d,
>                                          struct pci_host_bridge *bridge));
>  int pci_host_get_num_bridges(void);
> +int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
>  #else   /*!CONFIG_HAS_PCI*/
>  
>  #define pci_passthrough_enabled (false)
> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
> index 95da0b7ab9cd..21863dd2bc58 100644
> --- a/xen/include/asm-arm/setup.h
> +++ b/xen/include/asm-arm/setup.h
> @@ -2,6 +2,8 @@
>  #define __ARM_SETUP_H_
>  
>  #include <public/version.h>
> +#include <asm/p2m.h>
> +#include <xen/device_tree.h>
>  
>  #define MIN_FDT_ALIGN 8
>  #define MAX_FDT_SIZE SZ_2M
> @@ -77,6 +79,14 @@ struct bootinfo {
>  #endif
>  };
>  
> +struct map_range_data
> +{
> +    struct domain *d;
> +    p2m_type_t p2mt;
> +    /* Set if mappings for PCI host bridges must not be skipped. */
> +    bool map_pci_bridge;

To make this more generally applicable, I would call the new property:

  bool skip_mapping;

and it could apply to any class of devices. All current users would set
it to false except for pci_host_bridge_mappings.


> +};
>
>  extern struct bootinfo bootinfo;
>  
>  extern domid_t max_init_domid;
> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
>  u32 device_tree_get_u32(const void *fdt, int node,
>                          const char *prop_name, u32 dflt);
>  
> +int map_range_to_domain(const struct dt_device_node *dev,
> +                        u64 addr, u64 len, void *data);
> +
>  #endif
>  /*
>   * Local variables:
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 11/11] xen/arm: Process pending vPCI map/unmap operations
  2021-09-23 12:54 ` [PATCH v2 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
@ 2021-09-25  1:20   ` Stefano Stabellini
  2021-09-27  8:06   ` Jan Beulich
  1 sibling, 0 replies; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-25  1:20 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko, jbeulich, andrew.cooper3

+ x86 maintainers

On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> vPCI may map and unmap PCI device memory (BARs) being passed through which
> may take a lot of time. For this those operations may be deferred to be
> performed later, so that they can be safely preempted.
> Run the corresponding vPCI code while switching a vCPU.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

From an ARM point of view, I think the code change is OK.

Only one note: it would be good to add to the commit message a short
list of relevant TODOs which in particular affect this patch.

Something like:

Please be aware that there are a few outstanding TODOs affecting this
code path, see xen/drivers/vpci/header.c:map_range and
xen/drivers/vpci/header.c:vpci_process_pending.


> ---
> Since v1:
>  - Moved the check for pending vpci work from the common IOREQ code
>    to hvm_do_resume on x86
>  - Re-worked the code for Arm to ensure we don't miss pending vPCI work
> ---
>  xen/arch/arm/traps.c   | 13 +++++++++++++
>  xen/arch/x86/hvm/hvm.c |  6 ++++++
>  xen/common/ioreq.c     |  9 ---------
>  3 files changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 219ab3c3fbde..b246f51086e3 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -34,6 +34,7 @@
>  #include <xen/symbols.h>
>  #include <xen/version.h>
>  #include <xen/virtual_region.h>
> +#include <xen/vpci.h>
>  
>  #include <public/sched.h>
>  #include <public/xen.h>
> @@ -2304,6 +2305,18 @@ static bool check_for_vcpu_work(void)
>      }
>  #endif
>  
> +    if ( has_vpci(v->domain) )
> +    {
> +        bool pending;
> +
> +        local_irq_enable();
> +        pending = vpci_process_pending(v);
> +        local_irq_disable();
> +
> +        if ( pending )
> +            return true;
> +    }
> +
>      if ( likely(!v->arch.need_flush_to_ram) )
>          return false;
>  
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 7b48a1b925bb..d32f5d572941 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -549,6 +549,12 @@ void hvm_do_resume(struct vcpu *v)
>      if ( !vcpu_ioreq_handle_completion(v) )
>          return;
>  
> +    if ( has_vpci(v->domain) && vpci_process_pending(v) )
> +    {
> +        raise_softirq(SCHEDULE_SOFTIRQ);
> +        return;
> +    }
> +
>      if ( unlikely(v->arch.vm_event) )
>          hvm_vm_event_do_resume(v);
>  
> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
> index d732dc045df9..689d256544c8 100644
> --- a/xen/common/ioreq.c
> +++ b/xen/common/ioreq.c
> @@ -25,9 +25,7 @@
>  #include <xen/lib.h>
>  #include <xen/paging.h>
>  #include <xen/sched.h>
> -#include <xen/softirq.h>
>  #include <xen/trace.h>
> -#include <xen/vpci.h>
>  
>  #include <asm/guest_atomics.h>
>  #include <asm/ioreq.h>
> @@ -212,19 +210,12 @@ static bool wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
>  
>  bool vcpu_ioreq_handle_completion(struct vcpu *v)
>  {
> -    struct domain *d = v->domain;
>      struct vcpu_io *vio = &v->io;
>      struct ioreq_server *s;
>      struct ioreq_vcpu *sv;
>      enum vio_completion completion;
>      bool res = true;
>  
> -    if ( has_vpci(d) && vpci_process_pending(v) )
> -    {
> -        raise_softirq(SCHEDULE_SOFTIRQ);
> -        return false;
> -    }
> -
>      while ( (sv = get_pending_vcpu(v, &s)) != NULL )
>          if ( !wait_for_io(sv, get_ioreq(s, v)) )
>              return false;
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-24 23:54   ` Stefano Stabellini
@ 2021-09-27  7:13     ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-27  7:13 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko


On 25.09.21 02:54, Stefano Stabellini wrote:
> On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> While adding a PCI device mark it as such, so other frameworks
>> can distinguish it form DT devices.
>                       ^ from
Will fix
>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>
>
>> ---
>> Since v1:
>>   - Moved the assignment from iommu_add_device to alloc_pdev
>> ---
>>   xen/drivers/passthrough/pci.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
>> index 633e89ac1311..fc3469bc12dc 100644
>> --- a/xen/drivers/passthrough/pci.c
>> +++ b/xen/drivers/passthrough/pci.c
>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>       *((u8*) &pdev->bus) = bus;
>>       *((u8*) &pdev->devfn) = devfn;
>>       pdev->domain = NULL;
>> +#ifdef CONFIG_ARM
>> +    pci_to_dev(pdev)->type = DEV_PCI;
>> +#endif
>>   
>>       rc = pdev_msi_init(pdev);
>>       if ( rc )
>> -- 
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 07/11] libxl: Allow removing PCI devices for all types of domains
  2021-09-25  0:00   ` Stefano Stabellini
@ 2021-09-27  7:17     ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-27  7:17 UTC (permalink / raw)
  To: Stefano Stabellini, roger.pau
  Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko, Ian Jackson, Juergen Gross, jbeulich,
	andrew.cooper3


On 25.09.21 03:00, Stefano Stabellini wrote:
> +x86 maintainers
>
> On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> The PCI device remove path may now be used by PVH on ARM, so the
>> assert is no longer valid.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> Cc: Ian Jackson <iwj@xenproject.org>
>> Cc: Juergen Gross <jgross@suse.com>
>>
>> ---
>>   tools/libs/light/libxl_pci.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
>> index 1a1c2630803b..59f3686fc85e 100644
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -1947,8 +1947,6 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
>>               goto out_fail;
>>           }
>>       } else {
>> -        assert(type == LIBXL_DOMAIN_TYPE_PV);
> This is fine for ARM, but is it OK from an x86 point of view considering
> the PVH implications?

Long time ago I was asking Roger about that. At first glance the change

seemed to be ok, but Roger could you please confirm this?

Thank you in advance,

Oleksandr

>
>
>>           char *sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/resource", pci->domain,
>>                                        pci->bus, pci->dev, pci->func);
>>           FILE *f = fopen(sysfs_path, "r");
>> -- 
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 02/11] xen/arm: Add new device type for PCI
  2021-09-23 12:54 ` [PATCH v2 02/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
  2021-09-24 23:45   ` Stefano Stabellini
@ 2021-09-27  7:41   ` Jan Beulich
  2021-09-27  8:18     ` Oleksandr Andrushchenko
  1 sibling, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2021-09-27  7:41 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko, xen-devel

On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
> --- a/xen/arch/arm/pci/pci.c
> +++ b/xen/arch/arm/pci/pci.c
> @@ -27,6 +27,16 @@ int arch_pci_clean_pirqs(struct domain *d)
>      return 0;
>  }
>  
> +struct pci_dev *dev_to_pci(struct device *dev)
> +{
> +    struct arch_pci_dev *arch_dev;
> +
> +    ASSERT(dev->type == DEV_PCI);
> +
> +    arch_dev = container_of((dev), struct arch_pci_dev, dev);

Nit: This not being a macro, the inner parentheses aren't needed.

> +    return container_of(arch_dev, struct pci_dev, arch);

Two successive container_of() on the same pointer look odd. Is
there a reason a single one wouldn't do?

    return container_of(dev, struct pci_dev, arch.dev);

Jan



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-23 12:54 ` [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
  2021-09-24 23:54   ` Stefano Stabellini
@ 2021-09-27  7:45   ` Jan Beulich
  2021-09-27  8:45     ` Oleksandr Andrushchenko
  1 sibling, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2021-09-27  7:45 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko, xen-devel

On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>      *((u8*) &pdev->bus) = bus;
>      *((u8*) &pdev->devfn) = devfn;
>      pdev->domain = NULL;
> +#ifdef CONFIG_ARM
> +    pci_to_dev(pdev)->type = DEV_PCI;
> +#endif

I have to admit that I'm not happy about new CONFIG_<arch> conditionals
here. I'd prefer to see this done by a new arch helper, unless there are
obstacles I'm overlooking.

Jan



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 11/11] xen/arm: Process pending vPCI map/unmap operations
  2021-09-23 12:54 ` [PATCH v2 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
  2021-09-25  1:20   ` Stefano Stabellini
@ 2021-09-27  8:06   ` Jan Beulich
  2021-09-27  8:39     ` Oleksandr Andrushchenko
  1 sibling, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2021-09-27  8:06 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko, xen-devel

+ Paul (retaining full context for this reason)

On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> vPCI may map and unmap PCI device memory (BARs) being passed through which
> may take a lot of time. For this those operations may be deferred to be
> performed later, so that they can be safely preempted.
> Run the corresponding vPCI code while switching a vCPU.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> ---
> Since v1:
>  - Moved the check for pending vpci work from the common IOREQ code
>    to hvm_do_resume on x86

While perhaps obvious for Arm folks, I'd like to see the reason for this
spelled out in the description.

> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -549,6 +549,12 @@ void hvm_do_resume(struct vcpu *v)
>      if ( !vcpu_ioreq_handle_completion(v) )
>          return;
>  
> +    if ( has_vpci(v->domain) && vpci_process_pending(v) )
> +    {
> +        raise_softirq(SCHEDULE_SOFTIRQ);
> +        return;
> +    }

Note that you're altering behavior here: Originally this was done ...

> @@ -212,19 +210,12 @@ static bool wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
>  
>  bool vcpu_ioreq_handle_completion(struct vcpu *v)
>  {
> -    struct domain *d = v->domain;
>      struct vcpu_io *vio = &v->io;
>      struct ioreq_server *s;
>      struct ioreq_vcpu *sv;
>      enum vio_completion completion;
>      bool res = true;
>  
> -    if ( has_vpci(d) && vpci_process_pending(v) )
> -    {
> -        raise_softirq(SCHEDULE_SOFTIRQ);
> -        return false;
> -    }

... first thing. And I think it wants (perhaps even needs) to remain
that way; otherwise you'll need to explain why not, and why the change
is correct / safe.

Jan



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 02/11] xen/arm: Add new device type for PCI
  2021-09-27  7:41   ` Jan Beulich
@ 2021-09-27  8:18     ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-27  8:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko, xen-devel


On 27.09.21 10:41, Jan Beulich wrote:
> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>> --- a/xen/arch/arm/pci/pci.c
>> +++ b/xen/arch/arm/pci/pci.c
>> @@ -27,6 +27,16 @@ int arch_pci_clean_pirqs(struct domain *d)
>>       return 0;
>>   }
>>   
>> +struct pci_dev *dev_to_pci(struct device *dev)
>> +{
>> +    struct arch_pci_dev *arch_dev;
>> +
>> +    ASSERT(dev->type == DEV_PCI);
>> +
>> +    arch_dev = container_of((dev), struct arch_pci_dev, dev);
> Nit: This not being a macro, the inner parentheses aren't needed.
>
>> +    return container_of(arch_dev, struct pci_dev, arch);
> Two successive container_of() on the same pointer look odd. Is
> there a reason a single one wouldn't do?
>
>      return container_of(dev, struct pci_dev, arch.dev);
Good catch, thank you! No reason at all, will fix
>
> Jan
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 11/11] xen/arm: Process pending vPCI map/unmap operations
  2021-09-27  8:06   ` Jan Beulich
@ 2021-09-27  8:39     ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-27  8:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko, xen-devel


On 27.09.21 11:06, Jan Beulich wrote:
> + Paul (retaining full context for this reason)
>
> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> vPCI may map and unmap PCI device memory (BARs) being passed through which
>> may take a lot of time. For this those operations may be deferred to be
>> performed later, so that they can be safely preempted.
>> Run the corresponding vPCI code while switching a vCPU.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> ---
>> Since v1:
>>   - Moved the check for pending vpci work from the common IOREQ code
>>     to hvm_do_resume on x86
> While perhaps obvious for Arm folks, I'd like to see the reason for this
> spelled out in the description.

I will add:

IOREQ is not enabled for Arm by default, so pending vPCI work has

no chance to be executed if the processing is left as is in the common IOREQ

code only. For that reason Arm processes that in arch specific code which results

in that the processing happens twice for Arm when IOREQ is enabled.

For x86, processing of vPCI in IOREQ code also doesn't seem to be the right

place, so move that to hvm_do_resume.

>
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -549,6 +549,12 @@ void hvm_do_resume(struct vcpu *v)
>>       if ( !vcpu_ioreq_handle_completion(v) )
>>           return;
>>   
>> +    if ( has_vpci(v->domain) && vpci_process_pending(v) )
>> +    {
>> +        raise_softirq(SCHEDULE_SOFTIRQ);
>> +        return;
>> +    }
> Note that you're altering behavior here: Originally this was done ...
>
>> @@ -212,19 +210,12 @@ static bool wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
>>   
>>   bool vcpu_ioreq_handle_completion(struct vcpu *v)
>>   {
>> -    struct domain *d = v->domain;
>>       struct vcpu_io *vio = &v->io;
>>       struct ioreq_server *s;
>>       struct ioreq_vcpu *sv;
>>       enum vio_completion completion;
>>       bool res = true;
>>   
>> -    if ( has_vpci(d) && vpci_process_pending(v) )
>> -    {
>> -        raise_softirq(SCHEDULE_SOFTIRQ);
>> -        return false;
>> -    }
> ... first thing. And I think it wants (perhaps even needs) to remain
> that way;
Will make it the first operation
>   otherwise you'll need to explain why not, and why the change
> is correct / safe.
>
> Jan
>
Thank you,

Oleksandr

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-27  7:45   ` Jan Beulich
@ 2021-09-27  8:45     ` Oleksandr Andrushchenko
  2021-09-27  9:19       ` Jan Beulich
  0 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-27  8:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko, xen-devel


On 27.09.21 10:45, Jan Beulich wrote:
> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>> --- a/xen/drivers/passthrough/pci.c
>> +++ b/xen/drivers/passthrough/pci.c
>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>       *((u8*) &pdev->bus) = bus;
>>       *((u8*) &pdev->devfn) = devfn;
>>       pdev->domain = NULL;
>> +#ifdef CONFIG_ARM
>> +    pci_to_dev(pdev)->type = DEV_PCI;
>> +#endif
> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
> here. I'd prefer to see this done by a new arch helper, unless there are
> obstacles I'm overlooking.

Do you mean something like arch_pci_alloc_pdev(dev)?

If so, where should we put this call? At the very beginning of alloc_pdev

or at the bottom just before returning from alloc_pdev?

>
> Jan
>
Thank you,

Oleksandr

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
  2021-09-25  0:18   ` Stefano Stabellini
@ 2021-09-27  8:47     ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-27  8:47 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko


On 25.09.21 03:18, Stefano Stabellini wrote:
> On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> In order for vPCI to work it needs to maintain guest and hardware
>> domain's views on the configuration space. For example, BARs and
>                   ^ of
>
Ack
>> COMMAND registers require emulation for guests and the guest view
>> on the registers needs to be in sync with the real contents of the
>    ^ of
Ack
>
>> relevant registers. For that ECAM address space needs to also be
>> trapped for the hardware domain, so we need to implement PCI host
>> bridge specific callbacks to properly setup MMIO handlers for those
>> ranges depending on particular host bridge implementation.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> The patch looks pretty good, only a couple of minor comments below
>
>
>> ---
>> Since v1:
>>   - Dynamically calculate the number of MMIO handlers required for vPCI
>>     and update the total number accordingly
>>   - s/clb/cb
>>   - Do not introduce a new callback for MMIO handler setup
>> ---
>>   xen/arch/arm/domain.c              |  2 ++
>>   xen/arch/arm/pci/pci-host-common.c | 28 +++++++++++++++++++++++++
>>   xen/arch/arm/vpci.c                | 33 ++++++++++++++++++++++++++++++
>>   xen/arch/arm/vpci.h                |  6 ++++++
>>   xen/include/asm-arm/pci.h          |  7 +++++++
>>   5 files changed, 76 insertions(+)
>>
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 854e8fed0393..c7b25bc70439 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -733,6 +733,8 @@ int arch_domain_create(struct domain *d,
>>       if ( (rc = domain_vgic_register(d, &count)) != 0 )
>>           goto fail;
>>   
>> +    count += domain_vpci_get_num_mmio_handlers(d);
>> +
>>       if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
>>           goto fail;
>>   
>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>> index 1567b6e2956c..155f2a2743af 100644
>> --- a/xen/arch/arm/pci/pci-host-common.c
>> +++ b/xen/arch/arm/pci/pci-host-common.c
>> @@ -300,6 +300,34 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
>>       }
>>       return bridge->dt_node;
>>   }
>> +
>> +int pci_host_iterate_bridges(struct domain *d,
>> +                             int (*cb)(struct domain *d,
>> +                                       struct pci_host_bridge *bridge))
>> +{
>> +    struct pci_host_bridge *bridge;
>> +    int err;
>> +
>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>> +    {
>> +        err = cb(d, bridge);
>> +        if ( err )
>> +            return err;
>> +    }
>> +    return 0;
>> +}
>> +
>> +int pci_host_get_num_bridges(void)
>> +{
>> +    struct pci_host_bridge *bridge;
>> +    int count = 0;
>> +
>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>> +        count++;
>> +
>> +    return count;
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
>> index 76c12b92814f..14947e975d69 100644
>> --- a/xen/arch/arm/vpci.c
>> +++ b/xen/arch/arm/vpci.c
>> @@ -80,17 +80,50 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
>>       .write = vpci_mmio_write,
>>   };
>>   
>> +static int vpci_setup_mmio_handler(struct domain *d,
>> +                                   struct pci_host_bridge *bridge)
>> +{
>> +    struct pci_config_window *cfg = bridge->cfg;
>> +
>> +    register_mmio_handler(d, &vpci_mmio_handler,
>> +                          cfg->phys_addr, cfg->size, NULL);
>> +    return 0;
>> +}
>> +
>>   int domain_vpci_init(struct domain *d)
>>   {
>>       if ( !has_vpci(d) )
>>           return 0;
>>   
>> +    if ( is_hardware_domain(d) )
>> +        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler);
>> +
>> +    /* Guest domains use what is programmed in their device tree. */
>>       register_mmio_handler(d, &vpci_mmio_handler,
>>                             GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
>>   
>>       return 0;
>>   }
>>   
>> +int domain_vpci_get_num_mmio_handlers(struct domain *d)
>> +{
>> +    int count = 0;
>> +
>> +    if ( is_hardware_domain(d) )
>> +        /* For each PCI host bridge's configuration space. */
>> +        count += pci_host_get_num_bridges();
> NIT: why += instead of = ?
Will fix
>
>
>> +    else
>> +        /*
>> +         * VPCI_MSIX_MEM_NUM handlers for MSI-X tables per each PCI device
>> +         * being passed through. Maximum number of supported devices
>> +         * is 32 as virtual bus topology emulates the devices as embedded
>> +         * endpoints.
>> +         * +1 for a single emulated host bridge's configuration space. */
>> +        count = VPCI_MSIX_MEM_NUM * 32 + 1;
>> +
>> +    return count;
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h
>> index d8a7b0e3e802..27a2b069abd2 100644
>> --- a/xen/arch/arm/vpci.h
>> +++ b/xen/arch/arm/vpci.h
>> @@ -17,11 +17,17 @@
>>   
>>   #ifdef CONFIG_HAS_VPCI
>>   int domain_vpci_init(struct domain *d);
>> +int domain_vpci_get_num_mmio_handlers(struct domain *d);
>>   #else
>>   static inline int domain_vpci_init(struct domain *d)
>>   {
>>       return 0;
>>   }
>> +
>> +static inline int domain_vpci_get_num_mmio_handlers(struct domain *d)
>> +{
>> +    return 0;
>> +}
>>   #endif
>>   
>>   #endif /* __ARCH_ARM_VPCI_H__ */
>> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
>> index 5b100556225e..7618f0b6725b 100644
>> --- a/xen/include/asm-arm/pci.h
>> +++ b/xen/include/asm-arm/pci.h
>> @@ -15,6 +15,8 @@
>>   #ifndef __ARM_PCI_H__
>>   #define __ARM_PCI_H__
>>   
>> +#include <asm/mmio.h>
> This should be no longer required.
I'll check and remove
>
>
>>   #ifdef CONFIG_HAS_PCI
>>   
>>   #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
>> @@ -109,6 +111,11 @@ static always_inline bool is_pci_passthrough_enabled(void)
>>   {
>>       return !!pci_passthrough_enabled;
>>   }
>> +
>> +int pci_host_iterate_bridges(struct domain *d,
>> +                             int (*clb)(struct domain *d,
>> +                                        struct pci_host_bridge *bridge));
>> +int pci_host_get_num_bridges(void);
>>   #else   /*!CONFIG_HAS_PCI*/
>>   
>>   #define pci_passthrough_enabled (false)

Thank you,

Oleksandr

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-27  8:45     ` Oleksandr Andrushchenko
@ 2021-09-27  9:19       ` Jan Beulich
  2021-09-27  9:35         ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2021-09-27  9:19 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	xen-devel

On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
> 
> On 27.09.21 10:45, Jan Beulich wrote:
>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>> --- a/xen/drivers/passthrough/pci.c
>>> +++ b/xen/drivers/passthrough/pci.c
>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>>       *((u8*) &pdev->bus) = bus;
>>>       *((u8*) &pdev->devfn) = devfn;
>>>       pdev->domain = NULL;
>>> +#ifdef CONFIG_ARM
>>> +    pci_to_dev(pdev)->type = DEV_PCI;
>>> +#endif
>> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
>> here. I'd prefer to see this done by a new arch helper, unless there are
>> obstacles I'm overlooking.
> 
> Do you mean something like arch_pci_alloc_pdev(dev)?

I'd recommend against "alloc" in its name; "new" instead maybe?

> If so, where should we put this call? At the very beginning of alloc_pdev
> or at the bottom just before returning from alloc_pdev?

Right where you have the #ifdef right now, I would say (separated by
a blank line).

Jan



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-27  9:19       ` Jan Beulich
@ 2021-09-27  9:35         ` Oleksandr Andrushchenko
  2021-09-27 10:00           ` Jan Beulich
  0 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-27  9:35 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	xen-devel, Oleksandr Andrushchenko


On 27.09.21 12:19, Jan Beulich wrote:
> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>> On 27.09.21 10:45, Jan Beulich wrote:
>>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>>> --- a/xen/drivers/passthrough/pci.c
>>>> +++ b/xen/drivers/passthrough/pci.c
>>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>>>        *((u8*) &pdev->bus) = bus;
>>>>        *((u8*) &pdev->devfn) = devfn;
>>>>        pdev->domain = NULL;
>>>> +#ifdef CONFIG_ARM
>>>> +    pci_to_dev(pdev)->type = DEV_PCI;
>>>> +#endif
>>> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
>>> here. I'd prefer to see this done by a new arch helper, unless there are
>>> obstacles I'm overlooking.
>> Do you mean something like arch_pci_alloc_pdev(dev)?
> I'd recommend against "alloc" in its name; "new" instead maybe?

I am fine with arch_pci_new_pdev, but arch prefix points to the fact that

this is just an architecture specific part of the pdev allocation rather than

actual pdev allocation itself, so with this respect arch_pci_alloc_pdev seems

more natural to me.

>
>> If so, where should we put this call? At the very beginning of alloc_pdev
>> or at the bottom just before returning from alloc_pdev?
> Right where you have the #ifdef right now, I would say (separated by
> a blank line).
Ok
>
> Jan
>
>
Thank you,

Oleksandr

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-27  9:35         ` Oleksandr Andrushchenko
@ 2021-09-27 10:00           ` Jan Beulich
  2021-09-27 10:04             ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2021-09-27 10:00 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	xen-devel

On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
> 
> On 27.09.21 12:19, Jan Beulich wrote:
>> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>>> On 27.09.21 10:45, Jan Beulich wrote:
>>>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>>>>        *((u8*) &pdev->bus) = bus;
>>>>>        *((u8*) &pdev->devfn) = devfn;
>>>>>        pdev->domain = NULL;
>>>>> +#ifdef CONFIG_ARM
>>>>> +    pci_to_dev(pdev)->type = DEV_PCI;
>>>>> +#endif
>>>> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
>>>> here. I'd prefer to see this done by a new arch helper, unless there are
>>>> obstacles I'm overlooking.
>>> Do you mean something like arch_pci_alloc_pdev(dev)?
>> I'd recommend against "alloc" in its name; "new" instead maybe?
> 
> I am fine with arch_pci_new_pdev, but arch prefix points to the fact that
> this is just an architecture specific part of the pdev allocation rather than
> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev seems
> more natural to me.

The bulk of the function is about populating the just allocated struct.
There's no arch-specific part of the allocation (so far, leaving aside
MSI-X), you only want and arch-specific part of the initialization. I
would agree with "alloc" in the name if further allocation was to
happen there.

Jan



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-27 10:00           ` Jan Beulich
@ 2021-09-27 10:04             ` Oleksandr Andrushchenko
  2021-09-27 10:26               ` Jan Beulich
  0 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-27 10:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	xen-devel


On 27.09.21 13:00, Jan Beulich wrote:
> On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
>> On 27.09.21 12:19, Jan Beulich wrote:
>>> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>>>> On 27.09.21 10:45, Jan Beulich wrote:
>>>>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>>>>>         *((u8*) &pdev->bus) = bus;
>>>>>>         *((u8*) &pdev->devfn) = devfn;
>>>>>>         pdev->domain = NULL;
>>>>>> +#ifdef CONFIG_ARM
>>>>>> +    pci_to_dev(pdev)->type = DEV_PCI;
>>>>>> +#endif
>>>>> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
>>>>> here. I'd prefer to see this done by a new arch helper, unless there are
>>>>> obstacles I'm overlooking.
>>>> Do you mean something like arch_pci_alloc_pdev(dev)?
>>> I'd recommend against "alloc" in its name; "new" instead maybe?
>> I am fine with arch_pci_new_pdev, but arch prefix points to the fact that
>> this is just an architecture specific part of the pdev allocation rather than
>> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev seems
>> more natural to me.
> The bulk of the function is about populating the just allocated struct.
> There's no arch-specific part of the allocation (so far, leaving aside
> MSI-X), you only want and arch-specific part of the initialization. I
> would agree with "alloc" in the name if further allocation was to
> happen there.
Hm, then arch_pci_init_pdev sounds more reasonable
> Jan
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-27 10:04             ` Oleksandr Andrushchenko
@ 2021-09-27 10:26               ` Jan Beulich
  2021-09-28  8:09                 ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2021-09-27 10:26 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	xen-devel

On 27.09.2021 12:04, Oleksandr Andrushchenko wrote:
> 
> On 27.09.21 13:00, Jan Beulich wrote:
>> On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
>>> On 27.09.21 12:19, Jan Beulich wrote:
>>>> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>>>>> On 27.09.21 10:45, Jan Beulich wrote:
>>>>>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>>>>>>         *((u8*) &pdev->bus) = bus;
>>>>>>>         *((u8*) &pdev->devfn) = devfn;
>>>>>>>         pdev->domain = NULL;
>>>>>>> +#ifdef CONFIG_ARM
>>>>>>> +    pci_to_dev(pdev)->type = DEV_PCI;
>>>>>>> +#endif
>>>>>> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
>>>>>> here. I'd prefer to see this done by a new arch helper, unless there are
>>>>>> obstacles I'm overlooking.
>>>>> Do you mean something like arch_pci_alloc_pdev(dev)?
>>>> I'd recommend against "alloc" in its name; "new" instead maybe?
>>> I am fine with arch_pci_new_pdev, but arch prefix points to the fact that
>>> this is just an architecture specific part of the pdev allocation rather than
>>> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev seems
>>> more natural to me.
>> The bulk of the function is about populating the just allocated struct.
>> There's no arch-specific part of the allocation (so far, leaving aside
>> MSI-X), you only want and arch-specific part of the initialization. I
>> would agree with "alloc" in the name if further allocation was to
>> happen there.
> Hm, then arch_pci_init_pdev sounds more reasonable

Fine with me.

Jan



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-09-25  0:44   ` Stefano Stabellini
@ 2021-09-27 12:44     ` Oleksandr Andrushchenko
  2021-09-28  4:00       ` Stefano Stabellini
  0 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-27 12:44 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko


On 25.09.21 03:44, Stefano Stabellini wrote:
> On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> PCI host bridges are special devices in terms of implementing PCI
>> passthrough. According to [1] the current implementation depends on
>> Domain-0 to perform the initialization of the relevant PCI host
>> bridge hardware and perform PCI device enumeration. In order to
>> achieve that one of the required changes is to not map all the memory
>> ranges in map_range_to_domain as we traverse the device tree on startup
>> and perform some additional checks if the range needs to be mapped to
>> Domain-0.
>>
>> The generic PCI host controller device tree binding says [2]:
>> - ranges: As described in IEEE Std 1275-1994, but must provide
>>            at least a definition of non-prefetchable memory. One
>>            or both of prefetchable Memory and IO Space may also
>>            be provided.
>>
>> - reg   : The Configuration Space base address and size, as accessed
>>            from the parent bus.  The base address corresponds to
>>            the first bus in the "bus-range" property.  If no
>>            "bus-range" is specified, this will be bus 0 (the default).
>>
>> >From the above none of the memory ranges from the "ranges" property
>> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
>> be handled dynamically by vPCI as we assign PCI devices, e.g. each
>> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
>> as needed by Xen.
>>
>> The "reg" property covers not only ECAM space, but may also have other
>> then the configuration memory ranges described, for example [3]:
>> - reg: Should contain rc_dbi, config registers location and length.
>> - reg-names: Must include the following entries:
>>     "rc_dbi": controller configuration registers;
>>     "config": PCIe configuration space registers.
>>
>> This patch makes it possible to not map all the ranges from the
>> "ranges" property and also ECAM from the "reg". All the rest from the
>> "reg" property still needs to be mapped to Domain-0, so the PCI
>> host bridge remains functional in Domain-0.
>>
>> [1] https://urldefense.com/v3/__https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html__;!!GF_29dbcQIUBPA!lrCuNRzUVkRf4FLgp3hW-4uOldgKr4qNpZb_ufI0jW-O0eRH11VFDeGcs5pPdtKZT1_ou4InSg$ [lists[.]xenproject[.]org]
>> [2] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt__;!!GF_29dbcQIUBPA!lrCuNRzUVkRf4FLgp3hW-4uOldgKr4qNpZb_ufI0jW-O0eRH11VFDeGcs5pPdtKZT1_T5yn7GA$ [kernel[.]org]
>> [3] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt__;!!GF_29dbcQIUBPA!lrCuNRzUVkRf4FLgp3hW-4uOldgKr4qNpZb_ufI0jW-O0eRH11VFDeGcs5pPdtKZT18im_Y2tw$ [kernel[.]org]
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> ---
>> Since v1:
>>   - Added better description of why and what needs to be mapped into
>>     Domain-0's p2m and what doesn't
>>   - Do not do any mappings for PCI devices while traversing the DT
>>   - Walk all the bridges and make required mappings in one go
>> ---
>>   xen/arch/arm/domain_build.c        | 38 +++++++++++++++--------
>>   xen/arch/arm/pci/ecam.c            | 14 +++++++++
>>   xen/arch/arm/pci/pci-host-common.c | 48 ++++++++++++++++++++++++++++++
>>   xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
>>   xen/include/asm-arm/pci.h          |  9 ++++++
>>   xen/include/asm-arm/setup.h        | 13 ++++++++
>>   6 files changed, 111 insertions(+), 12 deletions(-)
>>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index 83ab0d52cce9..e72c1b881cae 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -10,7 +10,6 @@
>>   #include <asm/regs.h>
>>   #include <xen/errno.h>
>>   #include <xen/err.h>
>> -#include <xen/device_tree.h>
>>   #include <xen/libfdt/libfdt.h>
>>   #include <xen/guest_access.h>
>>   #include <xen/iocap.h>
>> @@ -47,12 +46,6 @@ static int __init parse_dom0_mem(const char *s)
>>   }
>>   custom_param("dom0_mem", parse_dom0_mem);
>>   
>> -struct map_range_data
>> -{
>> -    struct domain *d;
>> -    p2m_type_t p2mt;
>> -};
>> -
>>   /* Override macros from asm/page.h to make them work with mfn_t */
>>   #undef virt_to_mfn
>>   #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
>> @@ -1388,9 +1381,8 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>>       return 0;
>>   }
>>   
>> -static int __init map_range_to_domain(const struct dt_device_node *dev,
>> -                                      u64 addr, u64 len,
>> -                                      void *data)
>> +int __init map_range_to_domain(const struct dt_device_node *dev,
>> +                               u64 addr, u64 len, void *data)
>>   {
>>       struct map_range_data *mr_data = data;
>>       struct domain *d = mr_data->d;
>> @@ -1417,6 +1409,13 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>>           }
>>       }
>>   
>> +#ifdef CONFIG_HAS_PCI
>> +    if ( is_pci_passthrough_enabled() &&
>> +         (device_get_class(dev) == DEVICE_PCI) &&
>> +         !mr_data->map_pci_bridge )
>> +        need_mapping = false;
>> +#endif
> With the change I suggested below turning map_pci_bridge into
> skip_mapping, then this check could go away if we just set need_mapping
> as follows:
>
> bool need_mapping = !dt_device_for_passthrough(dev) &&
>                      !mr_data->skip_mapping;

Not exactly. This check, e.g.

"is_pci_passthrough_enabled() && (device_get_class(dev) == DEVICE_PCI)"

really protects us from mapping any of the ranges belonging to a PCI device:

we scan the device tree and for each node we call map_range_to_domain

with skip_mapping == false (it is called from map_device_children).

So, if there is no check then the mapping is performed even for PCI devices

which we do not want.

But, yes we can simplify the logic to:

bool need_mapping = !dt_device_for_passthrough(dev) &&
                     !mr_data->skip_mapping;

#ifdef CONFIG_HAS_PCI
     if ( need_mapping && is_pci_passthrough_enabled() &&
          (device_get_class(dev) == DEVICE_PCI) )
         need_mapping = false;
#endif

but I see no big profit from it.

>
>
>>       if ( need_mapping )
>>       {
>>           res = map_regions_p2mt(d,
>> @@ -1450,7 +1449,11 @@ static int __init map_device_children(struct domain *d,
>>                                         const struct dt_device_node *dev,
>>                                         p2m_type_t p2mt)
>>   {
>> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>> +    struct map_range_data mr_data = {
>> +        .d = d,
>> +        .p2mt = p2mt,
>> +        .map_pci_bridge = false
>> +    };
>>       int ret;
>>   
>>       if ( dt_device_type_is_equal(dev, "pci") )
>> @@ -1582,7 +1585,11 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>       /* Give permission and map MMIOs */
>>       for ( i = 0; i < naddr; i++ )
>>       {
>> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>> +        struct map_range_data mr_data = {
>> +            .d = d,
>> +            .p2mt = p2mt,
>> +            .map_pci_bridge = false
>> +        };
>>           res = dt_device_get_address(dev, i, &addr, &size);
>>           if ( res )
>>           {
>> @@ -2754,7 +2761,14 @@ static int __init construct_dom0(struct domain *d)
>>           return rc;
>>   
>>       if ( acpi_disabled )
>> +    {
>>           rc = prepare_dtb_hwdom(d, &kinfo);
>> +#ifdef CONFIG_HAS_PCI
>> +        if ( rc < 0 )
>> +            return rc;
> This doesn't look great :-)
>
> I would move the call to pci_host_bridge_mappings() below just before
> construct_domain.

I put it there for purpose: currently we only support device-tree and

ACPI is not covered, e.g. pci_host_bridge_mappings is implemented

with device-tree in mind. So, I decided to tie it to prepare_dtb_hwdom

which is called when acpi_disabled is true.

>
>
>> +        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
>> +#endif
>> +    }
>>       else
>>           rc = prepare_acpi(d, &kinfo);
>>   
>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>> index 9b88b1cedaa2..eae177f2cbc2 100644
>> --- a/xen/arch/arm/pci/ecam.c
>> +++ b/xen/arch/arm/pci/ecam.c
>> @@ -39,6 +39,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>       return base + (PCI_DEVFN2(sbdf) << devfn_shift) + where;
>>   }
>>   
>> +bool pci_ecam_need_p2m_mapping(struct domain *d,
>> +                               struct pci_host_bridge *bridge,
>> +                               uint64_t addr)
>> +{
>> +    struct pci_config_window *cfg = bridge->cfg;
>> +
>> +    /*
>> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
>> +     * so we can trap access to it.
>> +     */
>> +    return cfg->phys_addr != addr;
>> +}
>> +
>>   /* ECAM ops */
>>   const struct pci_ecam_ops pci_generic_ecam_ops = {
>>       .bus_shift  = 20,
>> @@ -46,6 +59,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>>           .map_bus                = pci_ecam_map_bus,
>>           .read                   = pci_generic_config_read,
>>           .write                  = pci_generic_config_write,
>> +        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
>>       }
>>   };
>>   
>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>> index 155f2a2743af..f350826ea26b 100644
>> --- a/xen/arch/arm/pci/pci-host-common.c
>> +++ b/xen/arch/arm/pci/pci-host-common.c
>> @@ -18,6 +18,7 @@
>>   
>>   #include <xen/init.h>
>>   #include <xen/pci.h>
>> +#include <asm/setup.h>
>>   #include <xen/rwlock.h>
>>   #include <xen/sched.h>
>>   #include <xen/vmap.h>
>> @@ -328,6 +329,53 @@ int pci_host_get_num_bridges(void)
>>       return count;
>>   }
>>   
>> +int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
>> +{
>> +    struct pci_host_bridge *bridge;
>> +    struct map_range_data mr_data = {
>> +        .d = d,
>> +        .p2mt = p2mt,
>> +        .map_pci_bridge = true
>> +    };
>> +
>> +    /*
>> +     * For each PCI host bridge we need to only map those ranges
>> +     * which are used by Domain-0 to properly initialize the bridge,
>> +     * e.g. we do not want to map ECAM configuration space which lives in
>> +     * "reg" or "assigned-addresses" device tree property.
>> +     * Neither we want to map any of the MMIO ranges found in the "ranges"
>> +     * device tree property.
>> +     */
>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>> +    {
>> +        const struct dt_device_node *dev = bridge->dt_node;
>> +        int i;
> i should be unsigned int
Ok
>
>
>> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
>> +        {
>> +            uint64_t addr, size;
>> +            int err;
>> +
>> +            err = dt_device_get_address(dev, i, &addr, &size);
>> +            if ( err )
>> +            {
>> +                printk(XENLOG_ERR "Unable to retrieve address %u for %s\n",
> Maybe rephrase it to:
>
>      Unable to retrieve address range index=%u for %s
This is a copy-paste from the original code, but ok
>
>
>> +                       i, dt_node_full_name(dev));
>> +                return err;
>> +            }
>> +
>> +            if ( bridge->ops->need_p2m_mapping(d, bridge, addr) )
> The current implementation of need_p2m_mapping filters out ECAM and
> nothing else. Just double-checking here: is there anything else we
> should filter out? Looking at the device tree pci node for ZynqMP:
>
> 			reg = <0x0 0xfd0e0000 0x0 0x1000 0x0 0xfd480000 0x0 0x1000 0x80 0x0 0x0 0x1000000>;
> 			reg-names = "breg", "pcireg", "cfg";
>
> We are filtering out cfg, but do we need both "breg" and "pcireg" here?

It is vice versa: we are filtering out cfg only and all the rest are "unknown regions we do not

want to alter".

>
> If not, do we need another function like .cfg_reg_index to know what we
> actually need to map?

I was thinking to use .cfg_reg_index fir that, but it means I'll need to traverse

the device-tree to get the value for .cfg_reg_index which is already known

to the bridge. So, it is cheaper to have a callback and just check that

cfg->phys_addr != addr, e.g. what we want to map is not cfg.

>
>
>> +            {
>> +                err = map_range_to_domain(dev, addr, size, &mr_data);
>> +                if ( err )
>> +                    return err;
>> +            }
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
>> index c27b4ea9f02f..adbe3627871f 100644
>> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
>> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
>> @@ -33,6 +33,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
>>           .map_bus                = pci_ecam_map_bus,
>>           .read                   = pci_generic_config_read,
>>           .write                  = pci_generic_config_write,
>> +        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
>>       }
>>   };
>>   
>> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
>> index 7618f0b6725b..b81f66e813ef 100644
>> --- a/xen/include/asm-arm/pci.h
>> +++ b/xen/include/asm-arm/pci.h
>> @@ -19,6 +19,8 @@
>>   
>>   #ifdef CONFIG_HAS_PCI
>>   
>> +#include <asm/p2m.h>
>> +
>>   #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
>>   #define PRI_pci "%04x:%02x:%02x.%u"
>>   
>> @@ -79,6 +81,9 @@ struct pci_ops {
>>                   uint32_t reg, uint32_t len, uint32_t *value);
>>       int (*write)(struct pci_host_bridge *bridge, uint32_t sbdf,
>>                    uint32_t reg, uint32_t len, uint32_t value);
>> +    bool (*need_p2m_mapping)(struct domain *d,
>> +                             struct pci_host_bridge *bridge,
>> +                             uint64_t addr);
> I would call this function: need_p2m_hwdom_mapping
Ok
>
>
>>   };
>>   
>>   /*
>> @@ -102,6 +107,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, uint32_t sbdf,
>>                               uint32_t reg, uint32_t len, uint32_t value);
>>   void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>                                  uint32_t sbdf, uint32_t where);
>> +bool pci_ecam_need_p2m_mapping(struct domain *d,
>> +                               struct pci_host_bridge *bridge,
>> +                               uint64_t addr);
>>   struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
>>   int pci_get_host_bridge_segment(const struct dt_device_node *node,
>>                                   uint16_t *segment);
>> @@ -116,6 +124,7 @@ int pci_host_iterate_bridges(struct domain *d,
>>                                int (*clb)(struct domain *d,
>>                                           struct pci_host_bridge *bridge));
>>   int pci_host_get_num_bridges(void);
>> +int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
>>   #else   /*!CONFIG_HAS_PCI*/
>>   
>>   #define pci_passthrough_enabled (false)
>> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
>> index 95da0b7ab9cd..21863dd2bc58 100644
>> --- a/xen/include/asm-arm/setup.h
>> +++ b/xen/include/asm-arm/setup.h
>> @@ -2,6 +2,8 @@
>>   #define __ARM_SETUP_H_
>>   
>>   #include <public/version.h>
>> +#include <asm/p2m.h>
>> +#include <xen/device_tree.h>
>>   
>>   #define MIN_FDT_ALIGN 8
>>   #define MAX_FDT_SIZE SZ_2M
>> @@ -77,6 +79,14 @@ struct bootinfo {
>>   #endif
>>   };
>>   
>> +struct map_range_data
>> +{
>> +    struct domain *d;
>> +    p2m_type_t p2mt;
>> +    /* Set if mappings for PCI host bridges must not be skipped. */
>> +    bool map_pci_bridge;
> To make this more generally applicable, I would call the new property:
>
>    bool skip_mapping;
Sounds good
>
> and it could apply to any class of devices. All current users would set
> it to false except for pci_host_bridge_mappings.
Please see PCI special case above
>
>
>> +};
>>
>>   extern struct bootinfo bootinfo;
>>   
>>   extern domid_t max_init_domid;
>> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
>>   u32 device_tree_get_u32(const void *fdt, int node,
>>                           const char *prop_name, u32 dflt);
>>   
>> +int map_range_to_domain(const struct dt_device_node *dev,
>> +                        u64 addr, u64 len, void *data);
>> +
>>   #endif
>>   /*
>>    * Local variables:
>> -- 
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 08/11] libxl: Only map legacy PCI IRQs if they are supported
  2021-09-25  0:06   ` Stefano Stabellini
@ 2021-09-27 13:02     ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-27 13:02 UTC (permalink / raw)
  To: Stefano Stabellini, Ian Jackson, Juergen Gross
  Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko, jbeulich, andrew.cooper3


On 25.09.21 03:06, Stefano Stabellini wrote:
> On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> Arm's PCI passthrough implementation doesn't support legacy interrupts,
>> but MSI/MSI-X. This can be the case for other platforms too.
>> For that reason introduce a new CONFIG_PCI_SUPP_LEGACY_IRQ and add
>> it to the CFLAGS and compile the relevant code in the toolstack only if
>> applicable.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> Cc: Ian Jackson <iwj@xenproject.org>
>> Cc: Juergen Gross <jgross@suse.com>
>>
>> ---
>> Since v1:
>>   - Minimized #idefery by introducing pci_supp_legacy_irq function
>>     for relevant checks
>> ---
>>   tools/libs/light/Makefile    |  4 ++++
>>   tools/libs/light/libxl_pci.c | 13 +++++++++++++
>>   2 files changed, 17 insertions(+)
>>
>> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
>> index 7d8c51d49242..bd3f6be2a183 100644
>> --- a/tools/libs/light/Makefile
>> +++ b/tools/libs/light/Makefile
>> @@ -46,6 +46,10 @@ CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
>>   	-Wno-declaration-after-statement -Wformat-nonliteral
>>   CFLAGS += -I.
>>   
>> +ifeq ($(CONFIG_X86),y)
>> +CFLAGS += -DCONFIG_PCI_SUPP_LEGACY_IRQ
>> +endif
> This patch is a lot better than the previous version, thanks!
>
> I think the usage of pci_supp_legacy_irq below is good and we can't do
> better than that.
>
> I wonder if there is a better way than the above to export
> CONFIG_PCI_SUPP_LEGACY_IRQ. Suggestions?

I see nothing bad doing it this way, maybe Ian or Juergen can tell

if this is acceptable?

>
>
>>   SRCS-$(CONFIG_X86) += libxl_cpuid.c
>>   SRCS-$(CONFIG_X86) += libxl_x86.c
>>   SRCS-$(CONFIG_X86) += libxl_psr.c
>> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
>> index 59f3686fc85e..4c2d7aeefbb2 100644
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -1364,6 +1364,15 @@ static void pci_add_timeout(libxl__egc *egc, libxl__ev_time *ev,
>>       pci_add_dm_done(egc, pas, rc);
>>   }
>>   
>> +static bool pci_supp_legacy_irq(void)
>> +{
>> +#ifdef CONFIG_PCI_SUPP_LEGACY_IRQ
>> +    return true;
>> +#else
>> +    return false;
>> +#endif
>> +}
>> +
>>   static void pci_add_dm_done(libxl__egc *egc,
>>                               pci_add_state *pas,
>>                               int rc)
>> @@ -1434,6 +1443,8 @@ static void pci_add_dm_done(libxl__egc *egc,
>>           }
>>       }
>>       fclose(f);
>> +    if (!pci_supp_legacy_irq())
>> +        goto out_no_irq;
>>       sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>>                                   pci->bus, pci->dev, pci->func);
>>       f = fopen(sysfs_path, "r");
>> @@ -1983,6 +1994,8 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
>>           }
>>           fclose(f);
>>   skip1:
>> +        if (!pci_supp_legacy_irq())
>> +            goto skip_irq;
>>           sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>>                                  pci->bus, pci->dev, pci->func);
>>           f = fopen(sysfs_path, "r");
>> -- 
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-09-27 12:44     ` Oleksandr Andrushchenko
@ 2021-09-28  4:00       ` Stefano Stabellini
  2021-09-28  4:51         ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-28  4:00 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: Stefano Stabellini, xen-devel, julien, Oleksandr Tyshchenko,
	Volodymyr Babchuk, Artem Mygaiev, roger.pau, Bertrand Marquis,
	Rahul Singh

[-- Attachment #1: Type: text/plain, Size: 19603 bytes --]

On Mon, 27 Sep 2021, Oleksandr Andrushchenko wrote:
> On 25.09.21 03:44, Stefano Stabellini wrote:
> > On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
> >> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> >>
> >> PCI host bridges are special devices in terms of implementing PCI
> >> passthrough. According to [1] the current implementation depends on
> >> Domain-0 to perform the initialization of the relevant PCI host
> >> bridge hardware and perform PCI device enumeration. In order to
> >> achieve that one of the required changes is to not map all the memory
> >> ranges in map_range_to_domain as we traverse the device tree on startup
> >> and perform some additional checks if the range needs to be mapped to
> >> Domain-0.
> >>
> >> The generic PCI host controller device tree binding says [2]:
> >> - ranges: As described in IEEE Std 1275-1994, but must provide
> >>            at least a definition of non-prefetchable memory. One
> >>            or both of prefetchable Memory and IO Space may also
> >>            be provided.
> >>
> >> - reg   : The Configuration Space base address and size, as accessed
> >>            from the parent bus.  The base address corresponds to
> >>            the first bus in the "bus-range" property.  If no
> >>            "bus-range" is specified, this will be bus 0 (the default).
> >>
> >> >From the above none of the memory ranges from the "ranges" property
> >> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
> >> be handled dynamically by vPCI as we assign PCI devices, e.g. each
> >> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
> >> as needed by Xen.
> >>
> >> The "reg" property covers not only ECAM space, but may also have other
> >> then the configuration memory ranges described, for example [3]:
> >> - reg: Should contain rc_dbi, config registers location and length.
> >> - reg-names: Must include the following entries:
> >>     "rc_dbi": controller configuration registers;
> >>     "config": PCIe configuration space registers.
> >>
> >> This patch makes it possible to not map all the ranges from the
> >> "ranges" property and also ECAM from the "reg". All the rest from the
> >> "reg" property still needs to be mapped to Domain-0, so the PCI
> >> host bridge remains functional in Domain-0.
> >>
> >> [1] https://urldefense.com/v3/__https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html__;!!GF_29dbcQIUBPA!lrCuNRzUVkRf4FLgp3hW-4uOldgKr4qNpZb_ufI0jW-O0eRH11VFDeGcs5pPdtKZT1_ou4InSg$ [lists[.]xenproject[.]org]
> >> [2] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt__;!!GF_29dbcQIUBPA!lrCuNRzUVkRf4FLgp3hW-4uOldgKr4qNpZb_ufI0jW-O0eRH11VFDeGcs5pPdtKZT1_T5yn7GA$ [kernel[.]org]
> >> [3] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt__;!!GF_29dbcQIUBPA!lrCuNRzUVkRf4FLgp3hW-4uOldgKr4qNpZb_ufI0jW-O0eRH11VFDeGcs5pPdtKZT18im_Y2tw$ [kernel[.]org]
> >>
> >> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> >>
> >> ---
> >> Since v1:
> >>   - Added better description of why and what needs to be mapped into
> >>     Domain-0's p2m and what doesn't
> >>   - Do not do any mappings for PCI devices while traversing the DT
> >>   - Walk all the bridges and make required mappings in one go
> >> ---
> >>   xen/arch/arm/domain_build.c        | 38 +++++++++++++++--------
> >>   xen/arch/arm/pci/ecam.c            | 14 +++++++++
> >>   xen/arch/arm/pci/pci-host-common.c | 48 ++++++++++++++++++++++++++++++
> >>   xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
> >>   xen/include/asm-arm/pci.h          |  9 ++++++
> >>   xen/include/asm-arm/setup.h        | 13 ++++++++
> >>   6 files changed, 111 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >> index 83ab0d52cce9..e72c1b881cae 100644
> >> --- a/xen/arch/arm/domain_build.c
> >> +++ b/xen/arch/arm/domain_build.c
> >> @@ -10,7 +10,6 @@
> >>   #include <asm/regs.h>
> >>   #include <xen/errno.h>
> >>   #include <xen/err.h>
> >> -#include <xen/device_tree.h>
> >>   #include <xen/libfdt/libfdt.h>
> >>   #include <xen/guest_access.h>
> >>   #include <xen/iocap.h>
> >> @@ -47,12 +46,6 @@ static int __init parse_dom0_mem(const char *s)
> >>   }
> >>   custom_param("dom0_mem", parse_dom0_mem);
> >>   
> >> -struct map_range_data
> >> -{
> >> -    struct domain *d;
> >> -    p2m_type_t p2mt;
> >> -};
> >> -
> >>   /* Override macros from asm/page.h to make them work with mfn_t */
> >>   #undef virt_to_mfn
> >>   #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
> >> @@ -1388,9 +1381,8 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
> >>       return 0;
> >>   }
> >>   
> >> -static int __init map_range_to_domain(const struct dt_device_node *dev,
> >> -                                      u64 addr, u64 len,
> >> -                                      void *data)
> >> +int __init map_range_to_domain(const struct dt_device_node *dev,
> >> +                               u64 addr, u64 len, void *data)
> >>   {
> >>       struct map_range_data *mr_data = data;
> >>       struct domain *d = mr_data->d;
> >> @@ -1417,6 +1409,13 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
> >>           }
> >>       }
> >>   
> >> +#ifdef CONFIG_HAS_PCI
> >> +    if ( is_pci_passthrough_enabled() &&
> >> +         (device_get_class(dev) == DEVICE_PCI) &&
> >> +         !mr_data->map_pci_bridge )
> >> +        need_mapping = false;
> >> +#endif
> > With the change I suggested below turning map_pci_bridge into
> > skip_mapping, then this check could go away if we just set need_mapping
> > as follows:
> >
> > bool need_mapping = !dt_device_for_passthrough(dev) &&
> >                      !mr_data->skip_mapping;
> 
> Not exactly. This check, e.g.
> 
> "is_pci_passthrough_enabled() && (device_get_class(dev) == DEVICE_PCI)"
> 
> really protects us from mapping any of the ranges belonging to a PCI device:
> 
> we scan the device tree and for each node we call map_range_to_domain
> 
> with skip_mapping == false (it is called from map_device_children).
> 
> So, if there is no check then the mapping is performed even for PCI devices
> 
> which we do not want.
> 
> But, yes we can simplify the logic to:
> 
> bool need_mapping = !dt_device_for_passthrough(dev) &&
>                      !mr_data->skip_mapping;
> 
> #ifdef CONFIG_HAS_PCI
>      if ( need_mapping && is_pci_passthrough_enabled() &&
>           (device_get_class(dev) == DEVICE_PCI) )
>          need_mapping = false;
> #endif
> 
> but I see no big profit from it.

Sorry I didn't follow your explanation.

My suggestion is to remove the #ifdef CONFIG_HAS_PCI completely from
map_range_to_domain. At the beginning of map_range_to_domain, there is
already this line:

bool need_mapping = !dt_device_for_passthrough(dev);

We can change it into:

bool need_mapping = !dt_device_for_passthrough(dev) &&
                    !mr_data->skip_mapping;


Then, in map_device_children we can set mr_data->skip_mapping to true
for PCI devices. There is already a pci check there:

 if ( dt_device_type_is_equal(dev, "pci") )

so it should be easy to do. What am I missing?



> >>       if ( need_mapping )
> >>       {
> >>           res = map_regions_p2mt(d,
> >> @@ -1450,7 +1449,11 @@ static int __init map_device_children(struct domain *d,
> >>                                         const struct dt_device_node *dev,
> >>                                         p2m_type_t p2mt)
> >>   {
> >> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> >> +    struct map_range_data mr_data = {
> >> +        .d = d,
> >> +        .p2mt = p2mt,
> >> +        .map_pci_bridge = false
> >> +    };
> >>       int ret;
> >>   
> >>       if ( dt_device_type_is_equal(dev, "pci") )
> >> @@ -1582,7 +1585,11 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
> >>       /* Give permission and map MMIOs */
> >>       for ( i = 0; i < naddr; i++ )
> >>       {
> >> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> >> +        struct map_range_data mr_data = {
> >> +            .d = d,
> >> +            .p2mt = p2mt,
> >> +            .map_pci_bridge = false
> >> +        };
> >>           res = dt_device_get_address(dev, i, &addr, &size);
> >>           if ( res )
> >>           {
> >> @@ -2754,7 +2761,14 @@ static int __init construct_dom0(struct domain *d)
> >>           return rc;
> >>   
> >>       if ( acpi_disabled )
> >> +    {
> >>           rc = prepare_dtb_hwdom(d, &kinfo);
> >> +#ifdef CONFIG_HAS_PCI
> >> +        if ( rc < 0 )
> >> +            return rc;
> > This doesn't look great :-)
> >
> > I would move the call to pci_host_bridge_mappings() below just before
> > construct_domain.
> 
> I put it there for purpose: currently we only support device-tree and
> 
> ACPI is not covered, e.g. pci_host_bridge_mappings is implemented
> 
> with device-tree in mind. So, I decided to tie it to prepare_dtb_hwdom
> 
> which is called when acpi_disabled is true.

That's OK, I don't mind. My comment was purely "code aesthetic". I think
this would look better:

    if ( acpi_disabled )
        rc = prepare_dtb_hwdom(d, &kinfo);
    else
        rc = prepare_acpi(d, &kinfo);

    if ( rc < 0 )
        return rc;

#ifdef CONFIG_HAS_PCI
    if ( acpi_disabled )
        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
    if ( rc < 0 )
        return rc;
#endif


Or even this would look better:

    if ( acpi_disabled )
    {
        rc = prepare_dtb_hwdom(d, &kinfo);
        if ( rc < 0 )
            return rc;
#ifdef CONFIG_HAS_PCI
        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
#endif
    } else
        rc = prepare_acpi(d, &kinfo);

    if ( rc < 0 )
        return rc;


Given that this comment is not about functionality but only about how
the code looks like I won't insist.



> >> +        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
> >> +#endif
> >> +    }
> >>       else
> >>           rc = prepare_acpi(d, &kinfo);
> >>   
> >> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> >> index 9b88b1cedaa2..eae177f2cbc2 100644
> >> --- a/xen/arch/arm/pci/ecam.c
> >> +++ b/xen/arch/arm/pci/ecam.c
> >> @@ -39,6 +39,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
> >>       return base + (PCI_DEVFN2(sbdf) << devfn_shift) + where;
> >>   }
> >>   
> >> +bool pci_ecam_need_p2m_mapping(struct domain *d,
> >> +                               struct pci_host_bridge *bridge,
> >> +                               uint64_t addr)
> >> +{
> >> +    struct pci_config_window *cfg = bridge->cfg;
> >> +
> >> +    /*
> >> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
> >> +     * so we can trap access to it.
> >> +     */
> >> +    return cfg->phys_addr != addr;
> >> +}
> >> +
> >>   /* ECAM ops */
> >>   const struct pci_ecam_ops pci_generic_ecam_ops = {
> >>       .bus_shift  = 20,
> >> @@ -46,6 +59,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
> >>           .map_bus                = pci_ecam_map_bus,
> >>           .read                   = pci_generic_config_read,
> >>           .write                  = pci_generic_config_write,
> >> +        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
> >>       }
> >>   };
> >>   
> >> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> >> index 155f2a2743af..f350826ea26b 100644
> >> --- a/xen/arch/arm/pci/pci-host-common.c
> >> +++ b/xen/arch/arm/pci/pci-host-common.c
> >> @@ -18,6 +18,7 @@
> >>   
> >>   #include <xen/init.h>
> >>   #include <xen/pci.h>
> >> +#include <asm/setup.h>
> >>   #include <xen/rwlock.h>
> >>   #include <xen/sched.h>
> >>   #include <xen/vmap.h>
> >> @@ -328,6 +329,53 @@ int pci_host_get_num_bridges(void)
> >>       return count;
> >>   }
> >>   
> >> +int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
> >> +{
> >> +    struct pci_host_bridge *bridge;
> >> +    struct map_range_data mr_data = {
> >> +        .d = d,
> >> +        .p2mt = p2mt,
> >> +        .map_pci_bridge = true
> >> +    };
> >> +
> >> +    /*
> >> +     * For each PCI host bridge we need to only map those ranges
> >> +     * which are used by Domain-0 to properly initialize the bridge,
> >> +     * e.g. we do not want to map ECAM configuration space which lives in
> >> +     * "reg" or "assigned-addresses" device tree property.
> >> +     * Neither we want to map any of the MMIO ranges found in the "ranges"
> >> +     * device tree property.
> >> +     */
> >> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> >> +    {
> >> +        const struct dt_device_node *dev = bridge->dt_node;
> >> +        int i;
> > i should be unsigned int
> Ok
> >
> >
> >> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
> >> +        {
> >> +            uint64_t addr, size;
> >> +            int err;
> >> +
> >> +            err = dt_device_get_address(dev, i, &addr, &size);
> >> +            if ( err )
> >> +            {
> >> +                printk(XENLOG_ERR "Unable to retrieve address %u for %s\n",
> > Maybe rephrase it to:
> >
> >      Unable to retrieve address range index=%u for %s
> This is a copy-paste from the original code, but ok
> >
> >
> >> +                       i, dt_node_full_name(dev));
> >> +                return err;
> >> +            }
> >> +
> >> +            if ( bridge->ops->need_p2m_mapping(d, bridge, addr) )
> > The current implementation of need_p2m_mapping filters out ECAM and
> > nothing else. Just double-checking here: is there anything else we
> > should filter out? Looking at the device tree pci node for ZynqMP:
> >
> > 			reg = <0x0 0xfd0e0000 0x0 0x1000 0x0 0xfd480000 0x0 0x1000 0x80 0x0 0x0 0x1000000>;
> > 			reg-names = "breg", "pcireg", "cfg";
> >
> > We are filtering out cfg, but do we need both "breg" and "pcireg" here?
> 
> It is vice versa: we are filtering out cfg only and all the rest are "unknown regions we do not
> 
> want to alter".

Ah, OK. Can you please add a note about this to the in-code comment?
Maybe as follows:

For each PCI host bridge we need to only map those ranges which are used
by Domain-0 to properly initialize the bridge, e.g. we do not want to
map ECAM configuration space which lives in "reg" or
"assigned-addresses" device tree property, but we want to map other
regions of the host bridge. The PCI aperture defined by the "ranges"
device tree property should also be skipped.


> >
> > If not, do we need another function like .cfg_reg_index to know what we
> > actually need to map?
> 
> I was thinking to use .cfg_reg_index fir that, but it means I'll need to traverse
> 
> the device-tree to get the value for .cfg_reg_index which is already known
> 
> to the bridge. So, it is cheaper to have a callback and just check that
> 
> cfg->phys_addr != addr, e.g. what we want to map is not cfg.

Yeah that makes sense


> >> +            {
> >> +                err = map_range_to_domain(dev, addr, size, &mr_data);
> >> +                if ( err )
> >> +                    return err;
> >> +            }
> >> +        }
> >> +    }
> >> +
> >> +    return 0;
> >> +}
> >> +
> >>   /*
> >>    * Local variables:
> >>    * mode: C
> >> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
> >> index c27b4ea9f02f..adbe3627871f 100644
> >> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
> >> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
> >> @@ -33,6 +33,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
> >>           .map_bus                = pci_ecam_map_bus,
> >>           .read                   = pci_generic_config_read,
> >>           .write                  = pci_generic_config_write,
> >> +        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
> >>       }
> >>   };
> >>   
> >> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> >> index 7618f0b6725b..b81f66e813ef 100644
> >> --- a/xen/include/asm-arm/pci.h
> >> +++ b/xen/include/asm-arm/pci.h
> >> @@ -19,6 +19,8 @@
> >>   
> >>   #ifdef CONFIG_HAS_PCI
> >>   
> >> +#include <asm/p2m.h>
> >> +
> >>   #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
> >>   #define PRI_pci "%04x:%02x:%02x.%u"
> >>   
> >> @@ -79,6 +81,9 @@ struct pci_ops {
> >>                   uint32_t reg, uint32_t len, uint32_t *value);
> >>       int (*write)(struct pci_host_bridge *bridge, uint32_t sbdf,
> >>                    uint32_t reg, uint32_t len, uint32_t value);
> >> +    bool (*need_p2m_mapping)(struct domain *d,
> >> +                             struct pci_host_bridge *bridge,
> >> +                             uint64_t addr);
> > I would call this function: need_p2m_hwdom_mapping
> Ok
> >
> >
> >>   };
> >>   
> >>   /*
> >> @@ -102,6 +107,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, uint32_t sbdf,
> >>                               uint32_t reg, uint32_t len, uint32_t value);
> >>   void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
> >>                                  uint32_t sbdf, uint32_t where);
> >> +bool pci_ecam_need_p2m_mapping(struct domain *d,
> >> +                               struct pci_host_bridge *bridge,
> >> +                               uint64_t addr);
> >>   struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
> >>   int pci_get_host_bridge_segment(const struct dt_device_node *node,
> >>                                   uint16_t *segment);
> >> @@ -116,6 +124,7 @@ int pci_host_iterate_bridges(struct domain *d,
> >>                                int (*clb)(struct domain *d,
> >>                                           struct pci_host_bridge *bridge));
> >>   int pci_host_get_num_bridges(void);
> >> +int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
> >>   #else   /*!CONFIG_HAS_PCI*/
> >>   
> >>   #define pci_passthrough_enabled (false)
> >> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
> >> index 95da0b7ab9cd..21863dd2bc58 100644
> >> --- a/xen/include/asm-arm/setup.h
> >> +++ b/xen/include/asm-arm/setup.h
> >> @@ -2,6 +2,8 @@
> >>   #define __ARM_SETUP_H_
> >>   
> >>   #include <public/version.h>
> >> +#include <asm/p2m.h>
> >> +#include <xen/device_tree.h>
> >>   
> >>   #define MIN_FDT_ALIGN 8
> >>   #define MAX_FDT_SIZE SZ_2M
> >> @@ -77,6 +79,14 @@ struct bootinfo {
> >>   #endif
> >>   };
> >>   
> >> +struct map_range_data
> >> +{
> >> +    struct domain *d;
> >> +    p2m_type_t p2mt;
> >> +    /* Set if mappings for PCI host bridges must not be skipped. */
> >> +    bool map_pci_bridge;
> > To make this more generally applicable, I would call the new property:
> >
> >    bool skip_mapping;
> Sounds good
> >
> > and it could apply to any class of devices. All current users would set
> > it to false except for pci_host_bridge_mappings.
> Please see PCI special case above
> >
> >
> >> +};
> >>
> >>   extern struct bootinfo bootinfo;
> >>   
> >>   extern domid_t max_init_domid;
> >> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
> >>   u32 device_tree_get_u32(const void *fdt, int node,
> >>                           const char *prop_name, u32 dflt);
> >>   
> >> +int map_range_to_domain(const struct dt_device_node *dev,
> >> +                        u64 addr, u64 len, void *data);
> >> +
> >>   #endif
> >>   /*
> >>    * Local variables:
> >> -- 
> >> 2.25.1
> >>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-09-28  4:00       ` Stefano Stabellini
@ 2021-09-28  4:51         ` Oleksandr Andrushchenko
  2021-09-28 14:39           ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-28  4:51 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko


On 28.09.21 07:00, Stefano Stabellini wrote:
> On Mon, 27 Sep 2021, Oleksandr Andrushchenko wrote:
>> On 25.09.21 03:44, Stefano Stabellini wrote:
>>> On Thu, 23 Sep 2021, Oleksandr Andrushchenko wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>
>>>> PCI host bridges are special devices in terms of implementing PCI
>>>> passthrough. According to [1] the current implementation depends on
>>>> Domain-0 to perform the initialization of the relevant PCI host
>>>> bridge hardware and perform PCI device enumeration. In order to
>>>> achieve that one of the required changes is to not map all the memory
>>>> ranges in map_range_to_domain as we traverse the device tree on startup
>>>> and perform some additional checks if the range needs to be mapped to
>>>> Domain-0.
>>>>
>>>> The generic PCI host controller device tree binding says [2]:
>>>> - ranges: As described in IEEE Std 1275-1994, but must provide
>>>>             at least a definition of non-prefetchable memory. One
>>>>             or both of prefetchable Memory and IO Space may also
>>>>             be provided.
>>>>
>>>> - reg   : The Configuration Space base address and size, as accessed
>>>>             from the parent bus.  The base address corresponds to
>>>>             the first bus in the "bus-range" property.  If no
>>>>             "bus-range" is specified, this will be bus 0 (the default).
>>>>
>>>> >From the above none of the memory ranges from the "ranges" property
>>>> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
>>>> be handled dynamically by vPCI as we assign PCI devices, e.g. each
>>>> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
>>>> as needed by Xen.
>>>>
>>>> The "reg" property covers not only ECAM space, but may also have other
>>>> then the configuration memory ranges described, for example [3]:
>>>> - reg: Should contain rc_dbi, config registers location and length.
>>>> - reg-names: Must include the following entries:
>>>>      "rc_dbi": controller configuration registers;
>>>>      "config": PCIe configuration space registers.
>>>>
>>>> This patch makes it possible to not map all the ranges from the
>>>> "ranges" property and also ECAM from the "reg". All the rest from the
>>>> "reg" property still needs to be mapped to Domain-0, so the PCI
>>>> host bridge remains functional in Domain-0.
>>>>
>>>> [1] https://urldefense.com/v3/__https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html__;!!GF_29dbcQIUBPA!lrCuNRzUVkRf4FLgp3hW-4uOldgKr4qNpZb_ufI0jW-O0eRH11VFDeGcs5pPdtKZT1_ou4InSg$ [lists[.]xenproject[.]org]
>>>> [2] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt__;!!GF_29dbcQIUBPA!lrCuNRzUVkRf4FLgp3hW-4uOldgKr4qNpZb_ufI0jW-O0eRH11VFDeGcs5pPdtKZT1_T5yn7GA$ [kernel[.]org]
>>>> [3] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt__;!!GF_29dbcQIUBPA!lrCuNRzUVkRf4FLgp3hW-4uOldgKr4qNpZb_ufI0jW-O0eRH11VFDeGcs5pPdtKZT18im_Y2tw$ [kernel[.]org]
>>>>
>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>
>>>> ---
>>>> Since v1:
>>>>    - Added better description of why and what needs to be mapped into
>>>>      Domain-0's p2m and what doesn't
>>>>    - Do not do any mappings for PCI devices while traversing the DT
>>>>    - Walk all the bridges and make required mappings in one go
>>>> ---
>>>>    xen/arch/arm/domain_build.c        | 38 +++++++++++++++--------
>>>>    xen/arch/arm/pci/ecam.c            | 14 +++++++++
>>>>    xen/arch/arm/pci/pci-host-common.c | 48 ++++++++++++++++++++++++++++++
>>>>    xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
>>>>    xen/include/asm-arm/pci.h          |  9 ++++++
>>>>    xen/include/asm-arm/setup.h        | 13 ++++++++
>>>>    6 files changed, 111 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>> index 83ab0d52cce9..e72c1b881cae 100644
>>>> --- a/xen/arch/arm/domain_build.c
>>>> +++ b/xen/arch/arm/domain_build.c
>>>> @@ -10,7 +10,6 @@
>>>>    #include <asm/regs.h>
>>>>    #include <xen/errno.h>
>>>>    #include <xen/err.h>
>>>> -#include <xen/device_tree.h>
>>>>    #include <xen/libfdt/libfdt.h>
>>>>    #include <xen/guest_access.h>
>>>>    #include <xen/iocap.h>
>>>> @@ -47,12 +46,6 @@ static int __init parse_dom0_mem(const char *s)
>>>>    }
>>>>    custom_param("dom0_mem", parse_dom0_mem);
>>>>    
>>>> -struct map_range_data
>>>> -{
>>>> -    struct domain *d;
>>>> -    p2m_type_t p2mt;
>>>> -};
>>>> -
>>>>    /* Override macros from asm/page.h to make them work with mfn_t */
>>>>    #undef virt_to_mfn
>>>>    #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
>>>> @@ -1388,9 +1381,8 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>>>>        return 0;
>>>>    }
>>>>    
>>>> -static int __init map_range_to_domain(const struct dt_device_node *dev,
>>>> -                                      u64 addr, u64 len,
>>>> -                                      void *data)
>>>> +int __init map_range_to_domain(const struct dt_device_node *dev,
>>>> +                               u64 addr, u64 len, void *data)
>>>>    {
>>>>        struct map_range_data *mr_data = data;
>>>>        struct domain *d = mr_data->d;
>>>> @@ -1417,6 +1409,13 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>>>>            }
>>>>        }
>>>>    
>>>> +#ifdef CONFIG_HAS_PCI
>>>> +    if ( is_pci_passthrough_enabled() &&
>>>> +         (device_get_class(dev) == DEVICE_PCI) &&
>>>> +         !mr_data->map_pci_bridge )
>>>> +        need_mapping = false;
>>>> +#endif
>>> With the change I suggested below turning map_pci_bridge into
>>> skip_mapping, then this check could go away if we just set need_mapping
>>> as follows:
>>>
>>> bool need_mapping = !dt_device_for_passthrough(dev) &&
>>>                       !mr_data->skip_mapping;
>> Not exactly. This check, e.g.
>>
>> "is_pci_passthrough_enabled() && (device_get_class(dev) == DEVICE_PCI)"
>>
>> really protects us from mapping any of the ranges belonging to a PCI device:
>>
>> we scan the device tree and for each node we call map_range_to_domain
>>
>> with skip_mapping == false (it is called from map_device_children).
>>
>> So, if there is no check then the mapping is performed even for PCI devices
>>
>> which we do not want.
>>
>> But, yes we can simplify the logic to:
>>
>> bool need_mapping = !dt_device_for_passthrough(dev) &&
>>                       !mr_data->skip_mapping;
>>
>> #ifdef CONFIG_HAS_PCI
>>       if ( need_mapping && is_pci_passthrough_enabled() &&
>>            (device_get_class(dev) == DEVICE_PCI) )
>>           need_mapping = false;
>> #endif
>>
>> but I see no big profit from it.
> Sorry I didn't follow your explanation.
>
> My suggestion is to remove the #ifdef CONFIG_HAS_PCI completely from
> map_range_to_domain. At the beginning of map_range_to_domain, there is
> already this line:
>
> bool need_mapping = !dt_device_for_passthrough(dev);
>
> We can change it into:
>
> bool need_mapping = !dt_device_for_passthrough(dev) &&
>                      !mr_data->skip_mapping;
>
>
> Then, in map_device_children we can set mr_data->skip_mapping to true
> for PCI devices.

This is the key. I am fine with this, but it just means we move the

check to the outside of this function which looks good. Will do

>   There is already a pci check there:
>
>   if ( dt_device_type_is_equal(dev, "pci") )
>
> so it should be easy to do. What am I missing?
>
>
>
>>>>        if ( need_mapping )
>>>>        {
>>>>            res = map_regions_p2mt(d,
>>>> @@ -1450,7 +1449,11 @@ static int __init map_device_children(struct domain *d,
>>>>                                          const struct dt_device_node *dev,
>>>>                                          p2m_type_t p2mt)
>>>>    {
>>>> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>>>> +    struct map_range_data mr_data = {
>>>> +        .d = d,
>>>> +        .p2mt = p2mt,
>>>> +        .map_pci_bridge = false
>>>> +    };
>>>>        int ret;
>>>>    
>>>>        if ( dt_device_type_is_equal(dev, "pci") )
>>>> @@ -1582,7 +1585,11 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>>>        /* Give permission and map MMIOs */
>>>>        for ( i = 0; i < naddr; i++ )
>>>>        {
>>>> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>>>> +        struct map_range_data mr_data = {
>>>> +            .d = d,
>>>> +            .p2mt = p2mt,
>>>> +            .map_pci_bridge = false
>>>> +        };
>>>>            res = dt_device_get_address(dev, i, &addr, &size);
>>>>            if ( res )
>>>>            {
>>>> @@ -2754,7 +2761,14 @@ static int __init construct_dom0(struct domain *d)
>>>>            return rc;
>>>>    
>>>>        if ( acpi_disabled )
>>>> +    {
>>>>            rc = prepare_dtb_hwdom(d, &kinfo);
>>>> +#ifdef CONFIG_HAS_PCI
>>>> +        if ( rc < 0 )
>>>> +            return rc;
>>> This doesn't look great :-)
>>>
>>> I would move the call to pci_host_bridge_mappings() below just before
>>> construct_domain.
>> I put it there for purpose: currently we only support device-tree and
>>
>> ACPI is not covered, e.g. pci_host_bridge_mappings is implemented
>>
>> with device-tree in mind. So, I decided to tie it to prepare_dtb_hwdom
>>
>> which is called when acpi_disabled is true.
> That's OK, I don't mind. My comment was purely "code aesthetic". I think
> this would look better:
>
>      if ( acpi_disabled )
>          rc = prepare_dtb_hwdom(d, &kinfo);
>      else
>          rc = prepare_acpi(d, &kinfo);
>
>      if ( rc < 0 )
>          return rc;
>
> #ifdef CONFIG_HAS_PCI
>      if ( acpi_disabled )
>          rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
>      if ( rc < 0 )
>          return rc;
> #endif
>
>
> Or even this would look better:
>
>      if ( acpi_disabled )
>      {
>          rc = prepare_dtb_hwdom(d, &kinfo);
>          if ( rc < 0 )
>              return rc;
> #ifdef CONFIG_HAS_PCI
>          rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
> #endif
>      } else
>          rc = prepare_acpi(d, &kinfo);
>
>      if ( rc < 0 )
>          return rc;
>
>
> Given that this comment is not about functionality but only about how
> the code looks like I won't insist.
Ok, I'll re-arrange the code
>
>
>
>>>> +        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
>>>> +#endif
>>>> +    }
>>>>        else
>>>>            rc = prepare_acpi(d, &kinfo);
>>>>    
>>>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>>>> index 9b88b1cedaa2..eae177f2cbc2 100644
>>>> --- a/xen/arch/arm/pci/ecam.c
>>>> +++ b/xen/arch/arm/pci/ecam.c
>>>> @@ -39,6 +39,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>>>        return base + (PCI_DEVFN2(sbdf) << devfn_shift) + where;
>>>>    }
>>>>    
>>>> +bool pci_ecam_need_p2m_mapping(struct domain *d,
>>>> +                               struct pci_host_bridge *bridge,
>>>> +                               uint64_t addr)
>>>> +{
>>>> +    struct pci_config_window *cfg = bridge->cfg;
>>>> +
>>>> +    /*
>>>> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
>>>> +     * so we can trap access to it.
>>>> +     */
>>>> +    return cfg->phys_addr != addr;
>>>> +}
>>>> +
>>>>    /* ECAM ops */
>>>>    const struct pci_ecam_ops pci_generic_ecam_ops = {
>>>>        .bus_shift  = 20,
>>>> @@ -46,6 +59,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>>>>            .map_bus                = pci_ecam_map_bus,
>>>>            .read                   = pci_generic_config_read,
>>>>            .write                  = pci_generic_config_write,
>>>> +        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
>>>>        }
>>>>    };
>>>>    
>>>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>>>> index 155f2a2743af..f350826ea26b 100644
>>>> --- a/xen/arch/arm/pci/pci-host-common.c
>>>> +++ b/xen/arch/arm/pci/pci-host-common.c
>>>> @@ -18,6 +18,7 @@
>>>>    
>>>>    #include <xen/init.h>
>>>>    #include <xen/pci.h>
>>>> +#include <asm/setup.h>
>>>>    #include <xen/rwlock.h>
>>>>    #include <xen/sched.h>
>>>>    #include <xen/vmap.h>
>>>> @@ -328,6 +329,53 @@ int pci_host_get_num_bridges(void)
>>>>        return count;
>>>>    }
>>>>    
>>>> +int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
>>>> +{
>>>> +    struct pci_host_bridge *bridge;
>>>> +    struct map_range_data mr_data = {
>>>> +        .d = d,
>>>> +        .p2mt = p2mt,
>>>> +        .map_pci_bridge = true
>>>> +    };
>>>> +
>>>> +    /*
>>>> +     * For each PCI host bridge we need to only map those ranges
>>>> +     * which are used by Domain-0 to properly initialize the bridge,
>>>> +     * e.g. we do not want to map ECAM configuration space which lives in
>>>> +     * "reg" or "assigned-addresses" device tree property.
>>>> +     * Neither we want to map any of the MMIO ranges found in the "ranges"
>>>> +     * device tree property.
>>>> +     */
>>>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>>>> +    {
>>>> +        const struct dt_device_node *dev = bridge->dt_node;
>>>> +        int i;
>>> i should be unsigned int
>> Ok
>>>
>>>> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
>>>> +        {
>>>> +            uint64_t addr, size;
>>>> +            int err;
>>>> +
>>>> +            err = dt_device_get_address(dev, i, &addr, &size);
>>>> +            if ( err )
>>>> +            {
>>>> +                printk(XENLOG_ERR "Unable to retrieve address %u for %s\n",
>>> Maybe rephrase it to:
>>>
>>>       Unable to retrieve address range index=%u for %s
>> This is a copy-paste from the original code, but ok
>>>
>>>> +                       i, dt_node_full_name(dev));
>>>> +                return err;
>>>> +            }
>>>> +
>>>> +            if ( bridge->ops->need_p2m_mapping(d, bridge, addr) )
>>> The current implementation of need_p2m_mapping filters out ECAM and
>>> nothing else. Just double-checking here: is there anything else we
>>> should filter out? Looking at the device tree pci node for ZynqMP:
>>>
>>> 			reg = <0x0 0xfd0e0000 0x0 0x1000 0x0 0xfd480000 0x0 0x1000 0x80 0x0 0x0 0x1000000>;
>>> 			reg-names = "breg", "pcireg", "cfg";
>>>
>>> We are filtering out cfg, but do we need both "breg" and "pcireg" here?
>> It is vice versa: we are filtering out cfg only and all the rest are "unknown regions we do not
>>
>> want to alter".
> Ah, OK. Can you please add a note about this to the in-code comment?
> Maybe as follows:
>
> For each PCI host bridge we need to only map those ranges which are used
> by Domain-0 to properly initialize the bridge, e.g. we do not want to
> map ECAM configuration space which lives in "reg" or
> "assigned-addresses" device tree property, but we want to map other
> regions of the host bridge. The PCI aperture defined by the "ranges"
> device tree property should also be skipped.
Sure, will add
>
>
>>> If not, do we need another function like .cfg_reg_index to know what we
>>> actually need to map?
>> I was thinking to use .cfg_reg_index fir that, but it means I'll need to traverse
>>
>> the device-tree to get the value for .cfg_reg_index which is already known
>>
>> to the bridge. So, it is cheaper to have a callback and just check that
>>
>> cfg->phys_addr != addr, e.g. what we want to map is not cfg.
> Yeah that makes sense
>
>
>>>> +            {
>>>> +                err = map_range_to_domain(dev, addr, size, &mr_data);
>>>> +                if ( err )
>>>> +                    return err;
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>>    /*
>>>>     * Local variables:
>>>>     * mode: C
>>>> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
>>>> index c27b4ea9f02f..adbe3627871f 100644
>>>> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
>>>> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
>>>> @@ -33,6 +33,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
>>>>            .map_bus                = pci_ecam_map_bus,
>>>>            .read                   = pci_generic_config_read,
>>>>            .write                  = pci_generic_config_write,
>>>> +        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
>>>>        }
>>>>    };
>>>>    
>>>> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
>>>> index 7618f0b6725b..b81f66e813ef 100644
>>>> --- a/xen/include/asm-arm/pci.h
>>>> +++ b/xen/include/asm-arm/pci.h
>>>> @@ -19,6 +19,8 @@
>>>>    
>>>>    #ifdef CONFIG_HAS_PCI
>>>>    
>>>> +#include <asm/p2m.h>
>>>> +
>>>>    #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
>>>>    #define PRI_pci "%04x:%02x:%02x.%u"
>>>>    
>>>> @@ -79,6 +81,9 @@ struct pci_ops {
>>>>                    uint32_t reg, uint32_t len, uint32_t *value);
>>>>        int (*write)(struct pci_host_bridge *bridge, uint32_t sbdf,
>>>>                     uint32_t reg, uint32_t len, uint32_t value);
>>>> +    bool (*need_p2m_mapping)(struct domain *d,
>>>> +                             struct pci_host_bridge *bridge,
>>>> +                             uint64_t addr);
>>> I would call this function: need_p2m_hwdom_mapping
>> Ok
>>>
>>>>    };
>>>>    
>>>>    /*
>>>> @@ -102,6 +107,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, uint32_t sbdf,
>>>>                                uint32_t reg, uint32_t len, uint32_t value);
>>>>    void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>>>                                   uint32_t sbdf, uint32_t where);
>>>> +bool pci_ecam_need_p2m_mapping(struct domain *d,
>>>> +                               struct pci_host_bridge *bridge,
>>>> +                               uint64_t addr);
>>>>    struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
>>>>    int pci_get_host_bridge_segment(const struct dt_device_node *node,
>>>>                                    uint16_t *segment);
>>>> @@ -116,6 +124,7 @@ int pci_host_iterate_bridges(struct domain *d,
>>>>                                 int (*clb)(struct domain *d,
>>>>                                            struct pci_host_bridge *bridge));
>>>>    int pci_host_get_num_bridges(void);
>>>> +int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
>>>>    #else   /*!CONFIG_HAS_PCI*/
>>>>    
>>>>    #define pci_passthrough_enabled (false)
>>>> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
>>>> index 95da0b7ab9cd..21863dd2bc58 100644
>>>> --- a/xen/include/asm-arm/setup.h
>>>> +++ b/xen/include/asm-arm/setup.h
>>>> @@ -2,6 +2,8 @@
>>>>    #define __ARM_SETUP_H_
>>>>    
>>>>    #include <public/version.h>
>>>> +#include <asm/p2m.h>
>>>> +#include <xen/device_tree.h>
>>>>    
>>>>    #define MIN_FDT_ALIGN 8
>>>>    #define MAX_FDT_SIZE SZ_2M
>>>> @@ -77,6 +79,14 @@ struct bootinfo {
>>>>    #endif
>>>>    };
>>>>    
>>>> +struct map_range_data
>>>> +{
>>>> +    struct domain *d;
>>>> +    p2m_type_t p2mt;
>>>> +    /* Set if mappings for PCI host bridges must not be skipped. */
>>>> +    bool map_pci_bridge;
>>> To make this more generally applicable, I would call the new property:
>>>
>>>     bool skip_mapping;
>> Sounds good
>>> and it could apply to any class of devices. All current users would set
>>> it to false except for pci_host_bridge_mappings.
>> Please see PCI special case above
>>>
>>>> +};
>>>>
>>>>    extern struct bootinfo bootinfo;
>>>>    
>>>>    extern domid_t max_init_domid;
>>>> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
>>>>    u32 device_tree_get_u32(const void *fdt, int node,
>>>>                            const char *prop_name, u32 dflt);
>>>>    
>>>> +int map_range_to_domain(const struct dt_device_node *dev,
>>>> +                        u64 addr, u64 len, void *data);
>>>> +
>>>>    #endif
>>>>    /*
>>>>     * Local variables:
>>>> -- 
>>>> 2.25.1
>>> >

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 01/11] xen/arm: Fix dev_is_dt macro definition
  2021-09-23 12:54 ` [PATCH v2 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
  2021-09-24 23:39   ` Stefano Stabellini
@ 2021-09-28  7:31   ` Michal Orzel
  1 sibling, 0 replies; 49+ messages in thread
From: Michal Orzel @ 2021-09-28  7:31 UTC (permalink / raw)
  To: Oleksandr Andrushchenko, xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko



On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> This macro is not currently used, but still has an error in it:
> a missing parenthesis. Fix this, so the macro is properly defined.
> 
> Fixes: 6c5d3075d97e ("xen/arm: Introduce a generic way to describe device")
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> ---
> New in v2
> ---
>  xen/include/asm-arm/device.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
> index 582119c31ee0..64aaa2641b7f 100644
> --- a/xen/include/asm-arm/device.h
> +++ b/xen/include/asm-arm/device.h
> @@ -29,7 +29,7 @@ typedef struct device device_t;
>  
>  /* TODO: Correctly implement dev_is_pci when PCI is supported on ARM */
>  #define dev_is_pci(dev) ((void)(dev), 0)
> -#define dev_is_dt(dev)  ((dev->type == DEV_DT)
> +#define dev_is_dt(dev)  ((dev)->type == DEV_DT)
>  
>  enum device_class
>  {
> 

Reviewed-by: Michal Orzel <michal.orzel@arm.com>


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-27 10:26               ` Jan Beulich
@ 2021-09-28  8:09                 ` Oleksandr Andrushchenko
  2021-09-28  8:26                   ` Jan Beulich
  0 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-28  8:09 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko, xen-devel


On 27.09.21 13:26, Jan Beulich wrote:
> On 27.09.2021 12:04, Oleksandr Andrushchenko wrote:
>> On 27.09.21 13:00, Jan Beulich wrote:
>>> On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
>>>> On 27.09.21 12:19, Jan Beulich wrote:
>>>>> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>>>>>> On 27.09.21 10:45, Jan Beulich wrote:
>>>>>>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>>>>>>>          *((u8*) &pdev->bus) = bus;
>>>>>>>>          *((u8*) &pdev->devfn) = devfn;
>>>>>>>>          pdev->domain = NULL;
>>>>>>>> +#ifdef CONFIG_ARM
>>>>>>>> +    pci_to_dev(pdev)->type = DEV_PCI;
>>>>>>>> +#endif
>>>>>>> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
>>>>>>> here. I'd prefer to see this done by a new arch helper, unless there are
>>>>>>> obstacles I'm overlooking.
>>>>>> Do you mean something like arch_pci_alloc_pdev(dev)?
>>>>> I'd recommend against "alloc" in its name; "new" instead maybe?
>>>> I am fine with arch_pci_new_pdev, but arch prefix points to the fact that
>>>> this is just an architecture specific part of the pdev allocation rather than
>>>> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev seems
>>>> more natural to me.
>>> The bulk of the function is about populating the just allocated struct.
>>> There's no arch-specific part of the allocation (so far, leaving aside
>>> MSI-X), you only want and arch-specific part of the initialization. I
>>> would agree with "alloc" in the name if further allocation was to
>>> happen there.
>> Hm, then arch_pci_init_pdev sounds more reasonable
> Fine with me.

Do we want this to be void or returning an error code? If error code is needed,

then we would also need a roll-back function, e.g. arch_pci_free_pdev or

arch_pci_release_pdev or arch_pci_fini_pdev or something, so it can be used in

case of error or in free_pdev function.

If so, then what's your preference on the name of that function?

>
> Jan
>
>
Thank you,

Oleksandr

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-28  8:09                 ` Oleksandr Andrushchenko
@ 2021-09-28  8:26                   ` Jan Beulich
  2021-09-28  8:29                     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2021-09-28  8:26 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	xen-devel

On 28.09.2021 10:09, Oleksandr Andrushchenko wrote:
> 
> On 27.09.21 13:26, Jan Beulich wrote:
>> On 27.09.2021 12:04, Oleksandr Andrushchenko wrote:
>>> On 27.09.21 13:00, Jan Beulich wrote:
>>>> On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
>>>>> On 27.09.21 12:19, Jan Beulich wrote:
>>>>>> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>>>>>>> On 27.09.21 10:45, Jan Beulich wrote:
>>>>>>>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>>>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>>>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>>>>>>>>          *((u8*) &pdev->bus) = bus;
>>>>>>>>>          *((u8*) &pdev->devfn) = devfn;
>>>>>>>>>          pdev->domain = NULL;
>>>>>>>>> +#ifdef CONFIG_ARM
>>>>>>>>> +    pci_to_dev(pdev)->type = DEV_PCI;
>>>>>>>>> +#endif
>>>>>>>> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
>>>>>>>> here. I'd prefer to see this done by a new arch helper, unless there are
>>>>>>>> obstacles I'm overlooking.
>>>>>>> Do you mean something like arch_pci_alloc_pdev(dev)?
>>>>>> I'd recommend against "alloc" in its name; "new" instead maybe?
>>>>> I am fine with arch_pci_new_pdev, but arch prefix points to the fact that
>>>>> this is just an architecture specific part of the pdev allocation rather than
>>>>> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev seems
>>>>> more natural to me.
>>>> The bulk of the function is about populating the just allocated struct.
>>>> There's no arch-specific part of the allocation (so far, leaving aside
>>>> MSI-X), you only want and arch-specific part of the initialization. I
>>>> would agree with "alloc" in the name if further allocation was to
>>>> happen there.
>>> Hm, then arch_pci_init_pdev sounds more reasonable
>> Fine with me.
> 
> Do we want this to be void or returning an error code? If error code is needed,
> then we would also need a roll-back function, e.g. arch_pci_free_pdev or
> arch_pci_release_pdev or arch_pci_fini_pdev or something, so it can be used in
> case of error or in free_pdev function.

I'd start with void and make it return an error (and deal with necessary
cleanup) only once a need arises.

Jan



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-28  8:26                   ` Jan Beulich
@ 2021-09-28  8:29                     ` Oleksandr Andrushchenko
  2021-09-28  8:39                       ` Jan Beulich
  0 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-28  8:29 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	xen-devel, Oleksandr Andrushchenko


On 28.09.21 11:26, Jan Beulich wrote:
> On 28.09.2021 10:09, Oleksandr Andrushchenko wrote:
>> On 27.09.21 13:26, Jan Beulich wrote:
>>> On 27.09.2021 12:04, Oleksandr Andrushchenko wrote:
>>>> On 27.09.21 13:00, Jan Beulich wrote:
>>>>> On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
>>>>>> On 27.09.21 12:19, Jan Beulich wrote:
>>>>>>> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>>>>>>>> On 27.09.21 10:45, Jan Beulich wrote:
>>>>>>>>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>>>>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>>>>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>>>>>>>>>           *((u8*) &pdev->bus) = bus;
>>>>>>>>>>           *((u8*) &pdev->devfn) = devfn;
>>>>>>>>>>           pdev->domain = NULL;
>>>>>>>>>> +#ifdef CONFIG_ARM
>>>>>>>>>> +    pci_to_dev(pdev)->type = DEV_PCI;
>>>>>>>>>> +#endif
>>>>>>>>> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
>>>>>>>>> here. I'd prefer to see this done by a new arch helper, unless there are
>>>>>>>>> obstacles I'm overlooking.
>>>>>>>> Do you mean something like arch_pci_alloc_pdev(dev)?
>>>>>>> I'd recommend against "alloc" in its name; "new" instead maybe?
>>>>>> I am fine with arch_pci_new_pdev, but arch prefix points to the fact that
>>>>>> this is just an architecture specific part of the pdev allocation rather than
>>>>>> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev seems
>>>>>> more natural to me.
>>>>> The bulk of the function is about populating the just allocated struct.
>>>>> There's no arch-specific part of the allocation (so far, leaving aside
>>>>> MSI-X), you only want and arch-specific part of the initialization. I
>>>>> would agree with "alloc" in the name if further allocation was to
>>>>> happen there.
>>>> Hm, then arch_pci_init_pdev sounds more reasonable
>>> Fine with me.
>> Do we want this to be void or returning an error code? If error code is needed,
>> then we would also need a roll-back function, e.g. arch_pci_free_pdev or
>> arch_pci_release_pdev or arch_pci_fini_pdev or something, so it can be used in
>> case of error or in free_pdev function.
> I'd start with void and make it return an error (and deal with necessary
> cleanup) only once a need arises.

Sounds reasonable. For x86 I think we can deal with:

xen/include/xen/pci.h:

#ifdef CONFIG_ARM
void arch_pci_init_pdev(struct pci_dev *pdev);
#else
static inline void arch_pci_init_pdev(struct pci_dev *pdev)
{
     return 0;
}
#endif

>
> Jan
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-28  8:29                     ` Oleksandr Andrushchenko
@ 2021-09-28  8:39                       ` Jan Beulich
  2021-09-28  8:54                         ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 49+ messages in thread
From: Jan Beulich @ 2021-09-28  8:39 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	xen-devel

On 28.09.2021 10:29, Oleksandr Andrushchenko wrote:
> 
> On 28.09.21 11:26, Jan Beulich wrote:
>> On 28.09.2021 10:09, Oleksandr Andrushchenko wrote:
>>> On 27.09.21 13:26, Jan Beulich wrote:
>>>> On 27.09.2021 12:04, Oleksandr Andrushchenko wrote:
>>>>> On 27.09.21 13:00, Jan Beulich wrote:
>>>>>> On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
>>>>>>> On 27.09.21 12:19, Jan Beulich wrote:
>>>>>>>> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>>>>>>>>> On 27.09.21 10:45, Jan Beulich wrote:
>>>>>>>>>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>>>>>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>>>>>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>>>>>>>>>>           *((u8*) &pdev->bus) = bus;
>>>>>>>>>>>           *((u8*) &pdev->devfn) = devfn;
>>>>>>>>>>>           pdev->domain = NULL;
>>>>>>>>>>> +#ifdef CONFIG_ARM
>>>>>>>>>>> +    pci_to_dev(pdev)->type = DEV_PCI;
>>>>>>>>>>> +#endif
>>>>>>>>>> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
>>>>>>>>>> here. I'd prefer to see this done by a new arch helper, unless there are
>>>>>>>>>> obstacles I'm overlooking.
>>>>>>>>> Do you mean something like arch_pci_alloc_pdev(dev)?
>>>>>>>> I'd recommend against "alloc" in its name; "new" instead maybe?
>>>>>>> I am fine with arch_pci_new_pdev, but arch prefix points to the fact that
>>>>>>> this is just an architecture specific part of the pdev allocation rather than
>>>>>>> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev seems
>>>>>>> more natural to me.
>>>>>> The bulk of the function is about populating the just allocated struct.
>>>>>> There's no arch-specific part of the allocation (so far, leaving aside
>>>>>> MSI-X), you only want and arch-specific part of the initialization. I
>>>>>> would agree with "alloc" in the name if further allocation was to
>>>>>> happen there.
>>>>> Hm, then arch_pci_init_pdev sounds more reasonable
>>>> Fine with me.
>>> Do we want this to be void or returning an error code? If error code is needed,
>>> then we would also need a roll-back function, e.g. arch_pci_free_pdev or
>>> arch_pci_release_pdev or arch_pci_fini_pdev or something, so it can be used in
>>> case of error or in free_pdev function.
>> I'd start with void and make it return an error (and deal with necessary
>> cleanup) only once a need arises.
> 
> Sounds reasonable. For x86 I think we can deal with:
> 
> xen/include/xen/pci.h:
> 
> #ifdef CONFIG_ARM
> void arch_pci_init_pdev(struct pci_dev *pdev);
> #else
> static inline void arch_pci_init_pdev(struct pci_dev *pdev)
> {
>      return 0;
> }
> #endif

But that's still #ifdef-ary. We have asm/pci.h.

Jan



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one
  2021-09-28  8:39                       ` Jan Beulich
@ 2021-09-28  8:54                         ` Oleksandr Andrushchenko
  0 siblings, 0 replies; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-28  8:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	xen-devel, Oleksandr Andrushchenko


On 28.09.21 11:39, Jan Beulich wrote:
> On 28.09.2021 10:29, Oleksandr Andrushchenko wrote:
>> On 28.09.21 11:26, Jan Beulich wrote:
>>> On 28.09.2021 10:09, Oleksandr Andrushchenko wrote:
>>>> On 27.09.21 13:26, Jan Beulich wrote:
>>>>> On 27.09.2021 12:04, Oleksandr Andrushchenko wrote:
>>>>>> On 27.09.21 13:00, Jan Beulich wrote:
>>>>>>> On 27.09.2021 11:35, Oleksandr Andrushchenko wrote:
>>>>>>>> On 27.09.21 12:19, Jan Beulich wrote:
>>>>>>>>> On 27.09.2021 10:45, Oleksandr Andrushchenko wrote:
>>>>>>>>>> On 27.09.21 10:45, Jan Beulich wrote:
>>>>>>>>>>> On 23.09.2021 14:54, Oleksandr Andrushchenko wrote:
>>>>>>>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>>>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>>>>>>>> @@ -328,6 +328,9 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>>>>>>>>>>>>            *((u8*) &pdev->bus) = bus;
>>>>>>>>>>>>            *((u8*) &pdev->devfn) = devfn;
>>>>>>>>>>>>            pdev->domain = NULL;
>>>>>>>>>>>> +#ifdef CONFIG_ARM
>>>>>>>>>>>> +    pci_to_dev(pdev)->type = DEV_PCI;
>>>>>>>>>>>> +#endif
>>>>>>>>>>> I have to admit that I'm not happy about new CONFIG_<arch> conditionals
>>>>>>>>>>> here. I'd prefer to see this done by a new arch helper, unless there are
>>>>>>>>>>> obstacles I'm overlooking.
>>>>>>>>>> Do you mean something like arch_pci_alloc_pdev(dev)?
>>>>>>>>> I'd recommend against "alloc" in its name; "new" instead maybe?
>>>>>>>> I am fine with arch_pci_new_pdev, but arch prefix points to the fact that
>>>>>>>> this is just an architecture specific part of the pdev allocation rather than
>>>>>>>> actual pdev allocation itself, so with this respect arch_pci_alloc_pdev seems
>>>>>>>> more natural to me.
>>>>>>> The bulk of the function is about populating the just allocated struct.
>>>>>>> There's no arch-specific part of the allocation (so far, leaving aside
>>>>>>> MSI-X), you only want and arch-specific part of the initialization. I
>>>>>>> would agree with "alloc" in the name if further allocation was to
>>>>>>> happen there.
>>>>>> Hm, then arch_pci_init_pdev sounds more reasonable
>>>>> Fine with me.
>>>> Do we want this to be void or returning an error code? If error code is needed,
>>>> then we would also need a roll-back function, e.g. arch_pci_free_pdev or
>>>> arch_pci_release_pdev or arch_pci_fini_pdev or something, so it can be used in
>>>> case of error or in free_pdev function.
>>> I'd start with void and make it return an error (and deal with necessary
>>> cleanup) only once a need arises.
>> Sounds reasonable. For x86 I think we can deal with:
>>
>> xen/include/xen/pci.h:
>>
>> #ifdef CONFIG_ARM
>> void arch_pci_init_pdev(struct pci_dev *pdev);
>> #else
>> static inline void arch_pci_init_pdev(struct pci_dev *pdev)
>> {
>>       return 0;
>> }
>> #endif
> But that's still #ifdef-ary. We have asm/pci.h.
Sure, will define it there
>
> Jan
>
>
Thank you,

Oleksandr

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-09-28  4:51         ` Oleksandr Andrushchenko
@ 2021-09-28 14:39           ` Oleksandr Andrushchenko
  2021-09-28 16:11             ` Stefano Stabellini
  0 siblings, 1 reply; 49+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-28 14:39 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko

[-- Attachment #1: Type: text/plain, Size: 3456 bytes --]

[snip]
>> Sorry I didn't follow your explanation.
>>
>> My suggestion is to remove the #ifdef CONFIG_HAS_PCI completely from
>> map_range_to_domain. At the beginning of map_range_to_domain, there is
>> already this line:
>>
>> bool need_mapping = !dt_device_for_passthrough(dev);
>>
>> We can change it into:
>>
>> bool need_mapping = !dt_device_for_passthrough(dev) &&
>>                       !mr_data->skip_mapping;
>>
>>
>> Then, in map_device_children we can set mr_data->skip_mapping to true
>> for PCI devices.
> This is the key. I am fine with this, but it just means we move the
>
> check to the outside of this function which looks good. Will do
>
>>    There is already a pci check there:
>>
>>    if ( dt_device_type_is_equal(dev, "pci") )
>>
>> so it should be easy to do. What am I missing?
>>
>>
>
I did some experiments. If we move the check to map_device_children

it is not enough because part of the ranges is still mapped at handle_device level:

handle_device:
(XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd0e0000
(XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd480000
(XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr 8000000000

map_device_children:
(XEN) Mapping children of /axi/pcie@fd0e0000 to guest skip 1
(XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr e0000000
(XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr 600000000

pci_host_bridge_mappings:
(XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd0e0000
(XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd480000

So, I did more intrusive change:

@@ -1540,6 +1534,12 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
      int res;
      u64 addr, size;
      bool need_mapping = !dt_device_for_passthrough(dev);
+    struct map_range_data mr_data = {
+        .d = d,
+        .p2mt = p2mt,
+        .skip_mapping = is_pci_passthrough_enabled() &&
+                        (device_get_class(dev) == DEVICE_PCI)
+    };

With this I see that now mappings are done correctly:

handle_device:
(XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr fd0e0000
(XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr fd480000
(XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr 8000000000

map_device_children:
(XEN) Mapping children of /axi/pcie@fd0e0000 to guest skip 1
(XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr e0000000
(XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr 600000000

pci_host_bridge_mappings:
(XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd0e0000
(XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd480000

So, handle_device seems to be the right place. While at it I have also

optimized the way we setup struct map_range_data mr_data in both

handle_device and map_device_children: I removed structure initialization

from within the relevant loop and also pass mr_data to map_device_children,

so it doesn't need to create its own copy of the same and perform yet

another computation for .skip_mapping: it does need to not only know

that dev is a PCI device (this is done by the dt_device_type_is_equal(dev, "pci")

check, but also account on is_pci_passthrough_enabled().

Thus, the change will be more intrusive, but I hope will simplify things.

I am attaching the fixup patch for just in case you want more details.

Thank you,

Oleksandr



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fixes-4480fb1a5c83-xen-arm-Do-not-map-PCI-ECAM-and-M.patch --]
[-- Type: text/x-patch; name="0001-Fixes-4480fb1a5c83-xen-arm-Do-not-map-PCI-ECAM-and-M.patch", Size: 10687 bytes --]

From 07d6523be2535293d3e34ffd1c8508a0812a4cd8 Mon Sep 17 00:00:00 2001
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Date: Tue, 28 Sep 2021 13:24:42 +0300
Subject: [PATCH] Fixes: 4480fb1a5c83 ("xen/arm: Do not map PCI ECAM and MMIO
 space to Domain-0's p2m")

Since v2:
 - removed check in map_range_to_domain for PCI_DEV
   and moved it to handle_device, so the code is
   simpler
 - s/map_pci_bridge/skip_mapping
 - extended comment in pci_host_bridge_mappings
 - minor code restructure in construct_dom0
 - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
   callbacks
 - unsigned int i; in pci_host_bridge_mappings

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
 xen/arch/arm/domain_build.c        | 43 +++++++++++-------------------
 xen/arch/arm/pci/ecam.c            |  8 +++---
 xen/arch/arm/pci/pci-host-common.c | 15 ++++++-----
 xen/arch/arm/pci/pci-host-zynqmp.c |  2 +-
 xen/include/asm-arm/pci.h          | 12 ++++-----
 xen/include/asm-arm/setup.h        |  2 +-
 6 files changed, 36 insertions(+), 46 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index e72c1b881cae..17f3db6a1f48 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1386,7 +1386,8 @@ int __init map_range_to_domain(const struct dt_device_node *dev,
 {
     struct map_range_data *mr_data = data;
     struct domain *d = mr_data->d;
-    bool need_mapping = !dt_device_for_passthrough(dev);
+    bool need_mapping = !dt_device_for_passthrough(dev) &&
+                        !mr_data->skip_mapping;
     int res;
 
     /*
@@ -1409,13 +1410,6 @@ int __init map_range_to_domain(const struct dt_device_node *dev,
         }
     }
 
-#ifdef CONFIG_HAS_PCI
-    if ( is_pci_passthrough_enabled() &&
-         (device_get_class(dev) == DEVICE_PCI) &&
-         !mr_data->map_pci_bridge )
-        need_mapping = false;
-#endif
-
     if ( need_mapping )
     {
         res = map_regions_p2mt(d,
@@ -1445,27 +1439,21 @@ int __init map_range_to_domain(const struct dt_device_node *dev,
  * then we may need to perform additional mappings in order to make
  * the child resources available to domain 0.
  */
-static int __init map_device_children(struct domain *d,
-                                      const struct dt_device_node *dev,
-                                      p2m_type_t p2mt)
+static int __init map_device_children(const struct dt_device_node *dev,
+                                      struct map_range_data *mr_data)
 {
-    struct map_range_data mr_data = {
-        .d = d,
-        .p2mt = p2mt,
-        .map_pci_bridge = false
-    };
-    int ret;
-
     if ( dt_device_type_is_equal(dev, "pci") )
     {
+        int ret;
+
         dt_dprintk("Mapping children of %s to guest\n",
                    dt_node_full_name(dev));
 
-        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, d);
+        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, mr_data->d);
         if ( ret < 0 )
             return ret;
 
-        ret = dt_for_each_range(dev, &map_range_to_domain, &mr_data);
+        ret = dt_for_each_range(dev, &map_range_to_domain, mr_data);
         if ( ret < 0 )
             return ret;
     }
@@ -1546,6 +1534,12 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
     int res;
     u64 addr, size;
     bool need_mapping = !dt_device_for_passthrough(dev);
+    struct map_range_data mr_data = {
+        .d = d,
+        .p2mt = p2mt,
+        .skip_mapping = is_pci_passthrough_enabled() &&
+                        (device_get_class(dev) == DEVICE_PCI)
+    };
 
     naddr = dt_number_of_address(dev);
 
@@ -1585,11 +1579,6 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
     /* Give permission and map MMIOs */
     for ( i = 0; i < naddr; i++ )
     {
-        struct map_range_data mr_data = {
-            .d = d,
-            .p2mt = p2mt,
-            .map_pci_bridge = false
-        };
         res = dt_device_get_address(dev, i, &addr, &size);
         if ( res )
         {
@@ -1603,7 +1592,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
             return res;
     }
 
-    res = map_device_children(d, dev, p2mt);
+    res = map_device_children(dev, &mr_data);
     if ( res )
         return res;
 
@@ -2763,9 +2752,9 @@ static int __init construct_dom0(struct domain *d)
     if ( acpi_disabled )
     {
         rc = prepare_dtb_hwdom(d, &kinfo);
-#ifdef CONFIG_HAS_PCI
         if ( rc < 0 )
             return rc;
+#ifdef CONFIG_HAS_PCI
         rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
 #endif
     }
diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
index eae177f2cbc2..4c54998d1b4b 100644
--- a/xen/arch/arm/pci/ecam.c
+++ b/xen/arch/arm/pci/ecam.c
@@ -39,9 +39,9 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
     return base + (PCI_DEVFN2(sbdf) << devfn_shift) + where;
 }
 
-bool pci_ecam_need_p2m_mapping(struct domain *d,
-                               struct pci_host_bridge *bridge,
-                               uint64_t addr)
+bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
+                                     struct pci_host_bridge *bridge,
+                                     uint64_t addr)
 {
     struct pci_config_window *cfg = bridge->cfg;
 
@@ -59,7 +59,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
         .map_bus                = pci_ecam_map_bus,
         .read                   = pci_generic_config_read,
         .write                  = pci_generic_config_write,
-        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
+        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
     }
 };
 
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index f350826ea26b..f993deba8712 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -335,21 +335,21 @@ int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
     struct map_range_data mr_data = {
         .d = d,
         .p2mt = p2mt,
-        .map_pci_bridge = true
+        .skip_mapping = false
     };
 
     /*
      * For each PCI host bridge we need to only map those ranges
      * which are used by Domain-0 to properly initialize the bridge,
      * e.g. we do not want to map ECAM configuration space which lives in
-     * "reg" or "assigned-addresses" device tree property.
-     * Neither we want to map any of the MMIO ranges found in the "ranges"
-     * device tree property.
+     * "reg" or "assigned-addresses" device tree property, but we want to
+     * map other regions of the host bridge. The PCI aperture defined by
+     * the "ranges" device tree property should also be skipped.
      */
     list_for_each_entry( bridge, &pci_host_bridges, node )
     {
         const struct dt_device_node *dev = bridge->dt_node;
-        int i;
+        unsigned int i;
 
         for ( i = 0; i < dt_number_of_address(dev); i++ )
         {
@@ -359,12 +359,13 @@ int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
             err = dt_device_get_address(dev, i, &addr, &size);
             if ( err )
             {
-                printk(XENLOG_ERR "Unable to retrieve address %u for %s\n",
+                printk(XENLOG_ERR
+                       "Unable to retrieve address range index=%u for %s\n",
                        i, dt_node_full_name(dev));
                 return err;
             }
 
-            if ( bridge->ops->need_p2m_mapping(d, bridge, addr) )
+            if ( bridge->ops->need_p2m_hwdom_mapping(d, bridge, addr) )
             {
                 err = map_range_to_domain(dev, addr, size, &mr_data);
                 if ( err )
diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
index adbe3627871f..aef5d6af7ef5 100644
--- a/xen/arch/arm/pci/pci-host-zynqmp.c
+++ b/xen/arch/arm/pci/pci-host-zynqmp.c
@@ -33,7 +33,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
         .map_bus                = pci_ecam_map_bus,
         .read                   = pci_generic_config_read,
         .write                  = pci_generic_config_write,
-        .need_p2m_mapping       = pci_ecam_need_p2m_mapping,
+        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
     }
 };
 
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index ef986bd5994f..3c222acc46be 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -79,9 +79,9 @@ struct pci_ops {
                 uint32_t reg, uint32_t len, uint32_t *value);
     int (*write)(struct pci_host_bridge *bridge, uint32_t sbdf,
                  uint32_t reg, uint32_t len, uint32_t value);
-    bool (*need_p2m_mapping)(struct domain *d,
-                             struct pci_host_bridge *bridge,
-                             uint64_t addr);
+    bool (*need_p2m_hwdom_mapping)(struct domain *d,
+                                   struct pci_host_bridge *bridge,
+                                   uint64_t addr);
 };
 
 /*
@@ -105,9 +105,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, uint32_t sbdf,
                             uint32_t reg, uint32_t len, uint32_t value);
 void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
                                uint32_t sbdf, uint32_t where);
-bool pci_ecam_need_p2m_mapping(struct domain *d,
-                               struct pci_host_bridge *bridge,
-                               uint64_t addr);
+bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
+                                     struct pci_host_bridge *bridge,
+                                     uint64_t addr);
 struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
 int pci_get_host_bridge_segment(const struct dt_device_node *node,
                                 uint16_t *segment);
diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
index 21863dd2bc58..5b30135fda38 100644
--- a/xen/include/asm-arm/setup.h
+++ b/xen/include/asm-arm/setup.h
@@ -84,7 +84,7 @@ struct map_range_data
     struct domain *d;
     p2m_type_t p2mt;
     /* Set if mappings for PCI host bridges must not be skipped. */
-    bool map_pci_bridge;
+    bool skip_mapping;
 };
 
 extern struct bootinfo bootinfo;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-09-28 14:39           ` Oleksandr Andrushchenko
@ 2021-09-28 16:11             ` Stefano Stabellini
  0 siblings, 0 replies; 49+ messages in thread
From: Stefano Stabellini @ 2021-09-28 16:11 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: Stefano Stabellini, xen-devel, julien, Oleksandr Tyshchenko,
	Volodymyr Babchuk, Artem Mygaiev, roger.pau, Bertrand Marquis,
	Rahul Singh

[-- Attachment #1: Type: text/plain, Size: 3687 bytes --]

On Tue, 28 Sep 2021, Oleksandr Andrushchenko wrote:
> [snip]
> >> Sorry I didn't follow your explanation.
> >>
> >> My suggestion is to remove the #ifdef CONFIG_HAS_PCI completely from
> >> map_range_to_domain. At the beginning of map_range_to_domain, there is
> >> already this line:
> >>
> >> bool need_mapping = !dt_device_for_passthrough(dev);
> >>
> >> We can change it into:
> >>
> >> bool need_mapping = !dt_device_for_passthrough(dev) &&
> >>                       !mr_data->skip_mapping;
> >>
> >>
> >> Then, in map_device_children we can set mr_data->skip_mapping to true
> >> for PCI devices.
> > This is the key. I am fine with this, but it just means we move the
> >
> > check to the outside of this function which looks good. Will do
> >
> >>    There is already a pci check there:
> >>
> >>    if ( dt_device_type_is_equal(dev, "pci") )
> >>
> >> so it should be easy to do. What am I missing?
> >>
> >>
> >
> I did some experiments. If we move the check to map_device_children
> 
> it is not enough because part of the ranges is still mapped at handle_device level:
> 
> handle_device:
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd0e0000
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd480000
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr 8000000000
> 
> map_device_children:
> (XEN) Mapping children of /axi/pcie@fd0e0000 to guest skip 1
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr e0000000
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr 600000000
> 
> pci_host_bridge_mappings:
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd0e0000
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd480000
> 
> So, I did more intrusive change:
> 
> @@ -1540,6 +1534,12 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>       int res;
>       u64 addr, size;
>       bool need_mapping = !dt_device_for_passthrough(dev);
> +    struct map_range_data mr_data = {
> +        .d = d,
> +        .p2mt = p2mt,
> +        .skip_mapping = is_pci_passthrough_enabled() &&
> +                        (device_get_class(dev) == DEVICE_PCI)
> +    };
> 
> With this I see that now mappings are done correctly:
> 
> handle_device:
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr fd0e0000
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr fd480000
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr 8000000000
> 
> map_device_children:
> (XEN) Mapping children of /axi/pcie@fd0e0000 to guest skip 1
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr e0000000
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 0 addr 600000000
> 
> pci_host_bridge_mappings:
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd0e0000
> (XEN) --- /axi/pcie@fd0e0000 need_mapping 1 addr fd480000
> 
> So, handle_device seems to be the right place. While at it I have also
> 
> optimized the way we setup struct map_range_data mr_data in both
> 
> handle_device and map_device_children: I removed structure initialization
> 
> from within the relevant loop and also pass mr_data to map_device_children,
> 
> so it doesn't need to create its own copy of the same and perform yet
> 
> another computation for .skip_mapping: it does need to not only know
> 
> that dev is a PCI device (this is done by the dt_device_type_is_equal(dev, "pci")
> 
> check, but also account on is_pci_passthrough_enabled().
> 
> Thus, the change will be more intrusive, but I hope will simplify things.
> 
> I am attaching the fixup patch for just in case you want more details.

Yes, thanks, this is what I had in mind. Hopefully the resulting
combined patch will be simpler.

Cheers,

Stefano

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2021-09-28 16:12 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-23 12:54 [PATCH v2 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
2021-09-23 12:54 ` [PATCH v2 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
2021-09-24 23:39   ` Stefano Stabellini
2021-09-28  7:31   ` Michal Orzel
2021-09-23 12:54 ` [PATCH v2 02/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
2021-09-24 23:45   ` Stefano Stabellini
2021-09-27  7:41   ` Jan Beulich
2021-09-27  8:18     ` Oleksandr Andrushchenko
2021-09-23 12:54 ` [PATCH v2 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
2021-09-24 23:48   ` Stefano Stabellini
2021-09-23 12:54 ` [PATCH v2 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
2021-09-24 23:51   ` Stefano Stabellini
2021-09-23 12:54 ` [PATCH v2 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
2021-09-24 23:54   ` Stefano Stabellini
2021-09-27  7:13     ` Oleksandr Andrushchenko
2021-09-27  7:45   ` Jan Beulich
2021-09-27  8:45     ` Oleksandr Andrushchenko
2021-09-27  9:19       ` Jan Beulich
2021-09-27  9:35         ` Oleksandr Andrushchenko
2021-09-27 10:00           ` Jan Beulich
2021-09-27 10:04             ` Oleksandr Andrushchenko
2021-09-27 10:26               ` Jan Beulich
2021-09-28  8:09                 ` Oleksandr Andrushchenko
2021-09-28  8:26                   ` Jan Beulich
2021-09-28  8:29                     ` Oleksandr Andrushchenko
2021-09-28  8:39                       ` Jan Beulich
2021-09-28  8:54                         ` Oleksandr Andrushchenko
2021-09-23 12:54 ` [PATCH v2 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
2021-09-24 23:57   ` Stefano Stabellini
2021-09-23 12:54 ` [PATCH v2 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
2021-09-25  0:00   ` Stefano Stabellini
2021-09-27  7:17     ` Oleksandr Andrushchenko
2021-09-23 12:54 ` [PATCH v2 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
2021-09-25  0:06   ` Stefano Stabellini
2021-09-27 13:02     ` Oleksandr Andrushchenko
2021-09-23 12:54 ` [PATCH v2 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
2021-09-25  0:18   ` Stefano Stabellini
2021-09-27  8:47     ` Oleksandr Andrushchenko
2021-09-23 12:54 ` [PATCH v2 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
2021-09-25  0:44   ` Stefano Stabellini
2021-09-27 12:44     ` Oleksandr Andrushchenko
2021-09-28  4:00       ` Stefano Stabellini
2021-09-28  4:51         ` Oleksandr Andrushchenko
2021-09-28 14:39           ` Oleksandr Andrushchenko
2021-09-28 16:11             ` Stefano Stabellini
2021-09-23 12:54 ` [PATCH v2 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
2021-09-25  1:20   ` Stefano Stabellini
2021-09-27  8:06   ` Jan Beulich
2021-09-27  8:39     ` Oleksandr Andrushchenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.