All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/11] PCI devices passthrough on Arm, part 2
@ 2021-10-04 14:11 Oleksandr Andrushchenko
  2021-10-04 14:11 ` [PATCH v4 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
                   ` (10 more replies)
  0 siblings, 11 replies; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Hi, all!

This is an assorted series of patches which aim is to make some further
basis for PCI passthrough on Arm support. The series continues the work
published earlier by Arm [1] and adds new helpers and clears the way for
vPCI changes which will follow.

RFC is at [2], [3]. Design presentation can be found at [4].

Thank you,
Oleksandr

[1] https://patchwork.kernel.org/project/xen-devel/list/?series=557111
[2] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg01184.html
[3] https://lists.xenproject.org/archives/html/xen-devel/2020-07/threads.html#01184
[4] https://static.sched.com/hosted_files/xen2021/e4/PCI_Device_Passthrough_On_Arm.pdf

Oleksandr Andrushchenko (10):
  xen/arm: Fix dev_is_dt macro definition
  xen/arm: Add new device type for PCI
  xen/arm: Introduce pci_find_host_bridge_node helper
  xen/device-tree: Make dt_find_node_by_phandle global
  xen/arm: Mark device as PCI while creating one
  libxl: Allow removing PCI devices for all types of domains
  libxl: Only map legacy PCI IRQs if they are supported
  xen/arm: Setup MMIO range trap handlers for hardware domain
  xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  xen/arm: Process pending vPCI map/unmap operations

Oleksandr Tyshchenko (1):
  xen/domain: Call pci_release_devices() when releasing domain resources

 tools/libs/light/Makefile          |  4 ++
 tools/libs/light/libxl_pci.c       | 15 ++++-
 xen/arch/arm/domain.c              | 11 +++-
 xen/arch/arm/domain_build.c        | 50 ++++++++--------
 xen/arch/arm/pci/ecam.c            | 14 +++++
 xen/arch/arm/pci/pci-host-common.c | 93 ++++++++++++++++++++++++++++++
 xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
 xen/arch/arm/pci/pci.c             | 12 ++++
 xen/arch/arm/traps.c               | 13 +++++
 xen/arch/arm/vpci.c                | 34 +++++++++++
 xen/arch/arm/vpci.h                |  6 ++
 xen/arch/x86/hvm/hvm.c             |  6 ++
 xen/common/device_tree.c           |  2 +-
 xen/common/ioreq.c                 |  9 ---
 xen/drivers/passthrough/pci.c      |  2 +
 xen/include/asm-arm/device.h       |  6 +-
 xen/include/asm-arm/pci.h          | 30 ++++++++++
 xen/include/asm-arm/setup.h        | 13 +++++
 xen/include/asm-x86/pci.h          |  2 +
 xen/include/xen/device_tree.h      |  2 +
 20 files changed, 286 insertions(+), 39 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4 01/11] xen/arm: Fix dev_is_dt macro definition
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-05  0:57   ` Stefano Stabellini
  2021-10-04 14:11 ` [PATCH v4 02/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko, Michal Orzel

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

This macro is not currently used, but still has an error in it:
a missing parenthesis. Fix this, so the macro is properly defined.

Fixes: 6c5d3075d97e ("xen/arm: Introduce a generic way to describe device")

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Michal Orzel <michal.orzel@arm.com>
---
New in v2
---
 xen/include/asm-arm/device.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index 5ecd5e7bd15e..ebe84ea853cd 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -27,7 +27,7 @@ typedef struct device device_t;
 
 /* TODO: Correctly implement dev_is_pci when PCI is supported on ARM */
 #define dev_is_pci(dev) ((void)(dev), 0)
-#define dev_is_dt(dev)  ((dev->type == DEV_DT)
+#define dev_is_dt(dev)  ((dev)->type == DEV_DT)
 
 enum device_class
 {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 02/11] xen/arm: Add new device type for PCI
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
  2021-10-04 14:11 ` [PATCH v4 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-06 10:15   ` Rahul Singh
  2021-10-04 14:11 ` [PATCH v4 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Add new device type (DEV_PCI) to distinguish PCI devices from platform
DT devices, so some drivers, like IOMMU, can handle PCI devices
differently.

Also add a helper which is when given a struct device returns the
corresponding struct pci_dev which this device is a part of.

Because of the header cross-dependencies, e.g. we need both
struct pci_dev and struct arch_pci_dev at the same time, this cannot be
done with an inline.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
Since v2:
 - !! dropped Stefano's r-b because of the changes
 - simplified dev_to_pci to use a single
   container_of(dev, struct pci_dev, arch.dev) (Jan)
Since v1:
 - Folded new device type (DEV_PCI) into this patch.
---
 xen/arch/arm/pci/pci.c       | 7 +++++++
 xen/include/asm-arm/device.h | 4 ++--
 xen/include/asm-arm/pci.h    | 7 +++++++
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
index 73540045d187..138da19284ab 100644
--- a/xen/arch/arm/pci/pci.c
+++ b/xen/arch/arm/pci/pci.c
@@ -27,6 +27,13 @@ int arch_pci_clean_pirqs(struct domain *d)
     return 0;
 }
 
+struct pci_dev *dev_to_pci(struct device *dev)
+{
+    ASSERT(dev->type == DEV_PCI);
+
+    return container_of(dev, struct pci_dev, arch.dev);
+}
+
 static int __init dt_pci_init(void)
 {
     struct dt_device_node *np;
diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index ebe84ea853cd..7bf040560363 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -4,6 +4,7 @@
 enum device_type
 {
     DEV_DT,
+    DEV_PCI,
 };
 
 struct dev_archdata {
@@ -25,8 +26,7 @@ typedef struct device device_t;
 
 #include <xen/device_tree.h>
 
-/* TODO: Correctly implement dev_is_pci when PCI is supported on ARM */
-#define dev_is_pci(dev) ((void)(dev), 0)
+#define dev_is_pci(dev) ((dev)->type == DEV_PCI)
 #define dev_is_dt(dev)  ((dev)->type == DEV_DT)
 
 enum device_class
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 7cb2e2f1edaf..603a1fc072d1 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -26,6 +26,13 @@ struct arch_pci_dev {
     struct device dev;
 };
 
+/*
+ * Because of the header cross-dependencies, e.g. we need both
+ * struct pci_dev and struct arch_pci_dev at the same time, this cannot be
+ * done with an inline here. Macro can be implemented, but looks scary.
+ */
+struct pci_dev *dev_to_pci(struct device *dev);
+
 /* Arch-specific MSI data for vPCI. */
 struct vpci_arch_msi {
 };
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 03/11] xen/arm: Introduce pci_find_host_bridge_node helper
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
  2021-10-04 14:11 ` [PATCH v4 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
  2021-10-04 14:11 ` [PATCH v4 02/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-06 10:24   ` Rahul Singh
  2021-10-04 14:11 ` [PATCH v4 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Get host bridge node given a PCI device attached to it.

This helper will be re-used for adding PCI devices by the subsequent
patches.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
Since v2:
 - !! dropped Stefano's r-b because of the changes
 - s/PRI_pci/%pp after rebase onto Arm series
---
 xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
 xen/include/asm-arm/pci.h          |  1 +
 2 files changed, 17 insertions(+)

diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index 593beeb48ce4..592c01aae5bb 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -276,6 +276,22 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
     return -EINVAL;
 }
 
+/*
+ * Get host bridge node given a device attached to it.
+ */
+struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
+{
+    struct pci_host_bridge *bridge;
+    struct pci_dev *pdev = dev_to_pci(dev);
+
+    bridge = pci_find_host_bridge(pdev->seg, pdev->bus);
+    if ( unlikely(!bridge) )
+    {
+        printk(XENLOG_ERR "Unable to find PCI bridge for %pp\n", &pdev->sbdf);
+        return NULL;
+    }
+    return bridge->dt_node;
+}
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 603a1fc072d1..e6d4000e2ac8 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -99,6 +99,7 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
 struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
 int pci_get_host_bridge_segment(const struct dt_device_node *node,
                                 uint16_t *segment);
+struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
 
 static always_inline bool is_pci_passthrough_enabled(void)
 {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 04/11] xen/device-tree: Make dt_find_node_by_phandle global
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (2 preceding siblings ...)
  2021-10-04 14:11 ` [PATCH v4 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-06 10:32   ` Rahul Singh
  2021-10-04 14:11 ` [PATCH v4 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Make dt_find_node_by_phandle globally visible, so it can be re-used by
other frameworks.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
 xen/common/device_tree.c      | 2 +-
 xen/include/xen/device_tree.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index ea93da1725f6..4aae281e89bf 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -1047,7 +1047,7 @@ int dt_for_each_range(const struct dt_device_node *dev,
  *
  * Returns a node pointer.
  */
-static struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle)
+struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle)
 {
     struct dt_device_node *np;
 
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 2297c59ce66d..fd6cd00b433a 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -849,6 +849,8 @@ int dt_count_phandle_with_args(const struct dt_device_node *np,
  */
 int dt_get_pci_domain_nr(struct dt_device_node *node);
 
+struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle);
+
 #ifdef CONFIG_DEVICE_TREE_DEBUG
 #define dt_dprintk(fmt, args...)  \
     printk(XENLOG_DEBUG fmt, ## args)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 05/11] xen/arm: Mark device as PCI while creating one
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (3 preceding siblings ...)
  2021-10-04 14:11 ` [PATCH v4 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-06 10:33   ` Rahul Singh
  2021-10-04 14:11 ` [PATCH v4 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

While adding a PCI device mark it as such, so other frameworks
can distinguish it from DT devices.
For that introduce an architecture defined helper which may perform
additional initialization of the newly created PCI device.

Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
[applicable parts]
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Since v3:
- fixed Arm build without CONFIG_HAS_PCI
Since v2:
 - !! dropped Stefano's r-b because of the changes
 - introduced arch_pci_init_pdev (Jan)
Since v1:
 - moved the assignment from iommu_add_device to alloc_pdev
---
 xen/arch/arm/pci/pci.c        | 5 +++++
 xen/drivers/passthrough/pci.c | 2 ++
 xen/include/asm-arm/pci.h     | 7 +++++++
 xen/include/asm-x86/pci.h     | 2 ++
 4 files changed, 16 insertions(+)

diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
index 138da19284ab..082c14e127a8 100644
--- a/xen/arch/arm/pci/pci.c
+++ b/xen/arch/arm/pci/pci.c
@@ -34,6 +34,11 @@ struct pci_dev *dev_to_pci(struct device *dev)
     return container_of(dev, struct pci_dev, arch.dev);
 }
 
+void arch_pci_init_pdev(struct pci_dev *pdev)
+{
+    pci_to_dev(pdev)->type = DEV_PCI;
+}
+
 static int __init dt_pci_init(void)
 {
     struct dt_device_node *np;
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index b36d5a481162..9e6246afeef5 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -329,6 +329,8 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
     *((u8*) &pdev->devfn) = devfn;
     pdev->domain = NULL;
 
+    arch_pci_init_pdev(pdev);
+
     rc = pdev_msi_init(pdev);
     if ( rc )
     {
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index e6d4000e2ac8..ea87ec6006fc 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -105,6 +105,9 @@ static always_inline bool is_pci_passthrough_enabled(void)
 {
     return pci_passthrough_enabled;
 }
+
+void arch_pci_init_pdev(struct pci_dev *pdev);
+
 #else   /*!CONFIG_HAS_PCI*/
 
 struct arch_pci_dev { };
@@ -121,5 +124,9 @@ static inline int pci_get_host_bridge_segment(const struct dt_device_node *node,
     return -EINVAL;
 }
 
+struct pci_dev;
+
+static inline void arch_pci_init_pdev(struct pci_dev *pdev) {}
+
 #endif  /*!CONFIG_HAS_PCI*/
 #endif /* __ARM_PCI_H__ */
diff --git a/xen/include/asm-x86/pci.h b/xen/include/asm-x86/pci.h
index c4a4fdcbc239..443f25347d08 100644
--- a/xen/include/asm-x86/pci.h
+++ b/xen/include/asm-x86/pci.h
@@ -36,4 +36,6 @@ static always_inline bool is_pci_passthrough_enabled(void)
     return true;
 }
 
+static inline void arch_pci_init_pdev(struct pci_dev *pdev) {}
+
 #endif /* __X86_PCI_H__ */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 06/11] xen/domain: Call pci_release_devices() when releasing domain resources
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (4 preceding siblings ...)
  2021-10-04 14:11 ` [PATCH v4 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-06 10:35   ` Rahul Singh
  2021-10-04 14:11 ` [PATCH v4 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This is the very same that we already do for DT devices. Moreover, x86
already calls pci_release_devices().

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
Since v1:
 - re-wording in the commit message
---
 xen/arch/arm/domain.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index fbb52f78f1a6..79012bf77757 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -985,7 +985,8 @@ static int relinquish_memory(struct domain *d, struct page_list_head *list)
  * function which may return -ERESTART.
  */
 enum {
-    PROG_tee = 1,
+    PROG_pci = 1,
+    PROG_tee,
     PROG_xen,
     PROG_page,
     PROG_mapping,
@@ -1022,6 +1023,12 @@ int domain_relinquish_resources(struct domain *d)
 #ifdef CONFIG_IOREQ_SERVER
         ioreq_server_destroy_all(d);
 #endif
+#ifdef CONFIG_HAS_PCI
+    PROGRESS(pci):
+        ret = pci_release_devices(d);
+        if ( ret )
+            return ret;
+#endif
 
     PROGRESS(tee):
         ret = tee_relinquish_resources(d);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 07/11] libxl: Allow removing PCI devices for all types of domains
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (5 preceding siblings ...)
  2021-10-04 14:11 ` [PATCH v4 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-06 10:25   ` Rahul Singh
  2021-10-04 14:11 ` [PATCH v4 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko, Ian Jackson, Juergen Gross

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

The PCI device remove path may now be used by PVH on ARM, so the
assert is no longer valid.

Cc: Ian Jackson <iwj@xenproject.org>
Cc: Juergen Gross <jgross@suse.com>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
---
 tools/libs/light/libxl_pci.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 1a1c2630803b..59f3686fc85e 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1947,8 +1947,6 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
             goto out_fail;
         }
     } else {
-        assert(type == LIBXL_DOMAIN_TYPE_PV);
-
         char *sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/resource", pci->domain,
                                      pci->bus, pci->dev, pci->func);
         FILE *f = fopen(sysfs_path, "r");
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 08/11] libxl: Only map legacy PCI IRQs if they are supported
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (6 preceding siblings ...)
  2021-10-04 14:11 ` [PATCH v4 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-06 10:39   ` Rahul Singh
  2021-10-04 14:11 ` [PATCH v4 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko, Ian Jackson, Juergen Gross

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Arm's PCI passthrough implementation doesn't support legacy interrupts,
but MSI/MSI-X. This can be the case for other platforms too.
For that reason introduce a new CONFIG_PCI_SUPP_LEGACY_IRQ and add
it to the CFLAGS and compile the relevant code in the toolstack only if
applicable.

Cc: Ian Jackson <iwj@xenproject.org>
Cc: Juergen Gross <jgross@suse.com>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
Since v1:
 - Minimized #idefery by introducing pci_supp_legacy_irq function
   for relevant checks
---
 tools/libs/light/Makefile    |  4 ++++
 tools/libs/light/libxl_pci.c | 13 +++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
index 7d8c51d49242..bd3f6be2a183 100644
--- a/tools/libs/light/Makefile
+++ b/tools/libs/light/Makefile
@@ -46,6 +46,10 @@ CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
 	-Wno-declaration-after-statement -Wformat-nonliteral
 CFLAGS += -I.
 
+ifeq ($(CONFIG_X86),y)
+CFLAGS += -DCONFIG_PCI_SUPP_LEGACY_IRQ
+endif
+
 SRCS-$(CONFIG_X86) += libxl_cpuid.c
 SRCS-$(CONFIG_X86) += libxl_x86.c
 SRCS-$(CONFIG_X86) += libxl_psr.c
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 59f3686fc85e..4c2d7aeefbb2 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1364,6 +1364,15 @@ static void pci_add_timeout(libxl__egc *egc, libxl__ev_time *ev,
     pci_add_dm_done(egc, pas, rc);
 }
 
+static bool pci_supp_legacy_irq(void)
+{
+#ifdef CONFIG_PCI_SUPP_LEGACY_IRQ
+    return true;
+#else
+    return false;
+#endif
+}
+
 static void pci_add_dm_done(libxl__egc *egc,
                             pci_add_state *pas,
                             int rc)
@@ -1434,6 +1443,8 @@ static void pci_add_dm_done(libxl__egc *egc,
         }
     }
     fclose(f);
+    if (!pci_supp_legacy_irq())
+        goto out_no_irq;
     sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
                                 pci->bus, pci->dev, pci->func);
     f = fopen(sysfs_path, "r");
@@ -1983,6 +1994,8 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
         }
         fclose(f);
 skip1:
+        if (!pci_supp_legacy_irq())
+            goto skip_irq;
         sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
                                pci->bus, pci->dev, pci->func);
         f = fopen(sysfs_path, "r");
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (7 preceding siblings ...)
  2021-10-04 14:11 ` [PATCH v4 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-06 10:36   ` Rahul Singh
  2021-10-04 14:11 ` [PATCH v4 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
  2021-10-04 14:11 ` [PATCH v4 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

In order for vPCI to work it needs to maintain guest and hardware
domain's views of the configuration space. For example, BARs and
COMMAND registers require emulation for guests and the guest view
of the registers needs to be in sync with the real contents of the
relevant registers. For that ECAM address space needs to also be
trapped for the hardware domain, so we need to implement PCI host
bridge specific callbacks to properly setup MMIO handlers for those
ranges depending on particular host bridge implementation.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
Since v3:
- fixed comment formatting
Since v2:
- removed unneeded assignment (count = 0)
- removed unneeded header inclusion
- update commit message
Since v1:
 - Dynamically calculate the number of MMIO handlers required for vPCI
   and update the total number accordingly
 - s/clb/cb
 - Do not introduce a new callback for MMIO handler setup
---
 xen/arch/arm/domain.c              |  2 ++
 xen/arch/arm/pci/pci-host-common.c | 28 ++++++++++++++++++++++++
 xen/arch/arm/vpci.c                | 34 ++++++++++++++++++++++++++++++
 xen/arch/arm/vpci.h                |  6 ++++++
 xen/include/asm-arm/pci.h          |  5 +++++
 5 files changed, 75 insertions(+)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 79012bf77757..fa6fcc5e467c 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -733,6 +733,8 @@ int arch_domain_create(struct domain *d,
     if ( (rc = domain_vgic_register(d, &count)) != 0 )
         goto fail;
 
+    count += domain_vpci_get_num_mmio_handlers(d);
+
     if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
         goto fail;
 
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index 592c01aae5bb..1eb4daa87365 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -292,6 +292,34 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
     }
     return bridge->dt_node;
 }
+
+int pci_host_iterate_bridges(struct domain *d,
+                             int (*cb)(struct domain *d,
+                                       struct pci_host_bridge *bridge))
+{
+    struct pci_host_bridge *bridge;
+    int err;
+
+    list_for_each_entry( bridge, &pci_host_bridges, node )
+    {
+        err = cb(d, bridge);
+        if ( err )
+            return err;
+    }
+    return 0;
+}
+
+int pci_host_get_num_bridges(void)
+{
+    struct pci_host_bridge *bridge;
+    int count = 0;
+
+    list_for_each_entry( bridge, &pci_host_bridges, node )
+        count++;
+
+    return count;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
index 76c12b92814f..6e179cd3010b 100644
--- a/xen/arch/arm/vpci.c
+++ b/xen/arch/arm/vpci.c
@@ -80,17 +80,51 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
     .write = vpci_mmio_write,
 };
 
+static int vpci_setup_mmio_handler(struct domain *d,
+                                   struct pci_host_bridge *bridge)
+{
+    struct pci_config_window *cfg = bridge->cfg;
+
+    register_mmio_handler(d, &vpci_mmio_handler,
+                          cfg->phys_addr, cfg->size, NULL);
+    return 0;
+}
+
 int domain_vpci_init(struct domain *d)
 {
     if ( !has_vpci(d) )
         return 0;
 
+    if ( is_hardware_domain(d) )
+        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler);
+
+    /* Guest domains use what is programmed in their device tree. */
     register_mmio_handler(d, &vpci_mmio_handler,
                           GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
 
     return 0;
 }
 
+int domain_vpci_get_num_mmio_handlers(struct domain *d)
+{
+    int count;
+
+    if ( is_hardware_domain(d) )
+        /* For each PCI host bridge's configuration space. */
+        count = pci_host_get_num_bridges();
+    else
+        /*
+         * VPCI_MSIX_MEM_NUM handlers for MSI-X tables per each PCI device
+         * being passed through. Maximum number of supported devices
+         * is 32 as virtual bus topology emulates the devices as embedded
+         * endpoints.
+         * +1 for a single emulated host bridge's configuration space.
+         */
+        count = VPCI_MSIX_MEM_NUM * 32 + 1;
+
+    return count;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h
index d8a7b0e3e802..27a2b069abd2 100644
--- a/xen/arch/arm/vpci.h
+++ b/xen/arch/arm/vpci.h
@@ -17,11 +17,17 @@
 
 #ifdef CONFIG_HAS_VPCI
 int domain_vpci_init(struct domain *d);
+int domain_vpci_get_num_mmio_handlers(struct domain *d);
 #else
 static inline int domain_vpci_init(struct domain *d)
 {
     return 0;
 }
+
+static inline int domain_vpci_get_num_mmio_handlers(struct domain *d)
+{
+    return 0;
+}
 #endif
 
 #endif /* __ARCH_ARM_VPCI_H__ */
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index ea87ec6006fc..a62d8bc60086 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -108,6 +108,11 @@ static always_inline bool is_pci_passthrough_enabled(void)
 
 void arch_pci_init_pdev(struct pci_dev *pdev);
 
+int pci_host_iterate_bridges(struct domain *d,
+                             int (*clb)(struct domain *d,
+                                        struct pci_host_bridge *bridge));
+int pci_host_get_num_bridges(void);
+
 #else   /*!CONFIG_HAS_PCI*/
 
 struct arch_pci_dev { };
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (8 preceding siblings ...)
  2021-10-04 14:11 ` [PATCH v4 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-05  1:24   ` Stefano Stabellini
  2021-10-04 14:11 ` [PATCH v4 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

PCI host bridges are special devices in terms of implementing PCI
passthrough. According to [1] the current implementation depends on
Domain-0 to perform the initialization of the relevant PCI host
bridge hardware and perform PCI device enumeration. In order to
achieve that one of the required changes is to not map all the memory
ranges in map_range_to_domain as we traverse the device tree on startup
and perform some additional checks if the range needs to be mapped to
Domain-0.

The generic PCI host controller device tree binding says [2]:
- ranges: As described in IEEE Std 1275-1994, but must provide
          at least a definition of non-prefetchable memory. One
          or both of prefetchable Memory and IO Space may also
          be provided.

- reg   : The Configuration Space base address and size, as accessed
          from the parent bus.  The base address corresponds to
          the first bus in the "bus-range" property.  If no
          "bus-range" is specified, this will be bus 0 (the default).

From the above none of the memory ranges from the "ranges" property
needs to be mapped to Domain-0 at startup as MMIO mapping is going to
be handled dynamically by vPCI as we assign PCI devices, e.g. each
device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
as needed by Xen.

The "reg" property covers not only ECAM space, but may also have other
then the configuration memory ranges described, for example [3]:
- reg: Should contain rc_dbi, config registers location and length.
- reg-names: Must include the following entries:
   "rc_dbi": controller configuration registers;
   "config": PCIe configuration space registers.

This patch makes it possible to not map all the ranges from the
"ranges" property and also ECAM from the "reg". All the rest from the
"reg" property still needs to be mapped to Domain-0, so the PCI
host bridge remains functional in Domain-0.

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html
[2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt
[3] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
Since v3:
 - pass struct map_range_data to map_dt_irq_to_domain
 - remove redundant check from map_range_to_domain
 - fix handle_device's .skip_mapping
Since v2:
 - removed check in map_range_to_domain for PCI_DEV
   and moved it to handle_device, so the code is
   simpler
 - s/map_pci_bridge/skip_mapping
 - extended comment in pci_host_bridge_mappings
 - minor code restructure in construct_dom0
 - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
   callbacks
 - unsigned int i; in pci_host_bridge_mappings
Since v1:
 - Added better description of why and what needs to be mapped into
   Domain-0's p2m and what doesn't
 - Do not do any mappings for PCI devices while traversing the DT
 - Walk all the bridges and make required mappings in one go
---
 xen/arch/arm/domain_build.c        | 50 ++++++++++++++++--------------
 xen/arch/arm/pci/ecam.c            | 14 +++++++++
 xen/arch/arm/pci/pci-host-common.c | 49 +++++++++++++++++++++++++++++
 xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
 xen/include/asm-arm/pci.h          | 10 ++++++
 xen/include/asm-arm/setup.h        | 13 ++++++++
 6 files changed, 114 insertions(+), 23 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 8739e3285ef4..2a3c641476bd 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -10,7 +10,6 @@
 #include <asm/regs.h>
 #include <xen/errno.h>
 #include <xen/err.h>
-#include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
 #include <xen/guest_access.h>
 #include <xen/iocap.h>
@@ -51,12 +50,6 @@ static int __init parse_dom0_mem(const char *s)
 }
 custom_param("dom0_mem", parse_dom0_mem);
 
-struct map_range_data
-{
-    struct domain *d;
-    p2m_type_t p2mt;
-};
-
 /* Override macros from asm/page.h to make them work with mfn_t */
 #undef virt_to_mfn
 #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
@@ -1641,10 +1634,11 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
                                        const struct dt_irq *dt_irq,
                                        void *data)
 {
-    struct domain *d = data;
+    struct map_range_data *mr_data = data;
+    struct domain *d = mr_data->d;
     unsigned int irq = dt_irq->irq;
     int res;
-    bool need_mapping = !dt_device_for_passthrough(dev);
+    bool need_mapping = !mr_data->skip_mapping;
 
     if ( irq < NR_LOCAL_IRQS )
     {
@@ -1668,13 +1662,12 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
     return 0;
 }
 
-static int __init map_range_to_domain(const struct dt_device_node *dev,
-                                      u64 addr, u64 len,
-                                      void *data)
+int __init map_range_to_domain(const struct dt_device_node *dev,
+                               u64 addr, u64 len, void *data)
 {
     struct map_range_data *mr_data = data;
     struct domain *d = mr_data->d;
-    bool need_mapping = !dt_device_for_passthrough(dev);
+    bool need_mapping = !mr_data->skip_mapping;
     int res;
 
     /*
@@ -1726,23 +1719,21 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
  * then we may need to perform additional mappings in order to make
  * the child resources available to domain 0.
  */
-static int __init map_device_children(struct domain *d,
-                                      const struct dt_device_node *dev,
-                                      p2m_type_t p2mt)
+static int __init map_device_children(const struct dt_device_node *dev,
+                                      struct map_range_data *mr_data)
 {
-    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
-    int ret;
-
     if ( dt_device_type_is_equal(dev, "pci") )
     {
+        int ret;
+
         dt_dprintk("Mapping children of %s to guest\n",
                    dt_node_full_name(dev));
 
-        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, d);
+        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, mr_data);
         if ( ret < 0 )
             return ret;
 
-        ret = dt_for_each_range(dev, &map_range_to_domain, &mr_data);
+        ret = dt_for_each_range(dev, &map_range_to_domain, mr_data);
         if ( ret < 0 )
             return ret;
     }
@@ -1823,6 +1814,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
     int res;
     u64 addr, size;
     bool need_mapping = !dt_device_for_passthrough(dev);
+    struct map_range_data mr_data = {
+        .d = d,
+        .p2mt = p2mt,
+        .skip_mapping = !need_mapping ||
+                        (is_pci_passthrough_enabled() &&
+                         (device_get_class(dev) == DEVICE_PCI))
+    };
 
     naddr = dt_number_of_address(dev);
 
@@ -1862,7 +1860,6 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
     /* Give permission and map MMIOs */
     for ( i = 0; i < naddr; i++ )
     {
-        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
         res = dt_device_get_address(dev, i, &addr, &size);
         if ( res )
         {
@@ -1876,7 +1873,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
             return res;
     }
 
-    res = map_device_children(d, dev, p2mt);
+    res = map_device_children(dev, &mr_data);
     if ( res )
         return res;
 
@@ -3034,7 +3031,14 @@ static int __init construct_dom0(struct domain *d)
         return rc;
 
     if ( acpi_disabled )
+    {
         rc = prepare_dtb_hwdom(d, &kinfo);
+        if ( rc < 0 )
+            return rc;
+#ifdef CONFIG_HAS_PCI
+        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
+#endif
+    }
     else
         rc = prepare_acpi(d, &kinfo);
 
diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
index 602d00799c8d..b81620074a91 100644
--- a/xen/arch/arm/pci/ecam.c
+++ b/xen/arch/arm/pci/ecam.c
@@ -40,6 +40,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
     return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where;
 }
 
+bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
+                                     struct pci_host_bridge *bridge,
+                                     uint64_t addr)
+{
+    struct pci_config_window *cfg = bridge->cfg;
+
+    /*
+     * We do not want ECAM address space to be mapped in Domain-0's p2m,
+     * so we can trap access to it.
+     */
+    return cfg->phys_addr != addr;
+}
+
 /* ECAM ops */
 const struct pci_ecam_ops pci_generic_ecam_ops = {
     .bus_shift  = 20,
@@ -47,6 +60,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
         .map_bus                = pci_ecam_map_bus,
         .read                   = pci_generic_config_read,
         .write                  = pci_generic_config_write,
+        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
     }
 };
 
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index 1eb4daa87365..085f08e23e0c 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -18,6 +18,7 @@
 
 #include <xen/init.h>
 #include <xen/pci.h>
+#include <asm/setup.h>
 #include <xen/rwlock.h>
 #include <xen/sched.h>
 #include <xen/vmap.h>
@@ -320,6 +321,54 @@ int pci_host_get_num_bridges(void)
     return count;
 }
 
+int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
+{
+    struct pci_host_bridge *bridge;
+    struct map_range_data mr_data = {
+        .d = d,
+        .p2mt = p2mt,
+        .skip_mapping = false
+    };
+
+    /*
+     * For each PCI host bridge we need to only map those ranges
+     * which are used by Domain-0 to properly initialize the bridge,
+     * e.g. we do not want to map ECAM configuration space which lives in
+     * "reg" or "assigned-addresses" device tree property, but we want to
+     * map other regions of the host bridge. The PCI aperture defined by
+     * the "ranges" device tree property should also be skipped.
+     */
+    list_for_each_entry( bridge, &pci_host_bridges, node )
+    {
+        const struct dt_device_node *dev = bridge->dt_node;
+        unsigned int i;
+
+        for ( i = 0; i < dt_number_of_address(dev); i++ )
+        {
+            uint64_t addr, size;
+            int err;
+
+            err = dt_device_get_address(dev, i, &addr, &size);
+            if ( err )
+            {
+                printk(XENLOG_ERR
+                       "Unable to retrieve address range index=%u for %s\n",
+                       i, dt_node_full_name(dev));
+                return err;
+            }
+
+            if ( bridge->ops->need_p2m_hwdom_mapping(d, bridge, addr) )
+            {
+                err = map_range_to_domain(dev, addr, size, &mr_data);
+                if ( err )
+                    return err;
+            }
+        }
+    }
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
index 61a9807d3d58..6ad2b31e810d 100644
--- a/xen/arch/arm/pci/pci-host-zynqmp.c
+++ b/xen/arch/arm/pci/pci-host-zynqmp.c
@@ -34,6 +34,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
         .map_bus                = pci_ecam_map_bus,
         .read                   = pci_generic_config_read,
         .write                  = pci_generic_config_write,
+        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
     }
 };
 
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index a62d8bc60086..d9a3c2a4f3b3 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -17,6 +17,8 @@
 
 #ifdef CONFIG_HAS_PCI
 
+#include <asm/p2m.h>
+
 #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
 
 extern bool_t pci_passthrough_enabled;
@@ -73,6 +75,9 @@ struct pci_ops {
                 uint32_t reg, uint32_t len, uint32_t *value);
     int (*write)(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
                  uint32_t reg, uint32_t len, uint32_t value);
+    bool (*need_p2m_hwdom_mapping)(struct domain *d,
+                                   struct pci_host_bridge *bridge,
+                                   uint64_t addr);
 };
 
 /*
@@ -96,6 +101,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
                              uint32_t reg, uint32_t len, uint32_t value);
 void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
                                pci_sbdf_t sbdf, uint32_t where);
+bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
+                                     struct pci_host_bridge *bridge,
+                                     uint64_t addr);
 struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
 int pci_get_host_bridge_segment(const struct dt_device_node *node,
                                 uint16_t *segment);
@@ -113,6 +121,8 @@ int pci_host_iterate_bridges(struct domain *d,
                                         struct pci_host_bridge *bridge));
 int pci_host_get_num_bridges(void);
 
+int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
+
 #else   /*!CONFIG_HAS_PCI*/
 
 struct arch_pci_dev { };
diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
index 95da0b7ab9cd..5b30135fda38 100644
--- a/xen/include/asm-arm/setup.h
+++ b/xen/include/asm-arm/setup.h
@@ -2,6 +2,8 @@
 #define __ARM_SETUP_H_
 
 #include <public/version.h>
+#include <asm/p2m.h>
+#include <xen/device_tree.h>
 
 #define MIN_FDT_ALIGN 8
 #define MAX_FDT_SIZE SZ_2M
@@ -77,6 +79,14 @@ struct bootinfo {
 #endif
 };
 
+struct map_range_data
+{
+    struct domain *d;
+    p2m_type_t p2mt;
+    /* Set if mappings for PCI host bridges must not be skipped. */
+    bool skip_mapping;
+};
+
 extern struct bootinfo bootinfo;
 
 extern domid_t max_init_domid;
@@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
 u32 device_tree_get_u32(const void *fdt, int node,
                         const char *prop_name, u32 dflt);
 
+int map_range_to_domain(const struct dt_device_node *dev,
+                        u64 addr, u64 len, void *data);
+
 #endif
 /*
  * Local variables:
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 11/11] xen/arm: Process pending vPCI map/unmap operations
  2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
                   ` (9 preceding siblings ...)
  2021-10-04 14:11 ` [PATCH v4 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
@ 2021-10-04 14:11 ` Oleksandr Andrushchenko
  2021-10-06 10:24   ` Rahul Singh
  10 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-04 14:11 UTC (permalink / raw)
  To: xen-devel
  Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
	Artem_Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, bertrand.marquis, rahul.singh,
	Oleksandr Andrushchenko

From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

vPCI may map and unmap PCI device memory (BARs) being passed through which
may take a lot of time. For this those operations may be deferred to be
performed later, so that they can be safely preempted.

Currently this deferred processing is happening in common IOREQ code
which doesn't seem to be the right place for x86 and is even more
doubtful because IOREQ may not be enabled for Arm at all.
So, for Arm the pending vPCI work may have no chance to be executed
if the processing is left as is in the common IOREQ code only.
For that reason make vPCI processing happen in arch specific code.

Please be aware that there are a few outstanding TODOs affecting this
code path, see xen/drivers/vpci/header.c:map_range and
xen/drivers/vpci/header.c:vpci_process_pending.

Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
[x86 changes]
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
Since v2:
 - update commit message with more insight on x86, IOREQ and Arm
 - restored order of invocation for IOREQ and vPCI processing (Jan)
Since v1:
 - Moved the check for pending vpci work from the common IOREQ code
   to hvm_do_resume on x86
 - Re-worked the code for Arm to ensure we don't miss pending vPCI work
---
 xen/arch/arm/traps.c   | 13 +++++++++++++
 xen/arch/x86/hvm/hvm.c |  6 ++++++
 xen/common/ioreq.c     |  9 ---------
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 219ab3c3fbde..b246f51086e3 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -34,6 +34,7 @@
 #include <xen/symbols.h>
 #include <xen/version.h>
 #include <xen/virtual_region.h>
+#include <xen/vpci.h>
 
 #include <public/sched.h>
 #include <public/xen.h>
@@ -2304,6 +2305,18 @@ static bool check_for_vcpu_work(void)
     }
 #endif
 
+    if ( has_vpci(v->domain) )
+    {
+        bool pending;
+
+        local_irq_enable();
+        pending = vpci_process_pending(v);
+        local_irq_disable();
+
+        if ( pending )
+            return true;
+    }
+
     if ( likely(!v->arch.need_flush_to_ram) )
         return false;
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index aa418a3ca1b7..c491242e4b8b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -546,6 +546,12 @@ void hvm_do_resume(struct vcpu *v)
 
     pt_restore_timer(v);
 
+    if ( has_vpci(v->domain) && vpci_process_pending(v) )
+    {
+        raise_softirq(SCHEDULE_SOFTIRQ);
+        return;
+    }
+
     if ( !vcpu_ioreq_handle_completion(v) )
         return;
 
diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
index d732dc045df9..689d256544c8 100644
--- a/xen/common/ioreq.c
+++ b/xen/common/ioreq.c
@@ -25,9 +25,7 @@
 #include <xen/lib.h>
 #include <xen/paging.h>
 #include <xen/sched.h>
-#include <xen/softirq.h>
 #include <xen/trace.h>
-#include <xen/vpci.h>
 
 #include <asm/guest_atomics.h>
 #include <asm/ioreq.h>
@@ -212,19 +210,12 @@ static bool wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
 
 bool vcpu_ioreq_handle_completion(struct vcpu *v)
 {
-    struct domain *d = v->domain;
     struct vcpu_io *vio = &v->io;
     struct ioreq_server *s;
     struct ioreq_vcpu *sv;
     enum vio_completion completion;
     bool res = true;
 
-    if ( has_vpci(d) && vpci_process_pending(v) )
-    {
-        raise_softirq(SCHEDULE_SOFTIRQ);
-        return false;
-    }
-
     while ( (sv = get_pending_vcpu(v, &s)) != NULL )
         if ( !wait_for_io(sv, get_ioreq(s, v)) )
             return false;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 01/11] xen/arm: Fix dev_is_dt macro definition
  2021-10-04 14:11 ` [PATCH v4 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
@ 2021-10-05  0:57   ` Stefano Stabellini
  0 siblings, 0 replies; 27+ messages in thread
From: Stefano Stabellini @ 2021-10-05  0:57 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, jbeulich,
	andrew.cooper3, george.dunlap, paul, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko, Michal Orzel

On Mon, 4 Oct 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> This macro is not currently used, but still has an error in it:
> a missing parenthesis. Fix this, so the macro is properly defined.
> 
> Fixes: 6c5d3075d97e ("xen/arm: Introduce a generic way to describe device")
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> Reviewed-by: Michal Orzel <michal.orzel@arm.com>

I committed this patch


> ---
> New in v2
> ---
>  xen/include/asm-arm/device.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
> index 5ecd5e7bd15e..ebe84ea853cd 100644
> --- a/xen/include/asm-arm/device.h
> +++ b/xen/include/asm-arm/device.h
> @@ -27,7 +27,7 @@ typedef struct device device_t;
>  
>  /* TODO: Correctly implement dev_is_pci when PCI is supported on ARM */
>  #define dev_is_pci(dev) ((void)(dev), 0)
> -#define dev_is_dt(dev)  ((dev->type == DEV_DT)
> +#define dev_is_dt(dev)  ((dev)->type == DEV_DT)
>  
>  enum device_class
>  {
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-10-04 14:11 ` [PATCH v4 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
@ 2021-10-05  1:24   ` Stefano Stabellini
  2021-10-05  4:32     ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 27+ messages in thread
From: Stefano Stabellini @ 2021-10-05  1:24 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
	volodymyr_babchuk, Artem_Mygaiev, roger.pau, jbeulich,
	andrew.cooper3, george.dunlap, paul, bertrand.marquis,
	rahul.singh, Oleksandr Andrushchenko

On Mon, 4 Oct 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> PCI host bridges are special devices in terms of implementing PCI
> passthrough. According to [1] the current implementation depends on
> Domain-0 to perform the initialization of the relevant PCI host
> bridge hardware and perform PCI device enumeration. In order to
> achieve that one of the required changes is to not map all the memory
> ranges in map_range_to_domain as we traverse the device tree on startup
> and perform some additional checks if the range needs to be mapped to
> Domain-0.
> 
> The generic PCI host controller device tree binding says [2]:
> - ranges: As described in IEEE Std 1275-1994, but must provide
>           at least a definition of non-prefetchable memory. One
>           or both of prefetchable Memory and IO Space may also
>           be provided.
> 
> - reg   : The Configuration Space base address and size, as accessed
>           from the parent bus.  The base address corresponds to
>           the first bus in the "bus-range" property.  If no
>           "bus-range" is specified, this will be bus 0 (the default).
> 
> >From the above none of the memory ranges from the "ranges" property
> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
> be handled dynamically by vPCI as we assign PCI devices, e.g. each
> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
> as needed by Xen.
> 
> The "reg" property covers not only ECAM space, but may also have other
> then the configuration memory ranges described, for example [3]:
> - reg: Should contain rc_dbi, config registers location and length.
> - reg-names: Must include the following entries:
>    "rc_dbi": controller configuration registers;
>    "config": PCIe configuration space registers.
> 
> This patch makes it possible to not map all the ranges from the
> "ranges" property and also ECAM from the "reg". All the rest from the
> "reg" property still needs to be mapped to Domain-0, so the PCI
> host bridge remains functional in Domain-0.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html
> [2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt
> [3] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
> Since v3:
>  - pass struct map_range_data to map_dt_irq_to_domain
>  - remove redundant check from map_range_to_domain
>  - fix handle_device's .skip_mapping
> Since v2:
>  - removed check in map_range_to_domain for PCI_DEV
>    and moved it to handle_device, so the code is
>    simpler
>  - s/map_pci_bridge/skip_mapping
>  - extended comment in pci_host_bridge_mappings
>  - minor code restructure in construct_dom0
>  - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
>    callbacks
>  - unsigned int i; in pci_host_bridge_mappings
> Since v1:
>  - Added better description of why and what needs to be mapped into
>    Domain-0's p2m and what doesn't
>  - Do not do any mappings for PCI devices while traversing the DT
>  - Walk all the bridges and make required mappings in one go
> ---
>  xen/arch/arm/domain_build.c        | 50 ++++++++++++++++--------------
>  xen/arch/arm/pci/ecam.c            | 14 +++++++++
>  xen/arch/arm/pci/pci-host-common.c | 49 +++++++++++++++++++++++++++++
>  xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
>  xen/include/asm-arm/pci.h          | 10 ++++++
>  xen/include/asm-arm/setup.h        | 13 ++++++++
>  6 files changed, 114 insertions(+), 23 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 8739e3285ef4..2a3c641476bd 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -10,7 +10,6 @@
>  #include <asm/regs.h>
>  #include <xen/errno.h>
>  #include <xen/err.h>
> -#include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <xen/guest_access.h>
>  #include <xen/iocap.h>
> @@ -51,12 +50,6 @@ static int __init parse_dom0_mem(const char *s)
>  }
>  custom_param("dom0_mem", parse_dom0_mem);
>  
> -struct map_range_data
> -{
> -    struct domain *d;
> -    p2m_type_t p2mt;
> -};
> -
>  /* Override macros from asm/page.h to make them work with mfn_t */
>  #undef virt_to_mfn
>  #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
> @@ -1641,10 +1634,11 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>                                         const struct dt_irq *dt_irq,
>                                         void *data)
>  {
> -    struct domain *d = data;
> +    struct map_range_data *mr_data = data;
> +    struct domain *d = mr_data->d;
>      unsigned int irq = dt_irq->irq;
>      int res;
> -    bool need_mapping = !dt_device_for_passthrough(dev);
> +    bool need_mapping = !mr_data->skip_mapping;
>  
>      if ( irq < NR_LOCAL_IRQS )
>      {
> @@ -1668,13 +1662,12 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>      return 0;
>  }
>  
> -static int __init map_range_to_domain(const struct dt_device_node *dev,
> -                                      u64 addr, u64 len,
> -                                      void *data)
> +int __init map_range_to_domain(const struct dt_device_node *dev,
> +                               u64 addr, u64 len, void *data)
>  {
>      struct map_range_data *mr_data = data;
>      struct domain *d = mr_data->d;
> -    bool need_mapping = !dt_device_for_passthrough(dev);
> +    bool need_mapping = !mr_data->skip_mapping;
>      int res;
>  
>      /*
> @@ -1726,23 +1719,21 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>   * then we may need to perform additional mappings in order to make
>   * the child resources available to domain 0.
>   */
> -static int __init map_device_children(struct domain *d,
> -                                      const struct dt_device_node *dev,
> -                                      p2m_type_t p2mt)
> +static int __init map_device_children(const struct dt_device_node *dev,
> +                                      struct map_range_data *mr_data)
>  {
> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> -    int ret;
> -
>      if ( dt_device_type_is_equal(dev, "pci") )
>      {
> +        int ret;
> +
>          dt_dprintk("Mapping children of %s to guest\n",
>                     dt_node_full_name(dev));
>  
> -        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, d);
> +        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, mr_data);
>          if ( ret < 0 )
>              return ret;
>  
> -        ret = dt_for_each_range(dev, &map_range_to_domain, &mr_data);
> +        ret = dt_for_each_range(dev, &map_range_to_domain, mr_data);
>          if ( ret < 0 )
>              return ret;
>      }
> @@ -1823,6 +1814,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>      int res;
>      u64 addr, size;
>      bool need_mapping = !dt_device_for_passthrough(dev);
> +    struct map_range_data mr_data = {
> +        .d = d,
> +        .p2mt = p2mt,
> +        .skip_mapping = !need_mapping ||
> +                        (is_pci_passthrough_enabled() &&
> +                         (device_get_class(dev) == DEVICE_PCI))
> +    };

I would prefer if we did this:

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 2a3c641476..1e7cdd11c7 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1813,13 +1813,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
     unsigned int i;
     int res;
     u64 addr, size;
-    bool need_mapping = !dt_device_for_passthrough(dev);
+    bool need_mapping = !dt_device_for_passthrough(dev) &&
+                        !(is_pci_passthrough_enabled() &&
+                          (device_get_class(dev) == DEVICE_PCI))
     struct map_range_data mr_data = {
         .d = d,
         .p2mt = p2mt,
-        .skip_mapping = !need_mapping ||
-                        (is_pci_passthrough_enabled() &&
-                         (device_get_class(dev) == DEVICE_PCI))
+        .skip_mapping = !need_mapping
     };
 
     naddr = dt_number_of_address(dev);


This actually makes a difference because otherwise
handle_device_interrupts could still be called with need_mapping ==
true for PCI devices.

What do you think?  One more comment below.


>      naddr = dt_number_of_address(dev);
>  
> @@ -1862,7 +1860,6 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>      /* Give permission and map MMIOs */
>      for ( i = 0; i < naddr; i++ )
>      {
> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>          res = dt_device_get_address(dev, i, &addr, &size);
>          if ( res )
>          {
> @@ -1876,7 +1873,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>              return res;
>      }
>  
> -    res = map_device_children(d, dev, p2mt);
> +    res = map_device_children(dev, &mr_data);
>      if ( res )
>          return res;
>  
> @@ -3034,7 +3031,14 @@ static int __init construct_dom0(struct domain *d)
>          return rc;
>  
>      if ( acpi_disabled )
> +    {
>          rc = prepare_dtb_hwdom(d, &kinfo);
> +        if ( rc < 0 )
> +            return rc;
> +#ifdef CONFIG_HAS_PCI
> +        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
> +#endif
> +    }
>      else
>          rc = prepare_acpi(d, &kinfo);
>  
> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> index 602d00799c8d..b81620074a91 100644
> --- a/xen/arch/arm/pci/ecam.c
> +++ b/xen/arch/arm/pci/ecam.c
> @@ -40,6 +40,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>      return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where;
>  }
>  
> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
> +                                     struct pci_host_bridge *bridge,
> +                                     uint64_t addr)
> +{
> +    struct pci_config_window *cfg = bridge->cfg;
> +
> +    /*
> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
> +     * so we can trap access to it.
> +     */
> +    return cfg->phys_addr != addr;
> +}
> +
>  /* ECAM ops */
>  const struct pci_ecam_ops pci_generic_ecam_ops = {
>      .bus_shift  = 20,
> @@ -47,6 +60,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>          .map_bus                = pci_ecam_map_bus,
>          .read                   = pci_generic_config_read,
>          .write                  = pci_generic_config_write,
> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
>      }
>  };
>  
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index 1eb4daa87365..085f08e23e0c 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -18,6 +18,7 @@
>  
>  #include <xen/init.h>
>  #include <xen/pci.h>
> +#include <asm/setup.h>
>  #include <xen/rwlock.h>
>  #include <xen/sched.h>
>  #include <xen/vmap.h>
> @@ -320,6 +321,54 @@ int pci_host_get_num_bridges(void)
>      return count;
>  }
>  
> +int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
> +{
> +    struct pci_host_bridge *bridge;
> +    struct map_range_data mr_data = {
> +        .d = d,
> +        .p2mt = p2mt,
> +        .skip_mapping = false
> +    };
> +
> +    /*
> +     * For each PCI host bridge we need to only map those ranges
> +     * which are used by Domain-0 to properly initialize the bridge,
> +     * e.g. we do not want to map ECAM configuration space which lives in
> +     * "reg" or "assigned-addresses" device tree property, but we want to
> +     * map other regions of the host bridge. The PCI aperture defined by
> +     * the "ranges" device tree property should also be skipped.
> +     */
> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +    {
> +        const struct dt_device_node *dev = bridge->dt_node;
> +        unsigned int i;
> +
> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
> +        {
> +            uint64_t addr, size;
> +            int err;
> +
> +            err = dt_device_get_address(dev, i, &addr, &size);
> +            if ( err )
> +            {
> +                printk(XENLOG_ERR
> +                       "Unable to retrieve address range index=%u for %s\n",
> +                       i, dt_node_full_name(dev));
> +                return err;
> +            }
> +
> +            if ( bridge->ops->need_p2m_hwdom_mapping(d, bridge, addr) )
> +            {
> +                err = map_range_to_domain(dev, addr, size, &mr_data);
> +                if ( err )
> +                    return err;
> +            }
> +        }
> +    }
> +
> +    return 0;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
> index 61a9807d3d58..6ad2b31e810d 100644
> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
> @@ -34,6 +34,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
>          .map_bus                = pci_ecam_map_bus,
>          .read                   = pci_generic_config_read,
>          .write                  = pci_generic_config_write,
> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
>      }
>  };
>  
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index a62d8bc60086..d9a3c2a4f3b3 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -17,6 +17,8 @@
>  
>  #ifdef CONFIG_HAS_PCI
>  
> +#include <asm/p2m.h>
> +
>  #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
>  
>  extern bool_t pci_passthrough_enabled;
> @@ -73,6 +75,9 @@ struct pci_ops {
>                  uint32_t reg, uint32_t len, uint32_t *value);
>      int (*write)(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>                   uint32_t reg, uint32_t len, uint32_t value);
> +    bool (*need_p2m_hwdom_mapping)(struct domain *d,
> +                                   struct pci_host_bridge *bridge,
> +                                   uint64_t addr);
>  };
>  
>  /*
> @@ -96,6 +101,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>                               uint32_t reg, uint32_t len, uint32_t value);
>  void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>                                 pci_sbdf_t sbdf, uint32_t where);
> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
> +                                     struct pci_host_bridge *bridge,
> +                                     uint64_t addr);
>  struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
>  int pci_get_host_bridge_segment(const struct dt_device_node *node,
>                                  uint16_t *segment);
> @@ -113,6 +121,8 @@ int pci_host_iterate_bridges(struct domain *d,
>                                          struct pci_host_bridge *bridge));
>  int pci_host_get_num_bridges(void);
>  
> +int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
> +
>  #else   /*!CONFIG_HAS_PCI*/
>  
>  struct arch_pci_dev { };
> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
> index 95da0b7ab9cd..5b30135fda38 100644
> --- a/xen/include/asm-arm/setup.h
> +++ b/xen/include/asm-arm/setup.h
> @@ -2,6 +2,8 @@
>  #define __ARM_SETUP_H_
>  
>  #include <public/version.h>
> +#include <asm/p2m.h>
> +#include <xen/device_tree.h>
>  
>  #define MIN_FDT_ALIGN 8
>  #define MAX_FDT_SIZE SZ_2M
> @@ -77,6 +79,14 @@ struct bootinfo {
>  #endif
>  };
>  
> +struct map_range_data
> +{
> +    struct domain *d;
> +    p2m_type_t p2mt;
> +    /* Set if mappings for PCI host bridges must not be skipped. */

This comment still needs updating.


> +    bool skip_mapping;
> +};
> +
>  extern struct bootinfo bootinfo;
>  
>  extern domid_t max_init_domid;
> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
>  u32 device_tree_get_u32(const void *fdt, int node,
>                          const char *prop_name, u32 dflt);
>  
> +int map_range_to_domain(const struct dt_device_node *dev,
> +                        u64 addr, u64 len, void *data);
> +
>  #endif
>  /*
>   * Local variables:
> -- 
> 2.25.1
> 


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-10-05  1:24   ` Stefano Stabellini
@ 2021-10-05  4:32     ` Oleksandr Andrushchenko
  2021-10-05 21:43       ` Stefano Stabellini
  0 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-05  4:32 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh,
	Oleksandr Andrushchenko



On 05.10.21 04:24, Stefano Stabellini wrote:
> On Mon, 4 Oct 2021, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> PCI host bridges are special devices in terms of implementing PCI
>> passthrough. According to [1] the current implementation depends on
>> Domain-0 to perform the initialization of the relevant PCI host
>> bridge hardware and perform PCI device enumeration. In order to
>> achieve that one of the required changes is to not map all the memory
>> ranges in map_range_to_domain as we traverse the device tree on startup
>> and perform some additional checks if the range needs to be mapped to
>> Domain-0.
>>
>> The generic PCI host controller device tree binding says [2]:
>> - ranges: As described in IEEE Std 1275-1994, but must provide
>>            at least a definition of non-prefetchable memory. One
>>            or both of prefetchable Memory and IO Space may also
>>            be provided.
>>
>> - reg   : The Configuration Space base address and size, as accessed
>>            from the parent bus.  The base address corresponds to
>>            the first bus in the "bus-range" property.  If no
>>            "bus-range" is specified, this will be bus 0 (the default).
>>
>> >From the above none of the memory ranges from the "ranges" property
>> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
>> be handled dynamically by vPCI as we assign PCI devices, e.g. each
>> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
>> as needed by Xen.
>>
>> The "reg" property covers not only ECAM space, but may also have other
>> then the configuration memory ranges described, for example [3]:
>> - reg: Should contain rc_dbi, config registers location and length.
>> - reg-names: Must include the following entries:
>>     "rc_dbi": controller configuration registers;
>>     "config": PCIe configuration space registers.
>>
>> This patch makes it possible to not map all the ranges from the
>> "ranges" property and also ECAM from the "reg". All the rest from the
>> "reg" property still needs to be mapped to Domain-0, so the PCI
>> host bridge remains functional in Domain-0.
>>
>> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html
>> [2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt
>> [3] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> ---
>> Since v3:
>>   - pass struct map_range_data to map_dt_irq_to_domain
>>   - remove redundant check from map_range_to_domain
>>   - fix handle_device's .skip_mapping
>> Since v2:
>>   - removed check in map_range_to_domain for PCI_DEV
>>     and moved it to handle_device, so the code is
>>     simpler
>>   - s/map_pci_bridge/skip_mapping
>>   - extended comment in pci_host_bridge_mappings
>>   - minor code restructure in construct_dom0
>>   - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
>>     callbacks
>>   - unsigned int i; in pci_host_bridge_mappings
>> Since v1:
>>   - Added better description of why and what needs to be mapped into
>>     Domain-0's p2m and what doesn't
>>   - Do not do any mappings for PCI devices while traversing the DT
>>   - Walk all the bridges and make required mappings in one go
>> ---
>>   xen/arch/arm/domain_build.c        | 50 ++++++++++++++++--------------
>>   xen/arch/arm/pci/ecam.c            | 14 +++++++++
>>   xen/arch/arm/pci/pci-host-common.c | 49 +++++++++++++++++++++++++++++
>>   xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
>>   xen/include/asm-arm/pci.h          | 10 ++++++
>>   xen/include/asm-arm/setup.h        | 13 ++++++++
>>   6 files changed, 114 insertions(+), 23 deletions(-)
>>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index 8739e3285ef4..2a3c641476bd 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -10,7 +10,6 @@
>>   #include <asm/regs.h>
>>   #include <xen/errno.h>
>>   #include <xen/err.h>
>> -#include <xen/device_tree.h>
>>   #include <xen/libfdt/libfdt.h>
>>   #include <xen/guest_access.h>
>>   #include <xen/iocap.h>
>> @@ -51,12 +50,6 @@ static int __init parse_dom0_mem(const char *s)
>>   }
>>   custom_param("dom0_mem", parse_dom0_mem);
>>   
>> -struct map_range_data
>> -{
>> -    struct domain *d;
>> -    p2m_type_t p2mt;
>> -};
>> -
>>   /* Override macros from asm/page.h to make them work with mfn_t */
>>   #undef virt_to_mfn
>>   #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
>> @@ -1641,10 +1634,11 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>>                                          const struct dt_irq *dt_irq,
>>                                          void *data)
>>   {
>> -    struct domain *d = data;
>> +    struct map_range_data *mr_data = data;
>> +    struct domain *d = mr_data->d;
>>       unsigned int irq = dt_irq->irq;
>>       int res;
>> -    bool need_mapping = !dt_device_for_passthrough(dev);
>> +    bool need_mapping = !mr_data->skip_mapping;
>>   
>>       if ( irq < NR_LOCAL_IRQS )
>>       {
>> @@ -1668,13 +1662,12 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>>       return 0;
>>   }
>>   
>> -static int __init map_range_to_domain(const struct dt_device_node *dev,
>> -                                      u64 addr, u64 len,
>> -                                      void *data)
>> +int __init map_range_to_domain(const struct dt_device_node *dev,
>> +                               u64 addr, u64 len, void *data)
>>   {
>>       struct map_range_data *mr_data = data;
>>       struct domain *d = mr_data->d;
>> -    bool need_mapping = !dt_device_for_passthrough(dev);
>> +    bool need_mapping = !mr_data->skip_mapping;
>>       int res;
>>   
>>       /*
>> @@ -1726,23 +1719,21 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>>    * then we may need to perform additional mappings in order to make
>>    * the child resources available to domain 0.
>>    */
>> -static int __init map_device_children(struct domain *d,
>> -                                      const struct dt_device_node *dev,
>> -                                      p2m_type_t p2mt)
>> +static int __init map_device_children(const struct dt_device_node *dev,
>> +                                      struct map_range_data *mr_data)
>>   {
>> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>> -    int ret;
>> -
>>       if ( dt_device_type_is_equal(dev, "pci") )
>>       {
>> +        int ret;
>> +
>>           dt_dprintk("Mapping children of %s to guest\n",
>>                      dt_node_full_name(dev));
>>   
>> -        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, d);
>> +        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, mr_data);
>>           if ( ret < 0 )
>>               return ret;
>>   
>> -        ret = dt_for_each_range(dev, &map_range_to_domain, &mr_data);
>> +        ret = dt_for_each_range(dev, &map_range_to_domain, mr_data);
>>           if ( ret < 0 )
>>               return ret;
>>       }
>> @@ -1823,6 +1814,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>       int res;
>>       u64 addr, size;
>>       bool need_mapping = !dt_device_for_passthrough(dev);
>> +    struct map_range_data mr_data = {
>> +        .d = d,
>> +        .p2mt = p2mt,
>> +        .skip_mapping = !need_mapping ||
>> +                        (is_pci_passthrough_enabled() &&
>> +                         (device_get_class(dev) == DEVICE_PCI))
>> +    };
> I would prefer if we did this:
>
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 2a3c641476..1e7cdd11c7 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1813,13 +1813,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>       unsigned int i;
>       int res;
>       u64 addr, size;
> -    bool need_mapping = !dt_device_for_passthrough(dev);
> +    bool need_mapping = !dt_device_for_passthrough(dev) &&
> +                        !(is_pci_passthrough_enabled() &&
> +                          (device_get_class(dev) == DEVICE_PCI))
>       struct map_range_data mr_data = {
>           .d = d,
>           .p2mt = p2mt,
> -        .skip_mapping = !need_mapping ||
> -                        (is_pci_passthrough_enabled() &&
> -                         (device_get_class(dev) == DEVICE_PCI))
> +        .skip_mapping = !need_mapping
>       };
>   
>       naddr = dt_number_of_address(dev);
>
>
> This actually makes a difference because otherwise
> handle_device_interrupts could still be called with need_mapping ==
> true for PCI devices.
And this is totally fine. Remember the name of the patch: we do not
want to map memory, but interrupts are still needed for the bridge to
function in Dom0. The same as it needs all, but "cfg" from "regs".
So, need_mapping == true for interrupts is what we want: legacy INTx
are not supported by design and only MSI/MSI-X are supported.
>
> What do you think?  One more comment below.
>
>
>>       naddr = dt_number_of_address(dev);
>>   
>> @@ -1862,7 +1860,6 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>       /* Give permission and map MMIOs */
>>       for ( i = 0; i < naddr; i++ )
>>       {
>> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>>           res = dt_device_get_address(dev, i, &addr, &size);
>>           if ( res )
>>           {
>> @@ -1876,7 +1873,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>               return res;
>>       }
>>   
>> -    res = map_device_children(d, dev, p2mt);
>> +    res = map_device_children(dev, &mr_data);
>>       if ( res )
>>           return res;
>>   
>> @@ -3034,7 +3031,14 @@ static int __init construct_dom0(struct domain *d)
>>           return rc;
>>   
>>       if ( acpi_disabled )
>> +    {
>>           rc = prepare_dtb_hwdom(d, &kinfo);
>> +        if ( rc < 0 )
>> +            return rc;
>> +#ifdef CONFIG_HAS_PCI
>> +        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
>> +#endif
>> +    }
>>       else
>>           rc = prepare_acpi(d, &kinfo);
>>   
>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>> index 602d00799c8d..b81620074a91 100644
>> --- a/xen/arch/arm/pci/ecam.c
>> +++ b/xen/arch/arm/pci/ecam.c
>> @@ -40,6 +40,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>       return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where;
>>   }
>>   
>> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
>> +                                     struct pci_host_bridge *bridge,
>> +                                     uint64_t addr)
>> +{
>> +    struct pci_config_window *cfg = bridge->cfg;
>> +
>> +    /*
>> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
>> +     * so we can trap access to it.
>> +     */
>> +    return cfg->phys_addr != addr;
>> +}
>> +
>>   /* ECAM ops */
>>   const struct pci_ecam_ops pci_generic_ecam_ops = {
>>       .bus_shift  = 20,
>> @@ -47,6 +60,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>>           .map_bus                = pci_ecam_map_bus,
>>           .read                   = pci_generic_config_read,
>>           .write                  = pci_generic_config_write,
>> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
>>       }
>>   };
>>   
>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>> index 1eb4daa87365..085f08e23e0c 100644
>> --- a/xen/arch/arm/pci/pci-host-common.c
>> +++ b/xen/arch/arm/pci/pci-host-common.c
>> @@ -18,6 +18,7 @@
>>   
>>   #include <xen/init.h>
>>   #include <xen/pci.h>
>> +#include <asm/setup.h>
>>   #include <xen/rwlock.h>
>>   #include <xen/sched.h>
>>   #include <xen/vmap.h>
>> @@ -320,6 +321,54 @@ int pci_host_get_num_bridges(void)
>>       return count;
>>   }
>>   
>> +int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
>> +{
>> +    struct pci_host_bridge *bridge;
>> +    struct map_range_data mr_data = {
>> +        .d = d,
>> +        .p2mt = p2mt,
>> +        .skip_mapping = false
>> +    };
>> +
>> +    /*
>> +     * For each PCI host bridge we need to only map those ranges
>> +     * which are used by Domain-0 to properly initialize the bridge,
>> +     * e.g. we do not want to map ECAM configuration space which lives in
>> +     * "reg" or "assigned-addresses" device tree property, but we want to
>> +     * map other regions of the host bridge. The PCI aperture defined by
>> +     * the "ranges" device tree property should also be skipped.
>> +     */
>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>> +    {
>> +        const struct dt_device_node *dev = bridge->dt_node;
>> +        unsigned int i;
>> +
>> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
>> +        {
>> +            uint64_t addr, size;
>> +            int err;
>> +
>> +            err = dt_device_get_address(dev, i, &addr, &size);
>> +            if ( err )
>> +            {
>> +                printk(XENLOG_ERR
>> +                       "Unable to retrieve address range index=%u for %s\n",
>> +                       i, dt_node_full_name(dev));
>> +                return err;
>> +            }
>> +
>> +            if ( bridge->ops->need_p2m_hwdom_mapping(d, bridge, addr) )
>> +            {
>> +                err = map_range_to_domain(dev, addr, size, &mr_data);
>> +                if ( err )
>> +                    return err;
>> +            }
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
>> index 61a9807d3d58..6ad2b31e810d 100644
>> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
>> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
>> @@ -34,6 +34,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
>>           .map_bus                = pci_ecam_map_bus,
>>           .read                   = pci_generic_config_read,
>>           .write                  = pci_generic_config_write,
>> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
>>       }
>>   };
>>   
>> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
>> index a62d8bc60086..d9a3c2a4f3b3 100644
>> --- a/xen/include/asm-arm/pci.h
>> +++ b/xen/include/asm-arm/pci.h
>> @@ -17,6 +17,8 @@
>>   
>>   #ifdef CONFIG_HAS_PCI
>>   
>> +#include <asm/p2m.h>
>> +
>>   #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
>>   
>>   extern bool_t pci_passthrough_enabled;
>> @@ -73,6 +75,9 @@ struct pci_ops {
>>                   uint32_t reg, uint32_t len, uint32_t *value);
>>       int (*write)(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>>                    uint32_t reg, uint32_t len, uint32_t value);
>> +    bool (*need_p2m_hwdom_mapping)(struct domain *d,
>> +                                   struct pci_host_bridge *bridge,
>> +                                   uint64_t addr);
>>   };
>>   
>>   /*
>> @@ -96,6 +101,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>>                                uint32_t reg, uint32_t len, uint32_t value);
>>   void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>                                  pci_sbdf_t sbdf, uint32_t where);
>> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
>> +                                     struct pci_host_bridge *bridge,
>> +                                     uint64_t addr);
>>   struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
>>   int pci_get_host_bridge_segment(const struct dt_device_node *node,
>>                                   uint16_t *segment);
>> @@ -113,6 +121,8 @@ int pci_host_iterate_bridges(struct domain *d,
>>                                           struct pci_host_bridge *bridge));
>>   int pci_host_get_num_bridges(void);
>>   
>> +int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
>> +
>>   #else   /*!CONFIG_HAS_PCI*/
>>   
>>   struct arch_pci_dev { };
>> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
>> index 95da0b7ab9cd..5b30135fda38 100644
>> --- a/xen/include/asm-arm/setup.h
>> +++ b/xen/include/asm-arm/setup.h
>> @@ -2,6 +2,8 @@
>>   #define __ARM_SETUP_H_
>>   
>>   #include <public/version.h>
>> +#include <asm/p2m.h>
>> +#include <xen/device_tree.h>
>>   
>>   #define MIN_FDT_ALIGN 8
>>   #define MAX_FDT_SIZE SZ_2M
>> @@ -77,6 +79,14 @@ struct bootinfo {
>>   #endif
>>   };
>>   
>> +struct map_range_data
>> +{
>> +    struct domain *d;
>> +    p2m_type_t p2mt;
>> +    /* Set if mappings for PCI host bridges must not be skipped. */
> This comment still needs updating.
Sorry, I missed this one
>
>
>> +    bool skip_mapping;
>> +};
>> +
>>   extern struct bootinfo bootinfo;
>>   
>>   extern domid_t max_init_domid;
>> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
>>   u32 device_tree_get_u32(const void *fdt, int node,
>>                           const char *prop_name, u32 dflt);
>>   
>> +int map_range_to_domain(const struct dt_device_node *dev,
>> +                        u64 addr, u64 len, void *data);
>> +
>>   #endif
>>   /*
>>    * Local variables:
>> -- 
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-10-05  4:32     ` Oleksandr Andrushchenko
@ 2021-10-05 21:43       ` Stefano Stabellini
  2021-10-06  4:55         ` Oleksandr Andrushchenko
  0 siblings, 1 reply; 27+ messages in thread
From: Stefano Stabellini @ 2021-10-05 21:43 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: Stefano Stabellini, xen-devel, julien, Oleksandr Tyshchenko,
	Volodymyr Babchuk, Artem Mygaiev, roger.pau, jbeulich,
	andrew.cooper3, george.dunlap, paul, Bertrand Marquis,
	Rahul Singh

On Tue, 5 Oct 2021, Oleksandr Andrushchenko wrote:
> On 05.10.21 04:24, Stefano Stabellini wrote:
> > On Mon, 4 Oct 2021, Oleksandr Andrushchenko wrote:
> >> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> >>
> >> PCI host bridges are special devices in terms of implementing PCI
> >> passthrough. According to [1] the current implementation depends on
> >> Domain-0 to perform the initialization of the relevant PCI host
> >> bridge hardware and perform PCI device enumeration. In order to
> >> achieve that one of the required changes is to not map all the memory
> >> ranges in map_range_to_domain as we traverse the device tree on startup
> >> and perform some additional checks if the range needs to be mapped to
> >> Domain-0.
> >>
> >> The generic PCI host controller device tree binding says [2]:
> >> - ranges: As described in IEEE Std 1275-1994, but must provide
> >>            at least a definition of non-prefetchable memory. One
> >>            or both of prefetchable Memory and IO Space may also
> >>            be provided.
> >>
> >> - reg   : The Configuration Space base address and size, as accessed
> >>            from the parent bus.  The base address corresponds to
> >>            the first bus in the "bus-range" property.  If no
> >>            "bus-range" is specified, this will be bus 0 (the default).
> >>
> >> >From the above none of the memory ranges from the "ranges" property
> >> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
> >> be handled dynamically by vPCI as we assign PCI devices, e.g. each
> >> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
> >> as needed by Xen.
> >>
> >> The "reg" property covers not only ECAM space, but may also have other
> >> then the configuration memory ranges described, for example [3]:
> >> - reg: Should contain rc_dbi, config registers location and length.
> >> - reg-names: Must include the following entries:
> >>     "rc_dbi": controller configuration registers;
> >>     "config": PCIe configuration space registers.
> >>
> >> This patch makes it possible to not map all the ranges from the
> >> "ranges" property and also ECAM from the "reg". All the rest from the
> >> "reg" property still needs to be mapped to Domain-0, so the PCI
> >> host bridge remains functional in Domain-0.
> >>
> >> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html
> >> [2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt
> >> [3] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt
> >>
> >> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> >> ---
> >> Since v3:
> >>   - pass struct map_range_data to map_dt_irq_to_domain
> >>   - remove redundant check from map_range_to_domain
> >>   - fix handle_device's .skip_mapping
> >> Since v2:
> >>   - removed check in map_range_to_domain for PCI_DEV
> >>     and moved it to handle_device, so the code is
> >>     simpler
> >>   - s/map_pci_bridge/skip_mapping
> >>   - extended comment in pci_host_bridge_mappings
> >>   - minor code restructure in construct_dom0
> >>   - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
> >>     callbacks
> >>   - unsigned int i; in pci_host_bridge_mappings
> >> Since v1:
> >>   - Added better description of why and what needs to be mapped into
> >>     Domain-0's p2m and what doesn't
> >>   - Do not do any mappings for PCI devices while traversing the DT
> >>   - Walk all the bridges and make required mappings in one go
> >> ---
> >>   xen/arch/arm/domain_build.c        | 50 ++++++++++++++++--------------
> >>   xen/arch/arm/pci/ecam.c            | 14 +++++++++
> >>   xen/arch/arm/pci/pci-host-common.c | 49 +++++++++++++++++++++++++++++
> >>   xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
> >>   xen/include/asm-arm/pci.h          | 10 ++++++
> >>   xen/include/asm-arm/setup.h        | 13 ++++++++
> >>   6 files changed, 114 insertions(+), 23 deletions(-)
> >>
> >> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >> index 8739e3285ef4..2a3c641476bd 100644
> >> --- a/xen/arch/arm/domain_build.c
> >> +++ b/xen/arch/arm/domain_build.c
> >> @@ -10,7 +10,6 @@
> >>   #include <asm/regs.h>
> >>   #include <xen/errno.h>
> >>   #include <xen/err.h>
> >> -#include <xen/device_tree.h>
> >>   #include <xen/libfdt/libfdt.h>
> >>   #include <xen/guest_access.h>
> >>   #include <xen/iocap.h>
> >> @@ -51,12 +50,6 @@ static int __init parse_dom0_mem(const char *s)
> >>   }
> >>   custom_param("dom0_mem", parse_dom0_mem);
> >>   
> >> -struct map_range_data
> >> -{
> >> -    struct domain *d;
> >> -    p2m_type_t p2mt;
> >> -};
> >> -
> >>   /* Override macros from asm/page.h to make them work with mfn_t */
> >>   #undef virt_to_mfn
> >>   #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
> >> @@ -1641,10 +1634,11 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
> >>                                          const struct dt_irq *dt_irq,
> >>                                          void *data)
> >>   {
> >> -    struct domain *d = data;
> >> +    struct map_range_data *mr_data = data;
> >> +    struct domain *d = mr_data->d;
> >>       unsigned int irq = dt_irq->irq;
> >>       int res;
> >> -    bool need_mapping = !dt_device_for_passthrough(dev);
> >> +    bool need_mapping = !mr_data->skip_mapping;
> >>   
> >>       if ( irq < NR_LOCAL_IRQS )
> >>       {
> >> @@ -1668,13 +1662,12 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
> >>       return 0;
> >>   }
> >>   
> >> -static int __init map_range_to_domain(const struct dt_device_node *dev,
> >> -                                      u64 addr, u64 len,
> >> -                                      void *data)
> >> +int __init map_range_to_domain(const struct dt_device_node *dev,
> >> +                               u64 addr, u64 len, void *data)
> >>   {
> >>       struct map_range_data *mr_data = data;
> >>       struct domain *d = mr_data->d;
> >> -    bool need_mapping = !dt_device_for_passthrough(dev);
> >> +    bool need_mapping = !mr_data->skip_mapping;
> >>       int res;
> >>   
> >>       /*
> >> @@ -1726,23 +1719,21 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
> >>    * then we may need to perform additional mappings in order to make
> >>    * the child resources available to domain 0.
> >>    */
> >> -static int __init map_device_children(struct domain *d,
> >> -                                      const struct dt_device_node *dev,
> >> -                                      p2m_type_t p2mt)
> >> +static int __init map_device_children(const struct dt_device_node *dev,
> >> +                                      struct map_range_data *mr_data)
> >>   {
> >> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> >> -    int ret;
> >> -
> >>       if ( dt_device_type_is_equal(dev, "pci") )
> >>       {
> >> +        int ret;
> >> +
> >>           dt_dprintk("Mapping children of %s to guest\n",
> >>                      dt_node_full_name(dev));
> >>   
> >> -        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, d);
> >> +        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, mr_data);
> >>           if ( ret < 0 )
> >>               return ret;
> >>   
> >> -        ret = dt_for_each_range(dev, &map_range_to_domain, &mr_data);
> >> +        ret = dt_for_each_range(dev, &map_range_to_domain, mr_data);
> >>           if ( ret < 0 )
> >>               return ret;
> >>       }
> >> @@ -1823,6 +1814,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
> >>       int res;
> >>       u64 addr, size;
> >>       bool need_mapping = !dt_device_for_passthrough(dev);
> >> +    struct map_range_data mr_data = {
> >> +        .d = d,
> >> +        .p2mt = p2mt,
> >> +        .skip_mapping = !need_mapping ||
> >> +                        (is_pci_passthrough_enabled() &&
> >> +                         (device_get_class(dev) == DEVICE_PCI))
> >> +    };
> > I would prefer if we did this:
> >
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index 2a3c641476..1e7cdd11c7 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -1813,13 +1813,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
> >       unsigned int i;
> >       int res;
> >       u64 addr, size;
> > -    bool need_mapping = !dt_device_for_passthrough(dev);
> > +    bool need_mapping = !dt_device_for_passthrough(dev) &&
> > +                        !(is_pci_passthrough_enabled() &&
> > +                          (device_get_class(dev) == DEVICE_PCI))
> >       struct map_range_data mr_data = {
> >           .d = d,
> >           .p2mt = p2mt,
> > -        .skip_mapping = !need_mapping ||
> > -                        (is_pci_passthrough_enabled() &&
> > -                         (device_get_class(dev) == DEVICE_PCI))
> > +        .skip_mapping = !need_mapping
> >       };
> >   
> >       naddr = dt_number_of_address(dev);
> >
> >
> > This actually makes a difference because otherwise
> > handle_device_interrupts could still be called with need_mapping ==
> > true for PCI devices.
> And this is totally fine. Remember the name of the patch: we do not
> want to map memory, but interrupts are still needed for the bridge to
> function in Dom0. The same as it needs all, but "cfg" from "regs".
> So, need_mapping == true for interrupts is what we want: legacy INTx
> are not supported by design and only MSI/MSI-X are supported.

Ah, makes sense, I momentarily lost sight of that. Could you please add
an in-code comment to explain it, something like:

/*
 * For PCI passthrough we need to remap only interrupts to Dom0. MMIO
 * regions of the PCIe Root Complex and PCIe aperture should not be
 * mapped automatically to Dom0.
 */
struct map_range_data mr_data = {
        .d = d,
        .p2mt = p2mt,
        .skip_mapping = !need_mapping ||
                        (is_pci_passthrough_enabled() &&
                         (device_get_class(dev) == DEVICE_PCI))
    };


> >
> > What do you think?  One more comment below.
> >
> >
> >>       naddr = dt_number_of_address(dev);
> >>   
> >> @@ -1862,7 +1860,6 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
> >>       /* Give permission and map MMIOs */
> >>       for ( i = 0; i < naddr; i++ )
> >>       {
> >> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> >>           res = dt_device_get_address(dev, i, &addr, &size);
> >>           if ( res )
> >>           {
> >> @@ -1876,7 +1873,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
> >>               return res;
> >>       }
> >>   
> >> -    res = map_device_children(d, dev, p2mt);
> >> +    res = map_device_children(dev, &mr_data);
> >>       if ( res )
> >>           return res;
> >>   
> >> @@ -3034,7 +3031,14 @@ static int __init construct_dom0(struct domain *d)
> >>           return rc;
> >>   
> >>       if ( acpi_disabled )
> >> +    {
> >>           rc = prepare_dtb_hwdom(d, &kinfo);
> >> +        if ( rc < 0 )
> >> +            return rc;
> >> +#ifdef CONFIG_HAS_PCI
> >> +        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
> >> +#endif
> >> +    }
> >>       else
> >>           rc = prepare_acpi(d, &kinfo);
> >>   
> >> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> >> index 602d00799c8d..b81620074a91 100644
> >> --- a/xen/arch/arm/pci/ecam.c
> >> +++ b/xen/arch/arm/pci/ecam.c
> >> @@ -40,6 +40,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
> >>       return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where;
> >>   }
> >>   
> >> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
> >> +                                     struct pci_host_bridge *bridge,
> >> +                                     uint64_t addr)
> >> +{
> >> +    struct pci_config_window *cfg = bridge->cfg;
> >> +
> >> +    /*
> >> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
> >> +     * so we can trap access to it.
> >> +     */
> >> +    return cfg->phys_addr != addr;
> >> +}
> >> +
> >>   /* ECAM ops */
> >>   const struct pci_ecam_ops pci_generic_ecam_ops = {
> >>       .bus_shift  = 20,
> >> @@ -47,6 +60,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
> >>           .map_bus                = pci_ecam_map_bus,
> >>           .read                   = pci_generic_config_read,
> >>           .write                  = pci_generic_config_write,
> >> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
> >>       }
> >>   };
> >>   
> >> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> >> index 1eb4daa87365..085f08e23e0c 100644
> >> --- a/xen/arch/arm/pci/pci-host-common.c
> >> +++ b/xen/arch/arm/pci/pci-host-common.c
> >> @@ -18,6 +18,7 @@
> >>   
> >>   #include <xen/init.h>
> >>   #include <xen/pci.h>
> >> +#include <asm/setup.h>
> >>   #include <xen/rwlock.h>
> >>   #include <xen/sched.h>
> >>   #include <xen/vmap.h>
> >> @@ -320,6 +321,54 @@ int pci_host_get_num_bridges(void)
> >>       return count;
> >>   }
> >>   
> >> +int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
> >> +{
> >> +    struct pci_host_bridge *bridge;
> >> +    struct map_range_data mr_data = {
> >> +        .d = d,
> >> +        .p2mt = p2mt,
> >> +        .skip_mapping = false
> >> +    };
> >> +
> >> +    /*
> >> +     * For each PCI host bridge we need to only map those ranges
> >> +     * which are used by Domain-0 to properly initialize the bridge,
> >> +     * e.g. we do not want to map ECAM configuration space which lives in
> >> +     * "reg" or "assigned-addresses" device tree property, but we want to
> >> +     * map other regions of the host bridge. The PCI aperture defined by
> >> +     * the "ranges" device tree property should also be skipped.
> >> +     */
> >> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> >> +    {
> >> +        const struct dt_device_node *dev = bridge->dt_node;
> >> +        unsigned int i;
> >> +
> >> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
> >> +        {
> >> +            uint64_t addr, size;
> >> +            int err;
> >> +
> >> +            err = dt_device_get_address(dev, i, &addr, &size);
> >> +            if ( err )
> >> +            {
> >> +                printk(XENLOG_ERR
> >> +                       "Unable to retrieve address range index=%u for %s\n",
> >> +                       i, dt_node_full_name(dev));
> >> +                return err;
> >> +            }
> >> +
> >> +            if ( bridge->ops->need_p2m_hwdom_mapping(d, bridge, addr) )
> >> +            {
> >> +                err = map_range_to_domain(dev, addr, size, &mr_data);
> >> +                if ( err )
> >> +                    return err;
> >> +            }
> >> +        }
> >> +    }
> >> +
> >> +    return 0;
> >> +}
> >> +
> >>   /*
> >>    * Local variables:
> >>    * mode: C
> >> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
> >> index 61a9807d3d58..6ad2b31e810d 100644
> >> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
> >> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
> >> @@ -34,6 +34,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
> >>           .map_bus                = pci_ecam_map_bus,
> >>           .read                   = pci_generic_config_read,
> >>           .write                  = pci_generic_config_write,
> >> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
> >>       }
> >>   };
> >>   
> >> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> >> index a62d8bc60086..d9a3c2a4f3b3 100644
> >> --- a/xen/include/asm-arm/pci.h
> >> +++ b/xen/include/asm-arm/pci.h
> >> @@ -17,6 +17,8 @@
> >>   
> >>   #ifdef CONFIG_HAS_PCI
> >>   
> >> +#include <asm/p2m.h>
> >> +
> >>   #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
> >>   
> >>   extern bool_t pci_passthrough_enabled;
> >> @@ -73,6 +75,9 @@ struct pci_ops {
> >>                   uint32_t reg, uint32_t len, uint32_t *value);
> >>       int (*write)(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
> >>                    uint32_t reg, uint32_t len, uint32_t value);
> >> +    bool (*need_p2m_hwdom_mapping)(struct domain *d,
> >> +                                   struct pci_host_bridge *bridge,
> >> +                                   uint64_t addr);
> >>   };
> >>   
> >>   /*
> >> @@ -96,6 +101,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
> >>                                uint32_t reg, uint32_t len, uint32_t value);
> >>   void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
> >>                                  pci_sbdf_t sbdf, uint32_t where);
> >> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
> >> +                                     struct pci_host_bridge *bridge,
> >> +                                     uint64_t addr);
> >>   struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
> >>   int pci_get_host_bridge_segment(const struct dt_device_node *node,
> >>                                   uint16_t *segment);
> >> @@ -113,6 +121,8 @@ int pci_host_iterate_bridges(struct domain *d,
> >>                                           struct pci_host_bridge *bridge));
> >>   int pci_host_get_num_bridges(void);
> >>   
> >> +int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
> >> +
> >>   #else   /*!CONFIG_HAS_PCI*/
> >>   
> >>   struct arch_pci_dev { };
> >> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
> >> index 95da0b7ab9cd..5b30135fda38 100644
> >> --- a/xen/include/asm-arm/setup.h
> >> +++ b/xen/include/asm-arm/setup.h
> >> @@ -2,6 +2,8 @@
> >>   #define __ARM_SETUP_H_
> >>   
> >>   #include <public/version.h>
> >> +#include <asm/p2m.h>
> >> +#include <xen/device_tree.h>
> >>   
> >>   #define MIN_FDT_ALIGN 8
> >>   #define MAX_FDT_SIZE SZ_2M
> >> @@ -77,6 +79,14 @@ struct bootinfo {
> >>   #endif
> >>   };
> >>   
> >> +struct map_range_data
> >> +{
> >> +    struct domain *d;
> >> +    p2m_type_t p2mt;
> >> +    /* Set if mappings for PCI host bridges must not be skipped. */
> > This comment still needs updating.
> Sorry, I missed this one
> >
> >
> >> +    bool skip_mapping;
> >> +};
> >> +
> >>   extern struct bootinfo bootinfo;
> >>   
> >>   extern domid_t max_init_domid;
> >> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
> >>   u32 device_tree_get_u32(const void *fdt, int node,
> >>                           const char *prop_name, u32 dflt);
> >>   
> >> +int map_range_to_domain(const struct dt_device_node *dev,
> >> +                        u64 addr, u64 len, void *data);
> >> +
> >>   #endif
> >>   /*
> >>    * Local variables:
> >> -- 
> >> 2.25.1
> >>
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-10-05 21:43       ` Stefano Stabellini
@ 2021-10-06  4:55         ` Oleksandr Andrushchenko
  2021-10-06 21:33           ` Stefano Stabellini
  0 siblings, 1 reply; 27+ messages in thread
From: Oleksandr Andrushchenko @ 2021-10-06  4:55 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
	Artem Mygaiev, roger.pau, jbeulich, andrew.cooper3,
	george.dunlap, paul, Bertrand Marquis, Rahul Singh



On 06.10.21 00:43, Stefano Stabellini wrote:
> On Tue, 5 Oct 2021, Oleksandr Andrushchenko wrote:
>> On 05.10.21 04:24, Stefano Stabellini wrote:
>>> On Mon, 4 Oct 2021, Oleksandr Andrushchenko wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>
>>>> PCI host bridges are special devices in terms of implementing PCI
>>>> passthrough. According to [1] the current implementation depends on
>>>> Domain-0 to perform the initialization of the relevant PCI host
>>>> bridge hardware and perform PCI device enumeration. In order to
>>>> achieve that one of the required changes is to not map all the memory
>>>> ranges in map_range_to_domain as we traverse the device tree on startup
>>>> and perform some additional checks if the range needs to be mapped to
>>>> Domain-0.
>>>>
>>>> The generic PCI host controller device tree binding says [2]:
>>>> - ranges: As described in IEEE Std 1275-1994, but must provide
>>>>             at least a definition of non-prefetchable memory. One
>>>>             or both of prefetchable Memory and IO Space may also
>>>>             be provided.
>>>>
>>>> - reg   : The Configuration Space base address and size, as accessed
>>>>             from the parent bus.  The base address corresponds to
>>>>             the first bus in the "bus-range" property.  If no
>>>>             "bus-range" is specified, this will be bus 0 (the default).
>>>>
>>>> >From the above none of the memory ranges from the "ranges" property
>>>> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
>>>> be handled dynamically by vPCI as we assign PCI devices, e.g. each
>>>> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
>>>> as needed by Xen.
>>>>
>>>> The "reg" property covers not only ECAM space, but may also have other
>>>> then the configuration memory ranges described, for example [3]:
>>>> - reg: Should contain rc_dbi, config registers location and length.
>>>> - reg-names: Must include the following entries:
>>>>      "rc_dbi": controller configuration registers;
>>>>      "config": PCIe configuration space registers.
>>>>
>>>> This patch makes it possible to not map all the ranges from the
>>>> "ranges" property and also ECAM from the "reg". All the rest from the
>>>> "reg" property still needs to be mapped to Domain-0, so the PCI
>>>> host bridge remains functional in Domain-0.
>>>>
>>>> [1] https://urldefense.com/v3/__https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html__;!!GF_29dbcQIUBPA!nW_LVc5NeS7QQII7CDO4h3ifHhxdPAvQpLSVaX1CRyXh3c1a4a9cBpQ9S7SiAsDdMbL5RkTJpw$ [lists[.]xenproject[.]org]
>>>> [2] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt__;!!GF_29dbcQIUBPA!nW_LVc5NeS7QQII7CDO4h3ifHhxdPAvQpLSVaX1CRyXh3c1a4a9cBpQ9S7SiAsDdMbL-vGunhg$ [kernel[.]org]
>>>> [3] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt__;!!GF_29dbcQIUBPA!nW_LVc5NeS7QQII7CDO4h3ifHhxdPAvQpLSVaX1CRyXh3c1a4a9cBpQ9S7SiAsDdMbKjL04tRA$ [kernel[.]org]
>>>>
>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>> ---
>>>> Since v3:
>>>>    - pass struct map_range_data to map_dt_irq_to_domain
>>>>    - remove redundant check from map_range_to_domain
>>>>    - fix handle_device's .skip_mapping
>>>> Since v2:
>>>>    - removed check in map_range_to_domain for PCI_DEV
>>>>      and moved it to handle_device, so the code is
>>>>      simpler
>>>>    - s/map_pci_bridge/skip_mapping
>>>>    - extended comment in pci_host_bridge_mappings
>>>>    - minor code restructure in construct_dom0
>>>>    - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
>>>>      callbacks
>>>>    - unsigned int i; in pci_host_bridge_mappings
>>>> Since v1:
>>>>    - Added better description of why and what needs to be mapped into
>>>>      Domain-0's p2m and what doesn't
>>>>    - Do not do any mappings for PCI devices while traversing the DT
>>>>    - Walk all the bridges and make required mappings in one go
>>>> ---
>>>>    xen/arch/arm/domain_build.c        | 50 ++++++++++++++++--------------
>>>>    xen/arch/arm/pci/ecam.c            | 14 +++++++++
>>>>    xen/arch/arm/pci/pci-host-common.c | 49 +++++++++++++++++++++++++++++
>>>>    xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
>>>>    xen/include/asm-arm/pci.h          | 10 ++++++
>>>>    xen/include/asm-arm/setup.h        | 13 ++++++++
>>>>    6 files changed, 114 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>> index 8739e3285ef4..2a3c641476bd 100644
>>>> --- a/xen/arch/arm/domain_build.c
>>>> +++ b/xen/arch/arm/domain_build.c
>>>> @@ -10,7 +10,6 @@
>>>>    #include <asm/regs.h>
>>>>    #include <xen/errno.h>
>>>>    #include <xen/err.h>
>>>> -#include <xen/device_tree.h>
>>>>    #include <xen/libfdt/libfdt.h>
>>>>    #include <xen/guest_access.h>
>>>>    #include <xen/iocap.h>
>>>> @@ -51,12 +50,6 @@ static int __init parse_dom0_mem(const char *s)
>>>>    }
>>>>    custom_param("dom0_mem", parse_dom0_mem);
>>>>    
>>>> -struct map_range_data
>>>> -{
>>>> -    struct domain *d;
>>>> -    p2m_type_t p2mt;
>>>> -};
>>>> -
>>>>    /* Override macros from asm/page.h to make them work with mfn_t */
>>>>    #undef virt_to_mfn
>>>>    #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
>>>> @@ -1641,10 +1634,11 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>>>>                                           const struct dt_irq *dt_irq,
>>>>                                           void *data)
>>>>    {
>>>> -    struct domain *d = data;
>>>> +    struct map_range_data *mr_data = data;
>>>> +    struct domain *d = mr_data->d;
>>>>        unsigned int irq = dt_irq->irq;
>>>>        int res;
>>>> -    bool need_mapping = !dt_device_for_passthrough(dev);
>>>> +    bool need_mapping = !mr_data->skip_mapping;
>>>>    
>>>>        if ( irq < NR_LOCAL_IRQS )
>>>>        {
>>>> @@ -1668,13 +1662,12 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
>>>>        return 0;
>>>>    }
>>>>    
>>>> -static int __init map_range_to_domain(const struct dt_device_node *dev,
>>>> -                                      u64 addr, u64 len,
>>>> -                                      void *data)
>>>> +int __init map_range_to_domain(const struct dt_device_node *dev,
>>>> +                               u64 addr, u64 len, void *data)
>>>>    {
>>>>        struct map_range_data *mr_data = data;
>>>>        struct domain *d = mr_data->d;
>>>> -    bool need_mapping = !dt_device_for_passthrough(dev);
>>>> +    bool need_mapping = !mr_data->skip_mapping;
>>>>        int res;
>>>>    
>>>>        /*
>>>> @@ -1726,23 +1719,21 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>>>>     * then we may need to perform additional mappings in order to make
>>>>     * the child resources available to domain 0.
>>>>     */
>>>> -static int __init map_device_children(struct domain *d,
>>>> -                                      const struct dt_device_node *dev,
>>>> -                                      p2m_type_t p2mt)
>>>> +static int __init map_device_children(const struct dt_device_node *dev,
>>>> +                                      struct map_range_data *mr_data)
>>>>    {
>>>> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>>>> -    int ret;
>>>> -
>>>>        if ( dt_device_type_is_equal(dev, "pci") )
>>>>        {
>>>> +        int ret;
>>>> +
>>>>            dt_dprintk("Mapping children of %s to guest\n",
>>>>                       dt_node_full_name(dev));
>>>>    
>>>> -        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, d);
>>>> +        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, mr_data);
>>>>            if ( ret < 0 )
>>>>                return ret;
>>>>    
>>>> -        ret = dt_for_each_range(dev, &map_range_to_domain, &mr_data);
>>>> +        ret = dt_for_each_range(dev, &map_range_to_domain, mr_data);
>>>>            if ( ret < 0 )
>>>>                return ret;
>>>>        }
>>>> @@ -1823,6 +1814,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>>>        int res;
>>>>        u64 addr, size;
>>>>        bool need_mapping = !dt_device_for_passthrough(dev);
>>>> +    struct map_range_data mr_data = {
>>>> +        .d = d,
>>>> +        .p2mt = p2mt,
>>>> +        .skip_mapping = !need_mapping ||
>>>> +                        (is_pci_passthrough_enabled() &&
>>>> +                         (device_get_class(dev) == DEVICE_PCI))
>>>> +    };
>>> I would prefer if we did this:
>>>
>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>> index 2a3c641476..1e7cdd11c7 100644
>>> --- a/xen/arch/arm/domain_build.c
>>> +++ b/xen/arch/arm/domain_build.c
>>> @@ -1813,13 +1813,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>>        unsigned int i;
>>>        int res;
>>>        u64 addr, size;
>>> -    bool need_mapping = !dt_device_for_passthrough(dev);
>>> +    bool need_mapping = !dt_device_for_passthrough(dev) &&
>>> +                        !(is_pci_passthrough_enabled() &&
>>> +                          (device_get_class(dev) == DEVICE_PCI))
>>>        struct map_range_data mr_data = {
>>>            .d = d,
>>>            .p2mt = p2mt,
>>> -        .skip_mapping = !need_mapping ||
>>> -                        (is_pci_passthrough_enabled() &&
>>> -                         (device_get_class(dev) == DEVICE_PCI))
>>> +        .skip_mapping = !need_mapping
>>>        };
>>>    
>>>        naddr = dt_number_of_address(dev);
>>>
>>>
>>> This actually makes a difference because otherwise
>>> handle_device_interrupts could still be called with need_mapping ==
>>> true for PCI devices.
>> And this is totally fine. Remember the name of the patch: we do not
>> want to map memory, but interrupts are still needed for the bridge to
>> function in Dom0. The same as it needs all, but "cfg" from "regs".
>> So, need_mapping == true for interrupts is what we want: legacy INTx
>> are not supported by design and only MSI/MSI-X are supported.
> Ah, makes sense, I momentarily lost sight of that. Could you please add
> an in-code comment to explain it, something like:
>
> /*
>   * For PCI passthrough we need to remap only interrupts to Dom0. MMIO
>   * regions of the PCIe Root Complex and PCIe aperture should not be
>   * mapped automatically to Dom0.
>   */
> struct map_range_data mr_data = {
>          .d = d,
>          .p2mt = p2mt,
>          .skip_mapping = !need_mapping ||
>                          (is_pci_passthrough_enabled() &&
>                           (device_get_class(dev) == DEVICE_PCI))
>      };
Sure will add this so we don't step on it in the future.
Other than this and the missed comment below do you think
that patch is good to go now?
Thank you,
Oleksandr
>
>>> What do you think?  One more comment below.
>>>
>>>
>>>>        naddr = dt_number_of_address(dev);
>>>>    
>>>> @@ -1862,7 +1860,6 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>>>        /* Give permission and map MMIOs */
>>>>        for ( i = 0; i < naddr; i++ )
>>>>        {
>>>> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
>>>>            res = dt_device_get_address(dev, i, &addr, &size);
>>>>            if ( res )
>>>>            {
>>>> @@ -1876,7 +1873,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>>>>                return res;
>>>>        }
>>>>    
>>>> -    res = map_device_children(d, dev, p2mt);
>>>> +    res = map_device_children(dev, &mr_data);
>>>>        if ( res )
>>>>            return res;
>>>>    
>>>> @@ -3034,7 +3031,14 @@ static int __init construct_dom0(struct domain *d)
>>>>            return rc;
>>>>    
>>>>        if ( acpi_disabled )
>>>> +    {
>>>>            rc = prepare_dtb_hwdom(d, &kinfo);
>>>> +        if ( rc < 0 )
>>>> +            return rc;
>>>> +#ifdef CONFIG_HAS_PCI
>>>> +        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
>>>> +#endif
>>>> +    }
>>>>        else
>>>>            rc = prepare_acpi(d, &kinfo);
>>>>    
>>>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>>>> index 602d00799c8d..b81620074a91 100644
>>>> --- a/xen/arch/arm/pci/ecam.c
>>>> +++ b/xen/arch/arm/pci/ecam.c
>>>> @@ -40,6 +40,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>>>        return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where;
>>>>    }
>>>>    
>>>> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
>>>> +                                     struct pci_host_bridge *bridge,
>>>> +                                     uint64_t addr)
>>>> +{
>>>> +    struct pci_config_window *cfg = bridge->cfg;
>>>> +
>>>> +    /*
>>>> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
>>>> +     * so we can trap access to it.
>>>> +     */
>>>> +    return cfg->phys_addr != addr;
>>>> +}
>>>> +
>>>>    /* ECAM ops */
>>>>    const struct pci_ecam_ops pci_generic_ecam_ops = {
>>>>        .bus_shift  = 20,
>>>> @@ -47,6 +60,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>>>>            .map_bus                = pci_ecam_map_bus,
>>>>            .read                   = pci_generic_config_read,
>>>>            .write                  = pci_generic_config_write,
>>>> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
>>>>        }
>>>>    };
>>>>    
>>>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>>>> index 1eb4daa87365..085f08e23e0c 100644
>>>> --- a/xen/arch/arm/pci/pci-host-common.c
>>>> +++ b/xen/arch/arm/pci/pci-host-common.c
>>>> @@ -18,6 +18,7 @@
>>>>    
>>>>    #include <xen/init.h>
>>>>    #include <xen/pci.h>
>>>> +#include <asm/setup.h>
>>>>    #include <xen/rwlock.h>
>>>>    #include <xen/sched.h>
>>>>    #include <xen/vmap.h>
>>>> @@ -320,6 +321,54 @@ int pci_host_get_num_bridges(void)
>>>>        return count;
>>>>    }
>>>>    
>>>> +int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
>>>> +{
>>>> +    struct pci_host_bridge *bridge;
>>>> +    struct map_range_data mr_data = {
>>>> +        .d = d,
>>>> +        .p2mt = p2mt,
>>>> +        .skip_mapping = false
>>>> +    };
>>>> +
>>>> +    /*
>>>> +     * For each PCI host bridge we need to only map those ranges
>>>> +     * which are used by Domain-0 to properly initialize the bridge,
>>>> +     * e.g. we do not want to map ECAM configuration space which lives in
>>>> +     * "reg" or "assigned-addresses" device tree property, but we want to
>>>> +     * map other regions of the host bridge. The PCI aperture defined by
>>>> +     * the "ranges" device tree property should also be skipped.
>>>> +     */
>>>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
>>>> +    {
>>>> +        const struct dt_device_node *dev = bridge->dt_node;
>>>> +        unsigned int i;
>>>> +
>>>> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
>>>> +        {
>>>> +            uint64_t addr, size;
>>>> +            int err;
>>>> +
>>>> +            err = dt_device_get_address(dev, i, &addr, &size);
>>>> +            if ( err )
>>>> +            {
>>>> +                printk(XENLOG_ERR
>>>> +                       "Unable to retrieve address range index=%u for %s\n",
>>>> +                       i, dt_node_full_name(dev));
>>>> +                return err;
>>>> +            }
>>>> +
>>>> +            if ( bridge->ops->need_p2m_hwdom_mapping(d, bridge, addr) )
>>>> +            {
>>>> +                err = map_range_to_domain(dev, addr, size, &mr_data);
>>>> +                if ( err )
>>>> +                    return err;
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>>    /*
>>>>     * Local variables:
>>>>     * mode: C
>>>> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
>>>> index 61a9807d3d58..6ad2b31e810d 100644
>>>> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
>>>> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
>>>> @@ -34,6 +34,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
>>>>            .map_bus                = pci_ecam_map_bus,
>>>>            .read                   = pci_generic_config_read,
>>>>            .write                  = pci_generic_config_write,
>>>> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
>>>>        }
>>>>    };
>>>>    
>>>> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
>>>> index a62d8bc60086..d9a3c2a4f3b3 100644
>>>> --- a/xen/include/asm-arm/pci.h
>>>> +++ b/xen/include/asm-arm/pci.h
>>>> @@ -17,6 +17,8 @@
>>>>    
>>>>    #ifdef CONFIG_HAS_PCI
>>>>    
>>>> +#include <asm/p2m.h>
>>>> +
>>>>    #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
>>>>    
>>>>    extern bool_t pci_passthrough_enabled;
>>>> @@ -73,6 +75,9 @@ struct pci_ops {
>>>>                    uint32_t reg, uint32_t len, uint32_t *value);
>>>>        int (*write)(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>>>>                     uint32_t reg, uint32_t len, uint32_t value);
>>>> +    bool (*need_p2m_hwdom_mapping)(struct domain *d,
>>>> +                                   struct pci_host_bridge *bridge,
>>>> +                                   uint64_t addr);
>>>>    };
>>>>    
>>>>    /*
>>>> @@ -96,6 +101,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
>>>>                                 uint32_t reg, uint32_t len, uint32_t value);
>>>>    void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>>>                                   pci_sbdf_t sbdf, uint32_t where);
>>>> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
>>>> +                                     struct pci_host_bridge *bridge,
>>>> +                                     uint64_t addr);
>>>>    struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
>>>>    int pci_get_host_bridge_segment(const struct dt_device_node *node,
>>>>                                    uint16_t *segment);
>>>> @@ -113,6 +121,8 @@ int pci_host_iterate_bridges(struct domain *d,
>>>>                                            struct pci_host_bridge *bridge));
>>>>    int pci_host_get_num_bridges(void);
>>>>    
>>>> +int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
>>>> +
>>>>    #else   /*!CONFIG_HAS_PCI*/
>>>>    
>>>>    struct arch_pci_dev { };
>>>> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
>>>> index 95da0b7ab9cd..5b30135fda38 100644
>>>> --- a/xen/include/asm-arm/setup.h
>>>> +++ b/xen/include/asm-arm/setup.h
>>>> @@ -2,6 +2,8 @@
>>>>    #define __ARM_SETUP_H_
>>>>    
>>>>    #include <public/version.h>
>>>> +#include <asm/p2m.h>
>>>> +#include <xen/device_tree.h>
>>>>    
>>>>    #define MIN_FDT_ALIGN 8
>>>>    #define MAX_FDT_SIZE SZ_2M
>>>> @@ -77,6 +79,14 @@ struct bootinfo {
>>>>    #endif
>>>>    };
>>>>    
>>>> +struct map_range_data
>>>> +{
>>>> +    struct domain *d;
>>>> +    p2m_type_t p2mt;
>>>> +    /* Set if mappings for PCI host bridges must not be skipped. */
>>> This comment still needs updating.
>> Sorry, I missed this one
>>>
>>>> +    bool skip_mapping;
>>>> +};
>>>> +
>>>>    extern struct bootinfo bootinfo;
>>>>    
>>>>    extern domid_t max_init_domid;
>>>> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
>>>>    u32 device_tree_get_u32(const void *fdt, int node,
>>>>                            const char *prop_name, u32 dflt);
>>>>    
>>>> +int map_range_to_domain(const struct dt_device_node *dev,
>>>> +                        u64 addr, u64 len, void *data);
>>>> +
>>>>    #endif
>>>>    /*
>>>>     * Local variables:
>>>> -- 
>>>> 2.25.1
>>>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 02/11] xen/arm: Add new device type for PCI
  2021-10-04 14:11 ` [PATCH v4 02/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
@ 2021-10-06 10:15   ` Rahul Singh
  0 siblings, 0 replies; 27+ messages in thread
From: Rahul Singh @ 2021-10-06 10:15 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini,
	oleksandr_tyshchenko, volodymyr_babchuk, Artem Mygaiev,
	Roger Pau Monné,
	Jan Beulich, Andrew Cooper, George Dunlap, Paul Durrant,
	Bertrand Marquis, Oleksandr Andrushchenko

Hi Oleksandr,

> On 4 Oct 2021, at 3:11 pm, Oleksandr Andrushchenko <andr2000@gmail.com> wrote:
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Add new device type (DEV_PCI) to distinguish PCI devices from platform
> DT devices, so some drivers, like IOMMU, can handle PCI devices
> differently.
> 
> Also add a helper which is when given a struct device returns the
> corresponding struct pci_dev which this device is a part of.
> 
> Because of the header cross-dependencies, e.g. we need both
> struct pci_dev and struct arch_pci_dev at the same time, this cannot be
> done with an inline.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>

Regards,
Rahul
> 
> Since v2:
> - !! dropped Stefano's r-b because of the changes
> - simplified dev_to_pci to use a single
>   container_of(dev, struct pci_dev, arch.dev) (Jan)
> Since v1:
> - Folded new device type (DEV_PCI) into this patch.
> ---
> xen/arch/arm/pci/pci.c       | 7 +++++++
> xen/include/asm-arm/device.h | 4 ++--
> xen/include/asm-arm/pci.h    | 7 +++++++
> 3 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
> index 73540045d187..138da19284ab 100644
> --- a/xen/arch/arm/pci/pci.c
> +++ b/xen/arch/arm/pci/pci.c
> @@ -27,6 +27,13 @@ int arch_pci_clean_pirqs(struct domain *d)
>     return 0;
> }
> 
> +struct pci_dev *dev_to_pci(struct device *dev)
> +{
> +    ASSERT(dev->type == DEV_PCI);
> +
> +    return container_of(dev, struct pci_dev, arch.dev);
> +}
> +
> static int __init dt_pci_init(void)
> {
>     struct dt_device_node *np;
> diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
> index ebe84ea853cd..7bf040560363 100644
> --- a/xen/include/asm-arm/device.h
> +++ b/xen/include/asm-arm/device.h
> @@ -4,6 +4,7 @@
> enum device_type
> {
>     DEV_DT,
> +    DEV_PCI,
> };
> 
> struct dev_archdata {
> @@ -25,8 +26,7 @@ typedef struct device device_t;
> 
> #include <xen/device_tree.h>
> 
> -/* TODO: Correctly implement dev_is_pci when PCI is supported on ARM */
> -#define dev_is_pci(dev) ((void)(dev), 0)
> +#define dev_is_pci(dev) ((dev)->type == DEV_PCI)
> #define dev_is_dt(dev)  ((dev)->type == DEV_DT)
> 
> enum device_class
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index 7cb2e2f1edaf..603a1fc072d1 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -26,6 +26,13 @@ struct arch_pci_dev {
>     struct device dev;
> };
> 
> +/*
> + * Because of the header cross-dependencies, e.g. we need both
> + * struct pci_dev and struct arch_pci_dev at the same time, this cannot be
> + * done with an inline here. Macro can be implemented, but looks scary.
> + */
> +struct pci_dev *dev_to_pci(struct device *dev);
> +
> /* Arch-specific MSI data for vPCI. */
> struct vpci_arch_msi {
> };
> -- 
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 03/11] xen/arm: Introduce pci_find_host_bridge_node helper
  2021-10-04 14:11 ` [PATCH v4 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
@ 2021-10-06 10:24   ` Rahul Singh
  0 siblings, 0 replies; 27+ messages in thread
From: Rahul Singh @ 2021-10-06 10:24 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini,
	oleksandr_tyshchenko, volodymyr_babchuk, Artem Mygaiev,
	Roger Pau Monné,
	Jan Beulich, Andrew Cooper, George Dunlap, Paul Durrant,
	Bertrand Marquis, Oleksandr Andrushchenko

Hi Oleksandr,

> On 4 Oct 2021, at 3:11 pm, Oleksandr Andrushchenko <andr2000@gmail.com> wrote:
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Get host bridge node given a PCI device attached to it.
> 
> This helper will be re-used for adding PCI devices by the subsequent
> patches.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>

Regards,
Rahul
> ---
> Since v2:
> - !! dropped Stefano's r-b because of the changes
> - s/PRI_pci/%pp after rebase onto Arm series
> ---
> xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
> xen/include/asm-arm/pci.h          |  1 +
> 2 files changed, 17 insertions(+)
> 
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index 593beeb48ce4..592c01aae5bb 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -276,6 +276,22 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
>     return -EINVAL;
> }
> 
> +/*
> + * Get host bridge node given a device attached to it.
> + */
> +struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
> +{
> +    struct pci_host_bridge *bridge;
> +    struct pci_dev *pdev = dev_to_pci(dev);
> +
> +    bridge = pci_find_host_bridge(pdev->seg, pdev->bus);
> +    if ( unlikely(!bridge) )
> +    {
> +        printk(XENLOG_ERR "Unable to find PCI bridge for %pp\n", &pdev->sbdf);
> +        return NULL;
> +    }
> +    return bridge->dt_node;
> +}
> /*
>  * Local variables:
>  * mode: C
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index 603a1fc072d1..e6d4000e2ac8 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -99,6 +99,7 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
> struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
> int pci_get_host_bridge_segment(const struct dt_device_node *node,
>                                 uint16_t *segment);
> +struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
> 
> static always_inline bool is_pci_passthrough_enabled(void)
> {
> -- 
> 2.25.1
> 
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 11/11] xen/arm: Process pending vPCI map/unmap operations
  2021-10-04 14:11 ` [PATCH v4 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
@ 2021-10-06 10:24   ` Rahul Singh
  0 siblings, 0 replies; 27+ messages in thread
From: Rahul Singh @ 2021-10-06 10:24 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini,
	oleksandr_tyshchenko, volodymyr_babchuk, Artem_Mygaiev,
	roger.pau, jbeulich, andrew.cooper3, george.dunlap, paul,
	Bertrand Marquis, Oleksandr Andrushchenko

Hi Oleksandr,

> On 4 Oct 2021, at 3:11 pm, Oleksandr Andrushchenko <andr2000@gmail.com> wrote:
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> vPCI may map and unmap PCI device memory (BARs) being passed through which
> may take a lot of time. For this those operations may be deferred to be
> performed later, so that they can be safely preempted.
> 
> Currently this deferred processing is happening in common IOREQ code
> which doesn't seem to be the right place for x86 and is even more
> doubtful because IOREQ may not be enabled for Arm at all.
> So, for Arm the pending vPCI work may have no chance to be executed
> if the processing is left as is in the common IOREQ code only.
> For that reason make vPCI processing happen in arch specific code.
> 
> Please be aware that there are a few outstanding TODOs affecting this
> code path, see xen/drivers/vpci/header.c:map_range and
> xen/drivers/vpci/header.c:vpci_process_pending.
> 
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Paul Durrant <paul@xen.org>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> [x86 changes]
> Acked-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>

Regards,
Rahul
> ---
> Since v2:
> - update commit message with more insight on x86, IOREQ and Arm
> - restored order of invocation for IOREQ and vPCI processing (Jan)
> Since v1:
> - Moved the check for pending vpci work from the common IOREQ code
>   to hvm_do_resume on x86
> - Re-worked the code for Arm to ensure we don't miss pending vPCI work
> ---
> xen/arch/arm/traps.c   | 13 +++++++++++++
> xen/arch/x86/hvm/hvm.c |  6 ++++++
> xen/common/ioreq.c     |  9 ---------
> 3 files changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 219ab3c3fbde..b246f51086e3 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -34,6 +34,7 @@
> #include <xen/symbols.h>
> #include <xen/version.h>
> #include <xen/virtual_region.h>
> +#include <xen/vpci.h>
> 
> #include <public/sched.h>
> #include <public/xen.h>
> @@ -2304,6 +2305,18 @@ static bool check_for_vcpu_work(void)
>     }
> #endif
> 
> +    if ( has_vpci(v->domain) )
> +    {
> +        bool pending;
> +
> +        local_irq_enable();
> +        pending = vpci_process_pending(v);
> +        local_irq_disable();
> +
> +        if ( pending )
> +            return true;
> +    }
> +
>     if ( likely(!v->arch.need_flush_to_ram) )
>         return false;
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index aa418a3ca1b7..c491242e4b8b 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -546,6 +546,12 @@ void hvm_do_resume(struct vcpu *v)
> 
>     pt_restore_timer(v);
> 
> +    if ( has_vpci(v->domain) && vpci_process_pending(v) )
> +    {
> +        raise_softirq(SCHEDULE_SOFTIRQ);
> +        return;
> +    }
> +
>     if ( !vcpu_ioreq_handle_completion(v) )
>         return;
> 
> diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c
> index d732dc045df9..689d256544c8 100644
> --- a/xen/common/ioreq.c
> +++ b/xen/common/ioreq.c
> @@ -25,9 +25,7 @@
> #include <xen/lib.h>
> #include <xen/paging.h>
> #include <xen/sched.h>
> -#include <xen/softirq.h>
> #include <xen/trace.h>
> -#include <xen/vpci.h>
> 
> #include <asm/guest_atomics.h>
> #include <asm/ioreq.h>
> @@ -212,19 +210,12 @@ static bool wait_for_io(struct ioreq_vcpu *sv, ioreq_t *p)
> 
> bool vcpu_ioreq_handle_completion(struct vcpu *v)
> {
> -    struct domain *d = v->domain;
>     struct vcpu_io *vio = &v->io;
>     struct ioreq_server *s;
>     struct ioreq_vcpu *sv;
>     enum vio_completion completion;
>     bool res = true;
> 
> -    if ( has_vpci(d) && vpci_process_pending(v) )
> -    {
> -        raise_softirq(SCHEDULE_SOFTIRQ);
> -        return false;
> -    }
> -
>     while ( (sv = get_pending_vcpu(v, &s)) != NULL )
>         if ( !wait_for_io(sv, get_ioreq(s, v)) )
>             return false;
> -- 
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 07/11] libxl: Allow removing PCI devices for all types of domains
  2021-10-04 14:11 ` [PATCH v4 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
@ 2021-10-06 10:25   ` Rahul Singh
  0 siblings, 0 replies; 27+ messages in thread
From: Rahul Singh @ 2021-10-06 10:25 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini,
	oleksandr_tyshchenko, volodymyr_babchuk, Artem Mygaiev,
	Roger Pau Monné,
	Jan Beulich, Andrew Cooper, George Dunlap, paul,
	Bertrand Marquis, Oleksandr Andrushchenko, Ian Jackson,
	Juergen Gross

Hi Oleksandr,

> On 4 Oct 2021, at 3:11 pm, Oleksandr Andrushchenko <andr2000@gmail.com> wrote:
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> The PCI device remove path may now be used by PVH on ARM, so the
> assert is no longer valid.
> 
> Cc: Ian Jackson <iwj@xenproject.org>
> Cc: Juergen Gross <jgross@suse.com>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Acked-by: Stefano Stabellini <sstabellini@kernel.org>

Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>

Regards,
Rahul
> ---
> tools/libs/light/libxl_pci.c | 2 --
> 1 file changed, 2 deletions(-)
> 
> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 1a1c2630803b..59f3686fc85e 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1947,8 +1947,6 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
>             goto out_fail;
>         }
>     } else {
> -        assert(type == LIBXL_DOMAIN_TYPE_PV);
> -
>         char *sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/resource", pci->domain,
>                                      pci->bus, pci->dev, pci->func);
>         FILE *f = fopen(sysfs_path, "r");
> -- 
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 04/11] xen/device-tree: Make dt_find_node_by_phandle global
  2021-10-04 14:11 ` [PATCH v4 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
@ 2021-10-06 10:32   ` Rahul Singh
  0 siblings, 0 replies; 27+ messages in thread
From: Rahul Singh @ 2021-10-06 10:32 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini,
	oleksandr_tyshchenko, volodymyr_babchuk, Artem_Mygaiev,
	roger.pau, jbeulich, andrew.cooper3, george.dunlap, paul,
	Bertrand Marquis, Oleksandr Andrushchenko

Hi Oleksandr,

> On 4 Oct 2021, at 3:11 pm, Oleksandr Andrushchenko <andr2000@gmail.com> wrote:
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Make dt_find_node_by_phandle globally visible, so it can be re-used by
> other frameworks.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>

Regards,
Rahul
> ---
> xen/common/device_tree.c      | 2 +-
> xen/include/xen/device_tree.h | 2 ++
> 2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index ea93da1725f6..4aae281e89bf 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -1047,7 +1047,7 @@ int dt_for_each_range(const struct dt_device_node *dev,
>  *
>  * Returns a node pointer.
>  */
> -static struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle)
> +struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle)
> {
>     struct dt_device_node *np;
> 
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index 2297c59ce66d..fd6cd00b433a 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -849,6 +849,8 @@ int dt_count_phandle_with_args(const struct dt_device_node *np,
>  */
> int dt_get_pci_domain_nr(struct dt_device_node *node);
> 
> +struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle);
> +
> #ifdef CONFIG_DEVICE_TREE_DEBUG
> #define dt_dprintk(fmt, args...)  \
>     printk(XENLOG_DEBUG fmt, ## args)
> -- 
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 05/11] xen/arm: Mark device as PCI while creating one
  2021-10-04 14:11 ` [PATCH v4 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
@ 2021-10-06 10:33   ` Rahul Singh
  0 siblings, 0 replies; 27+ messages in thread
From: Rahul Singh @ 2021-10-06 10:33 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini,
	oleksandr_tyshchenko, volodymyr_babchuk, Artem_Mygaiev,
	roger.pau, jbeulich, andrew.cooper3, george.dunlap, paul,
	Bertrand Marquis, Oleksandr Andrushchenko

Hi Oleksandr,

> On 4 Oct 2021, at 3:11 pm, Oleksandr Andrushchenko <andr2000@gmail.com> wrote:
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> While adding a PCI device mark it as such, so other frameworks
> can distinguish it from DT devices.
> For that introduce an architecture defined helper which may perform
> additional initialization of the newly created PCI device.
> 
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Paul Durrant <paul@xen.org>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> [applicable parts]
> Acked-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>

Regards,
Rahul

> ---
> Since v3:
> - fixed Arm build without CONFIG_HAS_PCI
> Since v2:
> - !! dropped Stefano's r-b because of the changes
> - introduced arch_pci_init_pdev (Jan)
> Since v1:
> - moved the assignment from iommu_add_device to alloc_pdev
> ---
> xen/arch/arm/pci/pci.c        | 5 +++++
> xen/drivers/passthrough/pci.c | 2 ++
> xen/include/asm-arm/pci.h     | 7 +++++++
> xen/include/asm-x86/pci.h     | 2 ++
> 4 files changed, 16 insertions(+)
> 
> diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
> index 138da19284ab..082c14e127a8 100644
> --- a/xen/arch/arm/pci/pci.c
> +++ b/xen/arch/arm/pci/pci.c
> @@ -34,6 +34,11 @@ struct pci_dev *dev_to_pci(struct device *dev)
>     return container_of(dev, struct pci_dev, arch.dev);
> }
> 
> +void arch_pci_init_pdev(struct pci_dev *pdev)
> +{
> +    pci_to_dev(pdev)->type = DEV_PCI;
> +}
> +
> static int __init dt_pci_init(void)
> {
>     struct dt_device_node *np;
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index b36d5a481162..9e6246afeef5 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -329,6 +329,8 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>     *((u8*) &pdev->devfn) = devfn;
>     pdev->domain = NULL;
> 
> +    arch_pci_init_pdev(pdev);
> +
>     rc = pdev_msi_init(pdev);
>     if ( rc )
>     {
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index e6d4000e2ac8..ea87ec6006fc 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -105,6 +105,9 @@ static always_inline bool is_pci_passthrough_enabled(void)
> {
>     return pci_passthrough_enabled;
> }
> +
> +void arch_pci_init_pdev(struct pci_dev *pdev);
> +
> #else   /*!CONFIG_HAS_PCI*/
> 
> struct arch_pci_dev { };
> @@ -121,5 +124,9 @@ static inline int pci_get_host_bridge_segment(const struct dt_device_node *node,
>     return -EINVAL;
> }
> 
> +struct pci_dev;
> +
> +static inline void arch_pci_init_pdev(struct pci_dev *pdev) {}
> +
> #endif  /*!CONFIG_HAS_PCI*/
> #endif /* __ARM_PCI_H__ */
> diff --git a/xen/include/asm-x86/pci.h b/xen/include/asm-x86/pci.h
> index c4a4fdcbc239..443f25347d08 100644
> --- a/xen/include/asm-x86/pci.h
> +++ b/xen/include/asm-x86/pci.h
> @@ -36,4 +36,6 @@ static always_inline bool is_pci_passthrough_enabled(void)
>     return true;
> }
> 
> +static inline void arch_pci_init_pdev(struct pci_dev *pdev) {}
> +
> #endif /* __X86_PCI_H__ */
> -- 
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 06/11] xen/domain: Call pci_release_devices() when releasing domain resources
  2021-10-04 14:11 ` [PATCH v4 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
@ 2021-10-06 10:35   ` Rahul Singh
  0 siblings, 0 replies; 27+ messages in thread
From: Rahul Singh @ 2021-10-06 10:35 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini,
	oleksandr_tyshchenko, volodymyr_babchuk, Artem_Mygaiev,
	roger.pau, jbeulich, andrew.cooper3, george.dunlap, paul,
	Bertrand Marquis, Oleksandr Andrushchenko

Hi Oleksandr,

> On 4 Oct 2021, at 3:11 pm, Oleksandr Andrushchenko <andr2000@gmail.com> wrote:
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This is the very same that we already do for DT devices. Moreover, x86
> already calls pci_release_devices().
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>

Regards,
Rahul
> ---
> Since v1:
> - re-wording in the commit message
> ---
> xen/arch/arm/domain.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index fbb52f78f1a6..79012bf77757 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -985,7 +985,8 @@ static int relinquish_memory(struct domain *d, struct page_list_head *list)
>  * function which may return -ERESTART.
>  */
> enum {
> -    PROG_tee = 1,
> +    PROG_pci = 1,
> +    PROG_tee,
>     PROG_xen,
>     PROG_page,
>     PROG_mapping,
> @@ -1022,6 +1023,12 @@ int domain_relinquish_resources(struct domain *d)
> #ifdef CONFIG_IOREQ_SERVER
>         ioreq_server_destroy_all(d);
> #endif
> +#ifdef CONFIG_HAS_PCI
> +    PROGRESS(pci):
> +        ret = pci_release_devices(d);
> +        if ( ret )
> +            return ret;
> +#endif
> 
>     PROGRESS(tee):
>         ret = tee_relinquish_resources(d);
> -- 
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
  2021-10-04 14:11 ` [PATCH v4 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
@ 2021-10-06 10:36   ` Rahul Singh
  0 siblings, 0 replies; 27+ messages in thread
From: Rahul Singh @ 2021-10-06 10:36 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini,
	oleksandr_tyshchenko, volodymyr_babchuk, Artem_Mygaiev,
	roger.pau, jbeulich, andrew.cooper3, george.dunlap, paul,
	Bertrand Marquis, Oleksandr Andrushchenko

Hi Oleksandr,

> On 4 Oct 2021, at 3:11 pm, Oleksandr Andrushchenko <andr2000@gmail.com> wrote:
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> In order for vPCI to work it needs to maintain guest and hardware
> domain's views of the configuration space. For example, BARs and
> COMMAND registers require emulation for guests and the guest view
> of the registers needs to be in sync with the real contents of the
> relevant registers. For that ECAM address space needs to also be
> trapped for the hardware domain, so we need to implement PCI host
> bridge specific callbacks to properly setup MMIO handlers for those
> ranges depending on particular host bridge implementation.
> 
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>

Regards,
Rahul

> ---
> Since v3:
> - fixed comment formatting
> Since v2:
> - removed unneeded assignment (count = 0)
> - removed unneeded header inclusion
> - update commit message
> Since v1:
> - Dynamically calculate the number of MMIO handlers required for vPCI
>   and update the total number accordingly
> - s/clb/cb
> - Do not introduce a new callback for MMIO handler setup
> ---
> xen/arch/arm/domain.c              |  2 ++
> xen/arch/arm/pci/pci-host-common.c | 28 ++++++++++++++++++++++++
> xen/arch/arm/vpci.c                | 34 ++++++++++++++++++++++++++++++
> xen/arch/arm/vpci.h                |  6 ++++++
> xen/include/asm-arm/pci.h          |  5 +++++
> 5 files changed, 75 insertions(+)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 79012bf77757..fa6fcc5e467c 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -733,6 +733,8 @@ int arch_domain_create(struct domain *d,
>     if ( (rc = domain_vgic_register(d, &count)) != 0 )
>         goto fail;
> 
> +    count += domain_vpci_get_num_mmio_handlers(d);
> +
>     if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
>         goto fail;
> 
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index 592c01aae5bb..1eb4daa87365 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -292,6 +292,34 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
>     }
>     return bridge->dt_node;
> }
> +
> +int pci_host_iterate_bridges(struct domain *d,
> +                             int (*cb)(struct domain *d,
> +                                       struct pci_host_bridge *bridge))
> +{
> +    struct pci_host_bridge *bridge;
> +    int err;
> +
> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +    {
> +        err = cb(d, bridge);
> +        if ( err )
> +            return err;
> +    }
> +    return 0;
> +}
> +
> +int pci_host_get_num_bridges(void)
> +{
> +    struct pci_host_bridge *bridge;
> +    int count = 0;
> +
> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> +        count++;
> +
> +    return count;
> +}
> +
> /*
>  * Local variables:
>  * mode: C
> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
> index 76c12b92814f..6e179cd3010b 100644
> --- a/xen/arch/arm/vpci.c
> +++ b/xen/arch/arm/vpci.c
> @@ -80,17 +80,51 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
>     .write = vpci_mmio_write,
> };
> 
> +static int vpci_setup_mmio_handler(struct domain *d,
> +                                   struct pci_host_bridge *bridge)
> +{
> +    struct pci_config_window *cfg = bridge->cfg;
> +
> +    register_mmio_handler(d, &vpci_mmio_handler,
> +                          cfg->phys_addr, cfg->size, NULL);
> +    return 0;
> +}
> +
> int domain_vpci_init(struct domain *d)
> {
>     if ( !has_vpci(d) )
>         return 0;
> 
> +    if ( is_hardware_domain(d) )
> +        return pci_host_iterate_bridges(d, vpci_setup_mmio_handler);
> +
> +    /* Guest domains use what is programmed in their device tree. */
>     register_mmio_handler(d, &vpci_mmio_handler,
>                           GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
> 
>     return 0;
> }
> 
> +int domain_vpci_get_num_mmio_handlers(struct domain *d)
> +{
> +    int count;
> +
> +    if ( is_hardware_domain(d) )
> +        /* For each PCI host bridge's configuration space. */
> +        count = pci_host_get_num_bridges();
> +    else
> +        /*
> +         * VPCI_MSIX_MEM_NUM handlers for MSI-X tables per each PCI device
> +         * being passed through. Maximum number of supported devices
> +         * is 32 as virtual bus topology emulates the devices as embedded
> +         * endpoints.
> +         * +1 for a single emulated host bridge's configuration space.
> +         */
> +        count = VPCI_MSIX_MEM_NUM * 32 + 1;
> +
> +    return count;
> +}
> +
> /*
>  * Local variables:
>  * mode: C
> diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h
> index d8a7b0e3e802..27a2b069abd2 100644
> --- a/xen/arch/arm/vpci.h
> +++ b/xen/arch/arm/vpci.h
> @@ -17,11 +17,17 @@
> 
> #ifdef CONFIG_HAS_VPCI
> int domain_vpci_init(struct domain *d);
> +int domain_vpci_get_num_mmio_handlers(struct domain *d);
> #else
> static inline int domain_vpci_init(struct domain *d)
> {
>     return 0;
> }
> +
> +static inline int domain_vpci_get_num_mmio_handlers(struct domain *d)
> +{
> +    return 0;
> +}
> #endif
> 
> #endif /* __ARCH_ARM_VPCI_H__ */
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index ea87ec6006fc..a62d8bc60086 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -108,6 +108,11 @@ static always_inline bool is_pci_passthrough_enabled(void)
> 
> void arch_pci_init_pdev(struct pci_dev *pdev);
> 
> +int pci_host_iterate_bridges(struct domain *d,
> +                             int (*clb)(struct domain *d,
> +                                        struct pci_host_bridge *bridge));
> +int pci_host_get_num_bridges(void);
> +
> #else   /*!CONFIG_HAS_PCI*/
> 
> struct arch_pci_dev { };
> -- 
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 08/11] libxl: Only map legacy PCI IRQs if they are supported
  2021-10-04 14:11 ` [PATCH v4 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
@ 2021-10-06 10:39   ` Rahul Singh
  0 siblings, 0 replies; 27+ messages in thread
From: Rahul Singh @ 2021-10-06 10:39 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: xen-devel, Julien Grall, Stefano Stabellini,
	oleksandr_tyshchenko, volodymyr_babchuk, Artem_Mygaiev,
	roger.pau, jbeulich, andrew.cooper3, george.dunlap, paul,
	Bertrand Marquis, Oleksandr Andrushchenko, Ian Jackson,
	Juergen Gross

Hi Oleksandr,

> On 4 Oct 2021, at 3:11 pm, Oleksandr Andrushchenko <andr2000@gmail.com> wrote:
> 
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> 
> Arm's PCI passthrough implementation doesn't support legacy interrupts,
> but MSI/MSI-X. This can be the case for other platforms too.
> For that reason introduce a new CONFIG_PCI_SUPP_LEGACY_IRQ and add
> it to the CFLAGS and compile the relevant code in the toolstack only if
> applicable.
> 
> Cc: Ian Jackson <iwj@xenproject.org>
> Cc: Juergen Gross <jgross@suse.com>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>

Regards,
Rahul
> ---
> Since v1:
> - Minimized #idefery by introducing pci_supp_legacy_irq function
>   for relevant checks
> ---
> tools/libs/light/Makefile    |  4 ++++
> tools/libs/light/libxl_pci.c | 13 +++++++++++++
> 2 files changed, 17 insertions(+)
> 
> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
> index 7d8c51d49242..bd3f6be2a183 100644
> --- a/tools/libs/light/Makefile
> +++ b/tools/libs/light/Makefile
> @@ -46,6 +46,10 @@ CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
> 	-Wno-declaration-after-statement -Wformat-nonliteral
> CFLAGS += -I.
> 
> +ifeq ($(CONFIG_X86),y)
> +CFLAGS += -DCONFIG_PCI_SUPP_LEGACY_IRQ
> +endif
> +
> SRCS-$(CONFIG_X86) += libxl_cpuid.c
> SRCS-$(CONFIG_X86) += libxl_x86.c
> SRCS-$(CONFIG_X86) += libxl_psr.c
> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 59f3686fc85e..4c2d7aeefbb2 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1364,6 +1364,15 @@ static void pci_add_timeout(libxl__egc *egc, libxl__ev_time *ev,
>     pci_add_dm_done(egc, pas, rc);
> }
> 
> +static bool pci_supp_legacy_irq(void)
> +{
> +#ifdef CONFIG_PCI_SUPP_LEGACY_IRQ
> +    return true;
> +#else
> +    return false;
> +#endif
> +}
> +
> static void pci_add_dm_done(libxl__egc *egc,
>                             pci_add_state *pas,
>                             int rc)
> @@ -1434,6 +1443,8 @@ static void pci_add_dm_done(libxl__egc *egc,
>         }
>     }
>     fclose(f);
> +    if (!pci_supp_legacy_irq())
> +        goto out_no_irq;
>     sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>                                 pci->bus, pci->dev, pci->func);
>     f = fopen(sysfs_path, "r");
> @@ -1983,6 +1994,8 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
>         }
>         fclose(f);
> skip1:
> +        if (!pci_supp_legacy_irq())
> +            goto skip_irq;
>         sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>                                pci->bus, pci->dev, pci->func);
>         f = fopen(sysfs_path, "r");
> -- 
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m
  2021-10-06  4:55         ` Oleksandr Andrushchenko
@ 2021-10-06 21:33           ` Stefano Stabellini
  0 siblings, 0 replies; 27+ messages in thread
From: Stefano Stabellini @ 2021-10-06 21:33 UTC (permalink / raw)
  To: Oleksandr Andrushchenko
  Cc: Stefano Stabellini, xen-devel, julien, Oleksandr Tyshchenko,
	Volodymyr Babchuk, Artem Mygaiev, roger.pau, jbeulich,
	andrew.cooper3, george.dunlap, paul, Bertrand Marquis,
	Rahul Singh

On Wed, 6 Oct 2021, Oleksandr Andrushchenko wrote:
> On 06.10.21 00:43, Stefano Stabellini wrote:
> > On Tue, 5 Oct 2021, Oleksandr Andrushchenko wrote:
> >> On 05.10.21 04:24, Stefano Stabellini wrote:
> >>> On Mon, 4 Oct 2021, Oleksandr Andrushchenko wrote:
> >>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> >>>>
> >>>> PCI host bridges are special devices in terms of implementing PCI
> >>>> passthrough. According to [1] the current implementation depends on
> >>>> Domain-0 to perform the initialization of the relevant PCI host
> >>>> bridge hardware and perform PCI device enumeration. In order to
> >>>> achieve that one of the required changes is to not map all the memory
> >>>> ranges in map_range_to_domain as we traverse the device tree on startup
> >>>> and perform some additional checks if the range needs to be mapped to
> >>>> Domain-0.
> >>>>
> >>>> The generic PCI host controller device tree binding says [2]:
> >>>> - ranges: As described in IEEE Std 1275-1994, but must provide
> >>>>             at least a definition of non-prefetchable memory. One
> >>>>             or both of prefetchable Memory and IO Space may also
> >>>>             be provided.
> >>>>
> >>>> - reg   : The Configuration Space base address and size, as accessed
> >>>>             from the parent bus.  The base address corresponds to
> >>>>             the first bus in the "bus-range" property.  If no
> >>>>             "bus-range" is specified, this will be bus 0 (the default).
> >>>>
> >>>> >From the above none of the memory ranges from the "ranges" property
> >>>> needs to be mapped to Domain-0 at startup as MMIO mapping is going to
> >>>> be handled dynamically by vPCI as we assign PCI devices, e.g. each
> >>>> device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
> >>>> as needed by Xen.
> >>>>
> >>>> The "reg" property covers not only ECAM space, but may also have other
> >>>> then the configuration memory ranges described, for example [3]:
> >>>> - reg: Should contain rc_dbi, config registers location and length.
> >>>> - reg-names: Must include the following entries:
> >>>>      "rc_dbi": controller configuration registers;
> >>>>      "config": PCIe configuration space registers.
> >>>>
> >>>> This patch makes it possible to not map all the ranges from the
> >>>> "ranges" property and also ECAM from the "reg". All the rest from the
> >>>> "reg" property still needs to be mapped to Domain-0, so the PCI
> >>>> host bridge remains functional in Domain-0.
> >>>>
> >>>> [1] https://urldefense.com/v3/__https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html__;!!GF_29dbcQIUBPA!nW_LVc5NeS7QQII7CDO4h3ifHhxdPAvQpLSVaX1CRyXh3c1a4a9cBpQ9S7SiAsDdMbL5RkTJpw$ [lists[.]xenproject[.]org]
> >>>> [2] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt__;!!GF_29dbcQIUBPA!nW_LVc5NeS7QQII7CDO4h3ifHhxdPAvQpLSVaX1CRyXh3c1a4a9cBpQ9S7SiAsDdMbL-vGunhg$ [kernel[.]org]
> >>>> [3] https://urldefense.com/v3/__https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt__;!!GF_29dbcQIUBPA!nW_LVc5NeS7QQII7CDO4h3ifHhxdPAvQpLSVaX1CRyXh3c1a4a9cBpQ9S7SiAsDdMbKjL04tRA$ [kernel[.]org]
> >>>>
> >>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> >>>> ---
> >>>> Since v3:
> >>>>    - pass struct map_range_data to map_dt_irq_to_domain
> >>>>    - remove redundant check from map_range_to_domain
> >>>>    - fix handle_device's .skip_mapping
> >>>> Since v2:
> >>>>    - removed check in map_range_to_domain for PCI_DEV
> >>>>      and moved it to handle_device, so the code is
> >>>>      simpler
> >>>>    - s/map_pci_bridge/skip_mapping
> >>>>    - extended comment in pci_host_bridge_mappings
> >>>>    - minor code restructure in construct_dom0
> >>>>    - s/.need_p2m_mapping/.need_p2m_hwdom_mapping and related
> >>>>      callbacks
> >>>>    - unsigned int i; in pci_host_bridge_mappings
> >>>> Since v1:
> >>>>    - Added better description of why and what needs to be mapped into
> >>>>      Domain-0's p2m and what doesn't
> >>>>    - Do not do any mappings for PCI devices while traversing the DT
> >>>>    - Walk all the bridges and make required mappings in one go
> >>>> ---
> >>>>    xen/arch/arm/domain_build.c        | 50 ++++++++++++++++--------------
> >>>>    xen/arch/arm/pci/ecam.c            | 14 +++++++++
> >>>>    xen/arch/arm/pci/pci-host-common.c | 49 +++++++++++++++++++++++++++++
> >>>>    xen/arch/arm/pci/pci-host-zynqmp.c |  1 +
> >>>>    xen/include/asm-arm/pci.h          | 10 ++++++
> >>>>    xen/include/asm-arm/setup.h        | 13 ++++++++
> >>>>    6 files changed, 114 insertions(+), 23 deletions(-)
> >>>>
> >>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >>>> index 8739e3285ef4..2a3c641476bd 100644
> >>>> --- a/xen/arch/arm/domain_build.c
> >>>> +++ b/xen/arch/arm/domain_build.c
> >>>> @@ -10,7 +10,6 @@
> >>>>    #include <asm/regs.h>
> >>>>    #include <xen/errno.h>
> >>>>    #include <xen/err.h>
> >>>> -#include <xen/device_tree.h>
> >>>>    #include <xen/libfdt/libfdt.h>
> >>>>    #include <xen/guest_access.h>
> >>>>    #include <xen/iocap.h>
> >>>> @@ -51,12 +50,6 @@ static int __init parse_dom0_mem(const char *s)
> >>>>    }
> >>>>    custom_param("dom0_mem", parse_dom0_mem);
> >>>>    
> >>>> -struct map_range_data
> >>>> -{
> >>>> -    struct domain *d;
> >>>> -    p2m_type_t p2mt;
> >>>> -};
> >>>> -
> >>>>    /* Override macros from asm/page.h to make them work with mfn_t */
> >>>>    #undef virt_to_mfn
> >>>>    #define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
> >>>> @@ -1641,10 +1634,11 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
> >>>>                                           const struct dt_irq *dt_irq,
> >>>>                                           void *data)
> >>>>    {
> >>>> -    struct domain *d = data;
> >>>> +    struct map_range_data *mr_data = data;
> >>>> +    struct domain *d = mr_data->d;
> >>>>        unsigned int irq = dt_irq->irq;
> >>>>        int res;
> >>>> -    bool need_mapping = !dt_device_for_passthrough(dev);
> >>>> +    bool need_mapping = !mr_data->skip_mapping;
> >>>>    
> >>>>        if ( irq < NR_LOCAL_IRQS )
> >>>>        {
> >>>> @@ -1668,13 +1662,12 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
> >>>>        return 0;
> >>>>    }
> >>>>    
> >>>> -static int __init map_range_to_domain(const struct dt_device_node *dev,
> >>>> -                                      u64 addr, u64 len,
> >>>> -                                      void *data)
> >>>> +int __init map_range_to_domain(const struct dt_device_node *dev,
> >>>> +                               u64 addr, u64 len, void *data)
> >>>>    {
> >>>>        struct map_range_data *mr_data = data;
> >>>>        struct domain *d = mr_data->d;
> >>>> -    bool need_mapping = !dt_device_for_passthrough(dev);
> >>>> +    bool need_mapping = !mr_data->skip_mapping;
> >>>>        int res;
> >>>>    
> >>>>        /*
> >>>> @@ -1726,23 +1719,21 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
> >>>>     * then we may need to perform additional mappings in order to make
> >>>>     * the child resources available to domain 0.
> >>>>     */
> >>>> -static int __init map_device_children(struct domain *d,
> >>>> -                                      const struct dt_device_node *dev,
> >>>> -                                      p2m_type_t p2mt)
> >>>> +static int __init map_device_children(const struct dt_device_node *dev,
> >>>> +                                      struct map_range_data *mr_data)
> >>>>    {
> >>>> -    struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> >>>> -    int ret;
> >>>> -
> >>>>        if ( dt_device_type_is_equal(dev, "pci") )
> >>>>        {
> >>>> +        int ret;
> >>>> +
> >>>>            dt_dprintk("Mapping children of %s to guest\n",
> >>>>                       dt_node_full_name(dev));
> >>>>    
> >>>> -        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, d);
> >>>> +        ret = dt_for_each_irq_map(dev, &map_dt_irq_to_domain, mr_data);
> >>>>            if ( ret < 0 )
> >>>>                return ret;
> >>>>    
> >>>> -        ret = dt_for_each_range(dev, &map_range_to_domain, &mr_data);
> >>>> +        ret = dt_for_each_range(dev, &map_range_to_domain, mr_data);
> >>>>            if ( ret < 0 )
> >>>>                return ret;
> >>>>        }
> >>>> @@ -1823,6 +1814,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
> >>>>        int res;
> >>>>        u64 addr, size;
> >>>>        bool need_mapping = !dt_device_for_passthrough(dev);
> >>>> +    struct map_range_data mr_data = {
> >>>> +        .d = d,
> >>>> +        .p2mt = p2mt,
> >>>> +        .skip_mapping = !need_mapping ||
> >>>> +                        (is_pci_passthrough_enabled() &&
> >>>> +                         (device_get_class(dev) == DEVICE_PCI))
> >>>> +    };
> >>> I would prefer if we did this:
> >>>
> >>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >>> index 2a3c641476..1e7cdd11c7 100644
> >>> --- a/xen/arch/arm/domain_build.c
> >>> +++ b/xen/arch/arm/domain_build.c
> >>> @@ -1813,13 +1813,13 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
> >>>        unsigned int i;
> >>>        int res;
> >>>        u64 addr, size;
> >>> -    bool need_mapping = !dt_device_for_passthrough(dev);
> >>> +    bool need_mapping = !dt_device_for_passthrough(dev) &&
> >>> +                        !(is_pci_passthrough_enabled() &&
> >>> +                          (device_get_class(dev) == DEVICE_PCI))
> >>>        struct map_range_data mr_data = {
> >>>            .d = d,
> >>>            .p2mt = p2mt,
> >>> -        .skip_mapping = !need_mapping ||
> >>> -                        (is_pci_passthrough_enabled() &&
> >>> -                         (device_get_class(dev) == DEVICE_PCI))
> >>> +        .skip_mapping = !need_mapping
> >>>        };
> >>>    
> >>>        naddr = dt_number_of_address(dev);
> >>>
> >>>
> >>> This actually makes a difference because otherwise
> >>> handle_device_interrupts could still be called with need_mapping ==
> >>> true for PCI devices.
> >> And this is totally fine. Remember the name of the patch: we do not
> >> want to map memory, but interrupts are still needed for the bridge to
> >> function in Dom0. The same as it needs all, but "cfg" from "regs".
> >> So, need_mapping == true for interrupts is what we want: legacy INTx
> >> are not supported by design and only MSI/MSI-X are supported.
> > Ah, makes sense, I momentarily lost sight of that. Could you please add
> > an in-code comment to explain it, something like:
> >
> > /*
> >   * For PCI passthrough we need to remap only interrupts to Dom0. MMIO
> >   * regions of the PCIe Root Complex and PCIe aperture should not be
> >   * mapped automatically to Dom0.
> >   */
> > struct map_range_data mr_data = {
> >          .d = d,
> >          .p2mt = p2mt,
> >          .skip_mapping = !need_mapping ||
> >                          (is_pci_passthrough_enabled() &&
> >                           (device_get_class(dev) == DEVICE_PCI))
> >      };
> Sure will add this so we don't step on it in the future.
> Other than this and the missed comment below do you think
> that patch is good to go now?

Yes, I think it is good to go. I plan to add my reviewed-by next time
just to double-check that everything is as expected.


> >>> What do you think?  One more comment below.
> >>>
> >>>
> >>>>        naddr = dt_number_of_address(dev);
> >>>>    
> >>>> @@ -1862,7 +1860,6 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
> >>>>        /* Give permission and map MMIOs */
> >>>>        for ( i = 0; i < naddr; i++ )
> >>>>        {
> >>>> -        struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
> >>>>            res = dt_device_get_address(dev, i, &addr, &size);
> >>>>            if ( res )
> >>>>            {
> >>>> @@ -1876,7 +1873,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
> >>>>                return res;
> >>>>        }
> >>>>    
> >>>> -    res = map_device_children(d, dev, p2mt);
> >>>> +    res = map_device_children(dev, &mr_data);
> >>>>        if ( res )
> >>>>            return res;
> >>>>    
> >>>> @@ -3034,7 +3031,14 @@ static int __init construct_dom0(struct domain *d)
> >>>>            return rc;
> >>>>    
> >>>>        if ( acpi_disabled )
> >>>> +    {
> >>>>            rc = prepare_dtb_hwdom(d, &kinfo);
> >>>> +        if ( rc < 0 )
> >>>> +            return rc;
> >>>> +#ifdef CONFIG_HAS_PCI
> >>>> +        rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
> >>>> +#endif
> >>>> +    }
> >>>>        else
> >>>>            rc = prepare_acpi(d, &kinfo);
> >>>>    
> >>>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> >>>> index 602d00799c8d..b81620074a91 100644
> >>>> --- a/xen/arch/arm/pci/ecam.c
> >>>> +++ b/xen/arch/arm/pci/ecam.c
> >>>> @@ -40,6 +40,19 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
> >>>>        return base + (PCI_DEVFN2(sbdf.bdf) << devfn_shift) + where;
> >>>>    }
> >>>>    
> >>>> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
> >>>> +                                     struct pci_host_bridge *bridge,
> >>>> +                                     uint64_t addr)
> >>>> +{
> >>>> +    struct pci_config_window *cfg = bridge->cfg;
> >>>> +
> >>>> +    /*
> >>>> +     * We do not want ECAM address space to be mapped in Domain-0's p2m,
> >>>> +     * so we can trap access to it.
> >>>> +     */
> >>>> +    return cfg->phys_addr != addr;
> >>>> +}
> >>>> +
> >>>>    /* ECAM ops */
> >>>>    const struct pci_ecam_ops pci_generic_ecam_ops = {
> >>>>        .bus_shift  = 20,
> >>>> @@ -47,6 +60,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
> >>>>            .map_bus                = pci_ecam_map_bus,
> >>>>            .read                   = pci_generic_config_read,
> >>>>            .write                  = pci_generic_config_write,
> >>>> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
> >>>>        }
> >>>>    };
> >>>>    
> >>>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> >>>> index 1eb4daa87365..085f08e23e0c 100644
> >>>> --- a/xen/arch/arm/pci/pci-host-common.c
> >>>> +++ b/xen/arch/arm/pci/pci-host-common.c
> >>>> @@ -18,6 +18,7 @@
> >>>>    
> >>>>    #include <xen/init.h>
> >>>>    #include <xen/pci.h>
> >>>> +#include <asm/setup.h>
> >>>>    #include <xen/rwlock.h>
> >>>>    #include <xen/sched.h>
> >>>>    #include <xen/vmap.h>
> >>>> @@ -320,6 +321,54 @@ int pci_host_get_num_bridges(void)
> >>>>        return count;
> >>>>    }
> >>>>    
> >>>> +int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
> >>>> +{
> >>>> +    struct pci_host_bridge *bridge;
> >>>> +    struct map_range_data mr_data = {
> >>>> +        .d = d,
> >>>> +        .p2mt = p2mt,
> >>>> +        .skip_mapping = false
> >>>> +    };
> >>>> +
> >>>> +    /*
> >>>> +     * For each PCI host bridge we need to only map those ranges
> >>>> +     * which are used by Domain-0 to properly initialize the bridge,
> >>>> +     * e.g. we do not want to map ECAM configuration space which lives in
> >>>> +     * "reg" or "assigned-addresses" device tree property, but we want to
> >>>> +     * map other regions of the host bridge. The PCI aperture defined by
> >>>> +     * the "ranges" device tree property should also be skipped.
> >>>> +     */
> >>>> +    list_for_each_entry( bridge, &pci_host_bridges, node )
> >>>> +    {
> >>>> +        const struct dt_device_node *dev = bridge->dt_node;
> >>>> +        unsigned int i;
> >>>> +
> >>>> +        for ( i = 0; i < dt_number_of_address(dev); i++ )
> >>>> +        {
> >>>> +            uint64_t addr, size;
> >>>> +            int err;
> >>>> +
> >>>> +            err = dt_device_get_address(dev, i, &addr, &size);
> >>>> +            if ( err )
> >>>> +            {
> >>>> +                printk(XENLOG_ERR
> >>>> +                       "Unable to retrieve address range index=%u for %s\n",
> >>>> +                       i, dt_node_full_name(dev));
> >>>> +                return err;
> >>>> +            }
> >>>> +
> >>>> +            if ( bridge->ops->need_p2m_hwdom_mapping(d, bridge, addr) )
> >>>> +            {
> >>>> +                err = map_range_to_domain(dev, addr, size, &mr_data);
> >>>> +                if ( err )
> >>>> +                    return err;
> >>>> +            }
> >>>> +        }
> >>>> +    }
> >>>> +
> >>>> +    return 0;
> >>>> +}
> >>>> +
> >>>>    /*
> >>>>     * Local variables:
> >>>>     * mode: C
> >>>> diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
> >>>> index 61a9807d3d58..6ad2b31e810d 100644
> >>>> --- a/xen/arch/arm/pci/pci-host-zynqmp.c
> >>>> +++ b/xen/arch/arm/pci/pci-host-zynqmp.c
> >>>> @@ -34,6 +34,7 @@ const struct pci_ecam_ops nwl_pcie_ops = {
> >>>>            .map_bus                = pci_ecam_map_bus,
> >>>>            .read                   = pci_generic_config_read,
> >>>>            .write                  = pci_generic_config_write,
> >>>> +        .need_p2m_hwdom_mapping = pci_ecam_need_p2m_hwdom_mapping,
> >>>>        }
> >>>>    };
> >>>>    
> >>>> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> >>>> index a62d8bc60086..d9a3c2a4f3b3 100644
> >>>> --- a/xen/include/asm-arm/pci.h
> >>>> +++ b/xen/include/asm-arm/pci.h
> >>>> @@ -17,6 +17,8 @@
> >>>>    
> >>>>    #ifdef CONFIG_HAS_PCI
> >>>>    
> >>>> +#include <asm/p2m.h>
> >>>> +
> >>>>    #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
> >>>>    
> >>>>    extern bool_t pci_passthrough_enabled;
> >>>> @@ -73,6 +75,9 @@ struct pci_ops {
> >>>>                    uint32_t reg, uint32_t len, uint32_t *value);
> >>>>        int (*write)(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
> >>>>                     uint32_t reg, uint32_t len, uint32_t value);
> >>>> +    bool (*need_p2m_hwdom_mapping)(struct domain *d,
> >>>> +                                   struct pci_host_bridge *bridge,
> >>>> +                                   uint64_t addr);
> >>>>    };
> >>>>    
> >>>>    /*
> >>>> @@ -96,6 +101,9 @@ int pci_generic_config_write(struct pci_host_bridge *bridge, pci_sbdf_t sbdf,
> >>>>                                 uint32_t reg, uint32_t len, uint32_t value);
> >>>>    void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
> >>>>                                   pci_sbdf_t sbdf, uint32_t where);
> >>>> +bool pci_ecam_need_p2m_hwdom_mapping(struct domain *d,
> >>>> +                                     struct pci_host_bridge *bridge,
> >>>> +                                     uint64_t addr);
> >>>>    struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
> >>>>    int pci_get_host_bridge_segment(const struct dt_device_node *node,
> >>>>                                    uint16_t *segment);
> >>>> @@ -113,6 +121,8 @@ int pci_host_iterate_bridges(struct domain *d,
> >>>>                                            struct pci_host_bridge *bridge));
> >>>>    int pci_host_get_num_bridges(void);
> >>>>    
> >>>> +int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
> >>>> +
> >>>>    #else   /*!CONFIG_HAS_PCI*/
> >>>>    
> >>>>    struct arch_pci_dev { };
> >>>> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
> >>>> index 95da0b7ab9cd..5b30135fda38 100644
> >>>> --- a/xen/include/asm-arm/setup.h
> >>>> +++ b/xen/include/asm-arm/setup.h
> >>>> @@ -2,6 +2,8 @@
> >>>>    #define __ARM_SETUP_H_
> >>>>    
> >>>>    #include <public/version.h>
> >>>> +#include <asm/p2m.h>
> >>>> +#include <xen/device_tree.h>
> >>>>    
> >>>>    #define MIN_FDT_ALIGN 8
> >>>>    #define MAX_FDT_SIZE SZ_2M
> >>>> @@ -77,6 +79,14 @@ struct bootinfo {
> >>>>    #endif
> >>>>    };
> >>>>    
> >>>> +struct map_range_data
> >>>> +{
> >>>> +    struct domain *d;
> >>>> +    p2m_type_t p2mt;
> >>>> +    /* Set if mappings for PCI host bridges must not be skipped. */
> >>> This comment still needs updating.
> >> Sorry, I missed this one
> >>>
> >>>> +    bool skip_mapping;
> >>>> +};
> >>>> +
> >>>>    extern struct bootinfo bootinfo;
> >>>>    
> >>>>    extern domid_t max_init_domid;
> >>>> @@ -124,6 +134,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
> >>>>    u32 device_tree_get_u32(const void *fdt, int node,
> >>>>                            const char *prop_name, u32 dflt);
> >>>>    
> >>>> +int map_range_to_domain(const struct dt_device_node *dev,
> >>>> +                        u64 addr, u64 len, void *data);
> >>>> +
> >>>>    #endif
> >>>>    /*
> >>>>     * Local variables:
> >>>> -- 
> >>>> 2.25.1
> >>>>
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-10-06 21:34 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-04 14:11 [PATCH v4 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
2021-10-04 14:11 ` [PATCH v4 01/11] xen/arm: Fix dev_is_dt macro definition Oleksandr Andrushchenko
2021-10-05  0:57   ` Stefano Stabellini
2021-10-04 14:11 ` [PATCH v4 02/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
2021-10-06 10:15   ` Rahul Singh
2021-10-04 14:11 ` [PATCH v4 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
2021-10-06 10:24   ` Rahul Singh
2021-10-04 14:11 ` [PATCH v4 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
2021-10-06 10:32   ` Rahul Singh
2021-10-04 14:11 ` [PATCH v4 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
2021-10-06 10:33   ` Rahul Singh
2021-10-04 14:11 ` [PATCH v4 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
2021-10-06 10:35   ` Rahul Singh
2021-10-04 14:11 ` [PATCH v4 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
2021-10-06 10:25   ` Rahul Singh
2021-10-04 14:11 ` [PATCH v4 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
2021-10-06 10:39   ` Rahul Singh
2021-10-04 14:11 ` [PATCH v4 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
2021-10-06 10:36   ` Rahul Singh
2021-10-04 14:11 ` [PATCH v4 10/11] xen/arm: Do not map PCI ECAM and MMIO space to Domain-0's p2m Oleksandr Andrushchenko
2021-10-05  1:24   ` Stefano Stabellini
2021-10-05  4:32     ` Oleksandr Andrushchenko
2021-10-05 21:43       ` Stefano Stabellini
2021-10-06  4:55         ` Oleksandr Andrushchenko
2021-10-06 21:33           ` Stefano Stabellini
2021-10-04 14:11 ` [PATCH v4 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
2021-10-06 10:24   ` Rahul Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.