* [PATCH 00/11] PCI devices passthrough on Arm, part 2
@ 2021-09-03 8:33 Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 01/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
` (10 more replies)
0 siblings, 11 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Hi, all!
This is an assorted series of patches which aim is to make some further
basis for PCI passthrough on Arm support. The series continues the work
published earlier by Arm [1] and adds new helpers and clears the way for
vPCI changes which will follow.
Thank you,
Oleksandr
[1] https://patchwork.kernel.org/project/xen-devel/cover/cover.1629366665.git.rahul.singh@arm.com/
Oleksandr Andrushchenko (10):
xen/arm: Add new device type for PCI
xen/arm: Add dev_to_pci helper
xen/arm: Introduce pci_find_host_bridge_node helper
xen/device-tree: Make dt_find_node_by_phandle global
xen/arm: Mark device as PCI while creating one
libxl: Allow removing PCI devices for all types of domains
libxl: Only map legacy PCI IRQs if they are supported
xen/arm: Setup MMIO range trap handlers for hardware domain
xen/arm: Do not map PCI ECAM space to Domain-0's p2m
xen/arm: Process pending vPCI map/unmap operations
Oleksandr Tyshchenko (1):
xen/domain: Call pci_release_devices() when releasing domain resources
tools/libs/light/Makefile | 4 +++
tools/libs/light/libxl_pci.c | 15 ++++++--
xen/arch/arm/domain.c | 9 ++++-
xen/arch/arm/domain_build.c | 3 ++
xen/arch/arm/pci/ecam.c | 28 +++++++++++++++
xen/arch/arm/pci/pci-host-common.c | 55 ++++++++++++++++++++++++++++++
xen/arch/arm/pci/pci.c | 10 ++++++
xen/arch/arm/traps.c | 6 ++++
xen/arch/arm/vpci.c | 13 +++++++
xen/common/device_tree.c | 2 +-
xen/drivers/passthrough/pci.c | 3 ++
xen/include/asm-arm/device.h | 6 ++--
xen/include/asm-arm/pci.h | 30 +++++++++++++++-
xen/include/xen/device_tree.h | 2 ++
14 files changed, 178 insertions(+), 8 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH 01/11] xen/arm: Add new device type for PCI
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-09 17:19 ` Julien Grall
2021-09-03 8:33 ` [PATCH 02/11] xen/arm: Add dev_to_pci helper Oleksandr Andrushchenko
` (9 subsequent siblings)
10 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Add new device type (DEV_PCI) to distinguish PCI devices from platform
DT devices, so some drivers, like IOMMU, can handle PCI devices
differently.
While at it fix dev_is_dt macro.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
xen/include/asm-arm/device.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index 5ecd5e7bd15e..7bf040560363 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -4,6 +4,7 @@
enum device_type
{
DEV_DT,
+ DEV_PCI,
};
struct dev_archdata {
@@ -25,9 +26,8 @@ typedef struct device device_t;
#include <xen/device_tree.h>
-/* TODO: Correctly implement dev_is_pci when PCI is supported on ARM */
-#define dev_is_pci(dev) ((void)(dev), 0)
-#define dev_is_dt(dev) ((dev->type == DEV_DT)
+#define dev_is_pci(dev) ((dev)->type == DEV_PCI)
+#define dev_is_dt(dev) ((dev)->type == DEV_DT)
enum device_class
{
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 02/11] xen/arm: Add dev_to_pci helper
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 01/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
` (8 subsequent siblings)
10 siblings, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Add a helper which is when given a struct device returns the
corresponding struct pci_dev which this device is a part of.
Because of the header cross-dependencies, e.g. we need both
struct pci_dev and struct arch_pci_dev at the same time, this cannot be
done with an inline. Macro can be implemented, but looks scary:
#define dev_to_pci_dev(dev) container_of((container_of((dev), \
struct arch_pci_dev, dev), struct pci_dev, arch)
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
xen/arch/arm/pci/pci.c | 10 ++++++++++
xen/include/asm-arm/pci.h | 7 +++++++
2 files changed, 17 insertions(+)
diff --git a/xen/arch/arm/pci/pci.c b/xen/arch/arm/pci/pci.c
index dc63bbc2a2c1..6573f179af55 100644
--- a/xen/arch/arm/pci/pci.c
+++ b/xen/arch/arm/pci/pci.c
@@ -26,6 +26,16 @@ int arch_pci_clean_pirqs(struct domain *d)
return 0;
}
+struct pci_dev *dev_to_pci(struct device *dev)
+{
+ struct arch_pci_dev *arch_dev;
+
+ ASSERT(dev->type == DEV_PCI);
+
+ arch_dev = container_of((dev), struct arch_pci_dev, dev);
+ return container_of(arch_dev, struct pci_dev, arch);
+}
+
static int __init dt_pci_init(void)
{
struct dt_device_node *np;
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 2d4610a23a25..e1aa05190bda 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -27,6 +27,13 @@ struct arch_pci_dev {
struct device dev;
};
+/*
+ * Because of the header cross-dependencies, e.g. we need both
+ * struct pci_dev and struct arch_pci_dev at the same time, this cannot be
+ * done with an inline here. Macro can be implemented, but looks scary.
+ */
+struct pci_dev *dev_to_pci(struct device *dev);
+
/* Arch-specific MSI data for vPCI. */
struct vpci_arch_msi {
};
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 03/11] xen/arm: Introduce pci_find_host_bridge_node helper
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 01/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 02/11] xen/arm: Add dev_to_pci helper Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
` (7 subsequent siblings)
10 siblings, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Get host bridge node given a PCI device attached to it.
This helper will be re-used for adding PCI devices by the subsequent
patches.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
---
xen/arch/arm/pci/pci-host-common.c | 17 +++++++++++++++++
xen/include/asm-arm/pci.h | 1 +
2 files changed, 18 insertions(+)
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index 5e34252deb9d..d2fef5476b8e 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -301,6 +301,23 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
return -EINVAL;
}
+/*
+ * Get host bridge node given a device attached to it.
+ */
+struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
+{
+ struct pci_host_bridge *bridge;
+ struct pci_dev *pdev = dev_to_pci(dev);
+
+ bridge = pci_find_host_bridge(pdev->seg, pdev->bus);
+ if ( unlikely(!bridge) )
+ {
+ printk(XENLOG_ERR "Unable to find PCI bridge for "PRI_pci"\n",
+ pdev->seg, pdev->bus, pdev->sbdf.dev, pdev->sbdf.fn);
+ return NULL;
+ }
+ return bridge->dt_node;
+}
/*
* Local variables:
* mode: C
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index e1aa05190bda..7dc4c8dc9026 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -105,6 +105,7 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
struct pci_host_bridge *pci_find_host_bridge(uint16_t segment, uint8_t bus);
int pci_get_host_bridge_segment(const struct dt_device_node *node,
uint16_t *segment);
+struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
#else /*!CONFIG_HAS_PCI*/
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 04/11] xen/device-tree: Make dt_find_node_by_phandle global
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
` (2 preceding siblings ...)
2021-09-03 8:33 ` [PATCH 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
` (6 subsequent siblings)
10 siblings, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Make dt_find_node_by_phandle globally visible, so it can be re-used by
other frameworks.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
xen/common/device_tree.c | 2 +-
xen/include/xen/device_tree.h | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 03d25a81cea8..c2e33b99832f 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -986,7 +986,7 @@ int dt_for_each_range(const struct dt_device_node *dev,
*
* Returns a node pointer.
*/
-static struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle)
+struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle)
{
struct dt_device_node *np;
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index b02696be9416..07393da1df90 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -776,6 +776,8 @@ int dt_count_phandle_with_args(const struct dt_device_node *np,
const char *list_name,
const char *cells_name);
+struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle);
+
#ifdef CONFIG_DEVICE_TREE_DEBUG
#define dt_dprintk(fmt, args...) \
printk(XENLOG_DEBUG fmt, ## args)
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 05/11] xen/arm: Mark device as PCI while creating one
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
` (3 preceding siblings ...)
2021-09-03 8:33 ` [PATCH 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-03 12:41 ` Jan Beulich
2021-09-03 8:33 ` [PATCH 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
` (5 subsequent siblings)
10 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
While adding a PCI device mark it as such, so other frameworks
can distinguish it form DT devices.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
xen/drivers/passthrough/pci.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 56e261e9bd08..25304dbe9956 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1301,6 +1301,9 @@ static int iommu_add_device(struct pci_dev *pdev)
if ( !is_iommu_enabled(pdev->domain) )
return 0;
+#ifdef CONFIG_ARM
+ pci_to_dev(pdev)->type = DEV_PCI;
+#endif
rc = hd->platform_ops->add_device(pdev->devfn, pci_to_dev(pdev));
if ( rc || !pdev->phantom_stride )
return rc;
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 06/11] xen/domain: Call pci_release_devices() when releasing domain resources
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
` (4 preceding siblings ...)
2021-09-03 8:33 ` [PATCH 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-10 18:45 ` Stefano Stabellini
2021-09-03 8:33 ` [PATCH 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
` (4 subsequent siblings)
10 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh
From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
This is the very same what we do for DT devices. What is more
that x86 already calls pci_release_devices().
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
---
xen/arch/arm/domain.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index d99c653626e4..4e40c4098280 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -985,7 +985,8 @@ static int relinquish_memory(struct domain *d, struct page_list_head *list)
* function which may return -ERESTART.
*/
enum {
- PROG_tee = 1,
+ PROG_pci = 1,
+ PROG_tee,
PROG_xen,
PROG_page,
PROG_mapping,
@@ -1022,6 +1023,12 @@ int domain_relinquish_resources(struct domain *d)
#ifdef CONFIG_IOREQ_SERVER
ioreq_server_destroy_all(d);
#endif
+#ifdef CONFIG_HAS_PCI
+ PROGRESS(pci):
+ ret = pci_release_devices(d);
+ if ( ret )
+ return ret;
+#endif
PROGRESS(tee):
ret = tee_relinquish_resources(d);
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 07/11] libxl: Allow removing PCI devices for all types of domains
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
` (5 preceding siblings ...)
2021-09-03 8:33 ` [PATCH 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
` (3 subsequent siblings)
10 siblings, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko, Ian Jackson, Juergen Gross
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
The PCI device remove path may now be used by PVH on ARM, so the
assert is no longer valid.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Juergen Gross <jgross@suse.com>
---
tools/libs/light/libxl_pci.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 1a1c2630803b..59f3686fc85e 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1947,8 +1947,6 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
goto out_fail;
}
} else {
- assert(type == LIBXL_DOMAIN_TYPE_PV);
-
char *sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/resource", pci->domain,
pci->bus, pci->dev, pci->func);
FILE *f = fopen(sysfs_path, "r");
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 08/11] libxl: Only map legacy PCI IRQs if they are supported
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
` (6 preceding siblings ...)
2021-09-03 8:33 ` [PATCH 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-03 10:26 ` Juergen Gross
2021-09-10 19:06 ` Stefano Stabellini
2021-09-03 8:33 ` [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
` (2 subsequent siblings)
10 siblings, 2 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko, Ian Jackson, Juergen Gross
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Arm's PCI passthrough implementation doesn't support legacy interrupts,
but MSI/MSI-X. This can be the case for other platforms too.
For that reason introduce a new CONFIG_PCI_SUPP_LEGACY_IRQ and add
it to the CFLAGS and compile the relevant code in the toolstack only if
applicable.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Juergen Gross <jgross@suse.com>
---
tools/libs/light/Makefile | 4 ++++
tools/libs/light/libxl_pci.c | 13 +++++++++++++
2 files changed, 17 insertions(+)
diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
index 7d8c51d49242..bd3f6be2a183 100644
--- a/tools/libs/light/Makefile
+++ b/tools/libs/light/Makefile
@@ -46,6 +46,10 @@ CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
-Wno-declaration-after-statement -Wformat-nonliteral
CFLAGS += -I.
+ifeq ($(CONFIG_X86),y)
+CFLAGS += -DCONFIG_PCI_SUPP_LEGACY_IRQ
+endif
+
SRCS-$(CONFIG_X86) += libxl_cpuid.c
SRCS-$(CONFIG_X86) += libxl_x86.c
SRCS-$(CONFIG_X86) += libxl_psr.c
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 59f3686fc85e..cd4fea46c3f7 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1434,6 +1434,7 @@ static void pci_add_dm_done(libxl__egc *egc,
}
}
fclose(f);
+#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
pci->bus, pci->dev, pci->func);
f = fopen(sysfs_path, "r");
@@ -1460,6 +1461,7 @@ static void pci_add_dm_done(libxl__egc *egc,
}
}
fclose(f);
+#endif
/* Don't restrict writes to the PCI config space from this VM */
if (pci->permissive) {
@@ -1471,7 +1473,9 @@ static void pci_add_dm_done(libxl__egc *egc,
}
}
+#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
out_no_irq:
+#endif
if (!isstubdom) {
if (pci->rdm_policy == LIBXL_RDM_RESERVE_POLICY_STRICT) {
flag &= ~XEN_DOMCTL_DEV_RDM_RELAXED;
@@ -1951,7 +1955,9 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
pci->bus, pci->dev, pci->func);
FILE *f = fopen(sysfs_path, "r");
unsigned int start = 0, end = 0, flags = 0, size = 0;
+#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
int irq = 0;
+#endif
int i;
if (f == NULL) {
@@ -1983,6 +1989,7 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
}
fclose(f);
skip1:
+#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
pci->bus, pci->dev, pci->func);
f = fopen(sysfs_path, "r");
@@ -2001,8 +2008,14 @@ skip1:
}
}
fclose(f);
+#else
+ /* Silence error: label at end of compound statement */
+ ;
+#endif
}
+#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
skip_irq:
+#endif
rc = 0;
out_fail:
pci_remove_detached(egc, prs, rc); /* must be last */
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
` (7 preceding siblings ...)
2021-09-03 8:33 ` [PATCH 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-09 17:43 ` Julien Grall
2021-09-10 20:12 ` Stefano Stabellini
2021-09-03 8:33 ` [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
10 siblings, 2 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
In order vPCI to work it needs all access to PCI configuration space
(ECAM) to be synchronized among all entities, e.g. hardware domain and
guests. For that implement PCI host bridge specific callbacks to
properly setup those ranges depending on particular host bridge
implementation.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
xen/arch/arm/pci/ecam.c | 11 +++++++++++
xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
xen/arch/arm/vpci.c | 13 +++++++++++++
xen/include/asm-arm/pci.h | 8 ++++++++
4 files changed, 48 insertions(+)
diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
index 91c691b41fdf..92ecb2e0762b 100644
--- a/xen/arch/arm/pci/ecam.c
+++ b/xen/arch/arm/pci/ecam.c
@@ -42,6 +42,16 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
}
+static int pci_ecam_register_mmio_handler(struct domain *d,
+ struct pci_host_bridge *bridge,
+ const struct mmio_handler_ops *ops)
+{
+ struct pci_config_window *cfg = bridge->sysdata;
+
+ register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
+ return 0;
+}
+
/* ECAM ops */
const struct pci_ecam_ops pci_generic_ecam_ops = {
.bus_shift = 20,
@@ -49,6 +59,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
.map_bus = pci_ecam_map_bus,
.read = pci_generic_config_read,
.write = pci_generic_config_write,
+ .register_mmio_handler = pci_ecam_register_mmio_handler,
}
};
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index d2fef5476b8e..a89112bfbb7c 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -318,6 +318,22 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
}
return bridge->dt_node;
}
+
+int pci_host_iterate_bridges(struct domain *d,
+ int (*clb)(struct domain *d,
+ struct pci_host_bridge *bridge))
+{
+ struct pci_host_bridge *bridge;
+ int err;
+
+ list_for_each_entry( bridge, &pci_host_bridges, node )
+ {
+ err = clb(d, bridge);
+ if ( err )
+ return err;
+ }
+ return 0;
+}
/*
* Local variables:
* mode: C
diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
index da8b1ca13c07..258134292458 100644
--- a/xen/arch/arm/vpci.c
+++ b/xen/arch/arm/vpci.c
@@ -74,11 +74,24 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
.write = vpci_mmio_write,
};
+static int vpci_setup_mmio_handler(struct domain *d,
+ struct pci_host_bridge *bridge)
+{
+ if ( bridge->ops->register_mmio_handler )
+ return bridge->ops->register_mmio_handler(d, bridge,
+ &vpci_mmio_handler);
+ return 0;
+}
+
int domain_vpci_init(struct domain *d)
{
if ( !has_vpci(d) )
return 0;
+ if ( is_hardware_domain(d) )
+ return pci_host_iterate_bridges(d, vpci_setup_mmio_handler);
+
+ /* Guest domains use what is programmed in their device tree. */
register_mmio_handler(d, &vpci_mmio_handler,
GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 7dc4c8dc9026..2c7c7649e00f 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -17,6 +17,8 @@
#ifndef __ARM_PCI_H__
#define __ARM_PCI_H__
+#include <asm/mmio.h>
+
#ifdef CONFIG_HAS_PCI
#define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
@@ -77,6 +79,9 @@ struct pci_ops {
uint32_t reg, uint32_t len, uint32_t *value);
int (*write)(struct pci_host_bridge *bridge, uint32_t sbdf,
uint32_t reg, uint32_t len, uint32_t value);
+ int (*register_mmio_handler)(struct domain *d,
+ struct pci_host_bridge *bridge,
+ const struct mmio_handler_ops *ops);
};
/*
@@ -107,6 +112,9 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
uint16_t *segment);
struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
+int pci_host_iterate_bridges(struct domain *d,
+ int (*clb)(struct domain *d,
+ struct pci_host_bridge *bridge));
#else /*!CONFIG_HAS_PCI*/
struct arch_pci_dev { };
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
` (8 preceding siblings ...)
2021-09-03 8:33 ` [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-09 17:58 ` Julien Grall
2021-09-03 8:33 ` [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
10 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Host bridge controller's ECAM space is mapped into Domain-0's p2m,
thus it is not possible to trap the same for vPCI via MMIO handlers.
For this to work we need to not map those while constructing the domain.
Note, that during Domain-0 creation there is no pci_dev yet allocated for
host bridges, thus we cannot match PCI host and its associated
bridge by SBDF. Use dt_device_node field for checks instead.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
xen/arch/arm/domain_build.c | 3 +++
xen/arch/arm/pci/ecam.c | 17 +++++++++++++++++
xen/arch/arm/pci/pci-host-common.c | 22 ++++++++++++++++++++++
xen/include/asm-arm/pci.h | 12 ++++++++++++
4 files changed, 54 insertions(+)
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index da427f399711..76f5b513280c 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1257,6 +1257,9 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
}
}
+ if ( need_mapping && (device_get_class(dev) == DEVICE_PCI) )
+ need_mapping = pci_host_bridge_need_p2m_mapping(d, dev, addr, len);
+
if ( need_mapping )
{
res = map_regions_p2mt(d,
diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
index 92ecb2e0762b..d32efb7fcbd0 100644
--- a/xen/arch/arm/pci/ecam.c
+++ b/xen/arch/arm/pci/ecam.c
@@ -52,6 +52,22 @@ static int pci_ecam_register_mmio_handler(struct domain *d,
return 0;
}
+static int pci_ecam_need_p2m_mapping(struct domain *d,
+ struct pci_host_bridge *bridge,
+ uint64_t addr, uint64_t len)
+{
+ struct pci_config_window *cfg = bridge->sysdata;
+
+ if ( !is_hardware_domain(d) )
+ return true;
+
+ /*
+ * We do not want ECAM address space to be mapped in domain's p2m,
+ * so we can trap access to it.
+ */
+ return cfg->phys_addr != addr;
+}
+
/* ECAM ops */
const struct pci_ecam_ops pci_generic_ecam_ops = {
.bus_shift = 20,
@@ -60,6 +76,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
.read = pci_generic_config_read,
.write = pci_generic_config_write,
.register_mmio_handler = pci_ecam_register_mmio_handler,
+ .need_p2m_mapping = pci_ecam_need_p2m_mapping,
}
};
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index a89112bfbb7c..c04be636452d 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -334,6 +334,28 @@ int pci_host_iterate_bridges(struct domain *d,
}
return 0;
}
+
+bool pci_host_bridge_need_p2m_mapping(struct domain *d,
+ const struct dt_device_node *node,
+ uint64_t addr, uint64_t len)
+{
+ struct pci_host_bridge *bridge;
+
+ list_for_each_entry( bridge, &pci_host_bridges, node )
+ {
+ if ( bridge->dt_node != node )
+ continue;
+
+ if ( !bridge->ops->need_p2m_mapping )
+ return true;
+
+ return bridge->ops->need_p2m_mapping(d, bridge, addr, len);
+ }
+ printk(XENLOG_ERR "Unable to find PCI bridge for %s segment %d, addr %lx\n",
+ node->full_name, bridge->segment, addr);
+ return true;
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 2c7c7649e00f..9c28a4bdc4b7 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -82,6 +82,8 @@ struct pci_ops {
int (*register_mmio_handler)(struct domain *d,
struct pci_host_bridge *bridge,
const struct mmio_handler_ops *ops);
+ int (*need_p2m_mapping)(struct domain *d, struct pci_host_bridge *bridge,
+ uint64_t addr, uint64_t len);
};
/*
@@ -115,9 +117,19 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
int pci_host_iterate_bridges(struct domain *d,
int (*clb)(struct domain *d,
struct pci_host_bridge *bridge));
+bool pci_host_bridge_need_p2m_mapping(struct domain *d,
+ const struct dt_device_node *node,
+ uint64_t addr, uint64_t len);
#else /*!CONFIG_HAS_PCI*/
struct arch_pci_dev { };
+static inline bool
+pci_host_bridge_need_p2m_mapping(struct domain *d,
+ const struct dt_device_node *node,
+ uint64_t addr, uint64_t len)
+{
+ return true;
+}
#endif /*!CONFIG_HAS_PCI*/
#endif /* __ARM_PCI_H__ */
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
` (9 preceding siblings ...)
2021-09-03 8:33 ` [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m Oleksandr Andrushchenko
@ 2021-09-03 8:33 ` Oleksandr Andrushchenko
2021-09-03 9:04 ` Julien Grall
10 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 8:33 UTC (permalink / raw)
To: xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
vPCI may map and unmap PCI device memory (BARs) being passed through which
may take a lot of time. For this those operations may be deferred to be
performed later, so that they can be safely preempted.
Run the corresponding vPCI code while switching a vCPU.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
xen/arch/arm/traps.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 219ab3c3fbde..1571fb8afd03 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -34,6 +34,7 @@
#include <xen/symbols.h>
#include <xen/version.h>
#include <xen/virtual_region.h>
+#include <xen/vpci.h>
#include <public/sched.h>
#include <public/xen.h>
@@ -2304,6 +2305,11 @@ static bool check_for_vcpu_work(void)
}
#endif
+ local_irq_enable();
+ if ( has_vpci(v->domain) && vpci_process_pending(v) )
+ raise_softirq(SCHEDULE_SOFTIRQ);
+ local_irq_disable();
+
if ( likely(!v->arch.need_flush_to_ram) )
return false;
--
2.25.1
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations
2021-09-03 8:33 ` [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
@ 2021-09-03 9:04 ` Julien Grall
2021-09-06 7:02 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-03 9:04 UTC (permalink / raw)
To: Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
Hi Oleksandr,
On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> vPCI may map and unmap PCI device memory (BARs) being passed through which
> may take a lot of time. For this those operations may be deferred to be
> performed later, so that they can be safely preempted.
> Run the corresponding vPCI code while switching a vCPU.
IIUC, you are talking about the function map_range() in
xen/drivers/vpci/header. The function has the following todo for Arm:
/*
* ARM TODOs:
* - On ARM whether the memory is prefetchable or not should be
passed
* to map_mmio_regions in order to decide which memory attributes
* should be used.
*
* - {un}map_mmio_regions doesn't support preemption.
*/
This doesn't seem to be addressed in the two series for PCI passthrough
sent so far. Do you have any plan to handle it?
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
> xen/arch/arm/traps.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 219ab3c3fbde..1571fb8afd03 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -34,6 +34,7 @@
> #include <xen/symbols.h>
> #include <xen/version.h>
> #include <xen/virtual_region.h>
> +#include <xen/vpci.h>
>
> #include <public/sched.h>
> #include <public/xen.h>
> @@ -2304,6 +2305,11 @@ static bool check_for_vcpu_work(void)
> }
> #endif
>
> + local_irq_enable();
> + if ( has_vpci(v->domain) && vpci_process_pending(v) )
Looking at the code of vpci_process_pending(), it looks like there are
some rework to do for guest. Do you plan to handle it as part of the
vPCI series?
> + raise_softirq(SCHEDULE_SOFTIRQ);
> + local_irq_disable();
> +
From my understanding of vcpi_process_pending(). The function will
return true if there are more work to schedule. However, if
check_for_vcpu_for_work() return false, then we will return to the guest
before any work for vCPI has finished. This is because
check_for_vcpu_work() will not be called again.
In this case, I think you want to return as soon as you know we need to
reschedule.
However, looking at the rest of the code, we already have a check for
vpci in the common IOREQ code. So we would end up to call twice
vpci_process_pending(). Maybe we should move the call from the IOREQ to
arch-code.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 08/11] libxl: Only map legacy PCI IRQs if they are supported
2021-09-03 8:33 ` [PATCH 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
@ 2021-09-03 10:26 ` Juergen Gross
2021-09-03 10:30 ` Oleksandr Andrushchenko
2021-09-10 19:06 ` Stefano Stabellini
1 sibling, 1 reply; 69+ messages in thread
From: Juergen Gross @ 2021-09-03 10:26 UTC (permalink / raw)
To: Oleksandr Andrushchenko, xen-devel
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko, Ian Jackson
[-- Attachment #1.1.1: Type: text/plain, Size: 1714 bytes --]
On 03.09.21 10:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> Arm's PCI passthrough implementation doesn't support legacy interrupts,
> but MSI/MSI-X. This can be the case for other platforms too.
> For that reason introduce a new CONFIG_PCI_SUPP_LEGACY_IRQ and add
> it to the CFLAGS and compile the relevant code in the toolstack only if
> applicable.
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Cc: Ian Jackson <iwj@xenproject.org>
> Cc: Juergen Gross <jgross@suse.com>
> ---
> tools/libs/light/Makefile | 4 ++++
> tools/libs/light/libxl_pci.c | 13 +++++++++++++
> 2 files changed, 17 insertions(+)
>
> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
> index 7d8c51d49242..bd3f6be2a183 100644
> --- a/tools/libs/light/Makefile
> +++ b/tools/libs/light/Makefile
> @@ -46,6 +46,10 @@ CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
> -Wno-declaration-after-statement -Wformat-nonliteral
> CFLAGS += -I.
>
> +ifeq ($(CONFIG_X86),y)
> +CFLAGS += -DCONFIG_PCI_SUPP_LEGACY_IRQ
> +endif
> +
> SRCS-$(CONFIG_X86) += libxl_cpuid.c
> SRCS-$(CONFIG_X86) += libxl_x86.c
> SRCS-$(CONFIG_X86) += libxl_psr.c
> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 59f3686fc85e..cd4fea46c3f7 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1434,6 +1434,7 @@ static void pci_add_dm_done(libxl__egc *egc,
> }
> }
> fclose(f);
> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
Why #ifndef? Shouldn't this be #ifdef (same below multiple times)?
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 08/11] libxl: Only map legacy PCI IRQs if they are supported
2021-09-03 10:26 ` Juergen Gross
@ 2021-09-03 10:30 ` Oleksandr Andrushchenko
0 siblings, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 10:30 UTC (permalink / raw)
To: Juergen Gross, Oleksandr Andrushchenko, xen-devel
Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
Ian Jackson
Hello, Juergen!
On 03.09.21 13:26, Juergen Gross wrote:
> On 03.09.21 10:33, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> Arm's PCI passthrough implementation doesn't support legacy interrupts,
>> but MSI/MSI-X. This can be the case for other platforms too.
>> For that reason introduce a new CONFIG_PCI_SUPP_LEGACY_IRQ and add
>> it to the CFLAGS and compile the relevant code in the toolstack only if
>> applicable.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> Cc: Ian Jackson <iwj@xenproject.org>
>> Cc: Juergen Gross <jgross@suse.com>
>> ---
>> tools/libs/light/Makefile | 4 ++++
>> tools/libs/light/libxl_pci.c | 13 +++++++++++++
>> 2 files changed, 17 insertions(+)
>>
>> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
>> index 7d8c51d49242..bd3f6be2a183 100644
>> --- a/tools/libs/light/Makefile
>> +++ b/tools/libs/light/Makefile
>> @@ -46,6 +46,10 @@ CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
>> -Wno-declaration-after-statement -Wformat-nonliteral
>> CFLAGS += -I.
>> +ifeq ($(CONFIG_X86),y)
>> +CFLAGS += -DCONFIG_PCI_SUPP_LEGACY_IRQ
>> +endif
>> +
>> SRCS-$(CONFIG_X86) += libxl_cpuid.c
>> SRCS-$(CONFIG_X86) += libxl_x86.c
>> SRCS-$(CONFIG_X86) += libxl_psr.c
>> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
>> index 59f3686fc85e..cd4fea46c3f7 100644
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -1434,6 +1434,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>> }
>> }
>> fclose(f);
>> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
>
> Why #ifndef? Shouldn't this be #ifdef (same below multiple times)?
Yes, you are right. I have to revert the logic, e.g. s/ifndef/ifdef
Other than that, are you ok with CONFIG_PCI_SUPP_LEGACY_IRQ name?
Thank you and sorry for the noise,
Oleksandr
>
>
> Juergen
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 05/11] xen/arm: Mark device as PCI while creating one
2021-09-03 8:33 ` [PATCH 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
@ 2021-09-03 12:41 ` Jan Beulich
2021-09-03 13:26 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Jan Beulich @ 2021-09-03 12:41 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: julien, sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko, xen-devel
On 03.09.2021 10:33, Oleksandr Andrushchenko wrote:
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -1301,6 +1301,9 @@ static int iommu_add_device(struct pci_dev *pdev)
> if ( !is_iommu_enabled(pdev->domain) )
> return 0;
>
> +#ifdef CONFIG_ARM
> + pci_to_dev(pdev)->type = DEV_PCI;
> +#endif
Why here instead of in alloc_pdev()? The field should be valid by the time
the new item gets inserted into the segment's list of devices, imo.
Jan
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 05/11] xen/arm: Mark device as PCI while creating one
2021-09-03 12:41 ` Jan Beulich
@ 2021-09-03 13:26 ` Oleksandr Andrushchenko
0 siblings, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-03 13:26 UTC (permalink / raw)
To: Jan Beulich, Oleksandr Andrushchenko
Cc: julien, sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
xen-devel
On 03.09.21 15:41, Jan Beulich wrote:
> On 03.09.2021 10:33, Oleksandr Andrushchenko wrote:
>> --- a/xen/drivers/passthrough/pci.c
>> +++ b/xen/drivers/passthrough/pci.c
>> @@ -1301,6 +1301,9 @@ static int iommu_add_device(struct pci_dev *pdev)
>> if ( !is_iommu_enabled(pdev->domain) )
>> return 0;
>>
>> +#ifdef CONFIG_ARM
>> + pci_to_dev(pdev)->type = DEV_PCI;
>> +#endif
> Why here instead of in alloc_pdev()? The field should be valid by the time
> the new item gets inserted into the segment's list of devices, imo.
Yes, makes sense.
Thank you,
Oleksandr
>
> Jan
>
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations
2021-09-03 9:04 ` Julien Grall
@ 2021-09-06 7:02 ` Oleksandr Andrushchenko
2021-09-06 8:48 ` Julien Grall
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-06 7:02 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
Hi, Julien!
On 03.09.21 12:04, Julien Grall wrote:
> Hi Oleksandr,
>
> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> vPCI may map and unmap PCI device memory (BARs) being passed through which
>> may take a lot of time. For this those operations may be deferred to be
>> performed later, so that they can be safely preempted.
>> Run the corresponding vPCI code while switching a vCPU.
>
> IIUC, you are talking about the function map_range() in xen/drivers/vpci/header. The function has the following todo for Arm:
>
> /*
> * ARM TODOs:
> * - On ARM whether the memory is prefetchable or not should be passed
> * to map_mmio_regions in order to decide which memory attributes
> * should be used.
> *
> * - {un}map_mmio_regions doesn't support preemption.
> */
>
> This doesn't seem to be addressed in the two series for PCI passthrough sent so far. Do you have any plan to handle it?
No plan yet.
All the mappings are happening with p2m_mmio_direct_dev as of now.
>
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> ---
>> xen/arch/arm/traps.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>> index 219ab3c3fbde..1571fb8afd03 100644
>> --- a/xen/arch/arm/traps.c
>> +++ b/xen/arch/arm/traps.c
>> @@ -34,6 +34,7 @@
>> #include <xen/symbols.h>
>> #include <xen/version.h>
>> #include <xen/virtual_region.h>
>> +#include <xen/vpci.h>
>> #include <public/sched.h>
>> #include <public/xen.h>
>> @@ -2304,6 +2305,11 @@ static bool check_for_vcpu_work(void)
>> }
>> #endif
>> + local_irq_enable();
>> + if ( has_vpci(v->domain) && vpci_process_pending(v) )
>
> Looking at the code of vpci_process_pending(), it looks like there are some rework to do for guest. Do you plan to handle it as part of the vPCI series?
Yes, vPCI code is heavily touched to support guest non-identity mappings
>
>> + raise_softirq(SCHEDULE_SOFTIRQ);
>> + local_irq_disable();
>> +
>
> From my understanding of vcpi_process_pending(). The function will return true if there are more work to schedule.
Yes
> However, if check_for_vcpu_for_work() return false, then we will return to the guest before any work for vCPI has finished. This is because check_for_vcpu_work() will not be called again.
Correct
>
> In this case, I think you want to return as soon as you know we need to reschedule.
Not sure I understand this
>
> However, looking at the rest of the code, we already have a check for vpci in the common IOREQ code.
Which may not be enabled as it depends on CONFIG_IOREQ_SERVER.
My understanding is that for x86 it is always enabled, but this might not be the case for Arm
> So we would end up to call twice vpci_process_pending().
So, if CONFIG_IOREQ_SERVER is not enabled then we end up with only calling it from traps.c on Arm
> Maybe we should move the call from the IOREQ to arch-code.
Hm. I would better think of moving it from IOREQ to some other common code: for x86 (if
my understanding correct about CONFIG_IOREQ_SERVER) it is by coincidence that we call vPCI
code from there and IOREQ is always enabled.
>
> Cheers,
>
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations
2021-09-06 7:02 ` Oleksandr Andrushchenko
@ 2021-09-06 8:48 ` Julien Grall
2021-09-06 9:14 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-06 8:48 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 06/09/2021 08:02, Oleksandr Andrushchenko wrote:
> Hi, Julien!
Hi Oleksandr,
> On 03.09.21 12:04, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>
>>> vPCI may map and unmap PCI device memory (BARs) being passed through which
>>> may take a lot of time. For this those operations may be deferred to be
>>> performed later, so that they can be safely preempted.
>>> Run the corresponding vPCI code while switching a vCPU.
>>
>> IIUC, you are talking about the function map_range() in xen/drivers/vpci/header. The function has the following todo for Arm:
>>
>> /*
>> * ARM TODOs:
>> * - On ARM whether the memory is prefetchable or not should be passed
>> * to map_mmio_regions in order to decide which memory attributes
>> * should be used.
>> *
>> * - {un}map_mmio_regions doesn't support preemption.
>> */
>>
>> This doesn't seem to be addressed in the two series for PCI passthrough sent so far. Do you have any plan to handle it?
>
> No plan yet.
>
> All the mappings are happening with p2m_mmio_direct_dev as of now.
So this address the first TODO but how about the second TODO? It refers
to the lack of preemption on Arm but in this patch you suggest there are
some and hence we need to call vpci_process_pending().
For a tech preview, the lack of preemption would be OK. However, the
commit message should be updated to make clear there are no such
preemption yet or avoid to mention it.
>
>>
>>>
>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>> ---
>>> xen/arch/arm/traps.c | 6 ++++++
>>> 1 file changed, 6 insertions(+)
>>>
>>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>>> index 219ab3c3fbde..1571fb8afd03 100644
>>> --- a/xen/arch/arm/traps.c
>>> +++ b/xen/arch/arm/traps.c
>>> @@ -34,6 +34,7 @@
>>> #include <xen/symbols.h>
>>> #include <xen/version.h>
>>> #include <xen/virtual_region.h>
>>> +#include <xen/vpci.h>
>>> #include <public/sched.h>
>>> #include <public/xen.h>
>>> @@ -2304,6 +2305,11 @@ static bool check_for_vcpu_work(void)
>>> }
>>> #endif
>>> + local_irq_enable();
>>> + if ( has_vpci(v->domain) && vpci_process_pending(v) )
>>
>> Looking at the code of vpci_process_pending(), it looks like there are some rework to do for guest. Do you plan to handle it as part of the vPCI series?
> Yes, vPCI code is heavily touched to support guest non-identity mappings
I wasn't referring to the non-identity mappings here. I was referring to
TODOs such as:
/*
* FIXME: in case of failure remove the device from the domain.
* Note that there might still be leftover mappings. While
this is
* safe for Dom0, for DomUs the domain will likely need to be
* killed in order to avoid leaking stale p2m mappings on
* failure.
*/
You still have them after the series reworking the vPCI. As for the
preemption this is OK to ignore it for a tech preview. Although, we want
to at least track them.
>>
>>> + raise_softirq(SCHEDULE_SOFTIRQ);
>>> + local_irq_disable();
>>> +
>>
>> From my understanding of vcpi_process_pending(). The function will return true if there are more work to schedule.
> Yes
>> However, if check_for_vcpu_for_work() return false, then we will return to the guest before any work for vCPI has finished. This is because check_for_vcpu_work() will not be called again.
> Correct
>>
>> In this case, I think you want to return as soon as you know we need to reschedule.
> Not sure I understand this
The return value of check_for_vcpu_for_work() indicates whether we have
more work to do before returning to return to the guest.
When vpci_process_pending() returns true, it tells us we need to call
the function at least one more time before returning to the guest.
In your current implementation, you leave that decision to whoeever is
next in the function.
It is not safe to return to the guest as long as vpci_process_pending()
returns true. So you want to write something like:
if ( vpci_process_pending() )
return true;
>>
>> However, looking at the rest of the code, we already have a check for vpci in the common IOREQ code.
>
> Which may not be enabled as it depends on CONFIG_IOREQ_SERVER.
Right. My point is when CONFIG_IOREQ_SERVER is set then you would end up
to call twice vpci_process_pending(). This will have an impact how on
long your vCPU is going to running because you are doubling the work.
>
> My understanding is that for x86 it is always enabled, but this might not be the case for Arm
>
>> So we would end up to call twice vpci_process_pending().
> So, if CONFIG_IOREQ_SERVER is not enabled then we end up with only calling it from traps.c on Arm
>> Maybe we should move the call from the IOREQ to arch-code.
>
> Hm. I would better think of moving it from IOREQ to some other common code: for x86 (if
>
> my understanding correct about CONFIG_IOREQ_SERVER) it is by coincidence that we call vPCI
>
> code from there and IOREQ is always enabled.
I am not aware of another suitable common helper that would be called on
the return to the guest path. Hence why I suggest to possibly duplicated
the code in each arch path.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations
2021-09-06 8:48 ` Julien Grall
@ 2021-09-06 9:14 ` Oleksandr Andrushchenko
2021-09-06 9:53 ` Julien Grall
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-06 9:14 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 06.09.21 11:48, Julien Grall wrote:
> On 06/09/2021 08:02, Oleksandr Andrushchenko wrote:
>> Hi, Julien!
>
> Hi Oleksandr,
>
>> On 03.09.21 12:04, Julien Grall wrote:
>>> Hi Oleksandr,
>>>
>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>
>>>> vPCI may map and unmap PCI device memory (BARs) being passed through which
>>>> may take a lot of time. For this those operations may be deferred to be
>>>> performed later, so that they can be safely preempted.
>>>> Run the corresponding vPCI code while switching a vCPU.
>>>
>>> IIUC, you are talking about the function map_range() in xen/drivers/vpci/header. The function has the following todo for Arm:
>>>
>>> /*
>>> * ARM TODOs:
>>> * - On ARM whether the memory is prefetchable or not should be passed
>>> * to map_mmio_regions in order to decide which memory attributes
>>> * should be used.
>>> *
>>> * - {un}map_mmio_regions doesn't support preemption.
>>> */
>>>
>>> This doesn't seem to be addressed in the two series for PCI passthrough sent so far. Do you have any plan to handle it?
>>
>> No plan yet.
>>
>> All the mappings are happening with p2m_mmio_direct_dev as of now.
>
> So this address the first TODO but how about the second TODO? It refers to the lack of preemption on Arm but in this patch you suggest there are some and hence we need to call vpci_process_pending().
>
> For a tech preview, the lack of preemption would be OK. However, the commit message should be updated to make clear there are no such preemption yet or avoid to mention it.
Well, the comment was not added by me (by Roger I guess), I just keep it.
As to the preemption both map and unmap are happening via vpci_process_pending, so
what is true for map is also true for unmap with this respect
>
>>
>>>
>>>>
>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>> ---
>>>> xen/arch/arm/traps.c | 6 ++++++
>>>> 1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>>>> index 219ab3c3fbde..1571fb8afd03 100644
>>>> --- a/xen/arch/arm/traps.c
>>>> +++ b/xen/arch/arm/traps.c
>>>> @@ -34,6 +34,7 @@
>>>> #include <xen/symbols.h>
>>>> #include <xen/version.h>
>>>> #include <xen/virtual_region.h>
>>>> +#include <xen/vpci.h>
>>>> #include <public/sched.h>
>>>> #include <public/xen.h>
>>>> @@ -2304,6 +2305,11 @@ static bool check_for_vcpu_work(void)
>>>> }
>>>> #endif
>>>> + local_irq_enable();
>>>> + if ( has_vpci(v->domain) && vpci_process_pending(v) )
>>>
>>> Looking at the code of vpci_process_pending(), it looks like there are some rework to do for guest. Do you plan to handle it as part of the vPCI series?
>> Yes, vPCI code is heavily touched to support guest non-identity mappings
>
> I wasn't referring to the non-identity mappings here. I was referring to TODOs such as:
>
> /*
> * FIXME: in case of failure remove the device from the domain.
> * Note that there might still be leftover mappings. While this is
> * safe for Dom0, for DomUs the domain will likely need to be
> * killed in order to avoid leaking stale p2m mappings on
> * failure.
> */
>
> You still have them after the series reworking the vPCI. As for the preemption this is OK to ignore it for a tech preview. Although, we want to at least track them.
Please see above: both map and unmap are happening via vpci_process_pending
>
>>>
>>>> + raise_softirq(SCHEDULE_SOFTIRQ);
>>>> + local_irq_disable();
>>>> +
>>>
>>> From my understanding of vcpi_process_pending(). The function will return true if there are more work to schedule.
>> Yes
>>> However, if check_for_vcpu_for_work() return false, then we will return to the guest before any work for vCPI has finished. This is because check_for_vcpu_work() will not be called again.
>> Correct
>>>
>>> In this case, I think you want to return as soon as you know we need to reschedule.
>> Not sure I understand this
>
I was more referring to "I think you want to return as soon as you know we need to reschedule."
> The return value of check_for_vcpu_for_work() indicates whether we have more work to do before returning to return to the guest.
>
> When vpci_process_pending() returns true, it tells us we need to call the function at least one more time before returning to the guest.
>
> In your current implementation, you leave that decision to whoeever is next in the function.
>
> It is not safe to return to the guest as long as vpci_process_pending() returns true. So you want to write something like:
>
> if ( vpci_process_pending() )
> return true;
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2291,6 +2291,9 @@ static bool check_for_vcpu_work(void)
{
struct vcpu *v = current;
+ if ( vpci_process_pending() )
+ return true;
+
#ifdef CONFIG_IOREQ_SERVER
if ( domain_has_ioreq_server(v->domain) )
{
Do you mean something like this?
>
>>>
>>> However, looking at the rest of the code, we already have a check for vpci in the common IOREQ code.
>>
>> Which may not be enabled as it depends on CONFIG_IOREQ_SERVER.
>
> Right. My point is when CONFIG_IOREQ_SERVER is set then you would end up to call twice vpci_process_pending(). This will have an impact how on long your vCPU is going to running because you are doubling the work.
So, you suggest that we have in the common IOREQ code something call like
arch_vpci_process_pending? In case of x86 it will have the code currently found in the
common IOREQ sources and for Arm it will be nop?
Any better suggestion for the name?
>
>>
>> My understanding is that for x86 it is always enabled, but this might not be the case for Arm
>>
>>> So we would end up to call twice vpci_process_pending().
>> So, if CONFIG_IOREQ_SERVER is not enabled then we end up with only calling it from traps.c on Arm
>>> Maybe we should move the call from the IOREQ to arch-code.
>>
>> Hm. I would better think of moving it from IOREQ to some other common code: for x86 (if
>>
>> my understanding correct about CONFIG_IOREQ_SERVER) it is by coincidence that we call vPCI
>>
>> code from there and IOREQ is always enabled.
>
> I am not aware of another suitable common helper that would be called on the return to the guest path. Hence why I suggest to possibly duplicated the code in each arch path.
I see
>
> Cheers,
>
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations
2021-09-06 9:14 ` Oleksandr Andrushchenko
@ 2021-09-06 9:53 ` Julien Grall
2021-09-06 10:06 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-06 9:53 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
Hi Oleksandr,
On 06/09/2021 10:14, Oleksandr Andrushchenko wrote:
>
> On 06.09.21 11:48, Julien Grall wrote:
>> On 06/09/2021 08:02, Oleksandr Andrushchenko wrote:
>>> Hi, Julien!
>>
>> Hi Oleksandr,
>>
>>> On 03.09.21 12:04, Julien Grall wrote:
>>>> Hi Oleksandr,
>>>>
>>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>
>>>>> vPCI may map and unmap PCI device memory (BARs) being passed through which
>>>>> may take a lot of time. For this those operations may be deferred to be
>>>>> performed later, so that they can be safely preempted.
>>>>> Run the corresponding vPCI code while switching a vCPU.
>>>>
>>>> IIUC, you are talking about the function map_range() in xen/drivers/vpci/header. The function has the following todo for Arm:
>>>>
>>>> /*
>>>> * ARM TODOs:
>>>> * - On ARM whether the memory is prefetchable or not should be passed
>>>> * to map_mmio_regions in order to decide which memory attributes
>>>> * should be used.
>>>> *
>>>> * - {un}map_mmio_regions doesn't support preemption.
>>>> */
>>>>
>>>> This doesn't seem to be addressed in the two series for PCI passthrough sent so far. Do you have any plan to handle it?
>>>
>>> No plan yet.
>>>
>>> All the mappings are happening with p2m_mmio_direct_dev as of now.
>>
>> So this address the first TODO but how about the second TODO? It refers to the lack of preemption on Arm but in this patch you suggest there are some and hence we need to call vpci_process_pending().
>>
>> For a tech preview, the lack of preemption would be OK. However, the commit message should be updated to make clear there are no such preemption yet or avoid to mention it.
>
> Well, the comment was not added by me (by Roger I guess), I just keep it.
I don't think it matters to know who added it. What matters is when
those comments are going to be handled. If they are already handled,
then they should be dropped.
If they are not, the two TODOs listed above are probably OK to defer as
you only plan a tech preview. But they would need to be handled before
vCPI is selected by default and used in production.
Note that I specifically wrote "the two TODOs listed above" because I
haven't looked at the other TODOs/FIXMEs and figued out they are fine to
defer.
>
> As to the preemption both map and unmap are happening via vpci_process_pending, so
Right... this doesn't mean preemption is actually supported on Arm.
vpci_process_pending() doesn't do the preemption itself. It relies on
map_range() to do it.
But even map_range() relies on the arch specific helper
{,un}map_mmio_regions() to do it. If you look at the x86 implementation
they are adding at max MAX_MMIO_MAX_ITER entries per call. On Arm, there
are not such limit. Therefore the function will always do the full
{,un}mapping before returning. IOW there are no preemption supported.
>
> what is true for map is also true for unmap with this respect
>
>>
>>>
>>>>
>>>>>
>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>> ---
>>>>> xen/arch/arm/traps.c | 6 ++++++
>>>>> 1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>>>>> index 219ab3c3fbde..1571fb8afd03 100644
>>>>> --- a/xen/arch/arm/traps.c
>>>>> +++ b/xen/arch/arm/traps.c
>>>>> @@ -34,6 +34,7 @@
>>>>> #include <xen/symbols.h>
>>>>> #include <xen/version.h>
>>>>> #include <xen/virtual_region.h>
>>>>> +#include <xen/vpci.h>
>>>>> #include <public/sched.h>
>>>>> #include <public/xen.h>
>>>>> @@ -2304,6 +2305,11 @@ static bool check_for_vcpu_work(void)
>>>>> }
>>>>> #endif
>>>>> + local_irq_enable();
>>>>> + if ( has_vpci(v->domain) && vpci_process_pending(v) )
>>>>
>>>> Looking at the code of vpci_process_pending(), it looks like there are some rework to do for guest. Do you plan to handle it as part of the vPCI series?
>>> Yes, vPCI code is heavily touched to support guest non-identity mappings
>>
>> I wasn't referring to the non-identity mappings here. I was referring to TODOs such as:
>>
>> /*
>> * FIXME: in case of failure remove the device from the domain.
>> * Note that there might still be leftover mappings. While this is
>> * safe for Dom0, for DomUs the domain will likely need to be
>> * killed in order to avoid leaking stale p2m mappings on
>> * failure.
>> */
>>
>> You still have them after the series reworking the vPCI. As for the preemption this is OK to ignore it for a tech preview. Although, we want to at least track them.
> Please see above: both map and unmap are happening via vpci_process_pending
I am not sure how this is relevant to what I just mentionned.
>>
>>>>
>>>>> + raise_softirq(SCHEDULE_SOFTIRQ);
>>>>> + local_irq_disable();
>>>>> +
>>>>
>>>> From my understanding of vcpi_process_pending(). The function will return true if there are more work to schedule.
>>> Yes
>>>> However, if check_for_vcpu_for_work() return false, then we will return to the guest before any work for vCPI has finished. This is because check_for_vcpu_work() will not be called again.
>>> Correct
>>>>
>>>> In this case, I think you want to return as soon as you know we need to reschedule.
>>> Not sure I understand this
>>
> I was more referring to "I think you want to return as soon as you know we need to reschedule."
>> The return value of check_for_vcpu_for_work() indicates whether we have more work to do before returning to return to the guest.
>>
>> When vpci_process_pending() returns true, it tells us we need to call the function at least one more time before returning to the guest.
>>
>> In your current implementation, you leave that decision to whoeever is next in the function.
>>
>> It is not safe to return to the guest as long as vpci_process_pending() returns true. So you want to write something like:
>>
>> if ( vpci_process_pending() )
>> return true;
> --- a/xen/arch/arm/traps.c
>
> +++ b/xen/arch/arm/traps.c
> @@ -2291,6 +2291,9 @@ static bool check_for_vcpu_work(void)
> {
> struct vcpu *v = current;
>
> + if ( vpci_process_pending() )
> + return true;
> +
> #ifdef CONFIG_IOREQ_SERVER
> if ( domain_has_ioreq_server(v->domain) )
> {
> Do you mean something like this?
Yes.
>>>> However, looking at the rest of the code, we already have a check for vpci in the common IOREQ code.
>>>
>>> Which may not be enabled as it depends on CONFIG_IOREQ_SERVER.
>>
>> Right. My point is when CONFIG_IOREQ_SERVER is set then you would end up to call twice vpci_process_pending(). This will have an impact how on long your vCPU is going to running because you are doubling the work.
>
> So, you suggest that we have in the common IOREQ code something call like
>
> arch_vpci_process_pending? In case of x86 it will have the code currently found in the
>
> common IOREQ sources and for Arm it will be nop?
No I am suggesting to move the call of the IOREQ code to hvm_do_resume()
(on x86) and check_for_vcpu_work() (on Arm).
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations
2021-09-06 9:53 ` Julien Grall
@ 2021-09-06 10:06 ` Oleksandr Andrushchenko
2021-09-06 10:38 ` Julien Grall
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-06 10:06 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 06.09.21 12:53, Julien Grall wrote:
> Hi Oleksandr,
>
> On 06/09/2021 10:14, Oleksandr Andrushchenko wrote:
>>
>> On 06.09.21 11:48, Julien Grall wrote:
>>> On 06/09/2021 08:02, Oleksandr Andrushchenko wrote:
>>>> Hi, Julien!
>>>
>>> Hi Oleksandr,
>>>
>>>> On 03.09.21 12:04, Julien Grall wrote:
>>>>> Hi Oleksandr,
>>>>>
>>>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>>
>>>>>> vPCI may map and unmap PCI device memory (BARs) being passed through which
>>>>>> may take a lot of time. For this those operations may be deferred to be
>>>>>> performed later, so that they can be safely preempted.
>>>>>> Run the corresponding vPCI code while switching a vCPU.
>>>>>
>>>>> IIUC, you are talking about the function map_range() in xen/drivers/vpci/header. The function has the following todo for Arm:
>>>>>
>>>>> /*
>>>>> * ARM TODOs:
>>>>> * - On ARM whether the memory is prefetchable or not should be passed
>>>>> * to map_mmio_regions in order to decide which memory attributes
>>>>> * should be used.
>>>>> *
>>>>> * - {un}map_mmio_regions doesn't support preemption.
>>>>> */
>>>>>
>>>>> This doesn't seem to be addressed in the two series for PCI passthrough sent so far. Do you have any plan to handle it?
>>>>
>>>> No plan yet.
>>>>
>>>> All the mappings are happening with p2m_mmio_direct_dev as of now.
>>>
>>> So this address the first TODO but how about the second TODO? It refers to the lack of preemption on Arm but in this patch you suggest there are some and hence we need to call vpci_process_pending().
>>>
>>> For a tech preview, the lack of preemption would be OK. However, the commit message should be updated to make clear there are no such preemption yet or avoid to mention it.
>>
>> Well, the comment was not added by me (by Roger I guess), I just keep it.
>
> I don't think it matters to know who added it. What matters is when those comments are going to be handled. If they are already handled, then they should be dropped.
>
> If they are not, the two TODOs listed above are probably OK to defer as you only plan a tech preview. But they would need to be handled before vCPI is selected by default and used in production.
>
> Note that I specifically wrote "the two TODOs listed above" because I haven't looked at the other TODOs/FIXMEs and figued out they are fine to defer.
Ok, then I leave the TODOs as they are
>
>>
>> As to the preemption both map and unmap are happening via vpci_process_pending, so
>
> Right... this doesn't mean preemption is actually supported on Arm. vpci_process_pending() doesn't do the preemption itself. It relies on map_range() to do it.
>
> But even map_range() relies on the arch specific helper {,un}map_mmio_regions() to do it. If you look at the x86 implementation they are adding at max MAX_MMIO_MAX_ITER entries per call. On Arm, there are not such limit. Therefore the function will always do the full {,un}mapping before returning. IOW there are no preemption supported.
Ok
>
>>
>> what is true for map is also true for unmap with this respect
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>> ---
>>>>>> xen/arch/arm/traps.c | 6 ++++++
>>>>>> 1 file changed, 6 insertions(+)
>>>>>>
>>>>>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>>>>>> index 219ab3c3fbde..1571fb8afd03 100644
>>>>>> --- a/xen/arch/arm/traps.c
>>>>>> +++ b/xen/arch/arm/traps.c
>>>>>> @@ -34,6 +34,7 @@
>>>>>> #include <xen/symbols.h>
>>>>>> #include <xen/version.h>
>>>>>> #include <xen/virtual_region.h>
>>>>>> +#include <xen/vpci.h>
>>>>>> #include <public/sched.h>
>>>>>> #include <public/xen.h>
>>>>>> @@ -2304,6 +2305,11 @@ static bool check_for_vcpu_work(void)
>>>>>> }
>>>>>> #endif
>>>>>> + local_irq_enable();
>>>>>> + if ( has_vpci(v->domain) && vpci_process_pending(v) )
>>>>>
>>>>> Looking at the code of vpci_process_pending(), it looks like there are some rework to do for guest. Do you plan to handle it as part of the vPCI series?
>>>> Yes, vPCI code is heavily touched to support guest non-identity mappings
>>>
>>> I wasn't referring to the non-identity mappings here. I was referring to TODOs such as:
>>>
>>> /*
>>> * FIXME: in case of failure remove the device from the domain.
>>> * Note that there might still be leftover mappings. While this is
>>> * safe for Dom0, for DomUs the domain will likely need to be
>>> * killed in order to avoid leaking stale p2m mappings on
>>> * failure.
>>> */
>>>
>>> You still have them after the series reworking the vPCI. As for the preemption this is OK to ignore it for a tech preview. Although, we want to at least track them.
>> Please see above: both map and unmap are happening via vpci_process_pending
>
> I am not sure how this is relevant to what I just mentionned.
>
>>>
>>>>>
>>>>>> + raise_softirq(SCHEDULE_SOFTIRQ);
>>>>>> + local_irq_disable();
>>>>>> +
>>>>>
>>>>> From my understanding of vcpi_process_pending(). The function will return true if there are more work to schedule.
>>>> Yes
>>>>> However, if check_for_vcpu_for_work() return false, then we will return to the guest before any work for vCPI has finished. This is because check_for_vcpu_work() will not be called again.
>>>> Correct
>>>>>
>>>>> In this case, I think you want to return as soon as you know we need to reschedule.
>>>> Not sure I understand this
>>>
>> I was more referring to "I think you want to return as soon as you know we need to reschedule."
>>> The return value of check_for_vcpu_for_work() indicates whether we have more work to do before returning to return to the guest.
>>>
>>> When vpci_process_pending() returns true, it tells us we need to call the function at least one more time before returning to the guest.
>>>
>>> In your current implementation, you leave that decision to whoeever is next in the function.
>>>
>>> It is not safe to return to the guest as long as vpci_process_pending() returns true. So you want to write something like:
>>>
>>> if ( vpci_process_pending() )
>>> return true;
>> --- a/xen/arch/arm/traps.c
>>
>> +++ b/xen/arch/arm/traps.c
>> @@ -2291,6 +2291,9 @@ static bool check_for_vcpu_work(void)
>> {
>> struct vcpu *v = current;
>>
>> + if ( vpci_process_pending() )
>> + return true;
>> +
>> #ifdef CONFIG_IOREQ_SERVER
>> if ( domain_has_ioreq_server(v->domain) )
>> {
>> Do you mean something like this?
>
> Yes.
Ok, I'll add this check
>
>>>>> However, looking at the rest of the code, we already have a check for vpci in the common IOREQ code.
>>>>
>>>> Which may not be enabled as it depends on CONFIG_IOREQ_SERVER.
>>>
>>> Right. My point is when CONFIG_IOREQ_SERVER is set then you would end up to call twice vpci_process_pending(). This will have an impact how on long your vCPU is going to running because you are doubling the work.
>>
>> So, you suggest that we have in the common IOREQ code something call like
>>
>> arch_vpci_process_pending? In case of x86 it will have the code currently found in the
>>
>> common IOREQ sources and for Arm it will be nop?
>
> No I am suggesting to move the call of the IOREQ code to hvm_do_resume() (on x86) and check_for_vcpu_work() (on Arm).
Ok, I can move vPCI code to hvm_do_resume, but vPCI is only used for x86 PVH Dom0.
Do you still think hvm_do_resume is the right place?
>
> Cheers,
>
Thanks,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations
2021-09-06 10:06 ` Oleksandr Andrushchenko
@ 2021-09-06 10:38 ` Julien Grall
2021-09-07 6:34 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-06 10:38 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
Hi Oleksandr,
On 06/09/2021 11:06, Oleksandr Andrushchenko wrote:
> On 06.09.21 12:53, Julien Grall wrote:
>>>>>> However, looking at the rest of the code, we already have a check for vpci in the common IOREQ code.
>>>>>
>>>>> Which may not be enabled as it depends on CONFIG_IOREQ_SERVER.
>>>>
>>>> Right. My point is when CONFIG_IOREQ_SERVER is set then you would end up to call twice vpci_process_pending(). This will have an impact how on long your vCPU is going to running because you are doubling the work.
>>>
>>> So, you suggest that we have in the common IOREQ code something call like
>>>
>>> arch_vpci_process_pending? In case of x86 it will have the code currently found in the
>>>
>>> common IOREQ sources and for Arm it will be nop?
>>
>> No I am suggesting to move the call of the IOREQ code to hvm_do_resume() (on x86) and check_for_vcpu_work() (on Arm).
>
> Ok, I can move vPCI code to hvm_do_resume, but vPCI is only used for x86 PVH Dom0.
AFAIK, Roger is planning to use it for x86 PVH guest.
>
> Do you still think hvm_do_resume is the right place?
I think so. AFAICT, on x86, the only caller of
vcpu_ioreq_handle_completion() is hvm_do_resume(). So it makes sense to
push one layer up.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations
2021-09-06 10:38 ` Julien Grall
@ 2021-09-07 6:34 ` Oleksandr Andrushchenko
0 siblings, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-07 6:34 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 06.09.21 13:38, Julien Grall wrote:
> Hi Oleksandr,
>
> On 06/09/2021 11:06, Oleksandr Andrushchenko wrote:
>> On 06.09.21 12:53, Julien Grall wrote:
>>>>>>> However, looking at the rest of the code, we already have a check for vpci in the common IOREQ code.
>>>>>>
>>>>>> Which may not be enabled as it depends on CONFIG_IOREQ_SERVER.
>>>>>
>>>>> Right. My point is when CONFIG_IOREQ_SERVER is set then you would end up to call twice vpci_process_pending(). This will have an impact how on long your vCPU is going to running because you are doubling the work.
>>>>
>>>> So, you suggest that we have in the common IOREQ code something call like
>>>>
>>>> arch_vpci_process_pending? In case of x86 it will have the code currently found in the
>>>>
>>>> common IOREQ sources and for Arm it will be nop?
>>>
>>> No I am suggesting to move the call of the IOREQ code to hvm_do_resume() (on x86) and check_for_vcpu_work() (on Arm).
>>
>> Ok, I can move vPCI code to hvm_do_resume, but vPCI is only used for x86 PVH Dom0.
>
> AFAIK, Roger is planning to use it for x86 PVH guest.
>
>>
>> Do you still think hvm_do_resume is the right place?
> I think so. AFAICT, on x86, the only caller of vcpu_ioreq_handle_completion() is hvm_do_resume(). So it makes sense to push one layer up.
Ok, so I ended up with:
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2305,10 +2305,17 @@ static bool check_for_vcpu_work(void)
}
#endif
- local_irq_enable();
- if ( has_vpci(v->domain) && vpci_process_pending(v) )
- raise_softirq(SCHEDULE_SOFTIRQ);
- local_irq_disable();
+ if ( has_vpci(v->domain) )
+ {
+ bool pending;
+
+ local_irq_enable();
+ pending = vpci_process_pending(v);
+ local_irq_disable();
+
+ if ( pending )
+ return true;
+ }
This is how it is done for IOREQ. It seems there is no need to raise softirq.
I also moved vPCI from common code to hvm_do_resume for x86
>
> Cheers,
>
Thanks,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 01/11] xen/arm: Add new device type for PCI
2021-09-03 8:33 ` [PATCH 01/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
@ 2021-09-09 17:19 ` Julien Grall
2021-09-10 7:40 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-09 17:19 UTC (permalink / raw)
To: Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
Hi Oleksandr,
On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> Add new device type (DEV_PCI) to distinguish PCI devices from platform
> DT devices, so some drivers, like IOMMU, can handle PCI devices
> differently.
I think it would be better to fold this change in the next patch as this
is where you add all the helpers for converting dev to PCI.
>
> While at it fix dev_is_dt macro.
I would keep this change separate. It also needs an explanation of
what's the problem and mention it is a latent bug because no-one use it
(so we know this doesn't require a backport).
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-03 8:33 ` [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
@ 2021-09-09 17:43 ` Julien Grall
2021-09-10 11:43 ` Oleksandr Andrushchenko
2021-09-10 20:12 ` Stefano Stabellini
1 sibling, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-09 17:43 UTC (permalink / raw)
To: Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
Hi Oleksandr,
On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> In order vPCI to work it needs all access to PCI configuration space
> (ECAM) to be synchronized among all entities, e.g. hardware domain and
> guests.
I am not entirely sure what you mean by "synchronized" here. Are you
refer to the content of the configuration space?
> For that implement PCI host bridge specific callbacks to
> properly setup those ranges depending on particular host bridge
> implementation.
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
> xen/arch/arm/pci/ecam.c | 11 +++++++++++
> xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
> xen/arch/arm/vpci.c | 13 +++++++++++++
> xen/include/asm-arm/pci.h | 8 ++++++++
> 4 files changed, 48 insertions(+)
>
> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> index 91c691b41fdf..92ecb2e0762b 100644
> --- a/xen/arch/arm/pci/ecam.c
> +++ b/xen/arch/arm/pci/ecam.c
> @@ -42,6 +42,16 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
> return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
> }
>
> +static int pci_ecam_register_mmio_handler(struct domain *d,
> + struct pci_host_bridge *bridge,
> + const struct mmio_handler_ops *ops)
> +{
> + struct pci_config_window *cfg = bridge->sysdata;
> +
> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
We have a fixed array for handling the MMIO handlers. So you need to
make sure we have enough space in it to store one handler per handler.
This is quite similar to the problem we had with the re-distribuor on
GICv3. Have a look there to see how we dealt with it.
> + return 0;
> +}
> +
> /* ECAM ops */
> const struct pci_ecam_ops pci_generic_ecam_ops = {
> .bus_shift = 20,
> @@ -49,6 +59,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
> .map_bus = pci_ecam_map_bus,
> .read = pci_generic_config_read,
> .write = pci_generic_config_write,
> + .register_mmio_handler = pci_ecam_register_mmio_handler,
> }
> };
>
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index d2fef5476b8e..a89112bfbb7c 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -318,6 +318,22 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
> }
> return bridge->dt_node;
> }
> +
> +int pci_host_iterate_bridges(struct domain *d,
> + int (*clb)(struct domain *d,
NIT: We tend to name callback variable 'cb'.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-03 8:33 ` [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m Oleksandr Andrushchenko
@ 2021-09-09 17:58 ` Julien Grall
2021-09-10 12:37 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-09 17:58 UTC (permalink / raw)
To: Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, oleksandr_tyshchenko, volodymyr_babchuk,
Artem_Mygaiev, roger.pau, bertrand.marquis, rahul.singh,
Oleksandr Andrushchenko
Hi Oleksandr,
On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> Host bridge controller's ECAM space is mapped into Domain-0's p2m,
> thus it is not possible to trap the same for vPCI via MMIO handlers.
> For this to work we need to not map those while constructing the domain.
>
> Note, that during Domain-0 creation there is no pci_dev yet allocated for
> host bridges, thus we cannot match PCI host and its associated
> bridge by SBDF. Use dt_device_node field for checks instead.
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
> xen/arch/arm/domain_build.c | 3 +++
> xen/arch/arm/pci/ecam.c | 17 +++++++++++++++++
> xen/arch/arm/pci/pci-host-common.c | 22 ++++++++++++++++++++++
> xen/include/asm-arm/pci.h | 12 ++++++++++++
> 4 files changed, 54 insertions(+)
>
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index da427f399711..76f5b513280c 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1257,6 +1257,9 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
> }
> }
>
> + if ( need_mapping && (device_get_class(dev) == DEVICE_PCI) ) > + need_mapping = pci_host_bridge_need_p2m_mapping(d, dev,
addr, len);
AFAICT, with device_get_class(dev), you know whether the hostbridge is
used by Xen. Therefore, I would expect that we don't want to map all the
regions of the hostbridges in dom0 (including the BARs).
Can you clarify it?
> + > if ( need_mapping )
> {
> res = map_regions_p2mt(d,
> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> index 92ecb2e0762b..d32efb7fcbd0 100644
> --- a/xen/arch/arm/pci/ecam.c
> +++ b/xen/arch/arm/pci/ecam.c
> @@ -52,6 +52,22 @@ static int pci_ecam_register_mmio_handler(struct domain *d,
> return 0;
> }
>
> +static int pci_ecam_need_p2m_mapping(struct domain *d,
> + struct pci_host_bridge *bridge,
> + uint64_t addr, uint64_t len)
> +{
> + struct pci_config_window *cfg = bridge->sysdata;
> +
> + if ( !is_hardware_domain(d) )
> + return true;
I am a bit puzzled with this check. If the ECAM has been initialized by
Xen, then I believe we cannot expose it to any domain at all.
> +
> + /*
> + * We do not want ECAM address space to be mapped in domain's p2m,
> + * so we can trap access to it.
> + */
> + return cfg->phys_addr != addr;
> +}
> +
> /* ECAM ops */
> const struct pci_ecam_ops pci_generic_ecam_ops = {
> .bus_shift = 20,
> @@ -60,6 +76,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
> .read = pci_generic_config_read,
> .write = pci_generic_config_write,
> .register_mmio_handler = pci_ecam_register_mmio_handler,
> + .need_p2m_mapping = pci_ecam_need_p2m_mapping,
> }
> };
>
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index a89112bfbb7c..c04be636452d 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -334,6 +334,28 @@ int pci_host_iterate_bridges(struct domain *d,
> }
> return 0;
> }
> +
> +bool pci_host_bridge_need_p2m_mapping(struct domain *d,
> + const struct dt_device_node *node,
> + uint64_t addr, uint64_t len)
> +{
> + struct pci_host_bridge *bridge;
> +
> + list_for_each_entry( bridge, &pci_host_bridges, node )
> + {
> + if ( bridge->dt_node != node )
> + continue;
> +
> + if ( !bridge->ops->need_p2m_mapping )
> + return true;
> +
> + return bridge->ops->need_p2m_mapping(d, bridge, addr, len);
> + }
> + printk(XENLOG_ERR "Unable to find PCI bridge for %s segment %d, addr %lx\n",
> + node->full_name, bridge->segment, addr);
> + return true;
> +}
If you really need to map the hostbridges, then I would suggest to defer
the P2M mappings for all of them and then walk all the bridge in one go
to do the mappings.
This would avoid going through the bridges every time during setup.
> +
> /*
> * Local variables:
> * mode: C
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index 2c7c7649e00f..9c28a4bdc4b7 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -82,6 +82,8 @@ struct pci_ops {
> int (*register_mmio_handler)(struct domain *d,
> struct pci_host_bridge *bridge,
> const struct mmio_handler_ops *ops);
> + int (*need_p2m_mapping)(struct domain *d, struct pci_host_bridge *bridge,
> + uint64_t addr, uint64_t len);
> };
>
> /*
> @@ -115,9 +117,19 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
> int pci_host_iterate_bridges(struct domain *d,
> int (*clb)(struct domain *d,
> struct pci_host_bridge *bridge));
> +bool pci_host_bridge_need_p2m_mapping(struct domain *d,
> + const struct dt_device_node *node,
> + uint64_t addr, uint64_t len);
> #else /*!CONFIG_HAS_PCI*/
>
> struct arch_pci_dev { };
>
> +static inline bool
> +pci_host_bridge_need_p2m_mapping(struct domain *d,
> + const struct dt_device_node *node,
> + uint64_t addr, uint64_t len)
> +{
> + return true;
> +}
> #endif /*!CONFIG_HAS_PCI*/
> #endif /* __ARM_PCI_H__ */
>
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 01/11] xen/arm: Add new device type for PCI
2021-09-09 17:19 ` Julien Grall
@ 2021-09-10 7:40 ` Oleksandr Andrushchenko
0 siblings, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-10 7:40 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
Hi, Julien!
On 09.09.21 20:19, Julien Grall wrote:
> Hi Oleksandr,
>
> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> Add new device type (DEV_PCI) to distinguish PCI devices from platform
>> DT devices, so some drivers, like IOMMU, can handle PCI devices
>> differently.
>
> I think it would be better to fold this change in the next patch as this is where you add all the helpers for converting dev to PCI.
Ok, I will
>
>>
>> While at it fix dev_is_dt macro.
>
> I would keep this change separate. It also needs an explanation of what's the problem and mention it is a latent bug because no-one use it (so we know this doesn't require a backport).
Sure, will create a dedicated patch for that
>
> Cheers,
>
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-09 17:43 ` Julien Grall
@ 2021-09-10 11:43 ` Oleksandr Andrushchenko
2021-09-10 13:04 ` Julien Grall
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-10 11:43 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
Hi, Julien!
On 09.09.21 20:43, Julien Grall wrote:
> Hi Oleksandr,
>
> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> In order vPCI to work it needs all access to PCI configuration space
>> (ECAM) to be synchronized among all entities, e.g. hardware domain and
>> guests.
>
> I am not entirely sure what you mean by "synchronized" here. Are you refer to the content of the configuration space?
We maintain hwdom's and guest's view of the registers we are interested in
So, to have hwdom's view we also need to trap its access to the configuration space.
Probably "synchronized" is not the right wording here.
>
>> For that implement PCI host bridge specific callbacks to
>> properly setup those ranges depending on particular host bridge
>> implementation.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> ---
>> xen/arch/arm/pci/ecam.c | 11 +++++++++++
>> xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
>> xen/arch/arm/vpci.c | 13 +++++++++++++
>> xen/include/asm-arm/pci.h | 8 ++++++++
>> 4 files changed, 48 insertions(+)
>>
>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>> index 91c691b41fdf..92ecb2e0762b 100644
>> --- a/xen/arch/arm/pci/ecam.c
>> +++ b/xen/arch/arm/pci/ecam.c
>> @@ -42,6 +42,16 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>> return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
>> }
>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>> + struct pci_host_bridge *bridge,
>> + const struct mmio_handler_ops *ops)
>> +{
>> + struct pci_config_window *cfg = bridge->sysdata;
>> +
>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>
> We have a fixed array for handling the MMIO handlers.
Didn't know that:
xen/include/asm-arm/mmio.h:27:#define MAX_IO_HANDLER 16
> So you need to make sure we have enough space in it to store one handler per handler.
>
> This is quite similar to the problem we had with the re-distribuor on GICv3. Have a look there to see how we dealt with it.
Could you please point me to that solution? I can only see
/* Register mmio handle for the Distributor */
register_mmio_handler(d, &vgic_distr_mmio_handler, d->arch.vgic.dbase,
SZ_64K, NULL);
/*
* Register mmio handler per contiguous region occupied by the
* redistributors. The handler will take care to choose which
* redistributor is targeted.
*/
for ( i = 0; i < d->arch.vgic.nr_regions; i++ )
{
struct vgic_rdist_region *region = &d->arch.vgic.rdist_regions[i];
register_mmio_handler(d, &vgic_rdistr_mmio_handler,
region->base, region->size, region);
}
which IMO doesn't care about the number of MMIOs we can handle
>
>> + return 0;
>> +}
>> +
>> /* ECAM ops */
>> const struct pci_ecam_ops pci_generic_ecam_ops = {
>> .bus_shift = 20,
>> @@ -49,6 +59,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>> .map_bus = pci_ecam_map_bus,
>> .read = pci_generic_config_read,
>> .write = pci_generic_config_write,
>> + .register_mmio_handler = pci_ecam_register_mmio_handler,
>> }
>> };
>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>> index d2fef5476b8e..a89112bfbb7c 100644
>> --- a/xen/arch/arm/pci/pci-host-common.c
>> +++ b/xen/arch/arm/pci/pci-host-common.c
>> @@ -318,6 +318,22 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
>> }
>> return bridge->dt_node;
>> }
>> +
>> +int pci_host_iterate_bridges(struct domain *d,
>> + int (*clb)(struct domain *d,
>
> NIT: We tend to name callback variable 'cb'.
Sure
>
> Cheers,
>
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-09 17:58 ` Julien Grall
@ 2021-09-10 12:37 ` Oleksandr Andrushchenko
2021-09-10 13:18 ` Julien Grall
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-10 12:37 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
Hi, Julien!
On 09.09.21 20:58, Julien Grall wrote:
> Hi Oleksandr,
>
> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> Host bridge controller's ECAM space is mapped into Domain-0's p2m,
>> thus it is not possible to trap the same for vPCI via MMIO handlers.
>> For this to work we need to not map those while constructing the domain.
>>
>> Note, that during Domain-0 creation there is no pci_dev yet allocated for
>> host bridges, thus we cannot match PCI host and its associated
>> bridge by SBDF. Use dt_device_node field for checks instead.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> ---
>> xen/arch/arm/domain_build.c | 3 +++
>> xen/arch/arm/pci/ecam.c | 17 +++++++++++++++++
>> xen/arch/arm/pci/pci-host-common.c | 22 ++++++++++++++++++++++
>> xen/include/asm-arm/pci.h | 12 ++++++++++++
>> 4 files changed, 54 insertions(+)
>>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index da427f399711..76f5b513280c 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -1257,6 +1257,9 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>> }
>> }
>> + if ( need_mapping && (device_get_class(dev) == DEVICE_PCI) ) > + need_mapping = pci_host_bridge_need_p2m_mapping(d, dev,
> addr, len);
>
> AFAICT, with device_get_class(dev), you know whether the hostbridge is used by Xen. Therefore, I would expect that we don't want to map all the regions of the hostbridges in dom0 (including the BARs).
>
> Can you clarify it?
We only want to trap ECAM, not MMIOs and any other memory regions as the bridge is
initialized and used by Domain-0 completely. But at the same time we want to trap all
configuration space access, so we can see what is happening to it.
>
>> + > if ( need_mapping )
>> {
>> res = map_regions_p2mt(d,
>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>> index 92ecb2e0762b..d32efb7fcbd0 100644
>> --- a/xen/arch/arm/pci/ecam.c
>> +++ b/xen/arch/arm/pci/ecam.c
>> @@ -52,6 +52,22 @@ static int pci_ecam_register_mmio_handler(struct domain *d,
>> return 0;
>> }
>> +static int pci_ecam_need_p2m_mapping(struct domain *d,
>> + struct pci_host_bridge *bridge,
>> + uint64_t addr, uint64_t len)
>> +{
>> + struct pci_config_window *cfg = bridge->sysdata;
>> +
>> + if ( !is_hardware_domain(d) )
>> + return true;
>
> I am a bit puzzled with this check. If the ECAM has been initialized by Xen, then I believe we cannot expose it to any domain at all.
You are right, this check needs to be removed as this function is only called for Domain-0
>
>> +
>> + /*
>> + * We do not want ECAM address space to be mapped in domain's p2m,
>> + * so we can trap access to it.
>> + */
>> + return cfg->phys_addr != addr;
>> +}
>> +
>> /* ECAM ops */
>> const struct pci_ecam_ops pci_generic_ecam_ops = {
>> .bus_shift = 20,
>> @@ -60,6 +76,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>> .read = pci_generic_config_read,
>> .write = pci_generic_config_write,
>> .register_mmio_handler = pci_ecam_register_mmio_handler,
>> + .need_p2m_mapping = pci_ecam_need_p2m_mapping,
>> }
>> };
>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>> index a89112bfbb7c..c04be636452d 100644
>> --- a/xen/arch/arm/pci/pci-host-common.c
>> +++ b/xen/arch/arm/pci/pci-host-common.c
>> @@ -334,6 +334,28 @@ int pci_host_iterate_bridges(struct domain *d,
>> }
>> return 0;
>> }
>> +
>> +bool pci_host_bridge_need_p2m_mapping(struct domain *d,
>> + const struct dt_device_node *node,
>> + uint64_t addr, uint64_t len)
>> +{
>> + struct pci_host_bridge *bridge;
>> +
>> + list_for_each_entry( bridge, &pci_host_bridges, node )
>> + {
>> + if ( bridge->dt_node != node )
>> + continue;
>> +
>> + if ( !bridge->ops->need_p2m_mapping )
>> + return true;
>> +
>> + return bridge->ops->need_p2m_mapping(d, bridge, addr, len);
>> + }
>> + printk(XENLOG_ERR "Unable to find PCI bridge for %s segment %d, addr %lx\n",
>> + node->full_name, bridge->segment, addr);
>> + return true;
>> +}
>
> If you really need to map the hostbridges, then I would suggest to defer the P2M mappings for all of them and then walk all the bridge in one go to do the mappings.
>
> This would avoid going through the bridges every time during setup.
Well, this can be done, but: my implementation prevents mappings and what
you suggest will require unmapping. So, the cost in my solution is we need
to go over all the bridges (until we find the one we need) and in what you
suggest we'll need to unmap what we have just mapped.
I think preventing unneeded mappings is cheaper than unmapping afterwards.
>
>> +
>> /*
>> * Local variables:
>> * mode: C
>> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
>> index 2c7c7649e00f..9c28a4bdc4b7 100644
>> --- a/xen/include/asm-arm/pci.h
>> +++ b/xen/include/asm-arm/pci.h
>> @@ -82,6 +82,8 @@ struct pci_ops {
>> int (*register_mmio_handler)(struct domain *d,
>> struct pci_host_bridge *bridge,
>> const struct mmio_handler_ops *ops);
>> + int (*need_p2m_mapping)(struct domain *d, struct pci_host_bridge *bridge,
>> + uint64_t addr, uint64_t len);
>> };
>> /*
>> @@ -115,9 +117,19 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
>> int pci_host_iterate_bridges(struct domain *d,
>> int (*clb)(struct domain *d,
>> struct pci_host_bridge *bridge));
>> +bool pci_host_bridge_need_p2m_mapping(struct domain *d,
>> + const struct dt_device_node *node,
>> + uint64_t addr, uint64_t len);
>> #else /*!CONFIG_HAS_PCI*/
>> struct arch_pci_dev { };
>> +static inline bool
>> +pci_host_bridge_need_p2m_mapping(struct domain *d,
>> + const struct dt_device_node *node,
>> + uint64_t addr, uint64_t len)
>> +{
>> + return true;
>> +}
>> #endif /*!CONFIG_HAS_PCI*/
>> #endif /* __ARM_PCI_H__ */
>>
>
> Cheers,
>
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-10 11:43 ` Oleksandr Andrushchenko
@ 2021-09-10 13:04 ` Julien Grall
2021-09-10 13:15 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-10 13:04 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 10/09/2021 12:43, Oleksandr Andrushchenko wrote:
> Hi, Julien!
Hi Oleksandr,
> On 09.09.21 20:43, Julien Grall wrote:
>> Hi Oleksandr,
>>
>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>
>>> In order vPCI to work it needs all access to PCI configuration space
>>> (ECAM) to be synchronized among all entities, e.g. hardware domain and
>>> guests.
>>
>> I am not entirely sure what you mean by "synchronized" here. Are you refer to the content of the configuration space?
>
> We maintain hwdom's and guest's view of the registers we are interested in
>
> So, to have hwdom's view we also need to trap its access to the configuration space.
>
> Probably "synchronized" is not the right wording here.
I would simply say that we want to expose an emulated hostbridge to the
dom0 so we need to unmap the configuration space.
>
>>
>>> For that implement PCI host bridge specific callbacks to
>>> properly setup those ranges depending on particular host bridge
>>> implementation.
>>>
>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>> ---
>>> xen/arch/arm/pci/ecam.c | 11 +++++++++++
>>> xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
>>> xen/arch/arm/vpci.c | 13 +++++++++++++
>>> xen/include/asm-arm/pci.h | 8 ++++++++
>>> 4 files changed, 48 insertions(+)
>>>
>>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>>> index 91c691b41fdf..92ecb2e0762b 100644
>>> --- a/xen/arch/arm/pci/ecam.c
>>> +++ b/xen/arch/arm/pci/ecam.c
>>> @@ -42,6 +42,16 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>> return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
>>> }
>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>> + struct pci_host_bridge *bridge,
>>> + const struct mmio_handler_ops *ops)
>>> +{
>>> + struct pci_config_window *cfg = bridge->sysdata;
>>> +
>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>
>> We have a fixed array for handling the MMIO handlers.
>
> Didn't know that:
>
> xen/include/asm-arm/mmio.h:27:#define MAX_IO_HANDLER 16
>
>> So you need to make sure we have enough space in it to store one handler per handler.
>>
>> This is quite similar to the problem we had with the re-distribuor on GICv3. Have a look there to see how we dealt with it.
>
> Could you please point me to that solution? I can only see
>
> /* Register mmio handle for the Distributor */
> register_mmio_handler(d, &vgic_distr_mmio_handler, d->arch.vgic.dbase,
> SZ_64K, NULL);
>
> /*
> * Register mmio handler per contiguous region occupied by the
> * redistributors. The handler will take care to choose which
> * redistributor is targeted.
> */
> for ( i = 0; i < d->arch.vgic.nr_regions; i++ )
> {
> struct vgic_rdist_region *region = &d->arch.vgic.rdist_regions[i];
>
> register_mmio_handler(d, &vgic_rdistr_mmio_handler,
> region->base, region->size, region);
> }
> which IMO doesn't care about the number of MMIOs we can handle
Please see vgic_v3_init(). We update mmio_count that is then used as a
the second argument for domain_io_init().
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-10 13:04 ` Julien Grall
@ 2021-09-10 13:15 ` Oleksandr Andrushchenko
2021-09-10 13:20 ` Julien Grall
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-10 13:15 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
Hi, Julien!
On 10.09.21 16:04, Julien Grall wrote:
>
>
> On 10/09/2021 12:43, Oleksandr Andrushchenko wrote:
>> Hi, Julien!
>
> Hi Oleksandr,
>
>> On 09.09.21 20:43, Julien Grall wrote:
>>> Hi Oleksandr,
>>>
>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>
>>>> In order vPCI to work it needs all access to PCI configuration space
>>>> (ECAM) to be synchronized among all entities, e.g. hardware domain and
>>>> guests.
>>>
>>> I am not entirely sure what you mean by "synchronized" here. Are you refer to the content of the configuration space?
>>
>> We maintain hwdom's and guest's view of the registers we are interested in
>>
>> So, to have hwdom's view we also need to trap its access to the configuration space.
>>
>> Probably "synchronized" is not the right wording here.
> I would simply say that we want to expose an emulated hostbridge to the dom0 so we need to unmap the configuration space.
Sounds good
>
>>
>>>
>>>> For that implement PCI host bridge specific callbacks to
>>>> properly setup those ranges depending on particular host bridge
>>>> implementation.
>>>>
>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>> ---
>>>> xen/arch/arm/pci/ecam.c | 11 +++++++++++
>>>> xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
>>>> xen/arch/arm/vpci.c | 13 +++++++++++++
>>>> xen/include/asm-arm/pci.h | 8 ++++++++
>>>> 4 files changed, 48 insertions(+)
>>>>
>>>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>>>> index 91c691b41fdf..92ecb2e0762b 100644
>>>> --- a/xen/arch/arm/pci/ecam.c
>>>> +++ b/xen/arch/arm/pci/ecam.c
>>>> @@ -42,6 +42,16 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>>> return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
>>>> }
>>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>>> + struct pci_host_bridge *bridge,
>>>> + const struct mmio_handler_ops *ops)
>>>> +{
>>>> + struct pci_config_window *cfg = bridge->sysdata;
>>>> +
>>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>>
>>> We have a fixed array for handling the MMIO handlers.
>>
>> Didn't know that:
>>
>> xen/include/asm-arm/mmio.h:27:#define MAX_IO_HANDLER 16
>>
>>> So you need to make sure we have enough space in it to store one handler per handler.
>>>
>>> This is quite similar to the problem we had with the re-distribuor on GICv3. Have a look there to see how we dealt with it.
>>
>> Could you please point me to that solution? I can only see
>>
>> /* Register mmio handle for the Distributor */
>> register_mmio_handler(d, &vgic_distr_mmio_handler, d->arch.vgic.dbase,
>> SZ_64K, NULL);
>>
>> /*
>> * Register mmio handler per contiguous region occupied by the
>> * redistributors. The handler will take care to choose which
>> * redistributor is targeted.
>> */
>> for ( i = 0; i < d->arch.vgic.nr_regions; i++ )
>> {
>> struct vgic_rdist_region *region = &d->arch.vgic.rdist_regions[i];
>>
>> register_mmio_handler(d, &vgic_rdistr_mmio_handler,
>> region->base, region->size, region);
>> }
>> which IMO doesn't care about the number of MMIOs we can handle
>
> Please see vgic_v3_init(). We update mmio_count that is then used as a the second argument for domain_io_init().
Ah, so
1) This needs to be done before the array for the handlers is allocated
2) How do I know at the time of 1) how many bridges we have?
>
> Cheers,
>
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 12:37 ` Oleksandr Andrushchenko
@ 2021-09-10 13:18 ` Julien Grall
2021-09-10 14:01 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-10 13:18 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 10/09/2021 13:37, Oleksandr Andrushchenko wrote:
> Hi, Julien!
Hi Oleksandr,
> On 09.09.21 20:58, Julien Grall wrote:
>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>
>>> Host bridge controller's ECAM space is mapped into Domain-0's p2m,
>>> thus it is not possible to trap the same for vPCI via MMIO handlers.
>>> For this to work we need to not map those while constructing the domain.
>>>
>>> Note, that during Domain-0 creation there is no pci_dev yet allocated for
>>> host bridges, thus we cannot match PCI host and its associated
>>> bridge by SBDF. Use dt_device_node field for checks instead.
>>>
>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>> ---
>>> xen/arch/arm/domain_build.c | 3 +++
>>> xen/arch/arm/pci/ecam.c | 17 +++++++++++++++++
>>> xen/arch/arm/pci/pci-host-common.c | 22 ++++++++++++++++++++++
>>> xen/include/asm-arm/pci.h | 12 ++++++++++++
>>> 4 files changed, 54 insertions(+)
>>>
>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>> index da427f399711..76f5b513280c 100644
>>> --- a/xen/arch/arm/domain_build.c
>>> +++ b/xen/arch/arm/domain_build.c
>>> @@ -1257,6 +1257,9 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>>> }
>>> }
>>> + if ( need_mapping && (device_get_class(dev) == DEVICE_PCI) ) > + need_mapping = pci_host_bridge_need_p2m_mapping(d, dev,
>> addr, len);
>>
>> AFAICT, with device_get_class(dev), you know whether the hostbridge is used by Xen. Therefore, I would expect that we don't want to map all the regions of the hostbridges in dom0 (including the BARs).
>>
>> Can you clarify it?
> We only want to trap ECAM, not MMIOs and any other memory regions as the bridge is
>
> initialized and used by Domain-0 completely.
What do you mean by "used by Domain-0 completely"? The hostbridge is
owned by Xen so I don't think we can let dom0 access any MMIO regions by
default.
In particular, we may want to hide a device from dom0 for security
reasons. This is not going to be possible if you map by default
everything to dom0.
Instead, the BARs should be mapped on demand when dom0 when we trap
access to the configuration space.
For other regions, could you provide an example of what you are
referring too?
>>> +
>>> + /*
>>> + * We do not want ECAM address space to be mapped in domain's p2m,
>>> + * so we can trap access to it.
>>> + */
>>> + return cfg->phys_addr != addr;
>>> +}
>>> +
>>> /* ECAM ops */
>>> const struct pci_ecam_ops pci_generic_ecam_ops = {
>>> .bus_shift = 20,
>>> @@ -60,6 +76,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>>> .read = pci_generic_config_read,
>>> .write = pci_generic_config_write,
>>> .register_mmio_handler = pci_ecam_register_mmio_handler,
>>> + .need_p2m_mapping = pci_ecam_need_p2m_mapping,
>>> }
>>> };
>>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>>> index a89112bfbb7c..c04be636452d 100644
>>> --- a/xen/arch/arm/pci/pci-host-common.c
>>> +++ b/xen/arch/arm/pci/pci-host-common.c
>>> @@ -334,6 +334,28 @@ int pci_host_iterate_bridges(struct domain *d,
>>> }
>>> return 0;
>>> }
>>> +
>>> +bool pci_host_bridge_need_p2m_mapping(struct domain *d,
>>> + const struct dt_device_node *node,
>>> + uint64_t addr, uint64_t len)
>>> +{
>>> + struct pci_host_bridge *bridge;
>>> +
>>> + list_for_each_entry( bridge, &pci_host_bridges, node )
>>> + {
>>> + if ( bridge->dt_node != node )
>>> + continue;
>>> +
>>> + if ( !bridge->ops->need_p2m_mapping )
>>> + return true;
>>> +
>>> + return bridge->ops->need_p2m_mapping(d, bridge, addr, len);
>>> + }
>>> + printk(XENLOG_ERR "Unable to find PCI bridge for %s segment %d, addr %lx\n",
>>> + node->full_name, bridge->segment, addr);
>>> + return true;
>>> +}
>>
>> If you really need to map the hostbridges, then I would suggest to defer the P2M mappings for all of them and then walk all the bridge in one go to do the mappings.
>>
>> This would avoid going through the bridges every time during setup.
>
> Well, this can be done, but: my implementation prevents mappings and what
>
> you suggest will require unmapping. So, the cost in my solution is we need
>
> to go over all the bridges (until we find the one we need) and in what you
>
> suggest we'll need to unmap what we have just mapped.
>
> I think preventing unneeded mappings is cheaper than unmapping afterwards.
I think you misundertood what I am suggesting. What I said is you could
defer the mappings (IOW not do the mapping) until later for the
hostbridges. And then you can walk all the hostbridges to decide how to
map them.
The regions will only mapped once and never be unmapped.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-10 13:15 ` Oleksandr Andrushchenko
@ 2021-09-10 13:20 ` Julien Grall
2021-09-10 13:27 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-10 13:20 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 10/09/2021 14:15, Oleksandr Andrushchenko wrote:
> Hi, Julien!
Hi,
> On 10.09.21 16:04, Julien Grall wrote:
>>
>>
>> On 10/09/2021 12:43, Oleksandr Andrushchenko wrote:
>>> Hi, Julien!
>>
>> Hi Oleksandr,
>>
>>> On 09.09.21 20:43, Julien Grall wrote:
>>>> Hi Oleksandr,
>>>>
>>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>
>>>>> In order vPCI to work it needs all access to PCI configuration space
>>>>> (ECAM) to be synchronized among all entities, e.g. hardware domain and
>>>>> guests.
>>>>
>>>> I am not entirely sure what you mean by "synchronized" here. Are you refer to the content of the configuration space?
>>>
>>> We maintain hwdom's and guest's view of the registers we are interested in
>>>
>>> So, to have hwdom's view we also need to trap its access to the configuration space.
>>>
>>> Probably "synchronized" is not the right wording here.
>> I would simply say that we want to expose an emulated hostbridge to the dom0 so we need to unmap the configuration space.
> Sounds good
>>
>>>
>>>>
>>>>> For that implement PCI host bridge specific callbacks to
>>>>> properly setup those ranges depending on particular host bridge
>>>>> implementation.
>>>>>
>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>> ---
>>>>> xen/arch/arm/pci/ecam.c | 11 +++++++++++
>>>>> xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
>>>>> xen/arch/arm/vpci.c | 13 +++++++++++++
>>>>> xen/include/asm-arm/pci.h | 8 ++++++++
>>>>> 4 files changed, 48 insertions(+)
>>>>>
>>>>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>>>>> index 91c691b41fdf..92ecb2e0762b 100644
>>>>> --- a/xen/arch/arm/pci/ecam.c
>>>>> +++ b/xen/arch/arm/pci/ecam.c
>>>>> @@ -42,6 +42,16 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>>>> return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
>>>>> }
>>>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>>>> + struct pci_host_bridge *bridge,
>>>>> + const struct mmio_handler_ops *ops)
>>>>> +{
>>>>> + struct pci_config_window *cfg = bridge->sysdata;
>>>>> +
>>>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>>>
>>>> We have a fixed array for handling the MMIO handlers.
>>>
>>> Didn't know that:
>>>
>>> xen/include/asm-arm/mmio.h:27:#define MAX_IO_HANDLER 16
>>>
>>>> So you need to make sure we have enough space in it to store one handler per handler.
>>>>
>>>> This is quite similar to the problem we had with the re-distribuor on GICv3. Have a look there to see how we dealt with it.
>>>
>>> Could you please point me to that solution? I can only see
>>>
>>> /* Register mmio handle for the Distributor */
>>> register_mmio_handler(d, &vgic_distr_mmio_handler, d->arch.vgic.dbase,
>>> SZ_64K, NULL);
>>>
>>> /*
>>> * Register mmio handler per contiguous region occupied by the
>>> * redistributors. The handler will take care to choose which
>>> * redistributor is targeted.
>>> */
>>> for ( i = 0; i < d->arch.vgic.nr_regions; i++ )
>>> {
>>> struct vgic_rdist_region *region = &d->arch.vgic.rdist_regions[i];
>>>
>>> register_mmio_handler(d, &vgic_rdistr_mmio_handler,
>>> region->base, region->size, region);
>>> }
>>> which IMO doesn't care about the number of MMIOs we can handle
>>
>> Please see vgic_v3_init(). We update mmio_count that is then used as a the second argument for domain_io_init().
>
> Ah, so
>
> 1) This needs to be done before the array for the handlers is allocated
>
> 2) How do I know at the time of 1) how many bridges we have?
By counting the number of bridge you want to expose to dom0? I am not
entirely sure what else you expect me to say.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-10 13:20 ` Julien Grall
@ 2021-09-10 13:27 ` Oleksandr Andrushchenko
2021-09-10 13:33 ` Julien Grall
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-10 13:27 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
Hi,
On 10.09.21 16:20, Julien Grall wrote:
>
>
> On 10/09/2021 14:15, Oleksandr Andrushchenko wrote:
>> Hi, Julien!
>
> Hi,
>
>> On 10.09.21 16:04, Julien Grall wrote:
>>>
>>>
>>> On 10/09/2021 12:43, Oleksandr Andrushchenko wrote:
>>>> Hi, Julien!
>>>
>>> Hi Oleksandr,
>>>
>>>> On 09.09.21 20:43, Julien Grall wrote:
>>>>> Hi Oleksandr,
>>>>>
>>>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>>
>>>>>> In order vPCI to work it needs all access to PCI configuration space
>>>>>> (ECAM) to be synchronized among all entities, e.g. hardware domain and
>>>>>> guests.
>>>>>
>>>>> I am not entirely sure what you mean by "synchronized" here. Are you refer to the content of the configuration space?
>>>>
>>>> We maintain hwdom's and guest's view of the registers we are interested in
>>>>
>>>> So, to have hwdom's view we also need to trap its access to the configuration space.
>>>>
>>>> Probably "synchronized" is not the right wording here.
>>> I would simply say that we want to expose an emulated hostbridge to the dom0 so we need to unmap the configuration space.
>> Sounds good
>>>
>>>>
>>>>>
>>>>>> For that implement PCI host bridge specific callbacks to
>>>>>> properly setup those ranges depending on particular host bridge
>>>>>> implementation.
>>>>>>
>>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>> ---
>>>>>> xen/arch/arm/pci/ecam.c | 11 +++++++++++
>>>>>> xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
>>>>>> xen/arch/arm/vpci.c | 13 +++++++++++++
>>>>>> xen/include/asm-arm/pci.h | 8 ++++++++
>>>>>> 4 files changed, 48 insertions(+)
>>>>>>
>>>>>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>>>>>> index 91c691b41fdf..92ecb2e0762b 100644
>>>>>> --- a/xen/arch/arm/pci/ecam.c
>>>>>> +++ b/xen/arch/arm/pci/ecam.c
>>>>>> @@ -42,6 +42,16 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>>>>> return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
>>>>>> }
>>>>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>>>>> + struct pci_host_bridge *bridge,
>>>>>> + const struct mmio_handler_ops *ops)
>>>>>> +{
>>>>>> + struct pci_config_window *cfg = bridge->sysdata;
>>>>>> +
>>>>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>>>>
>>>>> We have a fixed array for handling the MMIO handlers.
>>>>
>>>> Didn't know that:
>>>>
>>>> xen/include/asm-arm/mmio.h:27:#define MAX_IO_HANDLER 16
>>>>
>>>>> So you need to make sure we have enough space in it to store one handler per handler.
>>>>>
>>>>> This is quite similar to the problem we had with the re-distribuor on GICv3. Have a look there to see how we dealt with it.
>>>>
>>>> Could you please point me to that solution? I can only see
>>>>
>>>> /* Register mmio handle for the Distributor */
>>>> register_mmio_handler(d, &vgic_distr_mmio_handler, d->arch.vgic.dbase,
>>>> SZ_64K, NULL);
>>>>
>>>> /*
>>>> * Register mmio handler per contiguous region occupied by the
>>>> * redistributors. The handler will take care to choose which
>>>> * redistributor is targeted.
>>>> */
>>>> for ( i = 0; i < d->arch.vgic.nr_regions; i++ )
>>>> {
>>>> struct vgic_rdist_region *region = &d->arch.vgic.rdist_regions[i];
>>>>
>>>> register_mmio_handler(d, &vgic_rdistr_mmio_handler,
>>>> region->base, region->size, region);
>>>> }
>>>> which IMO doesn't care about the number of MMIOs we can handle
>>>
>>> Please see vgic_v3_init(). We update mmio_count that is then used as a the second argument for domain_io_init().
>>
>> Ah, so
>>
>> 1) This needs to be done before the array for the handlers is allocated
>>
>> 2) How do I know at the time of 1) how many bridges we have?
>
> By counting the number of bridge you want to expose to dom0? I am not entirely sure what else you expect me to say.
Ok, so I'll go over the device tree and find out all the bridges, e.g. devices with DEVICE_PCI type.
Then I'll also need to exclude those being passed through (xen,passthrough) and the rest are the bridges for Domain-0?
Is this what you mean?
>
> Cheers,
>
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-10 13:27 ` Oleksandr Andrushchenko
@ 2021-09-10 13:33 ` Julien Grall
2021-09-10 13:40 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-10 13:33 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 10/09/2021 14:27, Oleksandr Andrushchenko wrote:
> Hi,
>
> On 10.09.21 16:20, Julien Grall wrote:
>>
>>
>> On 10/09/2021 14:15, Oleksandr Andrushchenko wrote:
>>> Hi, Julien!
>>
>> Hi,
>>
>>> On 10.09.21 16:04, Julien Grall wrote:
>>>>
>>>>
>>>> On 10/09/2021 12:43, Oleksandr Andrushchenko wrote:
>>>>> Hi, Julien!
>>>>
>>>> Hi Oleksandr,
>>>>
>>>>> On 09.09.21 20:43, Julien Grall wrote:
>>>>>> Hi Oleksandr,
>>>>>>
>>>>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>>>
>>>>>>> In order vPCI to work it needs all access to PCI configuration space
>>>>>>> (ECAM) to be synchronized among all entities, e.g. hardware domain and
>>>>>>> guests.
>>>>>>
>>>>>> I am not entirely sure what you mean by "synchronized" here. Are you refer to the content of the configuration space?
>>>>>
>>>>> We maintain hwdom's and guest's view of the registers we are interested in
>>>>>
>>>>> So, to have hwdom's view we also need to trap its access to the configuration space.
>>>>>
>>>>> Probably "synchronized" is not the right wording here.
>>>> I would simply say that we want to expose an emulated hostbridge to the dom0 so we need to unmap the configuration space.
>>> Sounds good
>>>>
>>>>>
>>>>>>
>>>>>>> For that implement PCI host bridge specific callbacks to
>>>>>>> properly setup those ranges depending on particular host bridge
>>>>>>> implementation.
>>>>>>>
>>>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>>> ---
>>>>>>> xen/arch/arm/pci/ecam.c | 11 +++++++++++
>>>>>>> xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
>>>>>>> xen/arch/arm/vpci.c | 13 +++++++++++++
>>>>>>> xen/include/asm-arm/pci.h | 8 ++++++++
>>>>>>> 4 files changed, 48 insertions(+)
>>>>>>>
>>>>>>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>>>>>>> index 91c691b41fdf..92ecb2e0762b 100644
>>>>>>> --- a/xen/arch/arm/pci/ecam.c
>>>>>>> +++ b/xen/arch/arm/pci/ecam.c
>>>>>>> @@ -42,6 +42,16 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>>>>>> return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
>>>>>>> }
>>>>>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>>>>>> + struct pci_host_bridge *bridge,
>>>>>>> + const struct mmio_handler_ops *ops)
>>>>>>> +{
>>>>>>> + struct pci_config_window *cfg = bridge->sysdata;
>>>>>>> +
>>>>>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>>>>>
>>>>>> We have a fixed array for handling the MMIO handlers.
>>>>>
>>>>> Didn't know that:
>>>>>
>>>>> xen/include/asm-arm/mmio.h:27:#define MAX_IO_HANDLER 16
>>>>>
>>>>>> So you need to make sure we have enough space in it to store one handler per handler.
>>>>>>
>>>>>> This is quite similar to the problem we had with the re-distribuor on GICv3. Have a look there to see how we dealt with it.
>>>>>
>>>>> Could you please point me to that solution? I can only see
>>>>>
>>>>> /* Register mmio handle for the Distributor */
>>>>> register_mmio_handler(d, &vgic_distr_mmio_handler, d->arch.vgic.dbase,
>>>>> SZ_64K, NULL);
>>>>>
>>>>> /*
>>>>> * Register mmio handler per contiguous region occupied by the
>>>>> * redistributors. The handler will take care to choose which
>>>>> * redistributor is targeted.
>>>>> */
>>>>> for ( i = 0; i < d->arch.vgic.nr_regions; i++ )
>>>>> {
>>>>> struct vgic_rdist_region *region = &d->arch.vgic.rdist_regions[i];
>>>>>
>>>>> register_mmio_handler(d, &vgic_rdistr_mmio_handler,
>>>>> region->base, region->size, region);
>>>>> }
>>>>> which IMO doesn't care about the number of MMIOs we can handle
>>>>
>>>> Please see vgic_v3_init(). We update mmio_count that is then used as a the second argument for domain_io_init().
>>>
>>> Ah, so
>>>
>>> 1) This needs to be done before the array for the handlers is allocated
>>>
>>> 2) How do I know at the time of 1) how many bridges we have?
>>
>> By counting the number of bridge you want to expose to dom0? I am not entirely sure what else you expect me to say.
>
> Ok, so I'll go over the device tree and find out all the bridges, e.g. devices with DEVICE_PCI type.
>
> Then I'll also need to exclude those being passed through (xen,passthrough) and the rest are the bridges for Domain-0?
What you want to know if how many times register_mmio_handler() will be
called from domain_vpci_init().
You introduced a function pci_host_iterate_bridges() that will walk over
the bridges and then call the callback vpci_setup_mmio_handler(). So you
could introduce a new callback that will return 1 if
bridge->ops->register_mmio_handler is not NULL or 0.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-10 13:33 ` Julien Grall
@ 2021-09-10 13:40 ` Oleksandr Andrushchenko
2021-09-14 13:47 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-10 13:40 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 10.09.21 16:33, Julien Grall wrote:
>
>
> On 10/09/2021 14:27, Oleksandr Andrushchenko wrote:
>> Hi,
>>
>> On 10.09.21 16:20, Julien Grall wrote:
>>>
>>>
>>> On 10/09/2021 14:15, Oleksandr Andrushchenko wrote:
>>>> Hi, Julien!
>>>
>>> Hi,
>>>
>>>> On 10.09.21 16:04, Julien Grall wrote:
>>>>>
>>>>>
>>>>> On 10/09/2021 12:43, Oleksandr Andrushchenko wrote:
>>>>>> Hi, Julien!
>>>>>
>>>>> Hi Oleksandr,
>>>>>
>>>>>> On 09.09.21 20:43, Julien Grall wrote:
>>>>>>> Hi Oleksandr,
>>>>>>>
>>>>>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>>>>
>>>>>>>> In order vPCI to work it needs all access to PCI configuration space
>>>>>>>> (ECAM) to be synchronized among all entities, e.g. hardware domain and
>>>>>>>> guests.
>>>>>>>
>>>>>>> I am not entirely sure what you mean by "synchronized" here. Are you refer to the content of the configuration space?
>>>>>>
>>>>>> We maintain hwdom's and guest's view of the registers we are interested in
>>>>>>
>>>>>> So, to have hwdom's view we also need to trap its access to the configuration space.
>>>>>>
>>>>>> Probably "synchronized" is not the right wording here.
>>>>> I would simply say that we want to expose an emulated hostbridge to the dom0 so we need to unmap the configuration space.
>>>> Sounds good
>>>>>
>>>>>>
>>>>>>>
>>>>>>>> For that implement PCI host bridge specific callbacks to
>>>>>>>> properly setup those ranges depending on particular host bridge
>>>>>>>> implementation.
>>>>>>>>
>>>>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>>>> ---
>>>>>>>> xen/arch/arm/pci/ecam.c | 11 +++++++++++
>>>>>>>> xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
>>>>>>>> xen/arch/arm/vpci.c | 13 +++++++++++++
>>>>>>>> xen/include/asm-arm/pci.h | 8 ++++++++
>>>>>>>> 4 files changed, 48 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
>>>>>>>> index 91c691b41fdf..92ecb2e0762b 100644
>>>>>>>> --- a/xen/arch/arm/pci/ecam.c
>>>>>>>> +++ b/xen/arch/arm/pci/ecam.c
>>>>>>>> @@ -42,6 +42,16 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
>>>>>>>> return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
>>>>>>>> }
>>>>>>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>>>>>>> + struct pci_host_bridge *bridge,
>>>>>>>> + const struct mmio_handler_ops *ops)
>>>>>>>> +{
>>>>>>>> + struct pci_config_window *cfg = bridge->sysdata;
>>>>>>>> +
>>>>>>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>>>>>>
>>>>>>> We have a fixed array for handling the MMIO handlers.
>>>>>>
>>>>>> Didn't know that:
>>>>>>
>>>>>> xen/include/asm-arm/mmio.h:27:#define MAX_IO_HANDLER 16
>>>>>>
>>>>>>> So you need to make sure we have enough space in it to store one handler per handler.
>>>>>>>
>>>>>>> This is quite similar to the problem we had with the re-distribuor on GICv3. Have a look there to see how we dealt with it.
>>>>>>
>>>>>> Could you please point me to that solution? I can only see
>>>>>>
>>>>>> /* Register mmio handle for the Distributor */
>>>>>> register_mmio_handler(d, &vgic_distr_mmio_handler, d->arch.vgic.dbase,
>>>>>> SZ_64K, NULL);
>>>>>>
>>>>>> /*
>>>>>> * Register mmio handler per contiguous region occupied by the
>>>>>> * redistributors. The handler will take care to choose which
>>>>>> * redistributor is targeted.
>>>>>> */
>>>>>> for ( i = 0; i < d->arch.vgic.nr_regions; i++ )
>>>>>> {
>>>>>> struct vgic_rdist_region *region = &d->arch.vgic.rdist_regions[i];
>>>>>>
>>>>>> register_mmio_handler(d, &vgic_rdistr_mmio_handler,
>>>>>> region->base, region->size, region);
>>>>>> }
>>>>>> which IMO doesn't care about the number of MMIOs we can handle
>>>>>
>>>>> Please see vgic_v3_init(). We update mmio_count that is then used as a the second argument for domain_io_init().
>>>>
>>>> Ah, so
>>>>
>>>> 1) This needs to be done before the array for the handlers is allocated
>>>>
>>>> 2) How do I know at the time of 1) how many bridges we have?
>>>
>>> By counting the number of bridge you want to expose to dom0? I am not entirely sure what else you expect me to say.
>>
>> Ok, so I'll go over the device tree and find out all the bridges, e.g. devices with DEVICE_PCI type.
>>
>> Then I'll also need to exclude those being passed through (xen,passthrough) and the rest are the bridges for Domain-0?
>
> What you want to know if how many times register_mmio_handler() will be called from domain_vpci_init().
>
> You introduced a function pci_host_iterate_bridges() that will walk over the bridges and then call the callback vpci_setup_mmio_handler(). So you could introduce a new callback that will return 1 if bridge->ops->register_mmio_handler is not NULL or 0.
Ok, clear. Something like:
if ( (rc = domain_vgic_register(d, &count)) != 0 )
goto fail;
*find out how many bridges and update count*
if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
goto fail;
>
> Cheers,
>
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 13:18 ` Julien Grall
@ 2021-09-10 14:01 ` Oleksandr Andrushchenko
2021-09-10 14:18 ` Julien Grall
2021-09-10 15:04 ` Julien Grall
0 siblings, 2 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-10 14:01 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 10.09.21 16:18, Julien Grall wrote:
>
>
> On 10/09/2021 13:37, Oleksandr Andrushchenko wrote:
>> Hi, Julien!
>
> Hi Oleksandr,
>
>> On 09.09.21 20:58, Julien Grall wrote:
>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>
>>>> Host bridge controller's ECAM space is mapped into Domain-0's p2m,
>>>> thus it is not possible to trap the same for vPCI via MMIO handlers.
>>>> For this to work we need to not map those while constructing the domain.
>>>>
>>>> Note, that during Domain-0 creation there is no pci_dev yet allocated for
>>>> host bridges, thus we cannot match PCI host and its associated
>>>> bridge by SBDF. Use dt_device_node field for checks instead.
>>>>
>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>> ---
>>>> xen/arch/arm/domain_build.c | 3 +++
>>>> xen/arch/arm/pci/ecam.c | 17 +++++++++++++++++
>>>> xen/arch/arm/pci/pci-host-common.c | 22 ++++++++++++++++++++++
>>>> xen/include/asm-arm/pci.h | 12 ++++++++++++
>>>> 4 files changed, 54 insertions(+)
>>>>
>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>> index da427f399711..76f5b513280c 100644
>>>> --- a/xen/arch/arm/domain_build.c
>>>> +++ b/xen/arch/arm/domain_build.c
>>>> @@ -1257,6 +1257,9 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>>>> }
>>>> }
>>>> + if ( need_mapping && (device_get_class(dev) == DEVICE_PCI) ) > + need_mapping = pci_host_bridge_need_p2m_mapping(d, dev,
>>> addr, len);
>>>
>>> AFAICT, with device_get_class(dev), you know whether the hostbridge is used by Xen. Therefore, I would expect that we don't want to map all the regions of the hostbridges in dom0 (including the BARs).
>>>
>>> Can you clarify it?
>> We only want to trap ECAM, not MMIOs and any other memory regions as the bridge is
>>
>> initialized and used by Domain-0 completely.
>
> What do you mean by "used by Domain-0 completely"? The hostbridge is owned by Xen so I don't think we can let dom0 access any MMIO regions by
> default.
Now it's my time to ask why do you think Xen owns the bridge?
All the bridges are passed to Domain-0, are fully visible to Domain-0, initialized in Domain-0.
Enumeration etc. is done in Domain-0. So how comes that Xen owns the bridge? In what way it does?
Xen just accesses the ECAM when it needs it, but that's it. Xen traps ECAM access - that is true.
But it in no way uses MMIOs etc. of the bridge - those under direct control of Domain-0
>
> In particular, we may want to hide a device from dom0 for security reasons. This is not going to be possible if you map by default everything to dom0.
Then the bridge most probably will become unusable as we do not have relevant drivers in Xen.
At best we may rely on the firmware doing the initialization for us, then yes, we can control an ECAM bridge in Xen.
But this is not the case now: we rely on Domain-0 to initialize and setup the bridge
>
> Instead, the BARs should be mapped on demand when dom0 when we trap access to the configuration space.
>
> For other regions, could you provide an example of what you are referring too?
Registers of the bridge for example, all what is listed in "reg" property
>
>>>> +
>>>> + /*
>>>> + * We do not want ECAM address space to be mapped in domain's p2m,
>>>> + * so we can trap access to it.
>>>> + */
>>>> + return cfg->phys_addr != addr;
>>>> +}
>>>> +
>>>> /* ECAM ops */
>>>> const struct pci_ecam_ops pci_generic_ecam_ops = {
>>>> .bus_shift = 20,
>>>> @@ -60,6 +76,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>>>> .read = pci_generic_config_read,
>>>> .write = pci_generic_config_write,
>>>> .register_mmio_handler = pci_ecam_register_mmio_handler,
>>>> + .need_p2m_mapping = pci_ecam_need_p2m_mapping,
>>>> }
>>>> };
>>>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>>>> index a89112bfbb7c..c04be636452d 100644
>>>> --- a/xen/arch/arm/pci/pci-host-common.c
>>>> +++ b/xen/arch/arm/pci/pci-host-common.c
>>>> @@ -334,6 +334,28 @@ int pci_host_iterate_bridges(struct domain *d,
>>>> }
>>>> return 0;
>>>> }
>>>> +
>>>> +bool pci_host_bridge_need_p2m_mapping(struct domain *d,
>>>> + const struct dt_device_node *node,
>>>> + uint64_t addr, uint64_t len)
>>>> +{
>>>> + struct pci_host_bridge *bridge;
>>>> +
>>>> + list_for_each_entry( bridge, &pci_host_bridges, node )
>>>> + {
>>>> + if ( bridge->dt_node != node )
>>>> + continue;
>>>> +
>>>> + if ( !bridge->ops->need_p2m_mapping )
>>>> + return true;
>>>> +
>>>> + return bridge->ops->need_p2m_mapping(d, bridge, addr, len);
>>>> + }
>>>> + printk(XENLOG_ERR "Unable to find PCI bridge for %s segment %d, addr %lx\n",
>>>> + node->full_name, bridge->segment, addr);
>>>> + return true;
>>>> +}
>>>
>>> If you really need to map the hostbridges, then I would suggest to defer the P2M mappings for all of them and then walk all the bridge in one go to do the mappings.
>>>
>>> This would avoid going through the bridges every time during setup.
>>
>> Well, this can be done, but: my implementation prevents mappings and what
>>
>> you suggest will require unmapping. So, the cost in my solution is we need
>>
>> to go over all the bridges (until we find the one we need) and in what you
>>
>> suggest we'll need to unmap what we have just mapped.
>>
>> I think preventing unneeded mappings is cheaper than unmapping afterwards.
>
> I think you misundertood what I am suggesting. What I said is you could defer the mappings (IOW not do the mapping) until later for the hostbridges.
For each device tree device we call
static int __init map_range_to_domain(const struct dt_device_node *dev,
u64 addr, u64 len,
void *data)
which will call
res = map_regions_p2mt(d,
So, ECAM will be mapped and then we'll need to unmap it
> And then you can walk all the hostbridges to decide how to map them.
We don't want to map ECAM, we want to trap it
>
> The regions will only mapped once and never be unmapped.
>
> Cheers,
>
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 14:01 ` Oleksandr Andrushchenko
@ 2021-09-10 14:18 ` Julien Grall
2021-09-10 14:38 ` Oleksandr Andrushchenko
2021-09-10 15:04 ` Julien Grall
1 sibling, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-10 14:18 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 10/09/2021 15:01, Oleksandr Andrushchenko wrote:
>
> On 10.09.21 16:18, Julien Grall wrote:
>>
>>
>> On 10/09/2021 13:37, Oleksandr Andrushchenko wrote:
>>> Hi, Julien!
>>
>> Hi Oleksandr,
>>
>>> On 09.09.21 20:58, Julien Grall wrote:
>>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>
>>>>> Host bridge controller's ECAM space is mapped into Domain-0's p2m,
>>>>> thus it is not possible to trap the same for vPCI via MMIO handlers.
>>>>> For this to work we need to not map those while constructing the domain.
>>>>>
>>>>> Note, that during Domain-0 creation there is no pci_dev yet allocated for
>>>>> host bridges, thus we cannot match PCI host and its associated
>>>>> bridge by SBDF. Use dt_device_node field for checks instead.
>>>>>
>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>> ---
>>>>> xen/arch/arm/domain_build.c | 3 +++
>>>>> xen/arch/arm/pci/ecam.c | 17 +++++++++++++++++
>>>>> xen/arch/arm/pci/pci-host-common.c | 22 ++++++++++++++++++++++
>>>>> xen/include/asm-arm/pci.h | 12 ++++++++++++
>>>>> 4 files changed, 54 insertions(+)
>>>>>
>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>> index da427f399711..76f5b513280c 100644
>>>>> --- a/xen/arch/arm/domain_build.c
>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>> @@ -1257,6 +1257,9 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>>>>> }
>>>>> }
>>>>> + if ( need_mapping && (device_get_class(dev) == DEVICE_PCI) ) > + need_mapping = pci_host_bridge_need_p2m_mapping(d, dev,
>>>> addr, len);
>>>>
>>>> AFAICT, with device_get_class(dev), you know whether the hostbridge is used by Xen. Therefore, I would expect that we don't want to map all the regions of the hostbridges in dom0 (including the BARs).
>>>>
>>>> Can you clarify it?
>>> We only want to trap ECAM, not MMIOs and any other memory regions as the bridge is
>>>
>>> initialized and used by Domain-0 completely.
>>
>> What do you mean by "used by Domain-0 completely"? The hostbridge is owned by Xen so I don't think we can let dom0 access any MMIO regions by
>> default.
>
> Now it's my time to ask why do you think Xen owns the bridge?
Because nothing in this series indicates either way. So I assumed this
should be owned by Xen because it will drive it.
From what you wrote below, it sounds like this is not the case. I think
you want to have a design document sent along the series so we can
easily know what's the expected design and validate that the code match
the agreement.
There was already a design document sent a few months ago. So you may
want to refresh it (if needed) and send it again with this series.
>
> All the bridges are passed to Domain-0, are fully visible to Domain-0, initialized in Domain-0.
>
> Enumeration etc. is done in Domain-0. So how comes that Xen owns the bridge? In what way it does?
>
> Xen just accesses the ECAM when it needs it, but that's it. Xen traps ECAM access - that is true.
>
> But it in no way uses MMIOs etc. of the bridge - those under direct control of Domain-0
>
>>
>> In particular, we may want to hide a device from dom0 for security reasons. This is not going to be possible if you map by default everything to dom0.
>
> Then the bridge most probably will become unusable as we do not have relevant drivers in Xen.
>
> At best we may rely on the firmware doing the initialization for us, then yes, we can control an ECAM bridge in Xen.
>
> But this is not the case now: we rely on Domain-0 to initialize and setup the bridge
>
>>
>> Instead, the BARs should be mapped on demand when dom0 when we trap access to the configuration space.
>>
>> For other regions, could you provide an example of what you are referring too?
> Registers of the bridge for example, all what is listed in "reg" property
>>
>>>>> +
>>>>> + /*
>>>>> + * We do not want ECAM address space to be mapped in domain's p2m,
>>>>> + * so we can trap access to it.
>>>>> + */
>>>>> + return cfg->phys_addr != addr;
>>>>> +}
>>>>> +
>>>>> /* ECAM ops */
>>>>> const struct pci_ecam_ops pci_generic_ecam_ops = {
>>>>> .bus_shift = 20,
>>>>> @@ -60,6 +76,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
>>>>> .read = pci_generic_config_read,
>>>>> .write = pci_generic_config_write,
>>>>> .register_mmio_handler = pci_ecam_register_mmio_handler,
>>>>> + .need_p2m_mapping = pci_ecam_need_p2m_mapping,
>>>>> }
>>>>> };
>>>>> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
>>>>> index a89112bfbb7c..c04be636452d 100644
>>>>> --- a/xen/arch/arm/pci/pci-host-common.c
>>>>> +++ b/xen/arch/arm/pci/pci-host-common.c
>>>>> @@ -334,6 +334,28 @@ int pci_host_iterate_bridges(struct domain *d,
>>>>> }
>>>>> return 0;
>>>>> }
>>>>> +
>>>>> +bool pci_host_bridge_need_p2m_mapping(struct domain *d,
>>>>> + const struct dt_device_node *node,
>>>>> + uint64_t addr, uint64_t len)
>>>>> +{
>>>>> + struct pci_host_bridge *bridge;
>>>>> +
>>>>> + list_for_each_entry( bridge, &pci_host_bridges, node )
>>>>> + {
>>>>> + if ( bridge->dt_node != node )
>>>>> + continue;
>>>>> +
>>>>> + if ( !bridge->ops->need_p2m_mapping )
>>>>> + return true;
>>>>> +
>>>>> + return bridge->ops->need_p2m_mapping(d, bridge, addr, len);
>>>>> + }
>>>>> + printk(XENLOG_ERR "Unable to find PCI bridge for %s segment %d, addr %lx\n",
>>>>> + node->full_name, bridge->segment, addr);
>>>>> + return true;
>>>>> +}
>>>>
>>>> If you really need to map the hostbridges, then I would suggest to defer the P2M mappings for all of them and then walk all the bridge in one go to do the mappings.
>>>>
>>>> This would avoid going through the bridges every time during setup.
>>>
>>> Well, this can be done, but: my implementation prevents mappings and what
>>>
>>> you suggest will require unmapping. So, the cost in my solution is we need
>>>
>>> to go over all the bridges (until we find the one we need) and in what you
>>>
>>> suggest we'll need to unmap what we have just mapped.
>>>
>>> I think preventing unneeded mappings is cheaper than unmapping afterwards.
>>
>> I think you misundertood what I am suggesting. What I said is you could defer the mappings (IOW not do the mapping) until later for the hostbridges.
>
> For each device tree device we call
>
> static int __init map_range_to_domain(const struct dt_device_node *dev,
> u64 addr, u64 len,
> void *data)
> which will call
>
> res = map_regions_p2mt(d,
> So, ECAM will be mapped and then we'll need to unmap it
Well yes in the current code. But I specifically wrote "defer" as in
adding a check "if hostbridge then return 0".
>
>> And then you can walk all the hostbridges to decide how to map them.
> We don't want to map ECAM, we want to trap it
I know that... But as you wrote above, you want to map other regions of
the hostbridge. So if you defer *all* the mappings for the hostbridge,
then you can walk the hostbridges and check decide which regions should
be mapped.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 14:18 ` Julien Grall
@ 2021-09-10 14:38 ` Oleksandr Andrushchenko
2021-09-10 14:52 ` Julien Grall
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-10 14:38 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
[snip]
>>> What do you mean by "used by Domain-0 completely"? The hostbridge is owned by Xen so I don't think we can let dom0 access any MMIO regions by
>>> default.
>>
>> Now it's my time to ask why do you think Xen owns the bridge?
>
> Because nothing in this series indicates either way. So I assumed this should be owned by Xen because it will drive it.
>
> From what you wrote below, it sounds like this is not the case. I think you want to have a design document sent along the series so we can easily know what's the expected design and validate that the code match the agreement.
>
> There was already a design document sent a few months ago. So you may want to refresh it (if needed) and send it again with this series.
>
Please see [1] which is the design document we use to implement PCI passthrough on Arm.
For your convenience:
"
# Problem statement:
[snip]
Only Dom0 and Xen will have access to the real PCI bus, guest will have a
direct access to the assigned device itself. IOMEM memory will be mapped to
the guest and interrupt will be redirected to the guest. SMMU has to be
configured correctly to have DMA transaction."
"
# Discovering PCI Host Bridge in XEN:
In order to support the PCI passthrough XEN should be aware of all the PCI host
bridges available on the system and should be able to access the PCI
configuration space. ECAM configuration access is supported as of now. XEN
during boot will read the PCI device tree node “reg” property and will map the
ECAM space to the XEN memory using the “ioremap_nocache ()” function.
[snip]
When Dom0 tries to access the PCI config space of the device, XEN will find the
corresponding host bridge based on segment number and access the corresponding
config space assigned to that bridge.
Limitation:
* Only PCI ECAM configuration space access is supported.
* Device tree binding is supported as of now, ACPI is not supported.
* Need to port the PCI host bridge access code to XEN to access the
configuration space (generic one works but lots of platforms will required
some specific code or quirks).
"
Unfortunately the document had not been updated since then, but the question
being discussed has not changed in the design: Domain-0 has full access to the bridge,
Xen traps ECAM. (I am taking dom0less aside as that was not the target for the first phase)
Thank you,
Oleksandr
[1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 14:38 ` Oleksandr Andrushchenko
@ 2021-09-10 14:52 ` Julien Grall
2021-09-10 15:01 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-10 14:52 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 10/09/2021 15:38, Oleksandr Andrushchenko wrote:
> [snip]
>
>
>>>> What do you mean by "used by Domain-0 completely"? The hostbridge is owned by Xen so I don't think we can let dom0 access any MMIO regions by
>>>> default.
>>>
>>> Now it's my time to ask why do you think Xen owns the bridge?
>>
>> Because nothing in this series indicates either way. So I assumed this should be owned by Xen because it will drive it.
>>
>> From what you wrote below, it sounds like this is not the case. I think you want to have a design document sent along the series so we can easily know what's the expected design and validate that the code match the agreement.
>>
>> There was already a design document sent a few months ago. So you may want to refresh it (if needed) and send it again with this series.
>>
> Please see [1] which is the design document we use to implement PCI passthrough on Arm.
Thank. Can you make sure to include at least in a link in your cover
letter next time?
>
> For your convenience:
>
> "
>
> # Problem statement:
> [snip]
>
> Only Dom0 and Xen will have access to the real PCI bus, guest will have a
> direct access to the assigned device itself. IOMEM memory will be mapped to
> the guest and interrupt will be redirected to the guest. SMMU has to be
> configured correctly to have DMA transaction."
>
> "
>
> # Discovering PCI Host Bridge in XEN:
>
> In order to support the PCI passthrough XEN should be aware of all the PCI host
> bridges available on the system and should be able to access the PCI
> configuration space. ECAM configuration access is supported as of now. XEN
> during boot will read the PCI device tree node “reg” property and will map the
> ECAM space to the XEN memory using the “ioremap_nocache ()” function.
>
> [snip]
>
> When Dom0 tries to access the PCI config space of the device, XEN will find the
> corresponding host bridge based on segment number and access the corresponding
> config space assigned to that bridge.
>
> Limitation:
> * Only PCI ECAM configuration space access is supported.
> * Device tree binding is supported as of now, ACPI is not supported.
> * Need to port the PCI host bridge access code to XEN to access the
> configuration space (generic one works but lots of platforms will required
> some specific code or quirks).
>
> "
>
> Unfortunately the document had not been updated since then, but the question
>
> being discussed has not changed in the design: Domain-0 has full access to the bridge,
>
> Xen traps ECAM. (I am taking dom0less aside as that was not the target for the first phase)
Having an update design document is quite important. This will allow
reviewer to comment easily on overall approach and speed up the review
as we can match to the agreed approach.
So can this please be updated and sent along the work?
In addition to that, it feels to me that the commit message should
contain a summary of what you just pasted as this helps understanding
the goal and approach of this patch.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 14:52 ` Julien Grall
@ 2021-09-10 15:01 ` Oleksandr Andrushchenko
2021-09-10 15:05 ` Julien Grall
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-10 15:01 UTC (permalink / raw)
To: Julien Grall, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
On 10.09.21 17:52, Julien Grall wrote:
>
>
> On 10/09/2021 15:38, Oleksandr Andrushchenko wrote:
>> [snip]
>>
>>
>>>>> What do you mean by "used by Domain-0 completely"? The hostbridge is owned by Xen so I don't think we can let dom0 access any MMIO regions by
>>>>> default.
>>>>
>>>> Now it's my time to ask why do you think Xen owns the bridge?
>>>
>>> Because nothing in this series indicates either way. So I assumed this should be owned by Xen because it will drive it.
>>>
>>> From what you wrote below, it sounds like this is not the case. I think you want to have a design document sent along the series so we can easily know what's the expected design and validate that the code match the agreement.
>>>
>>> There was already a design document sent a few months ago. So you may want to refresh it (if needed) and send it again with this series.
>>>
>> Please see [1] which is the design document we use to implement PCI passthrough on Arm.
>
> Thank. Can you make sure to include at least in a link in your cover letter next time?
I will update the commit message to have more description on the design aspects
>
>>
>> For your convenience:
>>
>> "
>>
>> # Problem statement:
>> [snip]
>>
>> Only Dom0 and Xen will have access to the real PCI bus, guest will have a
>> direct access to the assigned device itself. IOMEM memory will be mapped to
>> the guest and interrupt will be redirected to the guest. SMMU has to be
>> configured correctly to have DMA transaction."
>>
>> "
>>
>> # Discovering PCI Host Bridge in XEN:
>>
>> In order to support the PCI passthrough XEN should be aware of all the PCI host
>> bridges available on the system and should be able to access the PCI
>> configuration space. ECAM configuration access is supported as of now. XEN
>> during boot will read the PCI device tree node “reg” property and will map the
>> ECAM space to the XEN memory using the “ioremap_nocache ()” function.
>>
>> [snip]
>>
>> When Dom0 tries to access the PCI config space of the device, XEN will find the
>> corresponding host bridge based on segment number and access the corresponding
>> config space assigned to that bridge.
>>
>> Limitation:
>> * Only PCI ECAM configuration space access is supported.
>> * Device tree binding is supported as of now, ACPI is not supported.
>> * Need to port the PCI host bridge access code to XEN to access the
>> configuration space (generic one works but lots of platforms will required
>> some specific code or quirks).
>>
>> "
>>
>> Unfortunately the document had not been updated since then, but the question
>>
>> being discussed has not changed in the design: Domain-0 has full access to the bridge,
>>
>> Xen traps ECAM. (I am taking dom0less aside as that was not the target for the first phase)
>
> Having an update design document is quite important. This will allow reviewer to comment easily on overall approach and speed up the review as we can match to the agreed approach.
>
> So can this please be updated and sent along the work?
>
> In addition to that, it feels to me that the commit message should contain a summary of what you just pasted as this helps understanding the goal and approach of this patch.
>
If we are on the same page now will you be able to review the patch
with respect to the design from RFC?
> Cheers,
>
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 14:01 ` Oleksandr Andrushchenko
2021-09-10 14:18 ` Julien Grall
@ 2021-09-10 15:04 ` Julien Grall
2021-09-10 20:30 ` Stefano Stabellini
1 sibling, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-10 15:04 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
Hi Oleksandr,
On 10/09/2021 15:01, Oleksandr Andrushchenko wrote:
>
> On 10.09.21 16:18, Julien Grall wrote:
>>
>>
>> On 10/09/2021 13:37, Oleksandr Andrushchenko wrote:
>>> Hi, Julien!
>>
>> Hi Oleksandr,
>>
>>> On 09.09.21 20:58, Julien Grall wrote:
>>>> On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>>
>>>>> Host bridge controller's ECAM space is mapped into Domain-0's p2m,
>>>>> thus it is not possible to trap the same for vPCI via MMIO handlers.
>>>>> For this to work we need to not map those while constructing the domain.
>>>>>
>>>>> Note, that during Domain-0 creation there is no pci_dev yet allocated for
>>>>> host bridges, thus we cannot match PCI host and its associated
>>>>> bridge by SBDF. Use dt_device_node field for checks instead.
>>>>>
>>>>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>>>> ---
>>>>> xen/arch/arm/domain_build.c | 3 +++
>>>>> xen/arch/arm/pci/ecam.c | 17 +++++++++++++++++
>>>>> xen/arch/arm/pci/pci-host-common.c | 22 ++++++++++++++++++++++
>>>>> xen/include/asm-arm/pci.h | 12 ++++++++++++
>>>>> 4 files changed, 54 insertions(+)
>>>>>
>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>> index da427f399711..76f5b513280c 100644
>>>>> --- a/xen/arch/arm/domain_build.c
>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>> @@ -1257,6 +1257,9 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
>>>>> }
>>>>> }
>>>>> + if ( need_mapping && (device_get_class(dev) == DEVICE_PCI) ) > + need_mapping = pci_host_bridge_need_p2m_mapping(d, dev,
>>>> addr, len);
>>>>
>>>> AFAICT, with device_get_class(dev), you know whether the hostbridge is used by Xen. Therefore, I would expect that we don't want to map all the regions of the hostbridges in dom0 (including the BARs).
>>>>
>>>> Can you clarify it?
>>> We only want to trap ECAM, not MMIOs and any other memory regions as the bridge is
>>>
>>> initialized and used by Domain-0 completely.
>>
>> What do you mean by "used by Domain-0 completely"? The hostbridge is owned by Xen so I don't think we can let dom0 access any MMIO regions by
>> default.
>
> Now it's my time to ask why do you think Xen owns the bridge?
>
> All the bridges are passed to Domain-0, are fully visible to Domain-0, initialized in Domain-0.
>
> Enumeration etc. is done in Domain-0. So how comes that Xen owns the bridge? In what way it does?
>
> Xen just accesses the ECAM when it needs it, but that's it. Xen traps ECAM access - that is true.
>
> But it in no way uses MMIOs etc. of the bridge - those under direct control of Domain-0
So I looked on the snipped of the design document you posted. I think
you are instead referring to a different part:
" PCI-PCIe enumeration is a process of detecting devices connected to
its host.
It is the responsibility of the hardware domain or boot firmware to do
the PCI enumeration and configure the BAR, PCI capabilities, and
MSI/MSI-X configuration."
But I still don't see why it means we have to map the MMIOs right now.
Dom0 should not need to access the MMIOs (aside the hostbridge
registers) until the BARs are configured.
You can do the mapping when the BAR is accessed. In fact, AFAICT, there
are code to do the identity mapping in the vPCI. So to me, the mappings
for the MMIOs are rather pointless if not confusing.
>>
>> In particular, we may want to hide a device from dom0 for security reasons. This is not going to be possible if you map by default everything to dom0.
>
> Then the bridge most probably will become unusable as we do not have relevant drivers in Xen.
>
> At best we may rely on the firmware doing the initialization for us, then yes, we can control an ECAM bridge in Xen.
>
> But this is not the case now: we rely on Domain-0 to initialize and setup the bridge
>
>>
>> Instead, the BARs should be mapped on demand when dom0 when we trap access to the configuration space.
>>
>> For other regions, could you provide an example of what you are referring too?
> Registers of the bridge for example, all what is listed in "reg" property
I was more after a real life example.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 15:01 ` Oleksandr Andrushchenko
@ 2021-09-10 15:05 ` Julien Grall
0 siblings, 0 replies; 69+ messages in thread
From: Julien Grall @ 2021-09-10 15:05 UTC (permalink / raw)
To: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel
Cc: sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
Hi Oleksandr,
On 10/09/2021 16:01, Oleksandr Andrushchenko wrote:
>
> On 10.09.21 17:52, Julien Grall wrote:
>>
>>
>> On 10/09/2021 15:38, Oleksandr Andrushchenko wrote:
>>> [snip]
>>>
>>>
>>>>>> What do you mean by "used by Domain-0 completely"? The hostbridge is owned by Xen so I don't think we can let dom0 access any MMIO regions by
>>>>>> default.
>>>>>
>>>>> Now it's my time to ask why do you think Xen owns the bridge?
>>>>
>>>> Because nothing in this series indicates either way. So I assumed this should be owned by Xen because it will drive it.
>>>>
>>>> From what you wrote below, it sounds like this is not the case. I think you want to have a design document sent along the series so we can easily know what's the expected design and validate that the code match the agreement.
>>>>
>>>> There was already a design document sent a few months ago. So you may want to refresh it (if needed) and send it again with this series.
>>>>
>>> Please see [1] which is the design document we use to implement PCI passthrough on Arm.
>>
>> Thank. Can you make sure to include at least in a link in your cover letter next time?
> I will update the commit message to have more description on the design aspects
>>
>>>
>>> For your convenience:
>>>
>>> "
>>>
>>> # Problem statement:
>>> [snip]
>>>
>>> Only Dom0 and Xen will have access to the real PCI bus, guest will have a
>>> direct access to the assigned device itself. IOMEM memory will be mapped to
>>> the guest and interrupt will be redirected to the guest. SMMU has to be
>>> configured correctly to have DMA transaction."
>>>
>>> "
>>>
>>> # Discovering PCI Host Bridge in XEN:
>>>
>>> In order to support the PCI passthrough XEN should be aware of all the PCI host
>>> bridges available on the system and should be able to access the PCI
>>> configuration space. ECAM configuration access is supported as of now. XEN
>>> during boot will read the PCI device tree node “reg” property and will map the
>>> ECAM space to the XEN memory using the “ioremap_nocache ()” function.
>>>
>>> [snip]
>>>
>>> When Dom0 tries to access the PCI config space of the device, XEN will find the
>>> corresponding host bridge based on segment number and access the corresponding
>>> config space assigned to that bridge.
>>>
>>> Limitation:
>>> * Only PCI ECAM configuration space access is supported.
>>> * Device tree binding is supported as of now, ACPI is not supported.
>>> * Need to port the PCI host bridge access code to XEN to access the
>>> configuration space (generic one works but lots of platforms will required
>>> some specific code or quirks).
>>>
>>> "
>>>
>>> Unfortunately the document had not been updated since then, but the question
>>>
>>> being discussed has not changed in the design: Domain-0 has full access to the bridge,
>>>
>>> Xen traps ECAM. (I am taking dom0less aside as that was not the target for the first phase)
>>
>> Having an update design document is quite important. This will allow reviewer to comment easily on overall approach and speed up the review as we can match to the agreed approach.
>>
>> So can this please be updated and sent along the work?
>>
>> In addition to that, it feels to me that the commit message should contain a summary of what you just pasted as this helps understanding the goal and approach of this patch.
>>
> If we are on the same page now will you be able to review the patch
>
> with respect to the design from RFC?
I believe this was already done as I covered both side in my review.
Cheers,
--
Julien Grall
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 06/11] xen/domain: Call pci_release_devices() when releasing domain resources
2021-09-03 8:33 ` [PATCH 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
@ 2021-09-10 18:45 ` Stefano Stabellini
0 siblings, 0 replies; 69+ messages in thread
From: Stefano Stabellini @ 2021-09-10 18:45 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
rahul.singh
On Fri, 3 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> This is the very same what we do for DT devices. What is more
> that x86 already calls pci_release_devices().
Rewording suggestion:
This is the very same that we already do for DT devices. Moreover, x86
already calls pci_release_devices().
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> ---
> xen/arch/arm/domain.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index d99c653626e4..4e40c4098280 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -985,7 +985,8 @@ static int relinquish_memory(struct domain *d, struct page_list_head *list)
> * function which may return -ERESTART.
> */
> enum {
> - PROG_tee = 1,
> + PROG_pci = 1,
> + PROG_tee,
> PROG_xen,
> PROG_page,
> PROG_mapping,
> @@ -1022,6 +1023,12 @@ int domain_relinquish_resources(struct domain *d)
> #ifdef CONFIG_IOREQ_SERVER
> ioreq_server_destroy_all(d);
> #endif
> +#ifdef CONFIG_HAS_PCI
> + PROGRESS(pci):
> + ret = pci_release_devices(d);
> + if ( ret )
> + return ret;
> +#endif
>
> PROGRESS(tee):
> ret = tee_relinquish_resources(d);
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 08/11] libxl: Only map legacy PCI IRQs if they are supported
2021-09-03 8:33 ` [PATCH 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
2021-09-03 10:26 ` Juergen Gross
@ 2021-09-10 19:06 ` Stefano Stabellini
2021-09-13 8:22 ` Oleksandr Andrushchenko
1 sibling, 1 reply; 69+ messages in thread
From: Stefano Stabellini @ 2021-09-10 19:06 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
rahul.singh, Oleksandr Andrushchenko, Ian Jackson, Juergen Gross
On Fri, 3 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> Arm's PCI passthrough implementation doesn't support legacy interrupts,
> but MSI/MSI-X. This can be the case for other platforms too.
> For that reason introduce a new CONFIG_PCI_SUPP_LEGACY_IRQ and add
> it to the CFLAGS and compile the relevant code in the toolstack only if
> applicable.
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> Cc: Ian Jackson <iwj@xenproject.org>
> Cc: Juergen Gross <jgross@suse.com>
> ---
> tools/libs/light/Makefile | 4 ++++
> tools/libs/light/libxl_pci.c | 13 +++++++++++++
> 2 files changed, 17 insertions(+)
>
> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
> index 7d8c51d49242..bd3f6be2a183 100644
> --- a/tools/libs/light/Makefile
> +++ b/tools/libs/light/Makefile
> @@ -46,6 +46,10 @@ CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
> -Wno-declaration-after-statement -Wformat-nonliteral
> CFLAGS += -I.
>
> +ifeq ($(CONFIG_X86),y)
> +CFLAGS += -DCONFIG_PCI_SUPP_LEGACY_IRQ
> +endif
> +
> SRCS-$(CONFIG_X86) += libxl_cpuid.c
> SRCS-$(CONFIG_X86) += libxl_x86.c
> SRCS-$(CONFIG_X86) += libxl_psr.c
> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 59f3686fc85e..cd4fea46c3f7 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1434,6 +1434,7 @@ static void pci_add_dm_done(libxl__egc *egc,
> }
> }
> fclose(f);
> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
As Juergen pointed out the logic is inverted.
I also think we need to come up with a better way to handle this #ifdef
logic in this file.
For instance, could we let this function try to open sysfs_path? I
imagine it would fail, right? If so, we could just have an #ifdef inside
the failure check.
Alternatively, could we have a small #ifdef right here doing:
#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
goto out_no_irq;
#endif
?
Even better, would be to introduce a static inline as follows:
static inline bool pci_supp_legacy_irq(void)
{
#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
return false;
#else
return true;
#endif
}
Then in libxl_pci.c you can avoid all ifdefs:
if (!pci_supp_legacy_irq()))
goto out_no_irq;
> sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
> pci->bus, pci->dev, pci->func);
> f = fopen(sysfs_path, "r");
> @@ -1460,6 +1461,7 @@ static void pci_add_dm_done(libxl__egc *egc,
> }
> }
> fclose(f);
> +#endif
>
> /* Don't restrict writes to the PCI config space from this VM */
> if (pci->permissive) {
> @@ -1471,7 +1473,9 @@ static void pci_add_dm_done(libxl__egc *egc,
> }
> }
>
> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
> out_no_irq:
> +#endif
> if (!isstubdom) {
> if (pci->rdm_policy == LIBXL_RDM_RESERVE_POLICY_STRICT) {
> flag &= ~XEN_DOMCTL_DEV_RDM_RELAXED;
> @@ -1951,7 +1955,9 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
> pci->bus, pci->dev, pci->func);
> FILE *f = fopen(sysfs_path, "r");
> unsigned int start = 0, end = 0, flags = 0, size = 0;
> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
> int irq = 0;
> +#endif
I'd let this compile if possible.
> int i;
> if (f == NULL) {
> @@ -1983,6 +1989,7 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
> }
> fclose(f);
> skip1:
> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
Here we could do instead:
if (!pci_supp_legacy_irq()))
goto skip_irq;
> sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
> pci->bus, pci->dev, pci->func);
> f = fopen(sysfs_path, "r");
> @@ -2001,8 +2008,14 @@ skip1:
> }
> }
> fclose(f);
> +#else
> + /* Silence error: label at end of compound statement */
> + ;
> +#endif
> }
> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
> skip_irq:
> +#endif
> rc = 0;
> out_fail:
> pci_remove_detached(egc, prs, rc); /* must be last */
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-03 8:33 ` [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
2021-09-09 17:43 ` Julien Grall
@ 2021-09-10 20:12 ` Stefano Stabellini
2021-09-14 14:24 ` Oleksandr Andrushchenko
1 sibling, 1 reply; 69+ messages in thread
From: Stefano Stabellini @ 2021-09-10 20:12 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: xen-devel, julien, sstabellini, oleksandr_tyshchenko,
volodymyr_babchuk, Artem_Mygaiev, roger.pau, bertrand.marquis,
rahul.singh, Oleksandr Andrushchenko
On Fri, 3 Sep 2021, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>
> In order vPCI to work it needs all access to PCI configuration space
> (ECAM) to be synchronized among all entities, e.g. hardware domain and
> guests. For that implement PCI host bridge specific callbacks to
> properly setup those ranges depending on particular host bridge
> implementation.
>
> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> ---
> xen/arch/arm/pci/ecam.c | 11 +++++++++++
> xen/arch/arm/pci/pci-host-common.c | 16 ++++++++++++++++
> xen/arch/arm/vpci.c | 13 +++++++++++++
> xen/include/asm-arm/pci.h | 8 ++++++++
> 4 files changed, 48 insertions(+)
>
> diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
> index 91c691b41fdf..92ecb2e0762b 100644
> --- a/xen/arch/arm/pci/ecam.c
> +++ b/xen/arch/arm/pci/ecam.c
> @@ -42,6 +42,16 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
> return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
> }
>
> +static int pci_ecam_register_mmio_handler(struct domain *d,
> + struct pci_host_bridge *bridge,
> + const struct mmio_handler_ops *ops)
> +{
> + struct pci_config_window *cfg = bridge->sysdata;
> +
> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
> + return 0;
> +}
Given that struct pci_config_window is generic (it is not specific to
one bridge), I wonder if we even need the .register_mmio_handler
callback here.
In fact, pci_host_bridge->sysdata doesn't even need to be a void*, it
could be a struct pci_config_window*, right?
We could simply call:
register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
for each bridge directly from domain_vpci_init ?
> /* ECAM ops */
> const struct pci_ecam_ops pci_generic_ecam_ops = {
> .bus_shift = 20,
> @@ -49,6 +59,7 @@ const struct pci_ecam_ops pci_generic_ecam_ops = {
> .map_bus = pci_ecam_map_bus,
> .read = pci_generic_config_read,
> .write = pci_generic_config_write,
> + .register_mmio_handler = pci_ecam_register_mmio_handler,
> }
> };
>
> diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
> index d2fef5476b8e..a89112bfbb7c 100644
> --- a/xen/arch/arm/pci/pci-host-common.c
> +++ b/xen/arch/arm/pci/pci-host-common.c
> @@ -318,6 +318,22 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev)
> }
> return bridge->dt_node;
> }
> +
> +int pci_host_iterate_bridges(struct domain *d,
> + int (*clb)(struct domain *d,
> + struct pci_host_bridge *bridge))
> +{
> + struct pci_host_bridge *bridge;
> + int err;
> +
> + list_for_each_entry( bridge, &pci_host_bridges, node )
> + {
> + err = clb(d, bridge);
> + if ( err )
> + return err;
> + }
> + return 0;
> +}
> /*
> * Local variables:
> * mode: C
> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
> index da8b1ca13c07..258134292458 100644
> --- a/xen/arch/arm/vpci.c
> +++ b/xen/arch/arm/vpci.c
> @@ -74,11 +74,24 @@ static const struct mmio_handler_ops vpci_mmio_handler = {
> .write = vpci_mmio_write,
> };
>
> +static int vpci_setup_mmio_handler(struct domain *d,
> + struct pci_host_bridge *bridge)
> +{
> + if ( bridge->ops->register_mmio_handler )
> + return bridge->ops->register_mmio_handler(d, bridge,
> + &vpci_mmio_handler);
> + return 0;
> +}
> +
> int domain_vpci_init(struct domain *d)
> {
> if ( !has_vpci(d) )
> return 0;
>
> + if ( is_hardware_domain(d) )
> + return pci_host_iterate_bridges(d, vpci_setup_mmio_handler);
> +
> + /* Guest domains use what is programmed in their device tree. */
> register_mmio_handler(d, &vpci_mmio_handler,
> GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
>
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index 7dc4c8dc9026..2c7c7649e00f 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -17,6 +17,8 @@
> #ifndef __ARM_PCI_H__
> #define __ARM_PCI_H__
>
> +#include <asm/mmio.h>
> +
> #ifdef CONFIG_HAS_PCI
>
> #define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
> @@ -77,6 +79,9 @@ struct pci_ops {
> uint32_t reg, uint32_t len, uint32_t *value);
> int (*write)(struct pci_host_bridge *bridge, uint32_t sbdf,
> uint32_t reg, uint32_t len, uint32_t value);
> + int (*register_mmio_handler)(struct domain *d,
> + struct pci_host_bridge *bridge,
> + const struct mmio_handler_ops *ops);
> };
>
> /*
> @@ -107,6 +112,9 @@ int pci_get_host_bridge_segment(const struct dt_device_node *node,
> uint16_t *segment);
> struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
>
> +int pci_host_iterate_bridges(struct domain *d,
> + int (*clb)(struct domain *d,
> + struct pci_host_bridge *bridge));
> #else /*!CONFIG_HAS_PCI*/
>
> struct arch_pci_dev { };
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 15:04 ` Julien Grall
@ 2021-09-10 20:30 ` Stefano Stabellini
2021-09-10 21:41 ` Julien Grall
0 siblings, 1 reply; 69+ messages in thread
From: Stefano Stabellini @ 2021-09-10 20:30 UTC (permalink / raw)
To: Julien Grall
Cc: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel,
sstabellini, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh
[-- Attachment #1: Type: text/plain, Size: 4431 bytes --]
On Fri, 10 Sep 2021, Julien Grall wrote:
> On 10/09/2021 15:01, Oleksandr Andrushchenko wrote:
> > On 10.09.21 16:18, Julien Grall wrote:
> > > On 10/09/2021 13:37, Oleksandr Andrushchenko wrote:
> > > > Hi, Julien!
> > >
> > > Hi Oleksandr,
> > >
> > > > On 09.09.21 20:58, Julien Grall wrote:
> > > > > On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
> > > > > > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
> > > > > >
> > > > > > Host bridge controller's ECAM space is mapped into Domain-0's p2m,
> > > > > > thus it is not possible to trap the same for vPCI via MMIO handlers.
> > > > > > For this to work we need to not map those while constructing the
> > > > > > domain.
> > > > > >
> > > > > > Note, that during Domain-0 creation there is no pci_dev yet
> > > > > > allocated for
> > > > > > host bridges, thus we cannot match PCI host and its associated
> > > > > > bridge by SBDF. Use dt_device_node field for checks instead.
> > > > > >
> > > > > > Signed-off-by: Oleksandr Andrushchenko
> > > > > > <oleksandr_andrushchenko@epam.com>
> > > > > > ---
> > > > > > xen/arch/arm/domain_build.c | 3 +++
> > > > > > xen/arch/arm/pci/ecam.c | 17 +++++++++++++++++
> > > > > > xen/arch/arm/pci/pci-host-common.c | 22 ++++++++++++++++++++++
> > > > > > xen/include/asm-arm/pci.h | 12 ++++++++++++
> > > > > > 4 files changed, 54 insertions(+)
> > > > > >
> > > > > > diff --git a/xen/arch/arm/domain_build.c
> > > > > > b/xen/arch/arm/domain_build.c
> > > > > > index da427f399711..76f5b513280c 100644
> > > > > > --- a/xen/arch/arm/domain_build.c
> > > > > > +++ b/xen/arch/arm/domain_build.c
> > > > > > @@ -1257,6 +1257,9 @@ static int __init map_range_to_domain(const
> > > > > > struct dt_device_node *dev,
> > > > > > }
> > > > > > }
> > > > > > + if ( need_mapping && (device_get_class(dev) == DEVICE_PCI)
> > > > > > ) > + need_mapping = pci_host_bridge_need_p2m_mapping(d, dev,
> > > > > addr, len);
> > > > >
> > > > > AFAICT, with device_get_class(dev), you know whether the hostbridge is
> > > > > used by Xen. Therefore, I would expect that we don't want to map all
> > > > > the regions of the hostbridges in dom0 (including the BARs).
> > > > >
> > > > > Can you clarify it?
> > > > We only want to trap ECAM, not MMIOs and any other memory regions as the
> > > > bridge is
> > > >
> > > > initialized and used by Domain-0 completely.
> > >
> > > What do you mean by "used by Domain-0 completely"? The hostbridge is owned
> > > by Xen so I don't think we can let dom0 access any MMIO regions by
> > > default.
> >
> > Now it's my time to ask why do you think Xen owns the bridge?
> >
> > All the bridges are passed to Domain-0, are fully visible to Domain-0,
> > initialized in Domain-0.
> >
> > Enumeration etc. is done in Domain-0. So how comes that Xen owns the bridge?
> > In what way it does?
> >
> > Xen just accesses the ECAM when it needs it, but that's it. Xen traps ECAM
> > access - that is true.
> >
> > But it in no way uses MMIOs etc. of the bridge - those under direct control
> > of Domain-0
>
> So I looked on the snipped of the design document you posted. I think you are
> instead referring to a different part:
>
> " PCI-PCIe enumeration is a process of detecting devices connected to its
> host.
> It is the responsibility of the hardware domain or boot firmware to do the PCI
> enumeration and configure the BAR, PCI capabilities, and MSI/MSI-X
> configuration."
>
> But I still don't see why it means we have to map the MMIOs right now. Dom0
> should not need to access the MMIOs (aside the hostbridge registers) until the
> BARs are configured.
This is true especially when we are going to assign a specific PCIe
device to a DomU. In that case, the related MMIO regions are going to be
mapped to the DomU and it would be a bad idea to also keep them mapped
in Dom0. Once we do PCIe assignment, the MMIO region of the PCIe device
being assigned should only be mapped in the DomU, right?
If so, it would be better if the MMIO region in question was never
mapped into Dom0 at all rather having to unmap it.
With the current approach, given that the entire PCIe aperture is mapped
to Dom0 since boot, we would have to identify the relevant subset region
and unmap it from Dom0 when doing assignment.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 20:30 ` Stefano Stabellini
@ 2021-09-10 21:41 ` Julien Grall
2021-09-13 6:27 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Julien Grall @ 2021-09-10 21:41 UTC (permalink / raw)
To: Stefano Stabellini
Cc: Oleksandr Andrushchenko, Oleksandr Andrushchenko, xen-devel,
Oleksandr Tyshchenko, Volodymyr Babchuk, Artem Mygaiev,
Roger Pau Monné,
Bertrand Marquis, Rahul Singh
[-- Attachment #1: Type: text/plain, Size: 4882 bytes --]
On Fri, 10 Sep 2021, 21:30 Stefano Stabellini, <sstabellini@kernel.org>
wrote:
> On Fri, 10 Sep 2021, Julien Grall wrote:
> > On 10/09/2021 15:01, Oleksandr Andrushchenko wrote:
> > > On 10.09.21 16:18, Julien Grall wrote:
> > > > On 10/09/2021 13:37, Oleksandr Andrushchenko wrote:
> > > > > Hi, Julien!
> > > >
> > > > Hi Oleksandr,
> > > >
> > > > > On 09.09.21 20:58, Julien Grall wrote:
> > > > > > On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
> > > > > > > From: Oleksandr Andrushchenko <
> oleksandr_andrushchenko@epam.com>
> > > > > > >
> > > > > > > Host bridge controller's ECAM space is mapped into Domain-0's
> p2m,
> > > > > > > thus it is not possible to trap the same for vPCI via MMIO
> handlers.
> > > > > > > For this to work we need to not map those while constructing
> the
> > > > > > > domain.
> > > > > > >
> > > > > > > Note, that during Domain-0 creation there is no pci_dev yet
> > > > > > > allocated for
> > > > > > > host bridges, thus we cannot match PCI host and its associated
> > > > > > > bridge by SBDF. Use dt_device_node field for checks instead.
> > > > > > >
> > > > > > > Signed-off-by: Oleksandr Andrushchenko
> > > > > > > <oleksandr_andrushchenko@epam.com>
> > > > > > > ---
> > > > > > > xen/arch/arm/domain_build.c | 3 +++
> > > > > > > xen/arch/arm/pci/ecam.c | 17 +++++++++++++++++
> > > > > > > xen/arch/arm/pci/pci-host-common.c | 22
> ++++++++++++++++++++++
> > > > > > > xen/include/asm-arm/pci.h | 12 ++++++++++++
> > > > > > > 4 files changed, 54 insertions(+)
> > > > > > >
> > > > > > > diff --git a/xen/arch/arm/domain_build.c
> > > > > > > b/xen/arch/arm/domain_build.c
> > > > > > > index da427f399711..76f5b513280c 100644
> > > > > > > --- a/xen/arch/arm/domain_build.c
> > > > > > > +++ b/xen/arch/arm/domain_build.c
> > > > > > > @@ -1257,6 +1257,9 @@ static int __init
> map_range_to_domain(const
> > > > > > > struct dt_device_node *dev,
> > > > > > > }
> > > > > > > }
> > > > > > > + if ( need_mapping && (device_get_class(dev) ==
> DEVICE_PCI)
> > > > > > > ) > + need_mapping =
> pci_host_bridge_need_p2m_mapping(d, dev,
> > > > > > addr, len);
> > > > > >
> > > > > > AFAICT, with device_get_class(dev), you know whether the
> hostbridge is
> > > > > > used by Xen. Therefore, I would expect that we don't want to map
> all
> > > > > > the regions of the hostbridges in dom0 (including the BARs).
> > > > > >
> > > > > > Can you clarify it?
> > > > > We only want to trap ECAM, not MMIOs and any other memory regions
> as the
> > > > > bridge is
> > > > >
> > > > > initialized and used by Domain-0 completely.
> > > >
> > > > What do you mean by "used by Domain-0 completely"? The hostbridge is
> owned
> > > > by Xen so I don't think we can let dom0 access any MMIO regions by
> > > > default.
> > >
> > > Now it's my time to ask why do you think Xen owns the bridge?
> > >
> > > All the bridges are passed to Domain-0, are fully visible to Domain-0,
> > > initialized in Domain-0.
> > >
> > > Enumeration etc. is done in Domain-0. So how comes that Xen owns the
> bridge?
> > > In what way it does?
> > >
> > > Xen just accesses the ECAM when it needs it, but that's it. Xen traps
> ECAM
> > > access - that is true.
> > >
> > > But it in no way uses MMIOs etc. of the bridge - those under direct
> control
> > > of Domain-0
> >
> > So I looked on the snipped of the design document you posted. I think
> you are
> > instead referring to a different part:
> >
> > " PCI-PCIe enumeration is a process of detecting devices connected to its
> > host.
> > It is the responsibility of the hardware domain or boot firmware to do
> the PCI
> > enumeration and configure the BAR, PCI capabilities, and MSI/MSI-X
> > configuration."
> >
> > But I still don't see why it means we have to map the MMIOs right now.
> Dom0
> > should not need to access the MMIOs (aside the hostbridge registers)
> until the
> > BARs are configured.
>
> This is true especially when we are going to assign a specific PCIe
> device to a DomU. In that case, the related MMIO regions are going to be
> mapped to the DomU and it would be a bad idea to also keep them mapped
> in Dom0. Once we do PCIe assignment, the MMIO region of the PCIe device
> being assigned should only be mapped in the DomU, right?
>
That's actually a good point. This is a recipe for disaster because dom0
and the domU may map the BARs with conflicting caching attributes.
So we ought to unmap the BAR from dom0. When the device is assigned to the
domU
> If so, it would be better if the MMIO region in question was never
> mapped into Dom0 at all rather having to unmap it.
>
> With the current approach, given that the entire PCIe aperture is mapped
> to Dom0 since boot, we would have to identify the relevant subset region
> and unmap it from Dom0 when doing assignment.
[-- Attachment #2: Type: text/html, Size: 7011 bytes --]
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-10 21:41 ` Julien Grall
@ 2021-09-13 6:27 ` Oleksandr Andrushchenko
2021-09-14 10:03 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-13 6:27 UTC (permalink / raw)
To: Julien Grall, Stefano Stabellini
Cc: Oleksandr Andrushchenko, xen-devel, Oleksandr Tyshchenko,
Volodymyr Babchuk, Artem Mygaiev, Roger Pau Monné,
Bertrand Marquis, Rahul Singh
On 11.09.21 00:41, Julien Grall wrote:
>
>
> On Fri, 10 Sep 2021, 21:30 Stefano Stabellini, <sstabellini@kernel.org <mailto:sstabellini@kernel.org>> wrote:
>
> On Fri, 10 Sep 2021, Julien Grall wrote:
> > On 10/09/2021 15:01, Oleksandr Andrushchenko wrote:
> > > On 10.09.21 16:18, Julien Grall wrote:
> > > > On 10/09/2021 13:37, Oleksandr Andrushchenko wrote:
> > > > > Hi, Julien!
> > > >
> > > > Hi Oleksandr,
> > > >
> > > > > On 09.09.21 20:58, Julien Grall wrote:
> > > > > > On 03/09/2021 09:33, Oleksandr Andrushchenko wrote:
> > > > > > > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com <mailto:oleksandr_andrushchenko@epam.com>>
> > > > > > >
> > > > > > > Host bridge controller's ECAM space is mapped into Domain-0's p2m,
> > > > > > > thus it is not possible to trap the same for vPCI via MMIO handlers.
> > > > > > > For this to work we need to not map those while constructing the
> > > > > > > domain.
> > > > > > >
> > > > > > > Note, that during Domain-0 creation there is no pci_dev yet
> > > > > > > allocated for
> > > > > > > host bridges, thus we cannot match PCI host and its associated
> > > > > > > bridge by SBDF. Use dt_device_node field for checks instead.
> > > > > > >
> > > > > > > Signed-off-by: Oleksandr Andrushchenko
> > > > > > > <oleksandr_andrushchenko@epam.com <mailto:oleksandr_andrushchenko@epam.com>>
> > > > > > > ---
> > > > > > > xen/arch/arm/domain_build.c | 3 +++
> > > > > > > xen/arch/arm/pci/ecam.c | 17 +++++++++++++++++
> > > > > > > xen/arch/arm/pci/pci-host-common.c | 22 ++++++++++++++++++++++
> > > > > > > xen/include/asm-arm/pci.h | 12 ++++++++++++
> > > > > > > 4 files changed, 54 insertions(+)
> > > > > > >
> > > > > > > diff --git a/xen/arch/arm/domain_build.c
> > > > > > > b/xen/arch/arm/domain_build.c
> > > > > > > index da427f399711..76f5b513280c 100644
> > > > > > > --- a/xen/arch/arm/domain_build.c
> > > > > > > +++ b/xen/arch/arm/domain_build.c
> > > > > > > @@ -1257,6 +1257,9 @@ static int __init map_range_to_domain(const
> > > > > > > struct dt_device_node *dev,
> > > > > > > }
> > > > > > > }
> > > > > > > + if ( need_mapping && (device_get_class(dev) == DEVICE_PCI)
> > > > > > > ) > + need_mapping = pci_host_bridge_need_p2m_mapping(d, dev,
> > > > > > addr, len);
> > > > > >
> > > > > > AFAICT, with device_get_class(dev), you know whether the hostbridge is
> > > > > > used by Xen. Therefore, I would expect that we don't want to map all
> > > > > > the regions of the hostbridges in dom0 (including the BARs).
> > > > > >
> > > > > > Can you clarify it?
> > > > > We only want to trap ECAM, not MMIOs and any other memory regions as the
> > > > > bridge is
> > > > >
> > > > > initialized and used by Domain-0 completely.
> > > >
> > > > What do you mean by "used by Domain-0 completely"? The hostbridge is owned
> > > > by Xen so I don't think we can let dom0 access any MMIO regions by
> > > > default.
> > >
> > > Now it's my time to ask why do you think Xen owns the bridge?
> > >
> > > All the bridges are passed to Domain-0, are fully visible to Domain-0,
> > > initialized in Domain-0.
> > >
> > > Enumeration etc. is done in Domain-0. So how comes that Xen owns the bridge?
> > > In what way it does?
> > >
> > > Xen just accesses the ECAM when it needs it, but that's it. Xen traps ECAM
> > > access - that is true.
> > >
> > > But it in no way uses MMIOs etc. of the bridge - those under direct control
> > > of Domain-0
> >
> > So I looked on the snipped of the design document you posted. I think you are
> > instead referring to a different part:
> >
> > " PCI-PCIe enumeration is a process of detecting devices connected to its
> > host.
> > It is the responsibility of the hardware domain or boot firmware to do the PCI
> > enumeration and configure the BAR, PCI capabilities, and MSI/MSI-X
> > configuration."
> >
> > But I still don't see why it means we have to map the MMIOs right now. Dom0
> > should not need to access the MMIOs (aside the hostbridge registers) until the
> > BARs are configured.
>
> This is true especially when we are going to assign a specific PCIe
> device to a DomU. In that case, the related MMIO regions are going to be
> mapped to the DomU and it would be a bad idea to also keep them mapped
> in Dom0. Once we do PCIe assignment, the MMIO region of the PCIe device
> being assigned should only be mapped in the DomU, right?
>
>
> That's actually a good point. This is a recipe for disaster because dom0 and the domU may map the BARs with conflicting caching attributes.
>
> So we ought to unmap the BAR from dom0. When the device is assigned to the domU
1. Yes, currently we map MMIOs to Dom0 from the beginning (the whole aperture actually)
2. When a PCIe device assigned to a DomU we do unmap the relevant MMIOs
from Dom0 and map them to DomU
>
> If so, it would be better if the MMIO region in question was never
> mapped into Dom0 at all rather having to unmap it.
>
Ok, I'll do that
>
>
> With the current approach, given that the entire PCIe aperture is mapped
> to Dom0 since boot, we would have to identify the relevant subset region
> and unmap it from Dom0 when doing assignment.
>
It is already in vPCI code (with non-identity mappings in the PCI devices passthrough on Arm, part 3)
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 08/11] libxl: Only map legacy PCI IRQs if they are supported
2021-09-10 19:06 ` Stefano Stabellini
@ 2021-09-13 8:22 ` Oleksandr Andrushchenko
0 siblings, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-13 8:22 UTC (permalink / raw)
To: Stefano Stabellini, Oleksandr Andrushchenko
Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
Ian Jackson, Juergen Gross
Hi, Stefano!
On 10.09.21 22:06, Stefano Stabellini wrote:
> On Fri, 3 Sep 2021, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>>
>> Arm's PCI passthrough implementation doesn't support legacy interrupts,
>> but MSI/MSI-X. This can be the case for other platforms too.
>> For that reason introduce a new CONFIG_PCI_SUPP_LEGACY_IRQ and add
>> it to the CFLAGS and compile the relevant code in the toolstack only if
>> applicable.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
>> Cc: Ian Jackson <iwj@xenproject.org>
>> Cc: Juergen Gross <jgross@suse.com>
>> ---
>> tools/libs/light/Makefile | 4 ++++
>> tools/libs/light/libxl_pci.c | 13 +++++++++++++
>> 2 files changed, 17 insertions(+)
>>
>> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
>> index 7d8c51d49242..bd3f6be2a183 100644
>> --- a/tools/libs/light/Makefile
>> +++ b/tools/libs/light/Makefile
>> @@ -46,6 +46,10 @@ CFLAGS += -Wno-format-zero-length -Wmissing-declarations \
>> -Wno-declaration-after-statement -Wformat-nonliteral
>> CFLAGS += -I.
>>
>> +ifeq ($(CONFIG_X86),y)
>> +CFLAGS += -DCONFIG_PCI_SUPP_LEGACY_IRQ
>> +endif
>> +
>> SRCS-$(CONFIG_X86) += libxl_cpuid.c
>> SRCS-$(CONFIG_X86) += libxl_x86.c
>> SRCS-$(CONFIG_X86) += libxl_psr.c
>> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
>> index 59f3686fc85e..cd4fea46c3f7 100644
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -1434,6 +1434,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>> }
>> }
>> fclose(f);
>> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
> As Juergen pointed out the logic is inverted.
>
> I also think we need to come up with a better way to handle this #ifdef
> logic in this file.
>
> For instance, could we let this function try to open sysfs_path? I
> imagine it would fail, right? If so, we could just have an #ifdef inside
> the failure check.
>
> Alternatively, could we have a small #ifdef right here doing:
>
> #ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
> goto out_no_irq;
> #endif
>
> ?
>
>
> Even better, would be to introduce a static inline as follows:
>
> static inline bool pci_supp_legacy_irq(void)
> {
> #ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
> return false;
> #else
> return true;
> #endif
> }
>
> Then in libxl_pci.c you can avoid all ifdefs:
>
> if (!pci_supp_legacy_irq()))
> goto out_no_irq;
This one seems to be the best with less ifdef'ery
I'll re-work the code as above
>
>
>> sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>> pci->bus, pci->dev, pci->func);
>> f = fopen(sysfs_path, "r");
>> @@ -1460,6 +1461,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>> }
>> }
>> fclose(f);
>> +#endif
>>
>> /* Don't restrict writes to the PCI config space from this VM */
>> if (pci->permissive) {
>> @@ -1471,7 +1473,9 @@ static void pci_add_dm_done(libxl__egc *egc,
>> }
>> }
>>
>> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
>> out_no_irq:
>> +#endif
>> if (!isstubdom) {
>> if (pci->rdm_policy == LIBXL_RDM_RESERVE_POLICY_STRICT) {
>> flag &= ~XEN_DOMCTL_DEV_RDM_RELAXED;
>> @@ -1951,7 +1955,9 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
>> pci->bus, pci->dev, pci->func);
>> FILE *f = fopen(sysfs_path, "r");
>> unsigned int start = 0, end = 0, flags = 0, size = 0;
>> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
>> int irq = 0;
>> +#endif
> I'd let this compile if possible.
>
>
>> int i;
>> if (f == NULL) {
>> @@ -1983,6 +1989,7 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
>> }
>> fclose(f);
>> skip1:
>> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
> Here we could do instead:
>
> if (!pci_supp_legacy_irq()))
> goto skip_irq;
>
>
>> sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>> pci->bus, pci->dev, pci->func);
>> f = fopen(sysfs_path, "r");
>> @@ -2001,8 +2008,14 @@ skip1:
>> }
>> }
>> fclose(f);
>> +#else
>> + /* Silence error: label at end of compound statement */
>> + ;
>> +#endif
>> }
>> +#ifndef CONFIG_PCI_SUPP_LEGACY_IRQ
>> skip_irq:
>> +#endif
>> rc = 0;
>> out_fail:
>> pci_remove_detached(egc, prs, rc); /* must be last */
>> --
>> 2.25.1
>>
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-13 6:27 ` Oleksandr Andrushchenko
@ 2021-09-14 10:03 ` Oleksandr Andrushchenko
2021-09-15 0:36 ` Stefano Stabellini
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-14 10:03 UTC (permalink / raw)
To: Julien Grall, Stefano Stabellini
Cc: Oleksandr Andrushchenko, xen-devel, Oleksandr Tyshchenko,
Volodymyr Babchuk, Artem Mygaiev, Roger Pau Monné,
Bertrand Marquis, Rahul Singh
>
>>
>> If so, it would be better if the MMIO region in question was never
>> mapped into Dom0 at all rather having to unmap it.
>>
> Ok, I'll do that
Sorry for pasting here quite a big patch, but I feel I need clarification if this is
the way we want it. Please note I had to modify setup.h.
From 6eee96bc046efb41ec25f87362b1f6e973a4e6c2 Mon Sep 17 00:00:00 2001
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Date: Tue, 14 Sep 2021 12:14:43 +0300
Subject: [PATCH] Fixes: a57dc84da5fd ("xen/arm: Do not map PCI ECAM space to
Domain-0's p2m")
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
xen/arch/arm/domain_build.c | 37 +++++++++++++--------
xen/arch/arm/pci/ecam.c | 11 +++----
xen/arch/arm/pci/pci-host-common.c | 53 ++++++++++++++++++++++--------
xen/include/asm-arm/pci.h | 18 ++++------
xen/include/asm-arm/setup.h | 13 ++++++++
5 files changed, 86 insertions(+), 46 deletions(-)
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 76f5b513280c..b4bfda9d5b5a 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -10,7 +10,6 @@
#include <asm/regs.h>
#include <xen/errno.h>
#include <xen/err.h>
-#include <xen/device_tree.h>
#include <xen/libfdt/libfdt.h>
#include <xen/guest_access.h>
#include <xen/iocap.h>
@@ -47,12 +46,6 @@ static int __init parse_dom0_mem(const char *s)
}
custom_param("dom0_mem", parse_dom0_mem);
-struct map_range_data
-{
- struct domain *d;
- p2m_type_t p2mt;
-};
-
/* Override macros from asm/page.h to make them work with mfn_t */
#undef virt_to_mfn
#define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
@@ -1228,9 +1221,8 @@ static int __init map_dt_irq_to_domain(const struct dt_device_node *dev,
return 0;
}
-static int __init map_range_to_domain(const struct dt_device_node *dev,
- u64 addr, u64 len,
- void *data)
+int __init map_range_to_domain(const struct dt_device_node *dev,
+ u64 addr, u64 len, void *data)
{
struct map_range_data *mr_data = data;
struct domain *d = mr_data->d;
@@ -1257,8 +1249,10 @@ static int __init map_range_to_domain(const struct dt_device_node *dev,
}
}
- if ( need_mapping && (device_get_class(dev) == DEVICE_PCI) )
- need_mapping = pci_host_bridge_need_p2m_mapping(d, dev, addr, len);
+#ifdef CONFIG_HAS_PCI
+ if ( (device_get_class(dev) == DEVICE_PCI) && !mr_data->map_pci_bridge )
+ need_mapping = false;
+#endif
if ( need_mapping )
{
@@ -1293,7 +1287,11 @@ static int __init map_device_children(struct domain *d,
const struct dt_device_node *dev,
p2m_type_t p2mt)
{
- struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
+ struct map_range_data mr_data = {
+ .d = d,
+ .p2mt = p2mt,
+ .map_pci_bridge = false
+ };
int ret;
if ( dt_device_type_is_equal(dev, "pci") )
@@ -1425,7 +1423,11 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
/* Give permission and map MMIOs */
for ( i = 0; i < naddr; i++ )
{
- struct map_range_data mr_data = { .d = d, .p2mt = p2mt };
+ struct map_range_data mr_data = {
+ .d = d,
+ .p2mt = p2mt,
+ .map_pci_bridge = false
+ };
res = dt_device_get_address(dev, i, &addr, &size);
if ( res )
{
@@ -2594,7 +2596,14 @@ static int __init construct_dom0(struct domain *d)
return rc;
if ( acpi_disabled )
+ {
rc = prepare_dtb_hwdom(d, &kinfo);
+#ifdef CONFIG_HAS_PCI
+ if ( rc < 0 )
+ return rc;
+ rc = pci_host_bridge_mappings(d, p2m_mmio_direct_c);
+#endif
+ }
else
rc = prepare_acpi(d, &kinfo);
diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
index d32efb7fcbd0..e08b9c6909b6 100644
--- a/xen/arch/arm/pci/ecam.c
+++ b/xen/arch/arm/pci/ecam.c
@@ -52,17 +52,14 @@ static int pci_ecam_register_mmio_handler(struct domain *d,
return 0;
}
-static int pci_ecam_need_p2m_mapping(struct domain *d,
- struct pci_host_bridge *bridge,
- uint64_t addr, uint64_t len)
+static bool pci_ecam_need_p2m_mapping(struct domain *d,
+ struct pci_host_bridge *bridge,
+ uint64_t addr)
{
struct pci_config_window *cfg = bridge->sysdata;
- if ( !is_hardware_domain(d) )
- return true;
-
/*
- * We do not want ECAM address space to be mapped in domain's p2m,
+ * We do not want ECAM address space to be mapped in Domain-0's p2m,
* so we can trap access to it.
*/
return cfg->phys_addr != addr;
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index c04be636452d..74077dec8c72 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -25,6 +25,7 @@
#include <xen/init.h>
#include <xen/pci.h>
#include <asm/pci.h>
+#include <asm/setup.h>
#include <xen/rwlock.h>
#include <xen/sched.h>
#include <xen/vmap.h>
@@ -335,25 +336,51 @@ int pci_host_iterate_bridges(struct domain *d,
return 0;
}
-bool pci_host_bridge_need_p2m_mapping(struct domain *d,
- const struct dt_device_node *node,
- uint64_t addr, uint64_t len)
+int __init pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt)
{
struct pci_host_bridge *bridge;
+ struct map_range_data mr_data = {
+ .d = d,
+ .p2mt = p2mt,
+ .map_pci_bridge = true
+ };
+ /*
+ * For each PCI host bridge we need to only map those ranges
+ * which are used by Domain-0 to properly initialize the bridge,
+ * e.g. we do not want to map ECAM configuration space which lives in
+ * "reg" or "assigned-addresses" device tree property.
+ * Neither we want to map any of the MMIO ranges found in the "ranges"
+ * device tree property.
+ */
list_for_each_entry( bridge, &pci_host_bridges, node )
{
- if ( bridge->dt_node != node )
- continue;
-
- if ( !bridge->ops->need_p2m_mapping )
- return true;
-
- return bridge->ops->need_p2m_mapping(d, bridge, addr, len);
+ const struct dt_device_node *dev = bridge->dt_node;
+ int i;
+
+ for ( i = 0; i < dt_number_of_address(dev); i++ )
+ {
+ uint64_t addr, size;
+ int err;
+
+ err = dt_device_get_address(dev, i, &addr, &size);
+ if ( err )
+ {
+ printk(XENLOG_ERR "Unable to retrieve address %u for %s\n",
+ i, dt_node_full_name(dev));
+ return err;
+ }
+
+ if ( bridge->ops->need_p2m_mapping(d, bridge, addr) )
+ {
+ err = map_range_to_domain(dev, addr, size, &mr_data);
+ if ( err )
+ return err;
+ }
+ }
}
- printk(XENLOG_ERR "Unable to find PCI bridge for %s segment %d, addr %lx\n",
- node->full_name, bridge->segment, addr);
- return true;
+
+ return 0;
}
/*
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 9c28a4bdc4b7..97fbaac01370 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -21,6 +21,8 @@
#ifdef CONFIG_HAS_PCI
+#include <asm/p2m.h>
+
#define pci_to_dev(pcidev) (&(pcidev)->arch.dev)
#define PRI_pci "%04x:%02x:%02x.%u"
@@ -82,8 +84,9 @@ struct pci_ops {
int (*register_mmio_handler)(struct domain *d,
struct pci_host_bridge *bridge,
const struct mmio_handler_ops *ops);
- int (*need_p2m_mapping)(struct domain *d, struct pci_host_bridge *bridge,
- uint64_t addr, uint64_t len);
+ bool (*need_p2m_mapping)(struct domain *d,
+ struct pci_host_bridge *bridge,
+ uint64_t addr);
};
/*
@@ -117,19 +120,10 @@ struct dt_device_node *pci_find_host_bridge_node(struct device *dev);
int pci_host_iterate_bridges(struct domain *d,
int (*clb)(struct domain *d,
struct pci_host_bridge *bridge));
-bool pci_host_bridge_need_p2m_mapping(struct domain *d,
- const struct dt_device_node *node,
- uint64_t addr, uint64_t len);
+int pci_host_bridge_mappings(struct domain *d, p2m_type_t p2mt);
#else /*!CONFIG_HAS_PCI*/
struct arch_pci_dev { };
-static inline bool
-pci_host_bridge_need_p2m_mapping(struct domain *d,
- const struct dt_device_node *node,
- uint64_t addr, uint64_t len)
-{
- return true;
-}
#endif /*!CONFIG_HAS_PCI*/
#endif /* __ARM_PCI_H__ */
diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
index c4b6af602995..a1c31c0bb024 100644
--- a/xen/include/asm-arm/setup.h
+++ b/xen/include/asm-arm/setup.h
@@ -2,6 +2,8 @@
#define __ARM_SETUP_H_
#include <public/version.h>
+#include <asm/p2m.h>
+#include <xen/device_tree.h>
#define MIN_FDT_ALIGN 8
#define MAX_FDT_SIZE SZ_2M
@@ -76,6 +78,14 @@ struct bootinfo {
#endif
};
+struct map_range_data
+{
+ struct domain *d;
+ p2m_type_t p2mt;
+ /* Set if mappings for PCI host bridges must not be skipped. */
+ bool map_pci_bridge;
+};
+
extern struct bootinfo bootinfo;
extern domid_t max_init_domid;
@@ -123,6 +133,9 @@ void device_tree_get_reg(const __be32 **cell, u32 address_cells,
u32 device_tree_get_u32(const void *fdt, int node,
const char *prop_name, u32 dflt);
+int map_range_to_domain(const struct dt_device_node *dev,
+ u64 addr, u64 len, void *data);
+
#endif
/*
* Local variables:
--
2.25.1
With the patch above I have the following log in Domain-0:
generic-armv8-xt-dom0 login: root
root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input)
(XEN) ==== PCI devices ====
(XEN) ==== segment 0000 ====
(XEN) 0000:03:00.0 - d0 - node -1
(XEN) 0000:02:02.0 - d0 - node -1
(XEN) 0000:02:01.0 - d0 - node -1
(XEN) 0000:02:00.0 - d0 - node -1
(XEN) 0000:01:00.0 - d0 - node -1
(XEN) 0000:00:00.0 - d0 - node -1
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
root@generic-armv8-xt-dom0:~# modprobe e1000e
[ 46.104729] e1000e: Intel(R) PRO/1000 Network Driver
[ 46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[ 46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002)
(XEN) map [e0000, e001f] -> 0xe0000 for d0
(XEN) map [e0020, e003f] -> 0xe0020 for d0
[ 46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[ 46.189668] pci_msi_setup_msi_irqs
[ 46.191016] nwl_compose_msi_msg msg at fe440000
(XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 gva=0xffff800010fa5000 gpa=0x000000e0040000
[ 46.200455] Unhandled fault at 0xffff800010fa5000
[snip]
[ 46.233079] Call trace:
[ 46.233559] __pci_write_msi_msg+0x70/0x180
[ 46.234149] pci_msi_domain_write_msg+0x28/0x30
[ 46.234869] msi_domain_activate+0x5c/0x88
From the above you can see that BARs are mapped for Domain-0 now
only when an assigned PCI device gets enabled in Domain-0.
Another thing to note is that we crash on MSI-X access as BARs are mapped
to the domain while enabling memory decoding in the COMMAND register,
but MSI-X are handled differently, e.g. we have MSI-X holes in the mappings.
This is because before this change the whole PCI aperture was mapped into
Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there
was no need to do so, e.g. they were always mapped into Domain-0 and
emulated for guests.
Please note that one cannot use xl pci-attach in this case to attach the PCI device
in question to Domain-0 as (please see the log) that device is already attached.
Neither it can be detached and re-attached. So, without mapping MSI-X holes for
Domain-0 the device becomes unusable in Domain-0. At the same time the device
can be successfully passed to DomU.
Julien, Stefano! Please let me know how can we proceed with this.
Thank you,
Oleksandr
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-10 13:40 ` Oleksandr Andrushchenko
@ 2021-09-14 13:47 ` Oleksandr Andrushchenko
2021-09-15 0:25 ` Stefano Stabellini
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-14 13:47 UTC (permalink / raw)
To: Julien Grall, sstabellini
Cc: Oleksandr Tyshchenko, Volodymyr Babchuk, Artem Mygaiev,
roger.pau, Bertrand Marquis, Rahul Singh, xen-devel,
Oleksandr Andrushchenko
>> What you want to know if how many times register_mmio_handler() will be called from domain_vpci_init().
>>
>> You introduced a function pci_host_iterate_bridges() that will walk over the bridges and then call the callback vpci_setup_mmio_handler(). So you could introduce a new callback that will return 1 if bridge->ops->register_mmio_handler is not NULL or 0.
>
> Ok, clear. Something like:
>
> if ( (rc = domain_vgic_register(d, &count)) != 0 )
> goto fail;
>
> *find out how many bridges and update count*
>
>
> if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
> goto fail;
>
I have the following code now:
int domain_vpci_get_num_mmio_handlers(struct domain *d)
{
int count;
if ( is_hardware_domain(d) )
/* For each PCI host bridge's configuration space. */
count += pci_host_get_num_bridges();
else
/*
* VPCI_MSIX_MEM_NUM handlers for MSI-X tables per each PCI device
* being passed through. Maximum number of supported devices
* is 32 as virtual bus topology emulates the devices as embedded
* endpoints.
* +1 for a single emulated host bridge's configuration space. */
count = VPCI_MSIX_MEM_NUM * 32 + 1;
return count;
}
Please note that we cannot tell how many PCIe devices are going to be passed through
So, worst case for DomU is going to be 65 to what we already have...
This sounds scary a bit as most probably we won't pass through 32 devices most of the
time, but will make d->arch.vmmio.handlers almost 4 times bigger then it is now.
This may have influence on the MMIO handlers performance...
Thanks,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-10 20:12 ` Stefano Stabellini
@ 2021-09-14 14:24 ` Oleksandr Andrushchenko
2021-09-15 5:30 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-14 14:24 UTC (permalink / raw)
To: Stefano Stabellini, Rahul Singh
Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis,
Oleksandr Andrushchenko
}
>>
>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>> + struct pci_host_bridge *bridge,
>> + const struct mmio_handler_ops *ops)
>> +{
>> + struct pci_config_window *cfg = bridge->sysdata;
>> +
>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>> + return 0;
>> +}
> Given that struct pci_config_window is generic (it is not specific to
> one bridge), I wonder if we even need the .register_mmio_handler
> callback here.
>
> In fact, pci_host_bridge->sysdata doesn't even need to be a void*, it
> could be a struct pci_config_window*, right?
Rahul, this actually may change your series.
Do you think we can do that?
>
> We could simply call:
>
> register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>
> for each bridge directly from domain_vpci_init ?
If Rahul changes the API then we can probably do that.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-14 13:47 ` Oleksandr Andrushchenko
@ 2021-09-15 0:25 ` Stefano Stabellini
2021-09-15 4:50 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Stefano Stabellini @ 2021-09-15 0:25 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: Julien Grall, sstabellini, Oleksandr Tyshchenko,
Volodymyr Babchuk, Artem Mygaiev, roger.pau, Bertrand Marquis,
Rahul Singh, xen-devel, Oleksandr Andrushchenko
[-- Attachment #1: Type: text/plain, Size: 2232 bytes --]
On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote:
> >> What you want to know if how many times register_mmio_handler() will be called from domain_vpci_init().
> >>
> >> You introduced a function pci_host_iterate_bridges() that will walk over the bridges and then call the callback vpci_setup_mmio_handler(). So you could introduce a new callback that will return 1 if bridge->ops->register_mmio_handler is not NULL or 0.
> >
> > Ok, clear. Something like:
> >
> > if ( (rc = domain_vgic_register(d, &count)) != 0 )
> > goto fail;
> >
> > *find out how many bridges and update count*
> >
> >
> > if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
> > goto fail;
> >
> I have the following code now:
>
> int domain_vpci_get_num_mmio_handlers(struct domain *d)
> {
> int count;
count is incremented but not initialized
> if ( is_hardware_domain(d) )
> /* For each PCI host bridge's configuration space. */
> count += pci_host_get_num_bridges();
> else
> /*
> * VPCI_MSIX_MEM_NUM handlers for MSI-X tables per each PCI device
> * being passed through. Maximum number of supported devices
> * is 32 as virtual bus topology emulates the devices as embedded
> * endpoints.
> * +1 for a single emulated host bridge's configuration space. */
> count = VPCI_MSIX_MEM_NUM * 32 + 1;
> return count;
> }
>
> Please note that we cannot tell how many PCIe devices are going to be passed through
>
> So, worst case for DomU is going to be 65 to what we already have...
>
> This sounds scary a bit as most probably we won't pass through 32 devices most of the
>
> time, but will make d->arch.vmmio.handlers almost 4 times bigger then it is now.
>
> This may have influence on the MMIO handlers performance...
I am OK with that given that it doesn't affect performance until you
actually start creating too many virtual devices for the DomU. In other
words, find_mmio_handler restricts the search to vmmio->num_entries, so
as long as most entries are allocated but unused, we should be fine.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-14 10:03 ` Oleksandr Andrushchenko
@ 2021-09-15 0:36 ` Stefano Stabellini
2021-09-15 5:35 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Stefano Stabellini @ 2021-09-15 0:36 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: Julien Grall, Stefano Stabellini, Oleksandr Andrushchenko,
xen-devel, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, Roger Pau Monné,
Bertrand Marquis, Rahul Singh
[-- Attachment #1: Type: text/plain, Size: 3132 bytes --]
On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote:
> With the patch above I have the following log in Domain-0:
>
> generic-armv8-xt-dom0 login: root
> root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input)
> (XEN) ==== PCI devices ====
> (XEN) ==== segment 0000 ====
> (XEN) 0000:03:00.0 - d0 - node -1
> (XEN) 0000:02:02.0 - d0 - node -1
> (XEN) 0000:02:01.0 - d0 - node -1
> (XEN) 0000:02:00.0 - d0 - node -1
> (XEN) 0000:01:00.0 - d0 - node -1
> (XEN) 0000:00:00.0 - d0 - node -1
> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
>
> root@generic-armv8-xt-dom0:~# modprobe e1000e
> [ 46.104729] e1000e: Intel(R) PRO/1000 Network Driver
> [ 46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> [ 46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002)
> (XEN) map [e0000, e001f] -> 0xe0000 for d0
> (XEN) map [e0020, e003f] -> 0xe0020 for d0
> [ 46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
> [ 46.189668] pci_msi_setup_msi_irqs
> [ 46.191016] nwl_compose_msi_msg msg at fe440000
> (XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 gva=0xffff800010fa5000 gpa=0x000000e0040000
> [ 46.200455] Unhandled fault at 0xffff800010fa5000
>
> [snip]
>
> [ 46.233079] Call trace:
> [ 46.233559] __pci_write_msi_msg+0x70/0x180
> [ 46.234149] pci_msi_domain_write_msg+0x28/0x30
> [ 46.234869] msi_domain_activate+0x5c/0x88
>
> From the above you can see that BARs are mapped for Domain-0 now
>
> only when an assigned PCI device gets enabled in Domain-0.
>
> Another thing to note is that we crash on MSI-X access as BARs are mapped
>
> to the domain while enabling memory decoding in the COMMAND register,
>
> but MSI-X are handled differently, e.g. we have MSI-X holes in the mappings.
>
> This is because before this change the whole PCI aperture was mapped into
>
> Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there
>
> was no need to do so, e.g. they were always mapped into Domain-0 and
>
> emulated for guests.
>
> Please note that one cannot use xl pci-attach in this case to attach the PCI device
>
> in question to Domain-0 as (please see the log) that device is already attached.
>
> Neither it can be detached and re-attached. So, without mapping MSI-X holes for
>
> Domain-0 the device becomes unusable in Domain-0. At the same time the device
>
> can be successfully passed to DomU.
>
>
> Julien, Stefano! Please let me know how can we proceed with this.
What was the plan for MSI-X in Dom0?
Given that Dom0 interacts with a virtual-ITS and virtual-GIC rather than
a physical-ITS and physical-GIC, I imagine that it wasn't correct for
Dom0 to write to the real MSI-X table directly?
Shouldn't Dom0 get emulated MSI-X tables like any DomU?
Otherwise, if Dom0 is expected to have the real MSI-X tables mapped, then
I would suggest to map them at the same time as the BARs. But I am
thinking that Dom0 should get emulated MSI-X tables, not physical MSI-X
tables.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-15 0:25 ` Stefano Stabellini
@ 2021-09-15 4:50 ` Oleksandr Andrushchenko
0 siblings, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-15 4:50 UTC (permalink / raw)
To: Stefano Stabellini
Cc: Julien Grall, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Rahul Singh,
xen-devel, Oleksandr Andrushchenko
On 15.09.21 03:25, Stefano Stabellini wrote:
> On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote:
>>>> What you want to know if how many times register_mmio_handler() will be called from domain_vpci_init().
>>>>
>>>> You introduced a function pci_host_iterate_bridges() that will walk over the bridges and then call the callback vpci_setup_mmio_handler(). So you could introduce a new callback that will return 1 if bridge->ops->register_mmio_handler is not NULL or 0.
>>> Ok, clear. Something like:
>>>
>>> if ( (rc = domain_vgic_register(d, &count)) != 0 )
>>> goto fail;
>>>
>>> *find out how many bridges and update count*
>>>
>>>
>>> if ( (rc = domain_io_init(d, count + MAX_IO_HANDLER)) != 0 )
>>> goto fail;
>>>
>> I have the following code now:
>>
>> int domain_vpci_get_num_mmio_handlers(struct domain *d)
>> {
>> int count;
> count is incremented but not initialized
Excessive cleanup before sending ;)
>
>
>> if ( is_hardware_domain(d) )
>> /* For each PCI host bridge's configuration space. */
>> count += pci_host_get_num_bridges();
>> else
>> /*
>> * VPCI_MSIX_MEM_NUM handlers for MSI-X tables per each PCI device
>> * being passed through. Maximum number of supported devices
>> * is 32 as virtual bus topology emulates the devices as embedded
>> * endpoints.
>> * +1 for a single emulated host bridge's configuration space. */
>> count = VPCI_MSIX_MEM_NUM * 32 + 1;
>> return count;
>> }
>>
>> Please note that we cannot tell how many PCIe devices are going to be passed through
>>
>> So, worst case for DomU is going to be 65 to what we already have...
>>
>> This sounds scary a bit as most probably we won't pass through 32 devices most of the
>>
>> time, but will make d->arch.vmmio.handlers almost 4 times bigger then it is now.
>>
>> This may have influence on the MMIO handlers performance...
> I am OK with that given that it doesn't affect performance until you
> actually start creating too many virtual devices for the DomU. In other
> words, find_mmio_handler restricts the search to vmmio->num_entries, so
> as long as most entries are allocated but unused, we should be fine.
Ok, fine, so I'll have this change as above in v2.
Thanks,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-14 14:24 ` Oleksandr Andrushchenko
@ 2021-09-15 5:30 ` Oleksandr Andrushchenko
2021-09-15 10:45 ` Rahul Singh
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-15 5:30 UTC (permalink / raw)
To: Rahul Singh
Cc: xen-devel, julien, Stefano Stabellini, Oleksandr Tyshchenko,
Volodymyr Babchuk, Artem Mygaiev, roger.pau, Bertrand Marquis,
Oleksandr Andrushchenko
Hi, Rahul!
On 14.09.21 17:24, Oleksandr Andrushchenko wrote:
>
> }
>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>> + struct pci_host_bridge *bridge,
>>> + const struct mmio_handler_ops *ops)
>>> +{
>>> + struct pci_config_window *cfg = bridge->sysdata;
>>> +
>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>> + return 0;
>>> +}
>> Given that struct pci_config_window is generic (it is not specific to
>> one bridge), I wonder if we even need the .register_mmio_handler
>> callback here.
>>
>> In fact, pci_host_bridge->sysdata doesn't even need to be a void*, it
>> could be a struct pci_config_window*, right?
>
> Rahul, this actually may change your series.
>
> Do you think we can do that?
>
This is the only change requested that left unanswered by now.
Will it be possible that you change the API accordingly, so I can
implement as Stefano suggests?
Thanks,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-15 0:36 ` Stefano Stabellini
@ 2021-09-15 5:35 ` Oleksandr Andrushchenko
2021-09-15 16:42 ` Rahul Singh
2021-09-15 20:09 ` Stefano Stabellini
0 siblings, 2 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-15 5:35 UTC (permalink / raw)
To: Stefano Stabellini, Rahul Singh
Cc: Julien Grall, Oleksandr Andrushchenko, xen-devel,
Oleksandr Tyshchenko, Volodymyr Babchuk, Artem Mygaiev,
Roger Pau Monné,
Bertrand Marquis
Hi, Stefano, Rahul!
On 15.09.21 03:36, Stefano Stabellini wrote:
> On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote:
>> With the patch above I have the following log in Domain-0:
>>
>> generic-armv8-xt-dom0 login: root
>> root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input)
>> (XEN) ==== PCI devices ====
>> (XEN) ==== segment 0000 ====
>> (XEN) 0000:03:00.0 - d0 - node -1
>> (XEN) 0000:02:02.0 - d0 - node -1
>> (XEN) 0000:02:01.0 - d0 - node -1
>> (XEN) 0000:02:00.0 - d0 - node -1
>> (XEN) 0000:01:00.0 - d0 - node -1
>> (XEN) 0000:00:00.0 - d0 - node -1
>> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
>>
>> root@generic-armv8-xt-dom0:~# modprobe e1000e
>> [ 46.104729] e1000e: Intel(R) PRO/1000 Network Driver
>> [ 46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
>> [ 46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002)
>> (XEN) map [e0000, e001f] -> 0xe0000 for d0
>> (XEN) map [e0020, e003f] -> 0xe0020 for d0
>> [ 46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
>> [ 46.189668] pci_msi_setup_msi_irqs
>> [ 46.191016] nwl_compose_msi_msg msg at fe440000
>> (XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 gva=0xffff800010fa5000 gpa=0x000000e0040000
>> [ 46.200455] Unhandled fault at 0xffff800010fa5000
>>
>> [snip]
>>
>> [ 46.233079] Call trace:
>> [ 46.233559] __pci_write_msi_msg+0x70/0x180
>> [ 46.234149] pci_msi_domain_write_msg+0x28/0x30
>> [ 46.234869] msi_domain_activate+0x5c/0x88
>>
>> From the above you can see that BARs are mapped for Domain-0 now
>>
>> only when an assigned PCI device gets enabled in Domain-0.
>>
>> Another thing to note is that we crash on MSI-X access as BARs are mapped
>>
>> to the domain while enabling memory decoding in the COMMAND register,
>>
>> but MSI-X are handled differently, e.g. we have MSI-X holes in the mappings.
>>
>> This is because before this change the whole PCI aperture was mapped into
>>
>> Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there
>>
>> was no need to do so, e.g. they were always mapped into Domain-0 and
>>
>> emulated for guests.
>>
>> Please note that one cannot use xl pci-attach in this case to attach the PCI device
>>
>> in question to Domain-0 as (please see the log) that device is already attached.
>>
>> Neither it can be detached and re-attached. So, without mapping MSI-X holes for
>>
>> Domain-0 the device becomes unusable in Domain-0. At the same time the device
>>
>> can be successfully passed to DomU.
>>
>>
>> Julien, Stefano! Please let me know how can we proceed with this.
> What was the plan for MSI-X in Dom0?
It just worked because we mapped everything
>
> Given that Dom0 interacts with a virtual-ITS and virtual-GIC rather than
> a physical-ITS and physical-GIC, I imagine that it wasn't correct for
> Dom0 to write to the real MSI-X table directly?
>
> Shouldn't Dom0 get emulated MSI-X tables like any DomU?
>
> Otherwise, if Dom0 is expected to have the real MSI-X tables mapped, then
> I would suggest to map them at the same time as the BARs. But I am
> thinking that Dom0 should get emulated MSI-X tables, not physical MSI-X
> tables.
Yes, it seems more than reasonable to enable emulation for Domain-0
as well. Other than that, Stefano, do you think we are good to go with
the changes I did in order to unmap everything for Domain-0?
Rahul, it seems we will need a change to vPCI/MSI-X so we can also
trap Domain-0 for MSI-X tables.
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-15 5:30 ` Oleksandr Andrushchenko
@ 2021-09-15 10:45 ` Rahul Singh
2021-09-15 11:55 ` Oleksandr Andrushchenko
2021-09-15 20:33 ` Stefano Stabellini
0 siblings, 2 replies; 69+ messages in thread
From: Rahul Singh @ 2021-09-15 10:45 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: xen-devel, julien, Stefano Stabellini, Oleksandr Tyshchenko,
Volodymyr Babchuk, Artem Mygaiev, roger.pau, Bertrand Marquis,
Oleksandr Andrushchenko
Hi Oleksandr, Stefano,
> On 15 Sep 2021, at 6:30 am, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com> wrote:
>
> Hi, Rahul!
>
> On 14.09.21 17:24, Oleksandr Andrushchenko wrote:
>>
>> }
>>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>>> + struct pci_host_bridge *bridge,
>>>> + const struct mmio_handler_ops *ops)
>>>> +{
>>>> + struct pci_config_window *cfg = bridge->sysdata;
>>>> +
>>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>>> + return 0;
>>>> +}
>>> Given that struct pci_config_window is generic (it is not specific to
>>> one bridge), I wonder if we even need the .register_mmio_handler
>>> callback here.
>>>
>>> In fact,pci_host_bridge->sysdata doesn't even need to be a void*, it
>>> could be a struct pci_config_window*, right?
>>
>> Rahul, this actually may change your series.
>>
>> Do you think we can do that?
>>
> This is the only change requested that left unanswered by now.
>
> Will it be possible that you change the API accordingly, so I can
>
> implement as Stefano suggests?
We need pci_host_bridge->sysdata as void* in case we need to implement the new non-ecam PCI controller in XEN.
Please have a look once in Linux code[1] how bridge->sysdata will be used. struct pci_config_window is used only for
ecam supported host controller. Different PCI host controller will have different PCI interface to access the PCI controller.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/pcie-rcar-host.c#n309
I think we need bridge->sysdata in future to implement the new PCI controller.
Regards,
Rahul
>
> Thanks,
>
> Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-15 10:45 ` Rahul Singh
@ 2021-09-15 11:55 ` Oleksandr Andrushchenko
2021-09-15 20:33 ` Stefano Stabellini
1 sibling, 0 replies; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-15 11:55 UTC (permalink / raw)
To: Rahul Singh
Cc: xen-devel, julien, Stefano Stabellini, Oleksandr Tyshchenko,
Volodymyr Babchuk, Artem Mygaiev, roger.pau, Bertrand Marquis,
Oleksandr Andrushchenko
On 15.09.21 13:45, Rahul Singh wrote:
> Hi Oleksandr, Stefano,
>
>> On 15 Sep 2021, at 6:30 am, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com> wrote:
>>
>> Hi, Rahul!
>>
>> On 14.09.21 17:24, Oleksandr Andrushchenko wrote:
>>> }
>>>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>>>> + struct pci_host_bridge *bridge,
>>>>> + const struct mmio_handler_ops *ops)
>>>>> +{
>>>>> + struct pci_config_window *cfg = bridge->sysdata;
>>>>> +
>>>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>>>> + return 0;
>>>>> +}
>>>> Given that struct pci_config_window is generic (it is not specific to
>>>> one bridge), I wonder if we even need the .register_mmio_handler
>>>> callback here.
>>>>
>>>> In fact,pci_host_bridge->sysdata doesn't even need to be a void*, it
>>>> could be a struct pci_config_window*, right?
>>> Rahul, this actually may change your series.
>>>
>>> Do you think we can do that?
>>>
>> This is the only change requested that left unanswered by now.
>>
>> Will it be possible that you change the API accordingly, so I can
>>
>> implement as Stefano suggests?
> We need pci_host_bridge->sysdata as void* in case we need to implement the new non-ecam PCI controller in XEN.
> Please have a look once in Linux code[1] how bridge->sysdata will be used. struct pci_config_window is used only for
> ecam supported host controller. Different PCI host controller will have different PCI interface to access the PCI controller.
>
> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/pcie-rcar-host.c*n309__;Iw!!GF_29dbcQIUBPA!mbI_iuu-laYQoUn36kKf3z2H4AyxR4J8C59CcKb21pLldyVnDaKbgJSQhZ4liEnwnAe2uzK8OA$ [git[.]kernel[.]org]
>
> I think we need bridge->sysdata in future to implement the new PCI controller.
>
> Regards,
> Rahul
Stefano, does it sound reasonable then to keep the above code as is?
>
>> Thanks,
>>
>> Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-15 5:35 ` Oleksandr Andrushchenko
@ 2021-09-15 16:42 ` Rahul Singh
2021-09-15 20:09 ` Stefano Stabellini
1 sibling, 0 replies; 69+ messages in thread
From: Rahul Singh @ 2021-09-15 16:42 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: Stefano Stabellini, Julien Grall, Oleksandr Andrushchenko,
xen-devel, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, Roger Pau Monné,
Bertrand Marquis
Hi Oleksandr,
> On 15 Sep 2021, at 6:35 am, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com> wrote:
>
> Hi, Stefano, Rahul!
>
> On 15.09.21 03:36, Stefano Stabellini wrote:
>> On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote:
>>> With the patch above I have the following log in Domain-0:
>>>
>>> generic-armv8-xt-dom0 login: root
>>> root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input)
>>> (XEN) ==== PCI devices ====
>>> (XEN) ==== segment 0000 ====
>>> (XEN) 0000:03:00.0 - d0 - node -1
>>> (XEN) 0000:02:02.0 - d0 - node -1
>>> (XEN) 0000:02:01.0 - d0 - node -1
>>> (XEN) 0000:02:00.0 - d0 - node -1
>>> (XEN) 0000:01:00.0 - d0 - node -1
>>> (XEN) 0000:00:00.0 - d0 - node -1
>>> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
>>>
>>> root@generic-armv8-xt-dom0:~# modprobe e1000e
>>> [ 46.104729] e1000e: Intel(R) PRO/1000 Network Driver
>>> [ 46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
>>> [ 46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002)
>>> (XEN) map [e0000, e001f] -> 0xe0000 for d0
>>> (XEN) map [e0020, e003f] -> 0xe0020 for d0
>>> [ 46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
>>> [ 46.189668] pci_msi_setup_msi_irqs
>>> [ 46.191016] nwl_compose_msi_msg msg at fe440000
>>> (XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 gva=0xffff800010fa5000 gpa=0x000000e0040000
>>> [ 46.200455] Unhandled fault at 0xffff800010fa5000
>>>
>>> [snip]
>>>
>>> [ 46.233079] Call trace:
>>> [ 46.233559] __pci_write_msi_msg+0x70/0x180
>>> [ 46.234149] pci_msi_domain_write_msg+0x28/0x30
>>> [ 46.234869] msi_domain_activate+0x5c/0x88
>>>
>>> From the above you can see that BARs are mapped for Domain-0 now
>>>
>>> only when an assigned PCI device gets enabled in Domain-0.
>>>
>>> Another thing to note is that we crash on MSI-X access as BARs are mapped
>>>
>>> to the domain while enabling memory decoding in the COMMAND register,
>>>
>>> but MSI-X are handled differently, e.g. we have MSI-X holes in the mappings.
>>>
>>> This is because before this change the whole PCI aperture was mapped into
>>>
>>> Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there
>>>
>>> was no need to do so, e.g. they were always mapped into Domain-0 and
>>>
>>> emulated for guests.
>>>
>>> Please note that one cannot use xl pci-attach in this case to attach the PCI device
>>>
>>> in question to Domain-0 as (please see the log) that device is already attached.
>>>
>>> Neither it can be detached and re-attached. So, without mapping MSI-X holes for
>>>
>>> Domain-0 the device becomes unusable in Domain-0. At the same time the device
>>>
>>> can be successfully passed to DomU.
>>>
>>>
>>> Julien, Stefano! Please let me know how can we proceed with this.
>> What was the plan for MSI-X in Dom0?
> It just worked because we mapped everything
>>
>> Given that Dom0 interacts with a virtual-ITS and virtual-GIC rather than
>> a physical-ITS and physical-GIC, I imagine that it wasn't correct for
>> Dom0 to write to the real MSI-X table directly?
>>
>> Shouldn't Dom0 get emulated MSI-X tables like any DomU?
>>
>> Otherwise, if Dom0 is expected to have the real MSI-X tables mapped, then
>> I would suggest to map them at the same time as the BARs. But I am
>> thinking that Dom0 should get emulated MSI-X tables, not physical MSI-X
>> tables.
>
> Yes, it seems more than reasonable to enable emulation for Domain-0
>
> as well. Other than that, Stefano, do you think we are good to go with
>
> the changes I did in order to unmap everything for Domain-0?
>
> Rahul, it seems we will need a change to vPCI/MSI-X so we can also
>
> trap Domain-0 for MSI-X tables.
I agree that we need emulated MSI-X tables for Dom0 also. Let me check on this and come back to you.
Regards,
Rahul
>
> Thank you,
>
> Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-15 5:35 ` Oleksandr Andrushchenko
2021-09-15 16:42 ` Rahul Singh
@ 2021-09-15 20:09 ` Stefano Stabellini
2021-09-15 20:19 ` Stefano Stabellini
1 sibling, 1 reply; 69+ messages in thread
From: Stefano Stabellini @ 2021-09-15 20:09 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: Stefano Stabellini, Rahul Singh, Julien Grall,
Oleksandr Andrushchenko, xen-devel, Oleksandr Tyshchenko,
Volodymyr Babchuk, Artem Mygaiev, Roger Pau Monné,
Bertrand Marquis
[-- Attachment #1: Type: text/plain, Size: 5101 bytes --]
On Wed, 15 Sep 2021, Oleksandr Andrushchenko wrote:
> On 15.09.21 03:36, Stefano Stabellini wrote:
> > On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote:
> >> With the patch above I have the following log in Domain-0:
> >>
> >> generic-armv8-xt-dom0 login: root
> >> root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input)
> >> (XEN) ==== PCI devices ====
> >> (XEN) ==== segment 0000 ====
> >> (XEN) 0000:03:00.0 - d0 - node -1
> >> (XEN) 0000:02:02.0 - d0 - node -1
> >> (XEN) 0000:02:01.0 - d0 - node -1
> >> (XEN) 0000:02:00.0 - d0 - node -1
> >> (XEN) 0000:01:00.0 - d0 - node -1
> >> (XEN) 0000:00:00.0 - d0 - node -1
> >> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
> >>
> >> root@generic-armv8-xt-dom0:~# modprobe e1000e
> >> [ 46.104729] e1000e: Intel(R) PRO/1000 Network Driver
> >> [ 46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> >> [ 46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002)
> >> (XEN) map [e0000, e001f] -> 0xe0000 for d0
> >> (XEN) map [e0020, e003f] -> 0xe0020 for d0
> >> [ 46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
> >> [ 46.189668] pci_msi_setup_msi_irqs
> >> [ 46.191016] nwl_compose_msi_msg msg at fe440000
> >> (XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 gva=0xffff800010fa5000 gpa=0x000000e0040000
> >> [ 46.200455] Unhandled fault at 0xffff800010fa5000
> >>
> >> [snip]
> >>
> >> [ 46.233079] Call trace:
> >> [ 46.233559] __pci_write_msi_msg+0x70/0x180
> >> [ 46.234149] pci_msi_domain_write_msg+0x28/0x30
> >> [ 46.234869] msi_domain_activate+0x5c/0x88
> >>
> >> From the above you can see that BARs are mapped for Domain-0 now
> >>
> >> only when an assigned PCI device gets enabled in Domain-0.
> >>
> >> Another thing to note is that we crash on MSI-X access as BARs are mapped
> >>
> >> to the domain while enabling memory decoding in the COMMAND register,
> >>
> >> but MSI-X are handled differently, e.g. we have MSI-X holes in the mappings.
> >>
> >> This is because before this change the whole PCI aperture was mapped into
> >>
> >> Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there
> >>
> >> was no need to do so, e.g. they were always mapped into Domain-0 and
> >>
> >> emulated for guests.
> >>
> >> Please note that one cannot use xl pci-attach in this case to attach the PCI device
> >>
> >> in question to Domain-0 as (please see the log) that device is already attached.
> >>
> >> Neither it can be detached and re-attached. So, without mapping MSI-X holes for
> >>
> >> Domain-0 the device becomes unusable in Domain-0. At the same time the device
> >>
> >> can be successfully passed to DomU.
> >>
> >>
> >> Julien, Stefano! Please let me know how can we proceed with this.
> > What was the plan for MSI-X in Dom0?
> It just worked because we mapped everything
> >
> > Given that Dom0 interacts with a virtual-ITS and virtual-GIC rather than
> > a physical-ITS and physical-GIC, I imagine that it wasn't correct for
> > Dom0 to write to the real MSI-X table directly?
> >
> > Shouldn't Dom0 get emulated MSI-X tables like any DomU?
> >
> > Otherwise, if Dom0 is expected to have the real MSI-X tables mapped, then
> > I would suggest to map them at the same time as the BARs. But I am
> > thinking that Dom0 should get emulated MSI-X tables, not physical MSI-X
> > tables.
>
> Yes, it seems more than reasonable to enable emulation for Domain-0
>
> as well. Other than that, Stefano, do you think we are good to go with
>
> the changes I did in order to unmap everything for Domain-0?
It might be better to resend the series with the patch in it, because it
is difficult to review the patch like this. Nonetheless I tried, but I
might have missed something.
Previously the whole PCIe bridge aperture was mapped to Dom0, and
it was done by map_range_to_domain, is that correct?
Now this patch, to avoid mapping the entire aperture to Dom0, is
skipping any operations for PCIe devices in map_range_to_domain by
setting need_mapping = false.
The idea is that instead, we'll only map things when needed and not the
whole aperture. However, looking at the changes to
pci_host_bridge_mappings (formerly known as
pci_host_bridge_need_p2m_mapping), it is still going through the full
list of address ranges of the PCIe bridge and map each range one by one
using map_range_to_domain. Also, pci_host_bridge_mappings is still
called unconditionally at boot for Dom0.
So I am missing the part where the aperture is actually *not* mapped to
Dom0. What's the difference between the loop in
pci_host_bridge_mappings:
for ( i = 0; i < dt_number_of_address(dev); i++ )
map_range_to_domain...
and the previous code in map_range_to_domain? I think I am missing
something but as mentioned it is difficult to review the patch like this
out of order.
Also, and this is minor, even if currently unused, it might be good to
keep a length parameter to pci_host_bridge_need_p2m_mapping.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-15 20:09 ` Stefano Stabellini
@ 2021-09-15 20:19 ` Stefano Stabellini
2021-09-16 7:16 ` Oleksandr Andrushchenko
0 siblings, 1 reply; 69+ messages in thread
From: Stefano Stabellini @ 2021-09-15 20:19 UTC (permalink / raw)
To: Stefano Stabellini
Cc: Oleksandr Andrushchenko, Rahul Singh, Julien Grall,
Oleksandr Andrushchenko, xen-devel, Oleksandr Tyshchenko,
Volodymyr Babchuk, Artem Mygaiev, Roger Pau Monné,
Bertrand Marquis
[-- Attachment #1: Type: text/plain, Size: 6159 bytes --]
On Wed, 15 Sep 2021, Stefano Stabellini wrote:
> On Wed, 15 Sep 2021, Oleksandr Andrushchenko wrote:
> > On 15.09.21 03:36, Stefano Stabellini wrote:
> > > On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote:
> > >> With the patch above I have the following log in Domain-0:
> > >>
> > >> generic-armv8-xt-dom0 login: root
> > >> root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input)
> > >> (XEN) ==== PCI devices ====
> > >> (XEN) ==== segment 0000 ====
> > >> (XEN) 0000:03:00.0 - d0 - node -1
> > >> (XEN) 0000:02:02.0 - d0 - node -1
> > >> (XEN) 0000:02:01.0 - d0 - node -1
> > >> (XEN) 0000:02:00.0 - d0 - node -1
> > >> (XEN) 0000:01:00.0 - d0 - node -1
> > >> (XEN) 0000:00:00.0 - d0 - node -1
> > >> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
> > >>
> > >> root@generic-armv8-xt-dom0:~# modprobe e1000e
> > >> [ 46.104729] e1000e: Intel(R) PRO/1000 Network Driver
> > >> [ 46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> > >> [ 46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002)
> > >> (XEN) map [e0000, e001f] -> 0xe0000 for d0
> > >> (XEN) map [e0020, e003f] -> 0xe0020 for d0
> > >> [ 46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
> > >> [ 46.189668] pci_msi_setup_msi_irqs
> > >> [ 46.191016] nwl_compose_msi_msg msg at fe440000
> > >> (XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 gva=0xffff800010fa5000 gpa=0x000000e0040000
> > >> [ 46.200455] Unhandled fault at 0xffff800010fa5000
> > >>
> > >> [snip]
> > >>
> > >> [ 46.233079] Call trace:
> > >> [ 46.233559] __pci_write_msi_msg+0x70/0x180
> > >> [ 46.234149] pci_msi_domain_write_msg+0x28/0x30
> > >> [ 46.234869] msi_domain_activate+0x5c/0x88
> > >>
> > >> From the above you can see that BARs are mapped for Domain-0 now
> > >>
> > >> only when an assigned PCI device gets enabled in Domain-0.
> > >>
> > >> Another thing to note is that we crash on MSI-X access as BARs are mapped
> > >>
> > >> to the domain while enabling memory decoding in the COMMAND register,
> > >>
> > >> but MSI-X are handled differently, e.g. we have MSI-X holes in the mappings.
> > >>
> > >> This is because before this change the whole PCI aperture was mapped into
> > >>
> > >> Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there
> > >>
> > >> was no need to do so, e.g. they were always mapped into Domain-0 and
> > >>
> > >> emulated for guests.
> > >>
> > >> Please note that one cannot use xl pci-attach in this case to attach the PCI device
> > >>
> > >> in question to Domain-0 as (please see the log) that device is already attached.
> > >>
> > >> Neither it can be detached and re-attached. So, without mapping MSI-X holes for
> > >>
> > >> Domain-0 the device becomes unusable in Domain-0. At the same time the device
> > >>
> > >> can be successfully passed to DomU.
> > >>
> > >>
> > >> Julien, Stefano! Please let me know how can we proceed with this.
> > > What was the plan for MSI-X in Dom0?
> > It just worked because we mapped everything
> > >
> > > Given that Dom0 interacts with a virtual-ITS and virtual-GIC rather than
> > > a physical-ITS and physical-GIC, I imagine that it wasn't correct for
> > > Dom0 to write to the real MSI-X table directly?
> > >
> > > Shouldn't Dom0 get emulated MSI-X tables like any DomU?
> > >
> > > Otherwise, if Dom0 is expected to have the real MSI-X tables mapped, then
> > > I would suggest to map them at the same time as the BARs. But I am
> > > thinking that Dom0 should get emulated MSI-X tables, not physical MSI-X
> > > tables.
> >
> > Yes, it seems more than reasonable to enable emulation for Domain-0
> >
> > as well. Other than that, Stefano, do you think we are good to go with
> >
> > the changes I did in order to unmap everything for Domain-0?
>
>
> It might be better to resend the series with the patch in it, because it
> is difficult to review the patch like this. Nonetheless I tried, but I
> might have missed something.
>
> Previously the whole PCIe bridge aperture was mapped to Dom0, and
> it was done by map_range_to_domain, is that correct?
>
> Now this patch, to avoid mapping the entire aperture to Dom0, is
> skipping any operations for PCIe devices in map_range_to_domain by
> setting need_mapping = false.
>
> The idea is that instead, we'll only map things when needed and not the
> whole aperture. However, looking at the changes to
> pci_host_bridge_mappings (formerly known as
> pci_host_bridge_need_p2m_mapping), it is still going through the full
> list of address ranges of the PCIe bridge and map each range one by one
> using map_range_to_domain. Also, pci_host_bridge_mappings is still
> called unconditionally at boot for Dom0.
>
> So I am missing the part where the aperture is actually *not* mapped to
> Dom0. What's the difference between the loop in
> pci_host_bridge_mappings:
>
> for ( i = 0; i < dt_number_of_address(dev); i++ )
> map_range_to_domain...
>
> and the previous code in map_range_to_domain? I think I am missing
> something but as mentioned it is difficult to review the patch like this
> out of order.
>
> Also, and this is minor, even if currently unused, it might be good to
> keep a length parameter to pci_host_bridge_need_p2m_mapping.
It looks like the filtering is done based on:
return cfg->phys_addr != addr
in pci_ecam_need_p2m_mapping that is expected to filter out the address
ranges we don't want to map because it comes from
xen/arch/arm/pci/pci-host-common.c:gen_pci_init:
/* Parse our PCI ecam register address*/
err = dt_device_get_address(dev, ecam_reg_idx, &addr, &size);
if ( err )
goto err_exit;
In pci_host_bridge_mappings, instead of parsing device tree again, can't
we just fetch the address and length we need to map straight from
bridge->sysdata->phys_addr/size ?
At the point when pci_host_bridge_mappings is called in your new patch,
we have already initialized all the PCIe-related data structures. It
seems a bit useless to have to go via device tree again.
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-15 10:45 ` Rahul Singh
2021-09-15 11:55 ` Oleksandr Andrushchenko
@ 2021-09-15 20:33 ` Stefano Stabellini
2021-09-17 6:13 ` Oleksandr Andrushchenko
1 sibling, 1 reply; 69+ messages in thread
From: Stefano Stabellini @ 2021-09-15 20:33 UTC (permalink / raw)
To: Rahul Singh
Cc: Oleksandr Andrushchenko, xen-devel, julien, Stefano Stabellini,
Oleksandr Tyshchenko, Volodymyr Babchuk, Artem Mygaiev,
roger.pau, Bertrand Marquis, Oleksandr Andrushchenko
[-- Attachment #1: Type: text/plain, Size: 3061 bytes --]
On Wed, 15 Sep 2021, Rahul Singh wrote:
> Hi Oleksandr, Stefano,
>
> > On 15 Sep 2021, at 6:30 am, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com> wrote:
> >
> > Hi, Rahul!
> >
> > On 14.09.21 17:24, Oleksandr Andrushchenko wrote:
> >>
> >> }
> >>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
> >>>> + struct pci_host_bridge *bridge,
> >>>> + const struct mmio_handler_ops *ops)
> >>>> +{
> >>>> + struct pci_config_window *cfg = bridge->sysdata;
> >>>> +
> >>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
> >>>> + return 0;
> >>>> +}
> >>> Given that struct pci_config_window is generic (it is not specific to
> >>> one bridge), I wonder if we even need the .register_mmio_handler
> >>> callback here.
> >>>
> >>> In fact,pci_host_bridge->sysdata doesn't even need to be a void*, it
> >>> could be a struct pci_config_window*, right?
> >>
> >> Rahul, this actually may change your series.
> >>
> >> Do you think we can do that?
> >>
> > This is the only change requested that left unanswered by now.
> >
> > Will it be possible that you change the API accordingly, so I can
> >
> > implement as Stefano suggests?
>
> We need pci_host_bridge->sysdata as void* in case we need to implement the new non-ecam PCI controller in XEN.
> Please have a look once in Linux code[1] how bridge->sysdata will be used. struct pci_config_window is used only for
> ecam supported host controller. Different PCI host controller will have different PCI interface to access the PCI controller.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/pcie-rcar-host.c#n309
>
> I think we need bridge->sysdata in future to implement the new PCI controller.
In my opinion the pci_config_window is too important of an information
to be left inside an opaque pointer, especially when the info under
pci_config_window is both critical and vendor-neutral.
My preference would be something like this:
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 9c28a4bdc4..c80d846da3 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -55,7 +55,6 @@ struct pci_config_window {
uint8_t busn_start;
uint8_t busn_end;
void __iomem *win;
- const struct pci_ecam_ops *ops;
};
/*
@@ -68,7 +67,8 @@ struct pci_host_bridge {
uint16_t segment; /* Segment number */
u8 bus_start; /* Bus start of this bridge. */
u8 bus_end; /* Bus end of this bridge. */
- void *sysdata; /* Pointer to the config space window*/
+ struct pci_config_window* cfg; /* Pointer to the bridge config window */
+ void *sysdata; /* Pointer to bridge private data */
const struct pci_ops *ops;
};
As a reference the attached patch builds. However, I had to remove const
where struct pci_ecam_ops *ops is used.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-diff; name=cfg.patch, Size: 4990 bytes --]
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index ecfa6822e4..f9d57ca0fa 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -7,6 +7,7 @@ config ARM_64
depends on !ARM_32
select 64BIT
select HAS_FAST_MULTIPLY
+ select HAS_PCI
config ARM
def_bool y
diff --git a/xen/arch/arm/pci/ecam.c b/xen/arch/arm/pci/ecam.c
index d32efb7fcb..f6d0d00c1b 100644
--- a/xen/arch/arm/pci/ecam.c
+++ b/xen/arch/arm/pci/ecam.c
@@ -26,8 +26,9 @@
void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
uint32_t sbdf, uint32_t where)
{
- const struct pci_config_window *cfg = bridge->sysdata;
- unsigned int devfn_shift = cfg->ops->bus_shift - 8;
+ const struct pci_ecam_ops *ops = bridge->sysdata;
+ const struct pci_config_window *cfg = bridge->cfg;
+ unsigned int devfn_shift = ops->bus_shift - 8;
void __iomem *base;
pci_sbdf_t sbdf_t = (pci_sbdf_t) sbdf ;
@@ -37,7 +38,7 @@ void __iomem *pci_ecam_map_bus(struct pci_host_bridge *bridge,
return NULL;
busn -= cfg->busn_start;
- base = cfg->win + (busn << cfg->ops->bus_shift);
+ base = cfg->win + (busn << ops->bus_shift);
return base + (PCI_DEVFN(sbdf_t.dev, sbdf_t.fn) << devfn_shift) + where;
}
diff --git a/xen/arch/arm/pci/pci-host-common.c b/xen/arch/arm/pci/pci-host-common.c
index c04be63645..41a5457e80 100644
--- a/xen/arch/arm/pci/pci-host-common.c
+++ b/xen/arch/arm/pci/pci-host-common.c
@@ -97,7 +97,6 @@ static struct pci_config_window *gen_pci_init(struct dt_device_node *dev,
cfg->phys_addr = addr;
cfg->size = size;
- cfg->ops = ops;
/*
* On 64-bit systems, we do a single ioremap for the whole config space
@@ -225,7 +224,7 @@ static int pci_bus_find_domain_nr(struct dt_device_node *dev)
}
int pci_host_common_probe(struct dt_device_node *dev,
- const struct pci_ecam_ops *ops,
+ struct pci_ecam_ops *ops,
int ecam_reg_idx)
{
struct pci_host_bridge *bridge;
@@ -245,7 +244,8 @@ int pci_host_common_probe(struct dt_device_node *dev,
}
bridge->dt_node = dev;
- bridge->sysdata = cfg;
+ bridge->cfg = cfg;
+ bridge->sysdata = ops;
bridge->ops = &ops->pci_ops;
bridge->bus_start = cfg->busn_start;
bridge->bus_end = cfg->busn_end;
diff --git a/xen/arch/arm/pci/pci-host-generic.c b/xen/arch/arm/pci/pci-host-generic.c
index 2d652e8910..66176f9658 100644
--- a/xen/arch/arm/pci/pci-host-generic.c
+++ b/xen/arch/arm/pci/pci-host-generic.c
@@ -32,7 +32,7 @@ static const struct dt_device_match gen_pci_dt_match[] = {
static int gen_pci_dt_init(struct dt_device_node *dev, const void *data)
{
const struct dt_device_match *of_id;
- const struct pci_ecam_ops *ops;
+ struct pci_ecam_ops *ops;
of_id = dt_match_node(gen_pci_dt_match, dev->dev.of_node);
ops = (struct pci_ecam_ops *) of_id->data;
diff --git a/xen/arch/arm/pci/pci-host-zynqmp.c b/xen/arch/arm/pci/pci-host-zynqmp.c
index fe103e3855..b4170c3bdd 100644
--- a/xen/arch/arm/pci/pci-host-zynqmp.c
+++ b/xen/arch/arm/pci/pci-host-zynqmp.c
@@ -32,7 +32,7 @@ static const struct dt_device_match gen_pci_dt_match[] = {
static int gen_pci_dt_init(struct dt_device_node *dev, const void *data)
{
const struct dt_device_match *of_id;
- const struct pci_ecam_ops *ops;
+ struct pci_ecam_ops *ops;
of_id = dt_match_node(gen_pci_dt_match, dev->dev.of_node);
ops = (struct pci_ecam_ops *) of_id->data;
diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
index 9c28a4bdc4..c80d846da3 100644
--- a/xen/include/asm-arm/pci.h
+++ b/xen/include/asm-arm/pci.h
@@ -55,7 +55,6 @@ struct pci_config_window {
uint8_t busn_start;
uint8_t busn_end;
void __iomem *win;
- const struct pci_ecam_ops *ops;
};
/*
@@ -68,7 +67,8 @@ struct pci_host_bridge {
uint16_t segment; /* Segment number */
u8 bus_start; /* Bus start of this bridge. */
u8 bus_end; /* Bus end of this bridge. */
- void *sysdata; /* Pointer to the config space window*/
+ struct pci_config_window* cfg; /* Pointer to the bridge config window */
+ void *sysdata; /* Pointer to bridge private data */
const struct pci_ops *ops;
};
@@ -100,7 +100,7 @@ struct pci_ecam_ops {
extern const struct pci_ecam_ops pci_generic_ecam_ops;
int pci_host_common_probe(struct dt_device_node *dev,
- const struct pci_ecam_ops *ops,
+ struct pci_ecam_ops *ops,
int ecam_reg_idx);
int pci_generic_config_read(struct pci_host_bridge *bridge, uint32_t sbdf,
uint32_t reg, uint32_t len, uint32_t *value);
^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-15 20:19 ` Stefano Stabellini
@ 2021-09-16 7:16 ` Oleksandr Andrushchenko
2021-09-16 20:22 ` Stefano Stabellini
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-16 7:16 UTC (permalink / raw)
To: Stefano Stabellini
Cc: Rahul Singh, Julien Grall, Oleksandr Andrushchenko, xen-devel,
Oleksandr Tyshchenko, Volodymyr Babchuk, Artem Mygaiev,
Roger Pau Monné,
Bertrand Marquis
Hi, Stefano!
On 15.09.21 23:19, Stefano Stabellini wrote:
> On Wed, 15 Sep 2021, Stefano Stabellini wrote:
>> On Wed, 15 Sep 2021, Oleksandr Andrushchenko wrote:
>>> On 15.09.21 03:36, Stefano Stabellini wrote:
>>>> On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote:
>>>>> With the patch above I have the following log in Domain-0:
>>>>>
>>>>> generic-armv8-xt-dom0 login: root
>>>>> root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input)
>>>>> (XEN) ==== PCI devices ====
>>>>> (XEN) ==== segment 0000 ====
>>>>> (XEN) 0000:03:00.0 - d0 - node -1
>>>>> (XEN) 0000:02:02.0 - d0 - node -1
>>>>> (XEN) 0000:02:01.0 - d0 - node -1
>>>>> (XEN) 0000:02:00.0 - d0 - node -1
>>>>> (XEN) 0000:01:00.0 - d0 - node -1
>>>>> (XEN) 0000:00:00.0 - d0 - node -1
>>>>> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
>>>>>
>>>>> root@generic-armv8-xt-dom0:~# modprobe e1000e
>>>>> [ 46.104729] e1000e: Intel(R) PRO/1000 Network Driver
>>>>> [ 46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
>>>>> [ 46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002)
>>>>> (XEN) map [e0000, e001f] -> 0xe0000 for d0
>>>>> (XEN) map [e0020, e003f] -> 0xe0020 for d0
>>>>> [ 46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
>>>>> [ 46.189668] pci_msi_setup_msi_irqs
>>>>> [ 46.191016] nwl_compose_msi_msg msg at fe440000
>>>>> (XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 gva=0xffff800010fa5000 gpa=0x000000e0040000
>>>>> [ 46.200455] Unhandled fault at 0xffff800010fa5000
>>>>>
>>>>> [snip]
>>>>>
>>>>> [ 46.233079] Call trace:
>>>>> [ 46.233559] __pci_write_msi_msg+0x70/0x180
>>>>> [ 46.234149] pci_msi_domain_write_msg+0x28/0x30
>>>>> [ 46.234869] msi_domain_activate+0x5c/0x88
>>>>>
>>>>> From the above you can see that BARs are mapped for Domain-0 now
>>>>>
>>>>> only when an assigned PCI device gets enabled in Domain-0.
>>>>>
>>>>> Another thing to note is that we crash on MSI-X access as BARs are mapped
>>>>>
>>>>> to the domain while enabling memory decoding in the COMMAND register,
>>>>>
>>>>> but MSI-X are handled differently, e.g. we have MSI-X holes in the mappings.
>>>>>
>>>>> This is because before this change the whole PCI aperture was mapped into
>>>>>
>>>>> Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there
>>>>>
>>>>> was no need to do so, e.g. they were always mapped into Domain-0 and
>>>>>
>>>>> emulated for guests.
>>>>>
>>>>> Please note that one cannot use xl pci-attach in this case to attach the PCI device
>>>>>
>>>>> in question to Domain-0 as (please see the log) that device is already attached.
>>>>>
>>>>> Neither it can be detached and re-attached. So, without mapping MSI-X holes for
>>>>>
>>>>> Domain-0 the device becomes unusable in Domain-0. At the same time the device
>>>>>
>>>>> can be successfully passed to DomU.
>>>>>
>>>>>
>>>>> Julien, Stefano! Please let me know how can we proceed with this.
>>>> What was the plan for MSI-X in Dom0?
>>> It just worked because we mapped everything
>>>> Given that Dom0 interacts with a virtual-ITS and virtual-GIC rather than
>>>> a physical-ITS and physical-GIC, I imagine that it wasn't correct for
>>>> Dom0 to write to the real MSI-X table directly?
>>>>
>>>> Shouldn't Dom0 get emulated MSI-X tables like any DomU?
>>>>
>>>> Otherwise, if Dom0 is expected to have the real MSI-X tables mapped, then
>>>> I would suggest to map them at the same time as the BARs. But I am
>>>> thinking that Dom0 should get emulated MSI-X tables, not physical MSI-X
>>>> tables.
>>> Yes, it seems more than reasonable to enable emulation for Domain-0
>>>
>>> as well. Other than that, Stefano, do you think we are good to go with
>>>
>>> the changes I did in order to unmap everything for Domain-0?
>>
>> It might be better to resend the series with the patch in it, because it
>> is difficult to review the patch like this.
This is true. Taking into account Xen release plan I am just trying to
minimize the turn around here. Sorry about this
>> Nonetheless I tried, but I
>> might have missed something.
Thank you for your time!!
>>
>> Previously the whole PCIe bridge aperture was mapped to Dom0, and
>> it was done by map_range_to_domain, is that correct?
Yes, but not only the aperture: please see below.
>>
>> Now this patch, to avoid mapping the entire aperture to Dom0, is
>> skipping any operations for PCIe devices in map_range_to_domain by
>> setting need_mapping = false.
>>
>> The idea is that instead, we'll only map things when needed and not the
>> whole aperture. However, looking at the changes to
>> pci_host_bridge_mappings (formerly known as
>> pci_host_bridge_need_p2m_mapping), it is still going through the full
>> list of address ranges of the PCIe bridge and map each range one by one
>> using map_range_to_domain. Also, pci_host_bridge_mappings is still
>> called unconditionally at boot for Dom0.
>>
>> So I am missing the part where the aperture is actually *not* mapped to
>> Dom0.
With map_range_to_domain we also mapped all the entries
of the "reg" and "ranges" properties. Let's have a look at [1]:
- ranges : As described in IEEE Std 1275-1994, but must provide
at least a definition of non-prefetchable memory. One
or both of prefetchable Memory and IO Space may also
be provided.
- reg : The Configuration Space base address and size, as accessed
from the parent bus. The base address corresponds to
the first bus in the "bus-range" property. If no
"bus-range" is specified, this will be bus 0 (the default).
The most interesting comes when "reg" also has other then configuration
space addresses, for example [2]:
- reg: Should contain rc_dbi, config registers location and length.
- reg-names: Must include the following entries:
"rc_dbi": controller configuration registers;
"config": PCIe configuration space registers.
So, we don't need to map "ranges" and *config* from the "reg", but all
the rest from "reg" still needs to be mapped to Domain-0, so the PCIe
bridge can remain functional in Domain-0.
>> What's the difference between the loop in
>> pci_host_bridge_mappings:
>>
>> for ( i = 0; i < dt_number_of_address(dev); i++ )
>> map_range_to_domain...
>>
>> and the previous code in map_range_to_domain? I think I am missing
>> something but as mentioned it is difficult to review the patch like this
>> out of order.
>>
>> Also, and this is minor, even if currently unused, it might be good to
>> keep a length parameter to pci_host_bridge_need_p2m_mapping.
> It looks like the filtering is done based on:
>
> return cfg->phys_addr != addr
As I explained above it is *now* the only range that we *do not want* to
be mapped to Domain-0. Other "reg" entries still need to be mapped.
>
> in pci_ecam_need_p2m_mapping that is expected to filter out the address
> ranges we don't want to map because it comes from
> xen/arch/arm/pci/pci-host-common.c:gen_pci_init:
>
> /* Parse our PCI ecam register address*/
> err = dt_device_get_address(dev, ecam_reg_idx, &addr, &size);
> if ( err )
> goto err_exit;
>
> In pci_host_bridge_mappings, instead of parsing device tree again, can't
> we just fetch the address and length we need to map straight from
> bridge->sysdata->phys_addr/size ?
We can't as the address describes the configuration space which we
*do not* want to be mapped, but what we want is other than configuration
entries in the "reg" property.
>
> At the point when pci_host_bridge_mappings is called in your new patch,
> we have already initialized all the PCIe-related data structures. It
> seems a bit useless to have to go via device tree again.
Bottom line: we do need to go over the "reg" property and map the regions
of which PCIe bridge is dependent and only skip "config" part of it.
Thank you,
Oleksandr
[1] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt
[2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m
2021-09-16 7:16 ` Oleksandr Andrushchenko
@ 2021-09-16 20:22 ` Stefano Stabellini
0 siblings, 0 replies; 69+ messages in thread
From: Stefano Stabellini @ 2021-09-16 20:22 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: Stefano Stabellini, Rahul Singh, Julien Grall,
Oleksandr Andrushchenko, xen-devel, Oleksandr Tyshchenko,
Volodymyr Babchuk, Artem Mygaiev, Roger Pau Monné,
Bertrand Marquis
[-- Attachment #1: Type: text/plain, Size: 8742 bytes --]
On Thu, 16 Sep 2021, Oleksandr Andrushchenko wrote:
> On 15.09.21 23:19, Stefano Stabellini wrote:
> > On Wed, 15 Sep 2021, Stefano Stabellini wrote:
> >> On Wed, 15 Sep 2021, Oleksandr Andrushchenko wrote:
> >>> On 15.09.21 03:36, Stefano Stabellini wrote:
> >>>> On Tue, 14 Sep 2021, Oleksandr Andrushchenko wrote:
> >>>>> With the patch above I have the following log in Domain-0:
> >>>>>
> >>>>> generic-armv8-xt-dom0 login: root
> >>>>> root@generic-armv8-xt-dom0:~# (XEN) *** Serial input to Xen (type 'CTRL-a' three times to switch input)
> >>>>> (XEN) ==== PCI devices ====
> >>>>> (XEN) ==== segment 0000 ====
> >>>>> (XEN) 0000:03:00.0 - d0 - node -1
> >>>>> (XEN) 0000:02:02.0 - d0 - node -1
> >>>>> (XEN) 0000:02:01.0 - d0 - node -1
> >>>>> (XEN) 0000:02:00.0 - d0 - node -1
> >>>>> (XEN) 0000:01:00.0 - d0 - node -1
> >>>>> (XEN) 0000:00:00.0 - d0 - node -1
> >>>>> (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
> >>>>>
> >>>>> root@generic-armv8-xt-dom0:~# modprobe e1000e
> >>>>> [ 46.104729] e1000e: Intel(R) PRO/1000 Network Driver
> >>>>> [ 46.105479] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> >>>>> [ 46.107297] e1000e 0000:03:00.0: enabling device (0000 -> 0002)
> >>>>> (XEN) map [e0000, e001f] -> 0xe0000 for d0
> >>>>> (XEN) map [e0020, e003f] -> 0xe0020 for d0
> >>>>> [ 46.178513] e1000e 0000:03:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
> >>>>> [ 46.189668] pci_msi_setup_msi_irqs
> >>>>> [ 46.191016] nwl_compose_msi_msg msg at fe440000
> >>>>> (XEN) traps.c:2014:d0v0 HSR=0x00000093810047 pc=0xffff8000104b4b00 gva=0xffff800010fa5000 gpa=0x000000e0040000
> >>>>> [ 46.200455] Unhandled fault at 0xffff800010fa5000
> >>>>>
> >>>>> [snip]
> >>>>>
> >>>>> [ 46.233079] Call trace:
> >>>>> [ 46.233559] __pci_write_msi_msg+0x70/0x180
> >>>>> [ 46.234149] pci_msi_domain_write_msg+0x28/0x30
> >>>>> [ 46.234869] msi_domain_activate+0x5c/0x88
> >>>>>
> >>>>> From the above you can see that BARs are mapped for Domain-0 now
> >>>>>
> >>>>> only when an assigned PCI device gets enabled in Domain-0.
> >>>>>
> >>>>> Another thing to note is that we crash on MSI-X access as BARs are mapped
> >>>>>
> >>>>> to the domain while enabling memory decoding in the COMMAND register,
> >>>>>
> >>>>> but MSI-X are handled differently, e.g. we have MSI-X holes in the mappings.
> >>>>>
> >>>>> This is because before this change the whole PCI aperture was mapped into
> >>>>>
> >>>>> Domain-0 and it is not. Thus, MSI-X holes are left unmapped now and there
> >>>>>
> >>>>> was no need to do so, e.g. they were always mapped into Domain-0 and
> >>>>>
> >>>>> emulated for guests.
> >>>>>
> >>>>> Please note that one cannot use xl pci-attach in this case to attach the PCI device
> >>>>>
> >>>>> in question to Domain-0 as (please see the log) that device is already attached.
> >>>>>
> >>>>> Neither it can be detached and re-attached. So, without mapping MSI-X holes for
> >>>>>
> >>>>> Domain-0 the device becomes unusable in Domain-0. At the same time the device
> >>>>>
> >>>>> can be successfully passed to DomU.
> >>>>>
> >>>>>
> >>>>> Julien, Stefano! Please let me know how can we proceed with this.
> >>>> What was the plan for MSI-X in Dom0?
> >>> It just worked because we mapped everything
> >>>> Given that Dom0 interacts with a virtual-ITS and virtual-GIC rather than
> >>>> a physical-ITS and physical-GIC, I imagine that it wasn't correct for
> >>>> Dom0 to write to the real MSI-X table directly?
> >>>>
> >>>> Shouldn't Dom0 get emulated MSI-X tables like any DomU?
> >>>>
> >>>> Otherwise, if Dom0 is expected to have the real MSI-X tables mapped, then
> >>>> I would suggest to map them at the same time as the BARs. But I am
> >>>> thinking that Dom0 should get emulated MSI-X tables, not physical MSI-X
> >>>> tables.
> >>> Yes, it seems more than reasonable to enable emulation for Domain-0
> >>>
> >>> as well. Other than that, Stefano, do you think we are good to go with
> >>>
> >>> the changes I did in order to unmap everything for Domain-0?
> >>
> >> It might be better to resend the series with the patch in it, because it
> >> is difficult to review the patch like this.
>
> This is true. Taking into account Xen release plan I am just trying to
>
> minimize the turn around here. Sorry about this
>
> >> Nonetheless I tried, but I
> >> might have missed something.
> Thank you for your time!!
> >>
> >> Previously the whole PCIe bridge aperture was mapped to Dom0, and
> >> it was done by map_range_to_domain, is that correct?
>
> Yes, but not only the aperture: please see below.
>
>
> >>
> >> Now this patch, to avoid mapping the entire aperture to Dom0, is
> >> skipping any operations for PCIe devices in map_range_to_domain by
> >> setting need_mapping = false.
> >>
> >> The idea is that instead, we'll only map things when needed and not the
> >> whole aperture. However, looking at the changes to
> >> pci_host_bridge_mappings (formerly known as
> >> pci_host_bridge_need_p2m_mapping), it is still going through the full
> >> list of address ranges of the PCIe bridge and map each range one by one
> >> using map_range_to_domain. Also, pci_host_bridge_mappings is still
> >> called unconditionally at boot for Dom0.
> >>
> >> So I am missing the part where the aperture is actually *not* mapped to
> >> Dom0.
> With map_range_to_domain we also mapped all the entries
>
> of the "reg" and "ranges" properties. Let's have a look at [1]:
>
> - ranges : As described in IEEE Std 1275-1994, but must provide
> at least a definition of non-prefetchable memory. One
> or both of prefetchable Memory and IO Space may also
> be provided.
>
> - reg : The Configuration Space base address and size, as accessed
> from the parent bus. The base address corresponds to
> the first bus in the "bus-range" property. If no
> "bus-range" is specified, this will be bus 0 (the default).
>
> The most interesting comes when "reg" also has other then configuration
>
> space addresses, for example [2]:
>
> - reg: Should contain rc_dbi, config registers location and length.
> - reg-names: Must include the following entries:
> "rc_dbi": controller configuration registers;
> "config": PCIe configuration space registers.
>
> So, we don't need to map "ranges" and *config* from the "reg", but all
>
> the rest from "reg" still needs to be mapped to Domain-0, so the PCIe
>
> bridge can remain functional in Domain-0.
>
> >> What's the difference between the loop in
> >> pci_host_bridge_mappings:
> >>
> >> for ( i = 0; i < dt_number_of_address(dev); i++ )
> >> map_range_to_domain...
> >>
> >> and the previous code in map_range_to_domain? I think I am missing
> >> something but as mentioned it is difficult to review the patch like this
> >> out of order.
> >>
> >> Also, and this is minor, even if currently unused, it might be good to
> >> keep a length parameter to pci_host_bridge_need_p2m_mapping.
> > It looks like the filtering is done based on:
> >
> > return cfg->phys_addr != addr
>
> As I explained above it is *now* the only range that we *do not want* to
>
> be mapped to Domain-0. Other "reg" entries still need to be mapped.
>
> >
> > in pci_ecam_need_p2m_mapping that is expected to filter out the address
> > ranges we don't want to map because it comes from
> > xen/arch/arm/pci/pci-host-common.c:gen_pci_init:
> >
> > /* Parse our PCI ecam register address*/
> > err = dt_device_get_address(dev, ecam_reg_idx, &addr, &size);
> > if ( err )
> > goto err_exit;
> >
> > In pci_host_bridge_mappings, instead of parsing device tree again, can't
> > we just fetch the address and length we need to map straight from
> > bridge->sysdata->phys_addr/size ?
>
> We can't as the address describes the configuration space which we
>
> *do not* want to be mapped, but what we want is other than configuration
>
> entries in the "reg" property.
>
> >
> > At the point when pci_host_bridge_mappings is called in your new patch,
> > we have already initialized all the PCIe-related data structures. It
> > seems a bit useless to have to go via device tree again.
>
> Bottom line: we do need to go over the "reg" property and map the regions
>
> of which PCIe bridge is dependent and only skip "config" part of it.
OK, thanks for the explanation, please add it to the commit message
and/or as an in-code comment when you resend the patch
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-15 20:33 ` Stefano Stabellini
@ 2021-09-17 6:13 ` Oleksandr Andrushchenko
2021-09-17 7:29 ` Rahul Singh
0 siblings, 1 reply; 69+ messages in thread
From: Oleksandr Andrushchenko @ 2021-09-17 6:13 UTC (permalink / raw)
To: Rahul Singh
Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Stefano Stabellini,
Oleksandr Andrushchenko
Hi, Rahul!
On 15.09.21 23:33, Stefano Stabellini wrote:
> On Wed, 15 Sep 2021, Rahul Singh wrote:
>> Hi Oleksandr, Stefano,
>>
>>> On 15 Sep 2021, at 6:30 am, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com> wrote:
>>>
>>> Hi, Rahul!
>>>
>>> On 14.09.21 17:24, Oleksandr Andrushchenko wrote:
>>>> }
>>>>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>>>>> + struct pci_host_bridge *bridge,
>>>>>> + const struct mmio_handler_ops *ops)
>>>>>> +{
>>>>>> + struct pci_config_window *cfg = bridge->sysdata;
>>>>>> +
>>>>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>>>>> + return 0;
>>>>>> +}
>>>>> Given that struct pci_config_window is generic (it is not specific to
>>>>> one bridge), I wonder if we even need the .register_mmio_handler
>>>>> callback here.
>>>>>
>>>>> In fact,pci_host_bridge->sysdata doesn't even need to be a void*, it
>>>>> could be a struct pci_config_window*, right?
>>>> Rahul, this actually may change your series.
>>>>
>>>> Do you think we can do that?
>>>>
>>> This is the only change requested that left unanswered by now.
>>>
>>> Will it be possible that you change the API accordingly, so I can
>>>
>>> implement as Stefano suggests?
>> We need pci_host_bridge->sysdata as void* in case we need to implement the new non-ecam PCI controller in XEN.
>> Please have a look once in Linux code[1] how bridge->sysdata will be used. struct pci_config_window is used only for
>> ecam supported host controller. Different PCI host controller will have different PCI interface to access the PCI controller.
>>
>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/pcie-rcar-host.c*n309__;Iw!!GF_29dbcQIUBPA!kjkv6KIlvXOEgVaS6BNPRo0gyABihhO0XmNHRPFJaFAGhhTEuK1mIsWqPs0cXEipzkT_MmA-Eg$ [git[.]kernel[.]org]
>>
>> I think we need bridge->sysdata in future to implement the new PCI controller.
> In my opinion the pci_config_window is too important of an information
> to be left inside an opaque pointer, especially when the info under
> pci_config_window is both critical and vendor-neutral.
>
> My preference would be something like this:
>
>
> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
> index 9c28a4bdc4..c80d846da3 100644
> --- a/xen/include/asm-arm/pci.h
> +++ b/xen/include/asm-arm/pci.h
> @@ -55,7 +55,6 @@ struct pci_config_window {
> uint8_t busn_start;
> uint8_t busn_end;
> void __iomem *win;
> - const struct pci_ecam_ops *ops;
> };
>
> /*
> @@ -68,7 +67,8 @@ struct pci_host_bridge {
> uint16_t segment; /* Segment number */
> u8 bus_start; /* Bus start of this bridge. */
> u8 bus_end; /* Bus end of this bridge. */
> - void *sysdata; /* Pointer to the config space window*/
> + struct pci_config_window* cfg; /* Pointer to the bridge config window */
> + void *sysdata; /* Pointer to bridge private data */
> const struct pci_ops *ops;
> };
>
>
> As a reference the attached patch builds. However, I had to remove const
> where struct pci_ecam_ops *ops is used.
I'd like to know which route we go with this as this is now the last
thing which stops me from sending v2 of this series.
Thank you,
Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain
2021-09-17 6:13 ` Oleksandr Andrushchenko
@ 2021-09-17 7:29 ` Rahul Singh
0 siblings, 0 replies; 69+ messages in thread
From: Rahul Singh @ 2021-09-17 7:29 UTC (permalink / raw)
To: Oleksandr Andrushchenko
Cc: xen-devel, julien, Oleksandr Tyshchenko, Volodymyr Babchuk,
Artem Mygaiev, roger.pau, Bertrand Marquis, Stefano Stabellini,
Oleksandr Andrushchenko
Hi Oleksandr,
> On 17 Sep 2021, at 7:13 am, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com> wrote:
>
> Hi, Rahul!
>
> On 15.09.21 23:33, Stefano Stabellini wrote:
>> On Wed, 15 Sep 2021, Rahul Singh wrote:
>>> Hi Oleksandr, Stefano,
>>>
>>>> On 15 Sep 2021, at 6:30 am, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com> wrote:
>>>>
>>>> Hi, Rahul!
>>>>
>>>> On 14.09.21 17:24, Oleksandr Andrushchenko wrote:
>>>>> }
>>>>>>> +static int pci_ecam_register_mmio_handler(struct domain *d,
>>>>>>> + struct pci_host_bridge *bridge,
>>>>>>> + const struct mmio_handler_ops *ops)
>>>>>>> +{
>>>>>>> + struct pci_config_window *cfg = bridge->sysdata;
>>>>>>> +
>>>>>>> + register_mmio_handler(d, ops, cfg->phys_addr, cfg->size, NULL);
>>>>>>> + return 0;
>>>>>>> +}
>>>>>> Given that struct pci_config_window is generic (it is not specific to
>>>>>> one bridge), I wonder if we even need the .register_mmio_handler
>>>>>> callback here.
>>>>>>
>>>>>> In fact,pci_host_bridge->sysdata doesn't even need to be a void*, it
>>>>>> could be a struct pci_config_window*, right?
>>>>> Rahul, this actually may change your series.
>>>>>
>>>>> Do you think we can do that?
>>>>>
>>>> This is the only change requested that left unanswered by now.
>>>>
>>>> Will it be possible that you change the API accordingly, so I can
>>>>
>>>> implement as Stefano suggests?
>>> We need pci_host_bridge->sysdata as void* in case we need to implement the new non-ecam PCI controller in XEN.
>>> Please have a look once in Linux code[1] how bridge->sysdata will be used. struct pci_config_window is used only for
>>> ecam supported host controller. Different PCI host controller will have different PCI interface to access the PCI controller.
>>>
>>> [1] https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/pcie-rcar-host.c*n309__;Iw!!GF_29dbcQIUBPA!kjkv6KIlvXOEgVaS6BNPRo0gyABihhO0XmNHRPFJaFAGhhTEuK1mIsWqPs0cXEipzkT_MmA-Eg$ [git[.]kernel[.]org]
>>>
>>> I think we need bridge->sysdata in future to implement the new PCI controller.
>> In my opinion the pci_config_window is too important of an information
>> to be left inside an opaque pointer, especially when the info under
>> pci_config_window is both critical and vendor-neutral.
>>
>> My preference would be something like this:
>>
>>
>> diff --git a/xen/include/asm-arm/pci.h b/xen/include/asm-arm/pci.h
>> index 9c28a4bdc4..c80d846da3 100644
>> --- a/xen/include/asm-arm/pci.h
>> +++ b/xen/include/asm-arm/pci.h
>> @@ -55,7 +55,6 @@ struct pci_config_window {
>> uint8_t busn_start;
>> uint8_t busn_end;
>> void __iomem *win;
>> - const struct pci_ecam_ops *ops;
>> };
>>
>> /*
>> @@ -68,7 +67,8 @@ struct pci_host_bridge {
>> uint16_t segment; /* Segment number */
>> u8 bus_start; /* Bus start of this bridge. */
>> u8 bus_end; /* Bus end of this bridge. */
>> - void *sysdata; /* Pointer to the config space window*/
>> + struct pci_config_window* cfg; /* Pointer to the bridge config window */
>> + void *sysdata; /* Pointer to bridge private data */
>> const struct pci_ops *ops;
>> };
>>
>>
>> As a reference the attached patch builds. However, I had to remove const
>> where struct pci_ecam_ops *ops is used.
>
> I'd like to know which route we go with this as this is now the last
>
> thing which stops me from sending v2 of this series.
I will modify the code as per Stefano request and will send the next version.
Regards,
Rahul
>
> Thank you,
>
> Oleksandr
^ permalink raw reply [flat|nested] 69+ messages in thread
end of thread, other threads:[~2021-09-17 7:30 UTC | newest]
Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-03 8:33 [PATCH 00/11] PCI devices passthrough on Arm, part 2 Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 01/11] xen/arm: Add new device type for PCI Oleksandr Andrushchenko
2021-09-09 17:19 ` Julien Grall
2021-09-10 7:40 ` Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 02/11] xen/arm: Add dev_to_pci helper Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 03/11] xen/arm: Introduce pci_find_host_bridge_node helper Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 04/11] xen/device-tree: Make dt_find_node_by_phandle global Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 05/11] xen/arm: Mark device as PCI while creating one Oleksandr Andrushchenko
2021-09-03 12:41 ` Jan Beulich
2021-09-03 13:26 ` Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 06/11] xen/domain: Call pci_release_devices() when releasing domain resources Oleksandr Andrushchenko
2021-09-10 18:45 ` Stefano Stabellini
2021-09-03 8:33 ` [PATCH 07/11] libxl: Allow removing PCI devices for all types of domains Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 08/11] libxl: Only map legacy PCI IRQs if they are supported Oleksandr Andrushchenko
2021-09-03 10:26 ` Juergen Gross
2021-09-03 10:30 ` Oleksandr Andrushchenko
2021-09-10 19:06 ` Stefano Stabellini
2021-09-13 8:22 ` Oleksandr Andrushchenko
2021-09-03 8:33 ` [PATCH 09/11] xen/arm: Setup MMIO range trap handlers for hardware domain Oleksandr Andrushchenko
2021-09-09 17:43 ` Julien Grall
2021-09-10 11:43 ` Oleksandr Andrushchenko
2021-09-10 13:04 ` Julien Grall
2021-09-10 13:15 ` Oleksandr Andrushchenko
2021-09-10 13:20 ` Julien Grall
2021-09-10 13:27 ` Oleksandr Andrushchenko
2021-09-10 13:33 ` Julien Grall
2021-09-10 13:40 ` Oleksandr Andrushchenko
2021-09-14 13:47 ` Oleksandr Andrushchenko
2021-09-15 0:25 ` Stefano Stabellini
2021-09-15 4:50 ` Oleksandr Andrushchenko
2021-09-10 20:12 ` Stefano Stabellini
2021-09-14 14:24 ` Oleksandr Andrushchenko
2021-09-15 5:30 ` Oleksandr Andrushchenko
2021-09-15 10:45 ` Rahul Singh
2021-09-15 11:55 ` Oleksandr Andrushchenko
2021-09-15 20:33 ` Stefano Stabellini
2021-09-17 6:13 ` Oleksandr Andrushchenko
2021-09-17 7:29 ` Rahul Singh
2021-09-03 8:33 ` [PATCH 10/11] xen/arm: Do not map PCI ECAM space to Domain-0's p2m Oleksandr Andrushchenko
2021-09-09 17:58 ` Julien Grall
2021-09-10 12:37 ` Oleksandr Andrushchenko
2021-09-10 13:18 ` Julien Grall
2021-09-10 14:01 ` Oleksandr Andrushchenko
2021-09-10 14:18 ` Julien Grall
2021-09-10 14:38 ` Oleksandr Andrushchenko
2021-09-10 14:52 ` Julien Grall
2021-09-10 15:01 ` Oleksandr Andrushchenko
2021-09-10 15:05 ` Julien Grall
2021-09-10 15:04 ` Julien Grall
2021-09-10 20:30 ` Stefano Stabellini
2021-09-10 21:41 ` Julien Grall
2021-09-13 6:27 ` Oleksandr Andrushchenko
2021-09-14 10:03 ` Oleksandr Andrushchenko
2021-09-15 0:36 ` Stefano Stabellini
2021-09-15 5:35 ` Oleksandr Andrushchenko
2021-09-15 16:42 ` Rahul Singh
2021-09-15 20:09 ` Stefano Stabellini
2021-09-15 20:19 ` Stefano Stabellini
2021-09-16 7:16 ` Oleksandr Andrushchenko
2021-09-16 20:22 ` Stefano Stabellini
2021-09-03 8:33 ` [PATCH 11/11] xen/arm: Process pending vPCI map/unmap operations Oleksandr Andrushchenko
2021-09-03 9:04 ` Julien Grall
2021-09-06 7:02 ` Oleksandr Andrushchenko
2021-09-06 8:48 ` Julien Grall
2021-09-06 9:14 ` Oleksandr Andrushchenko
2021-09-06 9:53 ` Julien Grall
2021-09-06 10:06 ` Oleksandr Andrushchenko
2021-09-06 10:38 ` Julien Grall
2021-09-07 6:34 ` Oleksandr Andrushchenko
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.