linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Resizable PCI BAR support V9
@ 2017-10-18 13:58 Christian König
  2017-10-18 13:58 ` [PATCH v9 1/5] PCI: add a define for the PCI resource type mask v2 Christian König
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Christian König @ 2017-10-18 13:58 UTC (permalink / raw)
  To: helgaas, linux-pci, dri-devel, linux-kernel, amd-gfx

Hi everyone,

This is the ninth and hopefully last incarnation of this set of patches. It
enables device drivers to resize and most likely also relocate the PCI BAR of
devices they manage to allow the CPU to access all of the device local memory
at once.

This is very useful for GFX device drivers where the default PCI BAR is only
about 256MB in size for compatibility reasons, but the device easily have
multiple gigabyte of local memory.

Some changes since the last version:
1. Rebased on drm-next, so should be ready to be merged for 4.15.
2. The fixup of the 64bit root window on AMD Family 15h CPUs/APUs is only
   enabled when we compile a kernel supporting that hw.
3. Some minor error handling improvements for the amdgpu side. We now
   gracefully abort driver loading in case of a critical error instead of
   calling BUG().

Bjorn any more comments or can we finally get this into 4.15? I will remove the
version tags from the patches when I send you a pull request if you want this.

I only work on this as a background task, so sorry for the ~3 month delay
between each version of the patchset.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v9 1/5] PCI: add a define for the PCI resource type mask v2
  2017-10-18 13:58 Resizable PCI BAR support V9 Christian König
@ 2017-10-18 13:58 ` Christian König
  2017-10-18 13:58 ` [PATCH v9 2/5] PCI: add resizeable BAR infrastructure v5 Christian König
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Christian König @ 2017-10-18 13:58 UTC (permalink / raw)
  To: helgaas, linux-pci, dri-devel, linux-kernel, amd-gfx

From: Christian König <christian.koenig@amd.com>

We use this mask multiple times in the bus setup.

v2: fix some style nit picks

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
---
 drivers/pci/pci.h       |  3 +++
 drivers/pci/setup-bus.c | 12 +++---------
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 22e061738c6f..5c475edc78c2 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -4,6 +4,9 @@
 #define PCI_FIND_CAP_TTL	48
 
 #define PCI_VSEC_ID_INTEL_TBT	0x1234	/* Thunderbolt */
+#define PCI_RES_TYPE_MASK \
+	(IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
+	 IORESOURCE_MEM_64)
 
 extern const unsigned char pcie_link_speed[];
 
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 958da7db9033..37450d9e1132 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1523,8 +1523,6 @@ static void pci_bridge_release_resources(struct pci_bus *bus,
 {
 	struct pci_dev *dev = bus->self;
 	struct resource *r;
-	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
-				  IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
 	unsigned old_flags = 0;
 	struct resource *b_res;
 	int idx = 1;
@@ -1567,7 +1565,7 @@ static void pci_bridge_release_resources(struct pci_bus *bus,
 	 */
 	release_child_resources(r);
 	if (!release_resource(r)) {
-		type = old_flags = r->flags & type_mask;
+		type = old_flags = r->flags & PCI_RES_TYPE_MASK;
 		dev_printk(KERN_DEBUG, &dev->dev, "resource %d %pR released\n",
 					PCI_BRIDGE_RESOURCES + idx, r);
 		/* keep the old size */
@@ -1758,8 +1756,6 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 	enum release_type rel_type = leaf_only;
 	LIST_HEAD(fail_head);
 	struct pci_dev_resource *fail_res;
-	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
-				  IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
 	int pci_try_num = 1;
 	enum enable_type enable_local;
 
@@ -1818,7 +1814,7 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 	 */
 	list_for_each_entry(fail_res, &fail_head, list)
 		pci_bus_release_bridge_resources(fail_res->dev->bus,
-						 fail_res->flags & type_mask,
+						 fail_res->flags & PCI_RES_TYPE_MASK,
 						 rel_type);
 
 	/* restore size and flags */
@@ -1862,8 +1858,6 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
 	LIST_HEAD(fail_head);
 	struct pci_dev_resource *fail_res;
 	int retval;
-	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
-				  IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
 
 again:
 	__pci_bus_size_bridges(parent, &add_list);
@@ -1889,7 +1883,7 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
 	 */
 	list_for_each_entry(fail_res, &fail_head, list)
 		pci_bus_release_bridge_resources(fail_res->dev->bus,
-						 fail_res->flags & type_mask,
+						 fail_res->flags & PCI_RES_TYPE_MASK,
 						 whole_subtree);
 
 	/* restore size and flags */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v9 2/5] PCI: add resizeable BAR infrastructure v5
  2017-10-18 13:58 Resizable PCI BAR support V9 Christian König
  2017-10-18 13:58 ` [PATCH v9 1/5] PCI: add a define for the PCI resource type mask v2 Christian König
@ 2017-10-18 13:58 ` Christian König
  2017-10-18 13:58 ` [PATCH v9 3/5] PCI: add functionality for resizing resources v7 Christian König
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Christian König @ 2017-10-18 13:58 UTC (permalink / raw)
  To: helgaas, linux-pci, dri-devel, linux-kernel, amd-gfx

From: Christian König <christian.koenig@amd.com>

Just the defines and helper functions to read the possible sizes of a BAR and
update it's size.

See https://pcisig.com/sites/default/files/specification_documents/ECN_Resizable-BAR_24Apr2008.pdf
and PCIe r3.1, sec 7.22.

This is useful for hardware with large local storage (mostly GFX) which only
expose 256MB BARs initially to be compatible with 32bit systems.

v2: provide read helper as well
v3: improve function names, use unsigned values, add better comments.
v4: move definition, improve commit message, s/bar/BAR/
v5: split out helper to find ctrl reg pos, style fixes, comment fixes,
    add pci_rbar_size_to_bytes as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
---
 drivers/pci/pci.c             | 104 ++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci.h             |   8 ++++
 include/uapi/linux/pci_regs.h |  11 ++++-
 3 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b4b7eab29400..3aca7393c43c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2957,6 +2957,110 @@ bool pci_acs_path_enabled(struct pci_dev *start,
 }
 
 /**
+ * pci_rbar_find_pos - find position of resize ctrl reg for BAR
+ * @dev: PCI device
+ * @bar: BAR to find
+ *
+ * Helper to find the postion of the ctrl register for a BAR.
+ * Returns -ENOTSUPP of resizeable BARs are not supported at all.
+ * Returns -ENOENT if not ctrl register for the BAR could be found.
+ */
+static int pci_rbar_find_pos(struct pci_dev *pdev, int bar)
+{
+	unsigned int pos, nbars;
+	unsigned int i;
+	u32 ctrl;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_REBAR);
+	if (!pos)
+		return -ENOTSUPP;
+
+	pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+	nbars = (ctrl & PCI_REBAR_CTRL_NBAR_MASK) >> PCI_REBAR_CTRL_NBAR_SHIFT;
+
+	for (i = 0; i < nbars; ++i, pos += 8) {
+		int bar_idx;
+
+		pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+		bar_idx = (ctrl & PCI_REBAR_CTRL_BAR_IDX_MASK) >>
+				PCI_REBAR_CTRL_BAR_IDX_SHIFT;
+		if (bar_idx == bar)
+			return pos;
+	}
+
+	return -ENOENT;
+}
+
+/**
+ * pci_rbar_get_possible_sizes - get possible sizes for BAR
+ * @dev: PCI device
+ * @bar: BAR to query
+ *
+ * Get the possible sizes of a resizeable BAR as bitmask defined in the spec
+ * (bit 0=1MB, bit 19=512GB). Returns 0 if BAR isn't resizeable.
+ */
+u32 pci_rbar_get_possible_sizes(struct pci_dev *pdev, int bar)
+{
+	u32 cap;
+	int pos;
+
+	pos = pci_rbar_find_pos(pdev, bar);
+	if (pos < 0)
+		return 0;
+
+	pci_read_config_dword(pdev, pos + PCI_REBAR_CAP, &cap);
+	return (cap & PCI_REBAR_CTRL_SIZES_MASK) >>
+		PCI_REBAR_CTRL_SIZES_SHIFT;
+}
+
+/**
+ * pci_rbar_get_current_size - get the current size of a BAR
+ * @dev: PCI device
+ * @bar: BAR to set size to
+ *
+ * Read the size of a BAR from the resizeable BAR config.
+ * Returns size if found or negative error code.
+ */
+int pci_rbar_get_current_size(struct pci_dev *pdev, int bar)
+{
+	u32 ctrl;
+	int pos;
+
+	pos = pci_rbar_find_pos(pdev, bar);
+	if (pos < 0)
+		return pos;
+
+	pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+	return (ctrl & PCI_REBAR_CTRL_BAR_SIZE_MASK) >>
+		PCI_REBAR_CTRL_BAR_SIZE_SHIFT;
+}
+
+/**
+ * pci_rbar_set_size - set a new size for a BAR
+ * @dev: PCI device
+ * @bar: BAR to set size to
+ * @size: new size as defined in the spec (0=1MB, 19=512GB)
+ *
+ * Set the new size of a BAR as defined in the spec.
+ * Returns zero if resizing was successful, error code otherwise.
+ */
+int pci_rbar_set_size(struct pci_dev *pdev, int bar, int size)
+{
+	u32 ctrl;
+	int pos;
+
+	pos = pci_rbar_find_pos(pdev, bar);
+	if (pos < 0)
+		return pos;
+
+	pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+	ctrl &= ~PCI_REBAR_CTRL_BAR_SIZE_MASK;
+	ctrl |= size << PCI_REBAR_CTRL_BAR_SIZE_SHIFT;
+	pci_write_config_dword(pdev, pos + PCI_REBAR_CTRL, ctrl);
+	return 0;
+}
+
+/**
  * pci_swizzle_interrupt_pin - swizzle INTx for device behind bridge
  * @dev: the PCI device
  * @pin: the INTx pin (1=INTA, 2=INTB, 3=INTC, 4=INTD)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 5c475edc78c2..1681895366dc 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -368,4 +368,12 @@ int acpi_get_rc_resources(struct device *dev, const char *hid, u16 segment,
 			  struct resource *res);
 #endif
 
+u32 pci_rbar_get_possible_sizes(struct pci_dev *pdev, int bar);
+int pci_rbar_get_current_size(struct pci_dev *pdev, int bar);
+int pci_rbar_set_size(struct pci_dev *pdev, int bar, int size);
+static inline u64 pci_rbar_size_to_bytes(int size)
+{
+	return 1ULL << (size + 20);
+}
+
 #endif /* DRIVERS_PCI_H */
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index c22d3ebaca20..d6a075de6d43 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -943,9 +943,16 @@
 #define PCI_SATA_SIZEOF_LONG	16
 
 /* Resizable BARs */
+#define PCI_REBAR_CAP		4	/* capability register */
+#define  PCI_REBAR_CTRL_SIZES_MASK	(0xFFFFF << 4)	/* mask for sizes */
+#define  PCI_REBAR_CTRL_SIZES_SHIFT	4	/* shift for sizes */
 #define PCI_REBAR_CTRL		8	/* control register */
-#define  PCI_REBAR_CTRL_NBAR_MASK	(7 << 5)	/* mask for # bars */
-#define  PCI_REBAR_CTRL_NBAR_SHIFT	5	/* shift for # bars */
+#define  PCI_REBAR_CTRL_BAR_IDX_MASK	(7 << 0)	/* mask for BAR index */
+#define  PCI_REBAR_CTRL_BAR_IDX_SHIFT	0	/* shift for BAR index */
+#define  PCI_REBAR_CTRL_NBAR_MASK	(7 << 5)	/* mask for # BARs */
+#define  PCI_REBAR_CTRL_NBAR_SHIFT	5	/* shift for # BARs */
+#define  PCI_REBAR_CTRL_BAR_SIZE_MASK	(0x1F << 8)	/* mask for BAR size */
+#define  PCI_REBAR_CTRL_BAR_SIZE_SHIFT	8	/* shift for BAR size */
 
 /* Dynamic Power Allocation */
 #define PCI_DPA_CAP		4	/* capability register */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v9 3/5] PCI: add functionality for resizing resources v7
  2017-10-18 13:58 Resizable PCI BAR support V9 Christian König
  2017-10-18 13:58 ` [PATCH v9 1/5] PCI: add a define for the PCI resource type mask v2 Christian König
  2017-10-18 13:58 ` [PATCH v9 2/5] PCI: add resizeable BAR infrastructure v5 Christian König
@ 2017-10-18 13:58 ` Christian König
  2017-10-18 13:58 ` [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5 Christian König
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Christian König @ 2017-10-18 13:58 UTC (permalink / raw)
  To: helgaas, linux-pci, dri-devel, linux-kernel, amd-gfx

From: Christian König <christian.koenig@amd.com>

This allows device drivers to request resizing their BARs.

The function only tries to reprogram the windows of the bridge directly above
the requesting device and only the BAR of the same type (usually mem, 64bit,
prefetchable). This is done to make sure not to disturb other drivers by
changing the BARs of their devices.

Drivers should use the following sequence to resize their BARs:
1. Disable memory decoding of your device using the PCI cfg dword.
2. Use pci_release_resource() to release all BARs which can move during the
   resize. Including the one you want to resize.
3. Call pci_resize_resource() for each BAR you want to resize.
4. Call pci_assign_unassigned_bus_resources() to reassign new locations
   for all BARs which are not resized, but could move.
5. If everything worked as expected enable memory decoding in your device again
   using the PCI cfg dword.

v2: rebase on changes in rbar support
v3: style cleanups, fail if memory decoding is enabled or resources
    still allocated, resize all unused bridge BARs,
    drop calling pci_reenable_device
v4: print resources before releasing them, style cleanups,
    use pci_rbar_size_to_bytes, use PCI_RES_TYPE_MASK
v5: use next pointer to simplify loop
v6: move reassigning resources on error to driver side
v7: Document in the commit message how to use the new function.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/pci/setup-bus.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/setup-res.c | 58 +++++++++++++++++++++++++++++
 include/linux/pci.h     |  3 ++
 3 files changed, 159 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 37450d9e1132..03af25b3eec4 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1908,6 +1908,104 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
 }
 EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
 
+int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
+{
+	struct pci_dev_resource *dev_res;
+	struct pci_dev *next;
+	LIST_HEAD(saved);
+	LIST_HEAD(added);
+	LIST_HEAD(failed);
+	unsigned int i;
+	int ret;
+
+	/* Walk to the root hub, releasing bridge BARs when possible */
+	next = bridge;
+	do {
+		bridge = next;
+		for (i = PCI_BRIDGE_RESOURCES; i < PCI_BRIDGE_RESOURCE_END;
+		     i++) {
+			struct resource *res = &bridge->resource[i];
+
+			if ((res->flags ^ type) & PCI_RES_TYPE_MASK)
+				continue;
+
+			/* Ignore BARs which are still in use */
+			if (res->child)
+				continue;
+
+			ret = add_to_list(&saved, bridge, res, 0, 0);
+			if (ret)
+				goto cleanup;
+
+			dev_info(&bridge->dev, "BAR %d: releasing %pR\n",
+				 i, res);
+
+			if (res->parent)
+				release_resource(res);
+			res->start = 0;
+			res->end = 0;
+			break;
+		}
+		if (i == PCI_BRIDGE_RESOURCE_END)
+			break;
+
+		next = bridge->bus ? bridge->bus->self : NULL;
+	} while (next);
+
+	if (list_empty(&saved))
+		return -ENOENT;
+
+	__pci_bus_size_bridges(bridge->subordinate, &added);
+	__pci_bridge_assign_resources(bridge, &added, &failed);
+	BUG_ON(!list_empty(&added));
+
+	if (!list_empty(&failed)) {
+		ret = -ENOSPC;
+		goto cleanup;
+	}
+
+	list_for_each_entry(dev_res, &saved, list) {
+		/* Skip the bridge we just assigned resources for. */
+		if (bridge == dev_res->dev)
+			continue;
+
+		bridge = dev_res->dev;
+		pci_setup_bridge(bridge->subordinate);
+	}
+
+	free_list(&saved);
+	return 0;
+
+cleanup:
+	/* restore size and flags */
+	list_for_each_entry(dev_res, &failed, list) {
+		struct resource *res = dev_res->res;
+
+		res->start = dev_res->start;
+		res->end = dev_res->end;
+		res->flags = dev_res->flags;
+	}
+	free_list(&failed);
+
+	/* Revert to the old configuration */
+	list_for_each_entry(dev_res, &saved, list) {
+		struct resource *res = dev_res->res;
+
+		bridge = dev_res->dev;
+		i = res - bridge->resource;
+
+		res->start = dev_res->start;
+		res->end = dev_res->end;
+		res->flags = dev_res->flags;
+
+		pci_claim_resource(bridge, i);
+		pci_setup_bridge(bridge->subordinate);
+	}
+	free_list(&saved);
+
+	return ret;
+}
+
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
 {
 	struct pci_dev *dev;
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 85774b7a316a..1d558e18270d 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -383,6 +383,64 @@ int pci_reassign_resource(struct pci_dev *dev, int resno, resource_size_t addsiz
 	return 0;
 }
 
+void pci_release_resource(struct pci_dev *dev, int resno)
+{
+	struct resource *res = dev->resource + resno;
+
+	dev_info(&dev->dev, "BAR %d: releasing %pR\n", resno, res);
+	release_resource(res);
+	res->end = resource_size(res) - 1;
+	res->start = 0;
+	res->flags |= IORESOURCE_UNSET;
+}
+EXPORT_SYMBOL(pci_release_resource);
+
+int pci_resize_resource(struct pci_dev *dev, int resno, int size)
+{
+	struct resource *res = dev->resource + resno;
+	int old, ret;
+	u32 sizes;
+	u16 cmd;
+
+	/* Make sure the resource isn't assigned before resizing it. */
+	if (!(res->flags & IORESOURCE_UNSET))
+		return -EBUSY;
+
+	pci_read_config_word(dev, PCI_COMMAND, &cmd);
+	if (cmd & PCI_COMMAND_MEMORY)
+		return -EBUSY;
+
+	sizes = pci_rbar_get_possible_sizes(dev, resno);
+	if (!sizes)
+		return -ENOTSUPP;
+
+	if (!(sizes & BIT(size)))
+		return -EINVAL;
+
+	old = pci_rbar_get_current_size(dev, resno);
+	if (old < 0)
+		return old;
+
+	ret = pci_rbar_set_size(dev, resno, size);
+	if (ret)
+		return ret;
+
+	res->end = res->start + pci_rbar_size_to_bytes(size) - 1;
+
+	/* Check if the new config works by trying to assign everything. */
+	ret = pci_reassign_bridge_resources(dev->bus->self, res->flags);
+	if (ret)
+		goto error_resize;
+
+	return 0;
+
+error_resize:
+	pci_rbar_set_size(dev, resno, old);
+	res->end = res->start + pci_rbar_size_to_bytes(old) - 1;
+	return ret;
+}
+EXPORT_SYMBOL(pci_resize_resource);
+
 int pci_enable_resources(struct pci_dev *dev, int mask)
 {
 	u16 cmd, old_cmd;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a75c13673852..c21949a6ea65 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1081,6 +1081,8 @@ void pci_reset_bridge_secondary_bus(struct pci_dev *dev);
 void pci_update_resource(struct pci_dev *dev, int resno);
 int __must_check pci_assign_resource(struct pci_dev *dev, int i);
 int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resource_size_t add_size, resource_size_t align);
+void pci_release_resource(struct pci_dev *dev, int resno);
+int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size);
 int pci_select_bars(struct pci_dev *dev, unsigned long flags);
 bool pci_device_is_present(struct pci_dev *pdev);
 void pci_ignore_hotplug(struct pci_dev *dev);
@@ -1159,6 +1161,7 @@ void pci_assign_unassigned_resources(void);
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus);
 void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus);
+int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type);
 void pdev_enable_device(struct pci_dev *);
 int pci_enable_resources(struct pci_dev *, int mask);
 void pci_fixup_irqs(u8 (*)(struct pci_dev *, u8 *),
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-10-18 13:58 Resizable PCI BAR support V9 Christian König
                   ` (2 preceding siblings ...)
  2017-10-18 13:58 ` [PATCH v9 3/5] PCI: add functionality for resizing resources v7 Christian König
@ 2017-10-18 13:58 ` Christian König
  2017-11-02 16:43   ` Alex Deucher
  2017-11-20 15:51   ` Boris Ostrovsky
  2017-10-18 13:58 ` [PATCH v9 5/5] drm/amdgpu: resize VRAM BAR for CPU access v5 Christian König
  2017-10-24 19:44 ` Resizable PCI BAR support V9 Bjorn Helgaas
  5 siblings, 2 replies; 27+ messages in thread
From: Christian König @ 2017-10-18 13:58 UTC (permalink / raw)
  To: helgaas, linux-pci, dri-devel, linux-kernel, amd-gfx

From: Christian König <christian.koenig@amd.com>

Most BIOS don't enable this because of compatibility reasons.

Manually enable a 64bit BAR of 64GB size so that we have
enough room for PCI devices.

v2: style cleanups, increase size, add resource name, set correct flags,
    print message that windows was added
v3: add defines for all the magic numbers, style cleanups
v4: add some comment that the BIOS should actually allow this using
    _PRS and _SRS.
v5: only enable this if CONFIG_PHYS_ADDR_T_64BIT is set

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
---
 arch/x86/pci/fixup.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 11e407489db0..7b6bd76713c5 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -618,3 +618,83 @@ static void quirk_apple_mbp_poweroff(struct pci_dev *pdev)
 		dev_info(dev, "can't work around MacBook Pro poweroff issue\n");
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10, quirk_apple_mbp_poweroff);
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+
+#define AMD_141b_MMIO_BASE(x)	(0x80 + (x) * 0x8)
+#define AMD_141b_MMIO_BASE_RE_MASK		BIT(0)
+#define AMD_141b_MMIO_BASE_WE_MASK		BIT(1)
+#define AMD_141b_MMIO_BASE_MMIOBASE_MASK	GENMASK(31,8)
+
+#define AMD_141b_MMIO_LIMIT(x)	(0x84 + (x) * 0x8)
+#define AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK	GENMASK(31,8)
+
+#define AMD_141b_MMIO_HIGH(x)	(0x180 + (x) * 0x4)
+#define AMD_141b_MMIO_HIGH_MMIOBASE_MASK	GENMASK(7,0)
+#define AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT	16
+#define AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK	GENMASK(23,16)
+
+/*
+ * The PCI Firmware Spec, rev 3.2 notes that ACPI should optionally allow
+ * configuring host bridge windows using the _PRS and _SRS methods.
+ *
+ * But this is rarely implemented, so we manually enable a large 64bit BAR for
+ * PCIe device on AMD Family 15h (Models 30h-3fh) Processors here.
+ */
+static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
+{
+	struct resource *res, *conflict;
+	u32 base, limit, high;
+	unsigned i;
+
+	for (i = 0; i < 8; ++i) {
+		pci_read_config_dword(dev, AMD_141b_MMIO_BASE(i), &base);
+		pci_read_config_dword(dev, AMD_141b_MMIO_HIGH(i), &high);
+
+		/* Is this slot free? */
+		if (!(base & (AMD_141b_MMIO_BASE_RE_MASK |
+			      AMD_141b_MMIO_BASE_WE_MASK)))
+			break;
+
+		base >>= 8;
+		base |= high << 24;
+
+		/* Abort if a slot already configures a 64bit BAR. */
+		if (base > 0x10000)
+			return;
+	}
+	if (i == 8)
+		return;
+
+	res = kzalloc(sizeof(*res), GFP_KERNEL);
+	if (!res)
+		return;
+
+	res->name = "PCI Bus 0000:00";
+	res->flags = IORESOURCE_PREFETCH | IORESOURCE_MEM |
+		IORESOURCE_MEM_64 | IORESOURCE_WINDOW;
+	res->start = 0x100000000ull;
+	res->end = 0xfd00000000ull - 1;
+
+	/* Just grab the free area behind system memory for this */
+	while ((conflict = request_resource_conflict(&iomem_resource, res)))
+		res->start = conflict->end + 1;
+
+	dev_info(&dev->dev, "adding root bus resource %pR\n", res);
+
+	base = ((res->start >> 8) & AMD_141b_MMIO_BASE_MMIOBASE_MASK) |
+		AMD_141b_MMIO_BASE_RE_MASK | AMD_141b_MMIO_BASE_WE_MASK;
+	limit = ((res->end + 1) >> 8) & AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK;
+	high = ((res->start >> 40) & AMD_141b_MMIO_HIGH_MMIOBASE_MASK) |
+		((((res->end + 1) >> 40) << AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT)
+		 & AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK);
+
+	pci_write_config_dword(dev, AMD_141b_MMIO_HIGH(i), high);
+	pci_write_config_dword(dev, AMD_141b_MMIO_LIMIT(i), limit);
+	pci_write_config_dword(dev, AMD_141b_MMIO_BASE(i), base);
+
+	pci_bus_add_resource(dev->bus, res, 0);
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x141b, pci_amd_enable_64bit_bar);
+
+#endif
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v9 5/5] drm/amdgpu: resize VRAM BAR for CPU access v5
  2017-10-18 13:58 Resizable PCI BAR support V9 Christian König
                   ` (3 preceding siblings ...)
  2017-10-18 13:58 ` [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5 Christian König
@ 2017-10-18 13:58 ` Christian König
  2017-10-24 19:44 ` Resizable PCI BAR support V9 Bjorn Helgaas
  5 siblings, 0 replies; 27+ messages in thread
From: Christian König @ 2017-10-18 13:58 UTC (permalink / raw)
  To: helgaas, linux-pci, dri-devel, linux-kernel, amd-gfx

From: Christian König <christian.koenig@amd.com>

Try to resize BAR0 to let CPU access all of VRAM.

v2: rebased, style cleanups, disable mem decode before resize,
    handle gmc_v9 as well, round size up to power of two.
v3: handle gmc_v6 as well, release and reassign all BARs in the driver.
v4: rename new function to amdgpu_device_resize_fb_bar,
    reenable mem decoding only if all resources are assigned.
v5: reorder resource release, return -ENODEV instead of BUG_ON().

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 48 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c      | 12 ++++++--
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c      | 13 ++++++--
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c      | 13 ++++++--
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c      | 14 ++++++---
 6 files changed, 88 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 7ecfc5303f4f..ac4e6f6fb6d8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1850,6 +1850,7 @@ void amdgpu_ttm_placement_from_domain(struct amdgpu_bo *abo, u32 domain);
 bool amdgpu_ttm_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
 void amdgpu_vram_location(struct amdgpu_device *adev, struct amdgpu_mc *mc, u64 base);
 void amdgpu_gart_location(struct amdgpu_device *adev, struct amdgpu_mc *mc);
+int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev);
 void amdgpu_ttm_set_active_vram_size(struct amdgpu_device *adev, u64 size);
 int amdgpu_ttm_init(struct amdgpu_device *adev);
 void amdgpu_ttm_fini(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 57addfe9e89b..8f2be5a36625 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -414,6 +414,9 @@ static int amdgpu_doorbell_init(struct amdgpu_device *adev)
 		return 0;
 	}
 
+	if (pci_resource_flags(adev->pdev, 2) & IORESOURCE_UNSET)
+		return -EINVAL;
+
 	/* doorbell bar mapping */
 	adev->doorbell.base = pci_resource_start(adev->pdev, 2);
 	adev->doorbell.size = pci_resource_len(adev->pdev, 2);
@@ -732,6 +735,51 @@ int amdgpu_fw_reserve_vram_init(struct amdgpu_device *adev)
 	return r;
 }
 
+/**
+ * amdgpu_device_resize_fb_bar - try to resize FB BAR
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Try to resize FB BAR to make all VRAM CPU accessible. We try very hard not
+ * to fail, but if any of the BARs is not accessible after the size we abort
+ * driver loading by returning -ENODEV.
+ */
+int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
+{
+	u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
+	u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) - 1;
+	u16 cmd;
+	int r;
+
+	/* Disable memory decoding while we change the BAR addresses and size */
+	pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
+	pci_write_config_word(adev->pdev, PCI_COMMAND,
+			      cmd & ~PCI_COMMAND_MEMORY);
+
+	/* Free the VRAM and doorbell BAR, we most likely need to move both. */
+	amdgpu_doorbell_fini(adev);
+	if (adev->asic_type >= CHIP_BONAIRE)
+		pci_release_resource(adev->pdev, 2);
+
+	pci_release_resource(adev->pdev, 0);
+
+	r = pci_resize_resource(adev->pdev, 0, rbar_size);
+	if (r == -ENOSPC)
+		DRM_INFO("Not enough PCI address space for a large BAR.");
+	else if (r && r != -ENOTSUPP)
+		DRM_ERROR("Problem resizing BAR0 (%d).", r);
+
+	pci_assign_unassigned_bus_resources(adev->pdev->bus);
+
+	/* When the doorbell or fb BAR isn't available we have no chance of
+	 * using the device.
+	 */
+	r = amdgpu_doorbell_init(adev);
+	if (r || (pci_resource_flags(adev->pdev, 0) & IORESOURCE_UNSET))
+		return -ENODEV;
+
+	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
+}
 
 /*
  * GPU helpers function.
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index f4603a7c8ef3..d2a43db22cff 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -283,6 +283,7 @@ static int gmc_v6_0_mc_init(struct amdgpu_device *adev)
 
 	u32 tmp;
 	int chansize, numchan;
+	int r;
 
 	tmp = RREG32(mmMC_ARB_RAMCFG);
 	if (tmp & (1 << 11)) {
@@ -324,12 +325,17 @@ static int gmc_v6_0_mc_init(struct amdgpu_device *adev)
 		break;
 	}
 	adev->mc.vram_width = numchan * chansize;
-	/* Could aper size report 0 ? */
-	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
-	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
 	/* size in MB on si */
 	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
+
+	if (!(adev->flags & AMD_IS_APU)) {
+		r = amdgpu_device_resize_fb_bar(adev);
+		if (r)
+			return r;
+	}
+	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
+	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
 	adev->mc.visible_vram_size = adev->mc.aper_size;
 
 	/* set the gart size */
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index b0528ca9207b..583d87792820 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -322,6 +322,8 @@ static void gmc_v7_0_mc_program(struct amdgpu_device *adev)
  */
 static int gmc_v7_0_mc_init(struct amdgpu_device *adev)
 {
+	int r;
+
 	adev->mc.vram_width = amdgpu_atombios_get_vram_width(adev);
 	if (!adev->mc.vram_width) {
 		u32 tmp;
@@ -367,13 +369,18 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev)
 		}
 		adev->mc.vram_width = numchan * chansize;
 	}
-	/* Could aper size report 0 ? */
-	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
-	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
 	/* size in MB on si */
 	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 
+	if (!(adev->flags & AMD_IS_APU)) {
+		r = amdgpu_device_resize_fb_bar(adev);
+		if (r)
+			return r;
+	}
+	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
+	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
+
 #ifdef CONFIG_X86_64
 	if (adev->flags & AMD_IS_APU) {
 		adev->mc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index f368cfe2f585..9ca5fea93ebc 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -498,6 +498,8 @@ static void gmc_v8_0_mc_program(struct amdgpu_device *adev)
  */
 static int gmc_v8_0_mc_init(struct amdgpu_device *adev)
 {
+	int r;
+
 	adev->mc.vram_width = amdgpu_atombios_get_vram_width(adev);
 	if (!adev->mc.vram_width) {
 		u32 tmp;
@@ -543,13 +545,18 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev)
 		}
 		adev->mc.vram_width = numchan * chansize;
 	}
-	/* Could aper size report 0 ? */
-	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
-	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
 	/* size in MB on si */
 	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 
+	if (!(adev->flags & AMD_IS_APU)) {
+		r = amdgpu_device_resize_fb_bar(adev);
+		if (r)
+			return r;
+	}
+	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
+	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
+
 #ifdef CONFIG_X86_64
 	if (adev->flags & AMD_IS_APU) {
 		adev->mc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 621699331e09..0de4dc068516 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -440,6 +440,7 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
 {
 	u32 tmp;
 	int chansize, numchan;
+	int r;
 
 	adev->mc.vram_width = amdgpu_atomfirmware_get_vram_width(adev);
 	if (!adev->mc.vram_width) {
@@ -482,17 +483,22 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
 		adev->mc.vram_width = numchan * chansize;
 	}
 
-	/* Could aper size report 0 ? */
-	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
-	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
 	/* size in MB on si */
 	adev->mc.mc_vram_size =
 		((adev->flags & AMD_IS_APU) ? nbio_v7_0_get_memsize(adev) :
 		 nbio_v6_1_get_memsize(adev)) * 1024ULL * 1024ULL;
 	adev->mc.real_vram_size = adev->mc.mc_vram_size;
-	adev->mc.visible_vram_size = adev->mc.aper_size;
+
+	if (!(adev->flags & AMD_IS_APU)) {
+		r = amdgpu_device_resize_fb_bar(adev);
+		if (r)
+			return r;
+	}
+	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
+	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
 
 	/* In case the PCI BAR is larger than the actual amount of vram */
+	adev->mc.visible_vram_size = adev->mc.aper_size;
 	if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
 		adev->mc.visible_vram_size = adev->mc.real_vram_size;
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: Resizable PCI BAR support V9
  2017-10-18 13:58 Resizable PCI BAR support V9 Christian König
                   ` (4 preceding siblings ...)
  2017-10-18 13:58 ` [PATCH v9 5/5] drm/amdgpu: resize VRAM BAR for CPU access v5 Christian König
@ 2017-10-24 19:44 ` Bjorn Helgaas
  2017-10-25 11:27   ` Christian König
  5 siblings, 1 reply; 27+ messages in thread
From: Bjorn Helgaas @ 2017-10-24 19:44 UTC (permalink / raw)
  To: Christian König; +Cc: linux-pci, dri-devel, linux-kernel, amd-gfx

On Wed, Oct 18, 2017 at 03:58:16PM +0200, Christian König wrote:
> Hi everyone,
> 
> This is the ninth and hopefully last incarnation of this set of patches. It
> enables device drivers to resize and most likely also relocate the PCI BAR of
> devices they manage to allow the CPU to access all of the device local memory
> at once.
> 
> This is very useful for GFX device drivers where the default PCI BAR is only
> about 256MB in size for compatibility reasons, but the device easily have
> multiple gigabyte of local memory.
> 
> Some changes since the last version:
> 1. Rebased on drm-next, so should be ready to be merged for 4.15.
> 2. The fixup of the 64bit root window on AMD Family 15h CPUs/APUs is only
>    enabled when we compile a kernel supporting that hw.
> 3. Some minor error handling improvements for the amdgpu side. We now
>    gracefully abort driver loading in case of a critical error instead of
>    calling BUG().
> 
> Bjorn any more comments or can we finally get this into 4.15? I will remove the
> version tags from the patches when I send you a pull request if you want this.
> 
> I only work on this as a background task, so sorry for the ~3 month delay
> between each version of the patchset.

I put these (except for the last) on pci/resource for v4.15.  I'd be
happy to merge the last as well if you like.  My tree does not include
drm-next, so there would be a minor conflict when merging upstream.

Bjorn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Resizable PCI BAR support V9
  2017-10-24 19:44 ` Resizable PCI BAR support V9 Bjorn Helgaas
@ 2017-10-25 11:27   ` Christian König
  0 siblings, 0 replies; 27+ messages in thread
From: Christian König @ 2017-10-25 11:27 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, dri-devel, linux-kernel, amd-gfx

Am 24.10.2017 um 21:44 schrieb Bjorn Helgaas:
> On Wed, Oct 18, 2017 at 03:58:16PM +0200, Christian König wrote:
>> Hi everyone,
>>
>> This is the ninth and hopefully last incarnation of this set of patches. It
>> enables device drivers to resize and most likely also relocate the PCI BAR of
>> devices they manage to allow the CPU to access all of the device local memory
>> at once.
>>
>> This is very useful for GFX device drivers where the default PCI BAR is only
>> about 256MB in size for compatibility reasons, but the device easily have
>> multiple gigabyte of local memory.
>>
>> Some changes since the last version:
>> 1. Rebased on drm-next, so should be ready to be merged for 4.15.
>> 2. The fixup of the 64bit root window on AMD Family 15h CPUs/APUs is only
>>     enabled when we compile a kernel supporting that hw.
>> 3. Some minor error handling improvements for the amdgpu side. We now
>>     gracefully abort driver loading in case of a critical error instead of
>>     calling BUG().
>>
>> Bjorn any more comments or can we finally get this into 4.15? I will remove the
>> version tags from the patches when I send you a pull request if you want this.
>>
>> I only work on this as a background task, so sorry for the ~3 month delay
>> between each version of the patchset.
> I put these (except for the last) on pci/resource for v4.15.  I'd be
> happy to merge the last as well if you like.  My tree does not include
> drm-next, so there would be a minor conflict when merging upstream.

Thanks Bjorn, that is great to hear.

I will sync up with Alex and Dave how to merge the last patch.

Christian.

>
> Bjorn

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-10-18 13:58 ` [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5 Christian König
@ 2017-11-02 16:43   ` Alex Deucher
  2017-11-20 15:51   ` Boris Ostrovsky
  1 sibling, 0 replies; 27+ messages in thread
From: Alex Deucher @ 2017-11-02 16:43 UTC (permalink / raw)
  To: Christian König
  Cc: Bjorn Helgaas, Linux PCI, Maling list - DRI developers, LKML,
	amd-gfx list

On Wed, Oct 18, 2017 at 9:58 AM, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
> From: Christian König <christian.koenig@amd.com>
>
> Most BIOS don't enable this because of compatibility reasons.
>
> Manually enable a 64bit BAR of 64GB size so that we have
> enough room for PCI devices.
>
> v2: style cleanups, increase size, add resource name, set correct flags,
>     print message that windows was added
> v3: add defines for all the magic numbers, style cleanups
> v4: add some comment that the BIOS should actually allow this using
>     _PRS and _SRS.
> v5: only enable this if CONFIG_PHYS_ADDR_T_64BIT is set
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
> ---
>  arch/x86/pci/fixup.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 80 insertions(+)
>
> diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> index 11e407489db0..7b6bd76713c5 100644
> --- a/arch/x86/pci/fixup.c
> +++ b/arch/x86/pci/fixup.c
> @@ -618,3 +618,83 @@ static void quirk_apple_mbp_poweroff(struct pci_dev *pdev)
>                 dev_info(dev, "can't work around MacBook Pro poweroff issue\n");
>  }
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10, quirk_apple_mbp_poweroff);
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +
> +#define AMD_141b_MMIO_BASE(x)  (0x80 + (x) * 0x8)
> +#define AMD_141b_MMIO_BASE_RE_MASK             BIT(0)
> +#define AMD_141b_MMIO_BASE_WE_MASK             BIT(1)
> +#define AMD_141b_MMIO_BASE_MMIOBASE_MASK       GENMASK(31,8)
> +
> +#define AMD_141b_MMIO_LIMIT(x) (0x84 + (x) * 0x8)
> +#define AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK     GENMASK(31,8)
> +
> +#define AMD_141b_MMIO_HIGH(x)  (0x180 + (x) * 0x4)
> +#define AMD_141b_MMIO_HIGH_MMIOBASE_MASK       GENMASK(7,0)
> +#define AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT     16
> +#define AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK      GENMASK(23,16)
> +
> +/*
> + * The PCI Firmware Spec, rev 3.2 notes that ACPI should optionally allow
> + * configuring host bridge windows using the _PRS and _SRS methods.
> + *
> + * But this is rarely implemented, so we manually enable a large 64bit BAR for
> + * PCIe device on AMD Family 15h (Models 30h-3fh) Processors here.
> + */
> +static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
> +{
> +       struct resource *res, *conflict;
> +       u32 base, limit, high;
> +       unsigned i;
> +
> +       for (i = 0; i < 8; ++i) {
> +               pci_read_config_dword(dev, AMD_141b_MMIO_BASE(i), &base);
> +               pci_read_config_dword(dev, AMD_141b_MMIO_HIGH(i), &high);
> +
> +               /* Is this slot free? */
> +               if (!(base & (AMD_141b_MMIO_BASE_RE_MASK |
> +                             AMD_141b_MMIO_BASE_WE_MASK)))
> +                       break;
> +
> +               base >>= 8;
> +               base |= high << 24;
> +
> +               /* Abort if a slot already configures a 64bit BAR. */
> +               if (base > 0x10000)
> +                       return;
> +       }
> +       if (i == 8)
> +               return;
> +
> +       res = kzalloc(sizeof(*res), GFP_KERNEL);
> +       if (!res)
> +               return;
> +
> +       res->name = "PCI Bus 0000:00";
> +       res->flags = IORESOURCE_PREFETCH | IORESOURCE_MEM |
> +               IORESOURCE_MEM_64 | IORESOURCE_WINDOW;
> +       res->start = 0x100000000ull;
> +       res->end = 0xfd00000000ull - 1;
> +
> +       /* Just grab the free area behind system memory for this */
> +       while ((conflict = request_resource_conflict(&iomem_resource, res)))
> +               res->start = conflict->end + 1;
> +
> +       dev_info(&dev->dev, "adding root bus resource %pR\n", res);
> +
> +       base = ((res->start >> 8) & AMD_141b_MMIO_BASE_MMIOBASE_MASK) |
> +               AMD_141b_MMIO_BASE_RE_MASK | AMD_141b_MMIO_BASE_WE_MASK;
> +       limit = ((res->end + 1) >> 8) & AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK;
> +       high = ((res->start >> 40) & AMD_141b_MMIO_HIGH_MMIOBASE_MASK) |
> +               ((((res->end + 1) >> 40) << AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT)
> +                & AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK);
> +
> +       pci_write_config_dword(dev, AMD_141b_MMIO_HIGH(i), high);
> +       pci_write_config_dword(dev, AMD_141b_MMIO_LIMIT(i), limit);
> +       pci_write_config_dword(dev, AMD_141b_MMIO_BASE(i), base);
> +
> +       pci_bus_add_resource(dev->bus, res, 0);
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x141b, pci_amd_enable_64bit_bar);
> +

We may want to expand this to cover more host bridges.  E.g., on my KV
system I have these:
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
15h (Models 30h-3fh) Processor Root Complex [1022:1422]
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:1424]
00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:1424]
00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:1424]
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
15h (Models 30h-3fh) Processor Function 0 [1022:141a]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
15h (Models 30h-3fh) Processor Function 1 [1022:141b]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
15h (Models 30h-3fh) Processor Function 2 [1022:141c]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
15h (Models 30h-3fh) Processor Function 3 [1022:141d]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
15h (Models 30h-3fh) Processor Function 4 [1022:141e]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
15h (Models 30h-3fh) Processor Function 5 [1022:141f]

And on my CZ system:
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:1576]
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:157b]
00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:157b]
00:09.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:157d]
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:1570]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:1571]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:1572]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:1573]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:1574]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device
[1022:1575]

Do you know if Zen based systems use the same registers?  They have
yet more set of pci ids for host bridges.

Alex


> +#endif
> --
> 2.11.0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-10-18 13:58 ` [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5 Christian König
  2017-11-02 16:43   ` Alex Deucher
@ 2017-11-20 15:51   ` Boris Ostrovsky
  2017-11-20 16:07     ` Christian König
  1 sibling, 1 reply; 27+ messages in thread
From: Boris Ostrovsky @ 2017-11-20 15:51 UTC (permalink / raw)
  To: Christian König, helgaas, linux-pci, dri-devel,
	linux-kernel, amd-gfx

On 10/18/2017 09:58 AM, Christian König wrote:
> From: Christian König <christian.koenig@amd.com>
>
> Most BIOS don't enable this because of compatibility reasons.
>
> Manually enable a 64bit BAR of 64GB size so that we have
> enough room for PCI devices.
>
> v2: style cleanups, increase size, add resource name, set correct flags,
>     print message that windows was added
> v3: add defines for all the magic numbers, style cleanups
> v4: add some comment that the BIOS should actually allow this using
>     _PRS and _SRS.
> v5: only enable this if CONFIG_PHYS_ADDR_T_64BIT is set
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
> ---
>  arch/x86/pci/fixup.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 80 insertions(+)
>
> diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> index 11e407489db0..7b6bd76713c5 100644
> --- a/arch/x86/pci/fixup.c
> +++ b/arch/x86/pci/fixup.c
> @@ -618,3 +618,83 @@ static void quirk_apple_mbp_poweroff(struct pci_dev *pdev)
>  		dev_info(dev, "can't work around MacBook Pro poweroff issue\n");
>  }
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10, quirk_apple_mbp_poweroff);
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +
> +#define AMD_141b_MMIO_BASE(x)	(0x80 + (x) * 0x8)
> +#define AMD_141b_MMIO_BASE_RE_MASK		BIT(0)
> +#define AMD_141b_MMIO_BASE_WE_MASK		BIT(1)
> +#define AMD_141b_MMIO_BASE_MMIOBASE_MASK	GENMASK(31,8)
> +
> +#define AMD_141b_MMIO_LIMIT(x)	(0x84 + (x) * 0x8)
> +#define AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK	GENMASK(31,8)
> +
> +#define AMD_141b_MMIO_HIGH(x)	(0x180 + (x) * 0x4)
> +#define AMD_141b_MMIO_HIGH_MMIOBASE_MASK	GENMASK(7,0)
> +#define AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT	16
> +#define AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK	GENMASK(23,16)
> +
> +/*
> + * The PCI Firmware Spec, rev 3.2 notes that ACPI should optionally allow
> + * configuring host bridge windows using the _PRS and _SRS methods.
> + *
> + * But this is rarely implemented, so we manually enable a large 64bit BAR for
> + * PCIe device on AMD Family 15h (Models 30h-3fh) Processors here.
> + */
> +static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
> +{
> +	struct resource *res, *conflict;
> +	u32 base, limit, high;
> +	unsigned i;
> +
> +	for (i = 0; i < 8; ++i) {
> +		pci_read_config_dword(dev, AMD_141b_MMIO_BASE(i), &base);
> +		pci_read_config_dword(dev, AMD_141b_MMIO_HIGH(i), &high);
> +
> +		/* Is this slot free? */
> +		if (!(base & (AMD_141b_MMIO_BASE_RE_MASK |
> +			      AMD_141b_MMIO_BASE_WE_MASK)))
> +			break;
> +
> +		base >>= 8;
> +		base |= high << 24;
> +
> +		/* Abort if a slot already configures a 64bit BAR. */
> +		if (base > 0x10000)
> +			return;
> +	}
> +	if (i == 8)
> +		return;
> +
> +	res = kzalloc(sizeof(*res), GFP_KERNEL);
> +	if (!res)
> +		return;
> +
> +	res->name = "PCI Bus 0000:00";
> +	res->flags = IORESOURCE_PREFETCH | IORESOURCE_MEM |
> +		IORESOURCE_MEM_64 | IORESOURCE_WINDOW;
> +	res->start = 0x100000000ull;
> +	res->end = 0xfd00000000ull - 1;
> +
> +	/* Just grab the free area behind system memory for this */
> +	while ((conflict = request_resource_conflict(&iomem_resource, res)))
> +		res->start = conflict->end + 1;


I get stuck in the infinite loop here.

Presumably because on a multi-socket system we succeed for the first
processor (0000:00:18.1) and add 'res' to iomem_resource. For
0000:00:19.1 we find the slot in the 'for' loop above but then we fail
to find a place to add 'res'. And with final sibling being [0 - max
possible addr] we are stuck here.

A possible solution to get out of the loop could be
    if (conflict->end >= res->end) {
                            kfree(res);
                            return;
   }


but I don't know whether this is what we actually want.

This is a 2-socket

vendor_id    : AuthenticAMD
cpu family    : 21
model        : 1
model name    : AMD Opteron(TM) Processor 6272
stepping    : 2


(and then it breaks differently as a Xen guest --- we hung on the last
pci_read_config_dword(), I haven't looked at this at all yet)



-boris


> +
> +	dev_info(&dev->dev, "adding root bus resource %pR\n", res);
> +
> +	base = ((res->start >> 8) & AMD_141b_MMIO_BASE_MMIOBASE_MASK) |
> +		AMD_141b_MMIO_BASE_RE_MASK | AMD_141b_MMIO_BASE_WE_MASK;
> +	limit = ((res->end + 1) >> 8) & AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK;
> +	high = ((res->start >> 40) & AMD_141b_MMIO_HIGH_MMIOBASE_MASK) |
> +		((((res->end + 1) >> 40) << AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT)
> +		 & AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK);
> +
> +	pci_write_config_dword(dev, AMD_141b_MMIO_HIGH(i), high);
> +	pci_write_config_dword(dev, AMD_141b_MMIO_LIMIT(i), limit);
> +	pci_write_config_dword(dev, AMD_141b_MMIO_BASE(i), base);
> +
> +	pci_bus_add_resource(dev->bus, res, 0);
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x141b, pci_amd_enable_64bit_bar);
> +
> +#endif

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-20 15:51   ` Boris Ostrovsky
@ 2017-11-20 16:07     ` Christian König
  2017-11-20 16:33       ` Boris Ostrovsky
  0 siblings, 1 reply; 27+ messages in thread
From: Christian König @ 2017-11-20 16:07 UTC (permalink / raw)
  To: Boris Ostrovsky, helgaas, linux-pci, dri-devel, linux-kernel, amd-gfx

Am 20.11.2017 um 16:51 schrieb Boris Ostrovsky:
> On 10/18/2017 09:58 AM, Christian König wrote:
>> From: Christian König <christian.koenig@amd.com>
>>
>> Most BIOS don't enable this because of compatibility reasons.
>>
>> Manually enable a 64bit BAR of 64GB size so that we have
>> enough room for PCI devices.
>>
>> v2: style cleanups, increase size, add resource name, set correct flags,
>>      print message that windows was added
>> v3: add defines for all the magic numbers, style cleanups
>> v4: add some comment that the BIOS should actually allow this using
>>      _PRS and _SRS.
>> v5: only enable this if CONFIG_PHYS_ADDR_T_64BIT is set
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
>> ---
>>   arch/x86/pci/fixup.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 80 insertions(+)
>>
>> diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
>> index 11e407489db0..7b6bd76713c5 100644
>> --- a/arch/x86/pci/fixup.c
>> +++ b/arch/x86/pci/fixup.c
>> @@ -618,3 +618,83 @@ static void quirk_apple_mbp_poweroff(struct pci_dev *pdev)
>>   		dev_info(dev, "can't work around MacBook Pro poweroff issue\n");
>>   }
>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10, quirk_apple_mbp_poweroff);
>> +
>> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
>> +
>> +#define AMD_141b_MMIO_BASE(x)	(0x80 + (x) * 0x8)
>> +#define AMD_141b_MMIO_BASE_RE_MASK		BIT(0)
>> +#define AMD_141b_MMIO_BASE_WE_MASK		BIT(1)
>> +#define AMD_141b_MMIO_BASE_MMIOBASE_MASK	GENMASK(31,8)
>> +
>> +#define AMD_141b_MMIO_LIMIT(x)	(0x84 + (x) * 0x8)
>> +#define AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK	GENMASK(31,8)
>> +
>> +#define AMD_141b_MMIO_HIGH(x)	(0x180 + (x) * 0x4)
>> +#define AMD_141b_MMIO_HIGH_MMIOBASE_MASK	GENMASK(7,0)
>> +#define AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT	16
>> +#define AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK	GENMASK(23,16)
>> +
>> +/*
>> + * The PCI Firmware Spec, rev 3.2 notes that ACPI should optionally allow
>> + * configuring host bridge windows using the _PRS and _SRS methods.
>> + *
>> + * But this is rarely implemented, so we manually enable a large 64bit BAR for
>> + * PCIe device on AMD Family 15h (Models 30h-3fh) Processors here.
>> + */
>> +static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
>> +{
>> +	struct resource *res, *conflict;
>> +	u32 base, limit, high;
>> +	unsigned i;
>> +
>> +	for (i = 0; i < 8; ++i) {
>> +		pci_read_config_dword(dev, AMD_141b_MMIO_BASE(i), &base);
>> +		pci_read_config_dword(dev, AMD_141b_MMIO_HIGH(i), &high);
>> +
>> +		/* Is this slot free? */
>> +		if (!(base & (AMD_141b_MMIO_BASE_RE_MASK |
>> +			      AMD_141b_MMIO_BASE_WE_MASK)))
>> +			break;
>> +
>> +		base >>= 8;
>> +		base |= high << 24;
>> +
>> +		/* Abort if a slot already configures a 64bit BAR. */
>> +		if (base > 0x10000)
>> +			return;
>> +	}
>> +	if (i == 8)
>> +		return;
>> +
>> +	res = kzalloc(sizeof(*res), GFP_KERNEL);
>> +	if (!res)
>> +		return;
>> +
>> +	res->name = "PCI Bus 0000:00";
>> +	res->flags = IORESOURCE_PREFETCH | IORESOURCE_MEM |
>> +		IORESOURCE_MEM_64 | IORESOURCE_WINDOW;
>> +	res->start = 0x100000000ull;
>> +	res->end = 0xfd00000000ull - 1;
>> +
>> +	/* Just grab the free area behind system memory for this */
>> +	while ((conflict = request_resource_conflict(&iomem_resource, res)))
>> +		res->start = conflict->end + 1;
>
> I get stuck in the infinite loop here.
>
> Presumably because on a multi-socket system we succeed for the first
> processor (0000:00:18.1) and add 'res' to iomem_resource. For
> 0000:00:19.1 we find the slot in the 'for' loop above but then we fail
> to find a place to add 'res'. And with final sibling being [0 - max
> possible addr] we are stuck here.
>
> A possible solution to get out of the loop could be
>      if (conflict->end >= res->end) {
>                              kfree(res);
>                              return;
>     }

Ah, sorry for that. Yes problem is obvious now.

> but I don't know whether this is what we actually want.

Actually we would probably want to add the range to all cores at the 
same time.

>
> This is a 2-socket
>
> vendor_id    : AuthenticAMD
> cpu family    : 21
> model        : 1
> model name    : AMD Opteron(TM) Processor 6272
> stepping    : 2
>
>
> (and then it breaks differently as a Xen guest --- we hung on the last
> pci_read_config_dword(), I haven't looked at this at all yet)

Hui? How does this fix applies to a Xen guest in the first place?

Please provide the output of "lspci -nn" and explain further what is 
your config with Xen.

Regards,
Christian.


>
>
>
> -boris
>
>
>> +
>> +	dev_info(&dev->dev, "adding root bus resource %pR\n", res);
>> +
>> +	base = ((res->start >> 8) & AMD_141b_MMIO_BASE_MMIOBASE_MASK) |
>> +		AMD_141b_MMIO_BASE_RE_MASK | AMD_141b_MMIO_BASE_WE_MASK;
>> +	limit = ((res->end + 1) >> 8) & AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK;
>> +	high = ((res->start >> 40) & AMD_141b_MMIO_HIGH_MMIOBASE_MASK) |
>> +		((((res->end + 1) >> 40) << AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT)
>> +		 & AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK);
>> +
>> +	pci_write_config_dword(dev, AMD_141b_MMIO_HIGH(i), high);
>> +	pci_write_config_dword(dev, AMD_141b_MMIO_LIMIT(i), limit);
>> +	pci_write_config_dword(dev, AMD_141b_MMIO_BASE(i), base);
>> +
>> +	pci_bus_add_resource(dev->bus, res, 0);
>> +}
>> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x141b, pci_amd_enable_64bit_bar);
>> +
>> +#endif

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-20 16:07     ` Christian König
@ 2017-11-20 16:33       ` Boris Ostrovsky
  2017-11-21 13:34         ` Christian König
  0 siblings, 1 reply; 27+ messages in thread
From: Boris Ostrovsky @ 2017-11-20 16:33 UTC (permalink / raw)
  To: christian.koenig, helgaas, linux-pci, dri-devel, linux-kernel, amd-gfx

On 11/20/2017 11:07 AM, Christian König wrote:
> Am 20.11.2017 um 16:51 schrieb Boris Ostrovsky:
>>
>> (and then it breaks differently as a Xen guest --- we hung on the last
>> pci_read_config_dword(), I haven't looked at this at all yet)
>
> Hui? How does this fix applies to a Xen guest in the first place?
>
> Please provide the output of "lspci -nn" and explain further what is
> your config with Xen.
>
>


This is dom0.

-bash-4.1# lspci -nn
00:00.0 Host bridge [0600]: ATI Technologies Inc RD890 Northbridge only
dual slot (2x16) PCI-e GFX Hydra part [1002:5a10] (rev 02)
00:00.2 Generic system peripheral [0806]: ATI Technologies Inc Device
[1002:5a23]
00:0d.0 PCI bridge [0604]: ATI Technologies Inc RD890 PCI to PCI bridge
(external gfx1 port B) [1002:5a1e]
00:11.0 SATA controller [0106]: ATI Technologies Inc SB700/SB800 SATA
Controller [AHCI mode] [1002:4391]
00:12.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
OHCI0 Controller [1002:4397]
00:12.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
Controller [1002:4398]
00:12.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
Controller [1002:4396]
00:13.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
OHCI0 Controller [1002:4397]
00:13.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
Controller [1002:4398]
00:13.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
Controller [1002:4396]
00:14.0 SMBus [0c05]: ATI Technologies Inc SBx00 SMBus Controller
[1002:4385] (rev 3d)
00:14.3 ISA bridge [0601]: ATI Technologies Inc SB700/SB800 LPC host
controller [1002:439d]
00:14.4 PCI bridge [0604]: ATI Technologies Inc SBx00 PCI to PCI Bridge
[1002:4384]
00:14.5 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
OHCI2 Controller [1002:4399]
00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1600]
00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1601]
00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1602]
00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1603]
00:18.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1604]
00:18.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1605]
00:19.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1600]
00:19.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1601]
00:19.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1602]
00:19.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1603]
00:19.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1604]
00:19.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1605]
01:04.0 VGA compatible controller [0300]: Matrox Graphics, Inc. MGA
G200eW WPCM450 [102b:0532] (rev 0a)
02:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
Network Connection [8086:10c9] (rev 01)
02:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
Network Connection [8086:10c9] (rev 01)
-bash-4.1#


-boris

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-20 16:33       ` Boris Ostrovsky
@ 2017-11-21 13:34         ` Christian König
  2017-11-21 22:26           ` Boris Ostrovsky
  0 siblings, 1 reply; 27+ messages in thread
From: Christian König @ 2017-11-21 13:34 UTC (permalink / raw)
  To: Boris Ostrovsky, helgaas, linux-pci, dri-devel, linux-kernel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 3647 bytes --]

Hi Boris,

attached are two patches.

The first one is a trivial fix for the infinite loop issue, it now 
correctly aborts the fixup when it can't find address space for the root 
window.

The second is a workaround for your board. It simply checks if there is 
exactly one Processor Function to apply this fix on.

Both are based on linus current master branch. Please test if they fix 
your issue.

Thanks for the help,
Christian.

Am 20.11.2017 um 17:33 schrieb Boris Ostrovsky:
> On 11/20/2017 11:07 AM, Christian König wrote:
>> Am 20.11.2017 um 16:51 schrieb Boris Ostrovsky:
>>> (and then it breaks differently as a Xen guest --- we hung on the last
>>> pci_read_config_dword(), I haven't looked at this at all yet)
>> Hui? How does this fix applies to a Xen guest in the first place?
>>
>> Please provide the output of "lspci -nn" and explain further what is
>> your config with Xen.
>>
>>
>
> This is dom0.
>
> -bash-4.1# lspci -nn
> 00:00.0 Host bridge [0600]: ATI Technologies Inc RD890 Northbridge only
> dual slot (2x16) PCI-e GFX Hydra part [1002:5a10] (rev 02)
> 00:00.2 Generic system peripheral [0806]: ATI Technologies Inc Device
> [1002:5a23]
> 00:0d.0 PCI bridge [0604]: ATI Technologies Inc RD890 PCI to PCI bridge
> (external gfx1 port B) [1002:5a1e]
> 00:11.0 SATA controller [0106]: ATI Technologies Inc SB700/SB800 SATA
> Controller [AHCI mode] [1002:4391]
> 00:12.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
> OHCI0 Controller [1002:4397]
> 00:12.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
> Controller [1002:4398]
> 00:12.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
> Controller [1002:4396]
> 00:13.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
> OHCI0 Controller [1002:4397]
> 00:13.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
> Controller [1002:4398]
> 00:13.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
> Controller [1002:4396]
> 00:14.0 SMBus [0c05]: ATI Technologies Inc SBx00 SMBus Controller
> [1002:4385] (rev 3d)
> 00:14.3 ISA bridge [0601]: ATI Technologies Inc SB700/SB800 LPC host
> controller [1002:439d]
> 00:14.4 PCI bridge [0604]: ATI Technologies Inc SBx00 PCI to PCI Bridge
> [1002:4384]
> 00:14.5 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
> OHCI2 Controller [1002:4399]
> 00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1600]
> 00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1601]
> 00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1602]
> 00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1603]
> 00:18.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1604]
> 00:18.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1605]
> 00:19.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1600]
> 00:19.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1601]
> 00:19.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1602]
> 00:19.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1603]
> 00:19.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1604]
> 00:19.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device [1022:1605]
> 01:04.0 VGA compatible controller [0300]: Matrox Graphics, Inc. MGA
> G200eW WPCM450 [102b:0532] (rev 0a)
> 02:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
> Network Connection [8086:10c9] (rev 01)
> 02:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
> Network Connection [8086:10c9] (rev 01)
> -bash-4.1#
>
>
> -boris



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-x86-PCI-fix-infinity-loop-in-search-for-64bit-BAR-pl.patch --]
[-- Type: text/x-patch; name="0001-x86-PCI-fix-infinity-loop-in-search-for-64bit-BAR-pl.patch", Size: 1225 bytes --]

>From 9b59f5919b31f1a869ef634481331ef325a992a7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>
Date: Tue, 21 Nov 2017 11:20:00 +0100
Subject: [PATCH 1/2] x86/PCI: fix infinity loop in search for 64bit BAR
 placement
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Break the loop if we can't find some address space for a 64bit BAR.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 arch/x86/pci/fixup.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 1e996df687a3..5328e86f73eb 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -696,8 +696,13 @@ static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
 	res->end = 0xfd00000000ull - 1;
 
 	/* Just grab the free area behind system memory for this */
-	while ((conflict = request_resource_conflict(&iomem_resource, res)))
+	while ((conflict = request_resource_conflict(&iomem_resource, res))) {
+		if (conflict->end >= res->end) {
+			kfree(res);
+			return;
+		}
 		res->start = conflict->end + 1;
+	}
 
 	dev_info(&dev->dev, "adding root bus resource %pR\n", res);
 
-- 
2.11.0


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-x86-PCI-only-enable-a-64bit-BAR-on-single-socket-AMD.patch --]
[-- Type: text/x-patch; name="0002-x86-PCI-only-enable-a-64bit-BAR-on-single-socket-AMD.patch", Size: 2369 bytes --]

>From 2dc4461ba8ec1eb54a49e1e166de9a554556e572 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>
Date: Tue, 21 Nov 2017 11:08:33 +0100
Subject: [PATCH 2/2] x86/PCI: only enable a 64bit BAR on single socket AMD
 Family 15h systems
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When we have a multi socket system each CPU core needs the same setup. Since
this is tricky to do in the fixup code disable enabling a 64bit BAR on multi
socket systems for now.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 arch/x86/pci/fixup.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 5328e86f73eb..8f86060f5cf6 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -665,6 +665,16 @@ static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
 	unsigned i;
 	u32 base, limit, high;
 	struct resource *res, *conflict;
+	struct pci_dev *other;
+
+	/* Check that we are the only device of that type */
+	other = pci_get_device(dev->vendor, dev->device, NULL);
+	if (other != dev ||
+	    (other = pci_get_device(dev->vendor, dev->device, other))) {
+		/* This is a multi socket system, don't touch it for now */
+		pci_dev_put(other);
+		return;
+	}
 
 	for (i = 0; i < 8; i++) {
 		pci_read_config_dword(dev, AMD_141b_MMIO_BASE(i), &base);
@@ -719,10 +729,10 @@ static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
 
 	pci_bus_add_resource(dev->bus, res, 0);
 }
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x1401, pci_amd_enable_64bit_bar);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x141b, pci_amd_enable_64bit_bar);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x1571, pci_amd_enable_64bit_bar);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x15b1, pci_amd_enable_64bit_bar);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x1601, pci_amd_enable_64bit_bar);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1401, pci_amd_enable_64bit_bar);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x141b, pci_amd_enable_64bit_bar);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1571, pci_amd_enable_64bit_bar);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15b1, pci_amd_enable_64bit_bar);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1601, pci_amd_enable_64bit_bar);
 
 #endif
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-21 13:34         ` Christian König
@ 2017-11-21 22:26           ` Boris Ostrovsky
  2017-11-22 10:09             ` Christian König
  0 siblings, 1 reply; 27+ messages in thread
From: Boris Ostrovsky @ 2017-11-21 22:26 UTC (permalink / raw)
  To: Christian König, helgaas, linux-pci, dri-devel,
	linux-kernel, amd-gfx

On 11/21/2017 08:34 AM, Christian König wrote:
> Hi Boris,
>
> attached are two patches.
>
> The first one is a trivial fix for the infinite loop issue, it now
> correctly aborts the fixup when it can't find address space for the
> root window.
>
> The second is a workaround for your board. It simply checks if there
> is exactly one Processor Function to apply this fix on.
>
> Both are based on linus current master branch. Please test if they fix
> your issue.


Yes, they do fix it but that's because the feature is disabled.

Do you know what the actual problem was (on Xen)?

Thanks.
-boris

>
> Thanks for the help,
> Christian.
>
> Am 20.11.2017 um 17:33 schrieb Boris Ostrovsky:
>> On 11/20/2017 11:07 AM, Christian König wrote:
>>> Am 20.11.2017 um 16:51 schrieb Boris Ostrovsky:
>>>> (and then it breaks differently as a Xen guest --- we hung on the last
>>>> pci_read_config_dword(), I haven't looked at this at all yet)
>>> Hui? How does this fix applies to a Xen guest in the first place?
>>>
>>> Please provide the output of "lspci -nn" and explain further what is
>>> your config with Xen.
>>>
>>>
>>
>> This is dom0.
>>
>> -bash-4.1# lspci -nn
>> 00:00.0 Host bridge [0600]: ATI Technologies Inc RD890 Northbridge only
>> dual slot (2x16) PCI-e GFX Hydra part [1002:5a10] (rev 02)
>> 00:00.2 Generic system peripheral [0806]: ATI Technologies Inc Device
>> [1002:5a23]
>> 00:0d.0 PCI bridge [0604]: ATI Technologies Inc RD890 PCI to PCI bridge
>> (external gfx1 port B) [1002:5a1e]
>> 00:11.0 SATA controller [0106]: ATI Technologies Inc SB700/SB800 SATA
>> Controller [AHCI mode] [1002:4391]
>> 00:12.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>> OHCI0 Controller [1002:4397]
>> 00:12.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
>> Controller [1002:4398]
>> 00:12.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
>> Controller [1002:4396]
>> 00:13.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>> OHCI0 Controller [1002:4397]
>> 00:13.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
>> Controller [1002:4398]
>> 00:13.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
>> Controller [1002:4396]
>> 00:14.0 SMBus [0c05]: ATI Technologies Inc SBx00 SMBus Controller
>> [1002:4385] (rev 3d)
>> 00:14.3 ISA bridge [0601]: ATI Technologies Inc SB700/SB800 LPC host
>> controller [1002:439d]
>> 00:14.4 PCI bridge [0604]: ATI Technologies Inc SBx00 PCI to PCI Bridge
>> [1002:4384]
>> 00:14.5 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>> OHCI2 Controller [1002:4399]
>> 00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1600]
>> 00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1601]
>> 00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1602]
>> 00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1603]
>> 00:18.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1604]
>> 00:18.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1605]
>> 00:19.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1600]
>> 00:19.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1601]
>> 00:19.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1602]
>> 00:19.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1603]
>> 00:19.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1604]
>> 00:19.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>> [1022:1605]
>> 01:04.0 VGA compatible controller [0300]: Matrox Graphics, Inc. MGA
>> G200eW WPCM450 [102b:0532] (rev 0a)
>> 02:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
>> Network Connection [8086:10c9] (rev 01)
>> 02:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
>> Network Connection [8086:10c9] (rev 01)
>> -bash-4.1#
>>
>>
>> -boris
>
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-21 22:26           ` Boris Ostrovsky
@ 2017-11-22 10:09             ` Christian König
  2017-11-22 16:24               ` Boris Ostrovsky
  0 siblings, 1 reply; 27+ messages in thread
From: Christian König @ 2017-11-22 10:09 UTC (permalink / raw)
  To: Boris Ostrovsky, helgaas, linux-pci, dri-devel, linux-kernel, amd-gfx

Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky:
> On 11/21/2017 08:34 AM, Christian König wrote:
>> Hi Boris,
>>
>> attached are two patches.
>>
>> The first one is a trivial fix for the infinite loop issue, it now
>> correctly aborts the fixup when it can't find address space for the
>> root window.
>>
>> The second is a workaround for your board. It simply checks if there
>> is exactly one Processor Function to apply this fix on.
>>
>> Both are based on linus current master branch. Please test if they fix
>> your issue.
>
> Yes, they do fix it but that's because the feature is disabled.
>
> Do you know what the actual problem was (on Xen)?

I still haven't understood what you actually did with Xen.

When you used PCI pass through with those devices then you have made a 
major configuration error.

When the problem happened on dom0 then the explanation is most likely 
that some PCI device ended up in the configured space, but the routing 
was only setup correctly on one CPU socket.

Regards,
Christian.

>
> Thanks.
> -boris
>
>> Thanks for the help,
>> Christian.
>>
>> Am 20.11.2017 um 17:33 schrieb Boris Ostrovsky:
>>> On 11/20/2017 11:07 AM, Christian König wrote:
>>>> Am 20.11.2017 um 16:51 schrieb Boris Ostrovsky:
>>>>> (and then it breaks differently as a Xen guest --- we hung on the last
>>>>> pci_read_config_dword(), I haven't looked at this at all yet)
>>>> Hui? How does this fix applies to a Xen guest in the first place?
>>>>
>>>> Please provide the output of "lspci -nn" and explain further what is
>>>> your config with Xen.
>>>>
>>>>
>>> This is dom0.
>>>
>>> -bash-4.1# lspci -nn
>>> 00:00.0 Host bridge [0600]: ATI Technologies Inc RD890 Northbridge only
>>> dual slot (2x16) PCI-e GFX Hydra part [1002:5a10] (rev 02)
>>> 00:00.2 Generic system peripheral [0806]: ATI Technologies Inc Device
>>> [1002:5a23]
>>> 00:0d.0 PCI bridge [0604]: ATI Technologies Inc RD890 PCI to PCI bridge
>>> (external gfx1 port B) [1002:5a1e]
>>> 00:11.0 SATA controller [0106]: ATI Technologies Inc SB700/SB800 SATA
>>> Controller [AHCI mode] [1002:4391]
>>> 00:12.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>> OHCI0 Controller [1002:4397]
>>> 00:12.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
>>> Controller [1002:4398]
>>> 00:12.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
>>> Controller [1002:4396]
>>> 00:13.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>> OHCI0 Controller [1002:4397]
>>> 00:13.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
>>> Controller [1002:4398]
>>> 00:13.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI
>>> Controller [1002:4396]
>>> 00:14.0 SMBus [0c05]: ATI Technologies Inc SBx00 SMBus Controller
>>> [1002:4385] (rev 3d)
>>> 00:14.3 ISA bridge [0601]: ATI Technologies Inc SB700/SB800 LPC host
>>> controller [1002:439d]
>>> 00:14.4 PCI bridge [0604]: ATI Technologies Inc SBx00 PCI to PCI Bridge
>>> [1002:4384]
>>> 00:14.5 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>> OHCI2 Controller [1002:4399]
>>> 00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1600]
>>> 00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1601]
>>> 00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1602]
>>> 00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1603]
>>> 00:18.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1604]
>>> 00:18.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1605]
>>> 00:19.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1600]
>>> 00:19.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1601]
>>> 00:19.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1602]
>>> 00:19.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1603]
>>> 00:19.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1604]
>>> 00:19.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>> [1022:1605]
>>> 01:04.0 VGA compatible controller [0300]: Matrox Graphics, Inc. MGA
>>> G200eW WPCM450 [102b:0532] (rev 0a)
>>> 02:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
>>> Network Connection [8086:10c9] (rev 01)
>>> 02:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
>>> Network Connection [8086:10c9] (rev 01)
>>> -bash-4.1#
>>>
>>>
>>> -boris
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-22 10:09             ` Christian König
@ 2017-11-22 16:24               ` Boris Ostrovsky
  2017-11-22 16:54                 ` Christian König
  0 siblings, 1 reply; 27+ messages in thread
From: Boris Ostrovsky @ 2017-11-22 16:24 UTC (permalink / raw)
  To: Christian König, helgaas, linux-pci, dri-devel,
	linux-kernel, amd-gfx, xen-devel

On 11/22/2017 05:09 AM, Christian König wrote:
> Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky:
>> On 11/21/2017 08:34 AM, Christian König wrote:
>>> Hi Boris,
>>>
>>> attached are two patches.
>>>
>>> The first one is a trivial fix for the infinite loop issue, it now
>>> correctly aborts the fixup when it can't find address space for the
>>> root window.
>>>
>>> The second is a workaround for your board. It simply checks if there
>>> is exactly one Processor Function to apply this fix on.
>>>
>>> Both are based on linus current master branch. Please test if they fix
>>> your issue.
>>
>> Yes, they do fix it but that's because the feature is disabled.
>>
>> Do you know what the actual problem was (on Xen)?
>
> I still haven't understood what you actually did with Xen.
>
> When you used PCI pass through with those devices then you have made a
> major configuration error.
>
> When the problem happened on dom0 then the explanation is most likely
> that some PCI device ended up in the configured space, but the routing
> was only setup correctly on one CPU socket.

The problem is that dom0 can be (and was in my case() booted with less
than full physical memory and so the "rest" of the host memory is not
necessarily reflected in iomem. Your patch then tried to configure that
memory for MMIO and the system hang.

And so my guess is that this patch will break dom0 on a single-socket
system as well.

-boris

>
> Regards,
> Christian.
>
>>
>> Thanks.
>> -boris
>>
>>> Thanks for the help,
>>> Christian.
>>>
>>> Am 20.11.2017 um 17:33 schrieb Boris Ostrovsky:
>>>> On 11/20/2017 11:07 AM, Christian König wrote:
>>>>> Am 20.11.2017 um 16:51 schrieb Boris Ostrovsky:
>>>>>> (and then it breaks differently as a Xen guest --- we hung on the
>>>>>> last
>>>>>> pci_read_config_dword(), I haven't looked at this at all yet)
>>>>> Hui? How does this fix applies to a Xen guest in the first place?
>>>>>
>>>>> Please provide the output of "lspci -nn" and explain further what is
>>>>> your config with Xen.
>>>>>
>>>>>
>>>> This is dom0.
>>>>
>>>> -bash-4.1# lspci -nn
>>>> 00:00.0 Host bridge [0600]: ATI Technologies Inc RD890 Northbridge
>>>> only
>>>> dual slot (2x16) PCI-e GFX Hydra part [1002:5a10] (rev 02)
>>>> 00:00.2 Generic system peripheral [0806]: ATI Technologies Inc Device
>>>> [1002:5a23]
>>>> 00:0d.0 PCI bridge [0604]: ATI Technologies Inc RD890 PCI to PCI
>>>> bridge
>>>> (external gfx1 port B) [1002:5a1e]
>>>> 00:11.0 SATA controller [0106]: ATI Technologies Inc SB700/SB800 SATA
>>>> Controller [AHCI mode] [1002:4391]
>>>> 00:12.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>>> OHCI0 Controller [1002:4397]
>>>> 00:12.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
>>>> Controller [1002:4398]
>>>> 00:12.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>>> EHCI
>>>> Controller [1002:4396]
>>>> 00:13.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>>> OHCI0 Controller [1002:4397]
>>>> 00:13.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
>>>> Controller [1002:4398]
>>>> 00:13.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>>> EHCI
>>>> Controller [1002:4396]
>>>> 00:14.0 SMBus [0c05]: ATI Technologies Inc SBx00 SMBus Controller
>>>> [1002:4385] (rev 3d)
>>>> 00:14.3 ISA bridge [0601]: ATI Technologies Inc SB700/SB800 LPC host
>>>> controller [1002:439d]
>>>> 00:14.4 PCI bridge [0604]: ATI Technologies Inc SBx00 PCI to PCI
>>>> Bridge
>>>> [1002:4384]
>>>> 00:14.5 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>>> OHCI2 Controller [1002:4399]
>>>> 00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1600]
>>>> 00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1601]
>>>> 00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1602]
>>>> 00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1603]
>>>> 00:18.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1604]
>>>> 00:18.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1605]
>>>> 00:19.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1600]
>>>> 00:19.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1601]
>>>> 00:19.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1602]
>>>> 00:19.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1603]
>>>> 00:19.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1604]
>>>> 00:19.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>> [1022:1605]
>>>> 01:04.0 VGA compatible controller [0300]: Matrox Graphics, Inc. MGA
>>>> G200eW WPCM450 [102b:0532] (rev 0a)
>>>> 02:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
>>>> Network Connection [8086:10c9] (rev 01)
>>>> 02:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
>>>> Network Connection [8086:10c9] (rev 01)
>>>> -bash-4.1#
>>>>
>>>>
>>>> -boris
>>>
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-22 16:24               ` Boris Ostrovsky
@ 2017-11-22 16:54                 ` Christian König
  2017-11-22 17:27                   ` Boris Ostrovsky
  0 siblings, 1 reply; 27+ messages in thread
From: Christian König @ 2017-11-22 16:54 UTC (permalink / raw)
  To: Boris Ostrovsky, helgaas, linux-pci, dri-devel, linux-kernel,
	amd-gfx, xen-devel

Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky:
> On 11/22/2017 05:09 AM, Christian König wrote:
>> Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky:
>>> On 11/21/2017 08:34 AM, Christian König wrote:
>>>> Hi Boris,
>>>>
>>>> attached are two patches.
>>>>
>>>> The first one is a trivial fix for the infinite loop issue, it now
>>>> correctly aborts the fixup when it can't find address space for the
>>>> root window.
>>>>
>>>> The second is a workaround for your board. It simply checks if there
>>>> is exactly one Processor Function to apply this fix on.
>>>>
>>>> Both are based on linus current master branch. Please test if they fix
>>>> your issue.
>>> Yes, they do fix it but that's because the feature is disabled.
>>>
>>> Do you know what the actual problem was (on Xen)?
>> I still haven't understood what you actually did with Xen.
>>
>> When you used PCI pass through with those devices then you have made a
>> major configuration error.
>>
>> When the problem happened on dom0 then the explanation is most likely
>> that some PCI device ended up in the configured space, but the routing
>> was only setup correctly on one CPU socket.
> The problem is that dom0 can be (and was in my case() booted with less
> than full physical memory and so the "rest" of the host memory is not
> necessarily reflected in iomem. Your patch then tried to configure that
> memory for MMIO and the system hang.
>
> And so my guess is that this patch will break dom0 on a single-socket
> system as well.

Oh, thanks!

I've thought about that possibility before, but wasn't able to find a 
system which actually does that.

May I ask why the rest of the memory isn't reported to the OS?

Sounds like I can't trust Linux resource management and probably need to 
read the DRAM config to figure things out after all.

Thanks a lot for this information,
Christian.

>
> -boris
>
>> Regards,
>> Christian.
>>
>>> Thanks.
>>> -boris
>>>
>>>> Thanks for the help,
>>>> Christian.
>>>>
>>>> Am 20.11.2017 um 17:33 schrieb Boris Ostrovsky:
>>>>> On 11/20/2017 11:07 AM, Christian König wrote:
>>>>>> Am 20.11.2017 um 16:51 schrieb Boris Ostrovsky:
>>>>>>> (and then it breaks differently as a Xen guest --- we hung on the
>>>>>>> last
>>>>>>> pci_read_config_dword(), I haven't looked at this at all yet)
>>>>>> Hui? How does this fix applies to a Xen guest in the first place?
>>>>>>
>>>>>> Please provide the output of "lspci -nn" and explain further what is
>>>>>> your config with Xen.
>>>>>>
>>>>>>
>>>>> This is dom0.
>>>>>
>>>>> -bash-4.1# lspci -nn
>>>>> 00:00.0 Host bridge [0600]: ATI Technologies Inc RD890 Northbridge
>>>>> only
>>>>> dual slot (2x16) PCI-e GFX Hydra part [1002:5a10] (rev 02)
>>>>> 00:00.2 Generic system peripheral [0806]: ATI Technologies Inc Device
>>>>> [1002:5a23]
>>>>> 00:0d.0 PCI bridge [0604]: ATI Technologies Inc RD890 PCI to PCI
>>>>> bridge
>>>>> (external gfx1 port B) [1002:5a1e]
>>>>> 00:11.0 SATA controller [0106]: ATI Technologies Inc SB700/SB800 SATA
>>>>> Controller [AHCI mode] [1002:4391]
>>>>> 00:12.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>>>> OHCI0 Controller [1002:4397]
>>>>> 00:12.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
>>>>> Controller [1002:4398]
>>>>> 00:12.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>>>> EHCI
>>>>> Controller [1002:4396]
>>>>> 00:13.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>>>> OHCI0 Controller [1002:4397]
>>>>> 00:13.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1
>>>>> Controller [1002:4398]
>>>>> 00:13.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>>>> EHCI
>>>>> Controller [1002:4396]
>>>>> 00:14.0 SMBus [0c05]: ATI Technologies Inc SBx00 SMBus Controller
>>>>> [1002:4385] (rev 3d)
>>>>> 00:14.3 ISA bridge [0601]: ATI Technologies Inc SB700/SB800 LPC host
>>>>> controller [1002:439d]
>>>>> 00:14.4 PCI bridge [0604]: ATI Technologies Inc SBx00 PCI to PCI
>>>>> Bridge
>>>>> [1002:4384]
>>>>> 00:14.5 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB
>>>>> OHCI2 Controller [1002:4399]
>>>>> 00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1600]
>>>>> 00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1601]
>>>>> 00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1602]
>>>>> 00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1603]
>>>>> 00:18.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1604]
>>>>> 00:18.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1605]
>>>>> 00:19.0 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1600]
>>>>> 00:19.1 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1601]
>>>>> 00:19.2 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1602]
>>>>> 00:19.3 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1603]
>>>>> 00:19.4 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1604]
>>>>> 00:19.5 Host bridge [0600]: Advanced Micro Devices [AMD] Device
>>>>> [1022:1605]
>>>>> 01:04.0 VGA compatible controller [0300]: Matrox Graphics, Inc. MGA
>>>>> G200eW WPCM450 [102b:0532] (rev 0a)
>>>>> 02:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
>>>>> Network Connection [8086:10c9] (rev 01)
>>>>> 02:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit
>>>>> Network Connection [8086:10c9] (rev 01)
>>>>> -bash-4.1#
>>>>>
>>>>>
>>>>> -boris

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-22 16:54                 ` Christian König
@ 2017-11-22 17:27                   ` Boris Ostrovsky
  2017-11-23  8:11                     ` Christian König
  0 siblings, 1 reply; 27+ messages in thread
From: Boris Ostrovsky @ 2017-11-22 17:27 UTC (permalink / raw)
  To: Christian König, helgaas, linux-pci, dri-devel,
	linux-kernel, amd-gfx, xen-devel

On 11/22/2017 11:54 AM, Christian König wrote:
> Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky:
>> On 11/22/2017 05:09 AM, Christian König wrote:
>>> Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky:
>>>> On 11/21/2017 08:34 AM, Christian König wrote:
>>>>> Hi Boris,
>>>>>
>>>>> attached are two patches.
>>>>>
>>>>> The first one is a trivial fix for the infinite loop issue, it now
>>>>> correctly aborts the fixup when it can't find address space for the
>>>>> root window.
>>>>>
>>>>> The second is a workaround for your board. It simply checks if there
>>>>> is exactly one Processor Function to apply this fix on.
>>>>>
>>>>> Both are based on linus current master branch. Please test if they
>>>>> fix
>>>>> your issue.
>>>> Yes, they do fix it but that's because the feature is disabled.
>>>>
>>>> Do you know what the actual problem was (on Xen)?
>>> I still haven't understood what you actually did with Xen.
>>>
>>> When you used PCI pass through with those devices then you have made a
>>> major configuration error.
>>>
>>> When the problem happened on dom0 then the explanation is most likely
>>> that some PCI device ended up in the configured space, but the routing
>>> was only setup correctly on one CPU socket.
>> The problem is that dom0 can be (and was in my case() booted with less
>> than full physical memory and so the "rest" of the host memory is not
>> necessarily reflected in iomem. Your patch then tried to configure that
>> memory for MMIO and the system hang.
>>
>> And so my guess is that this patch will break dom0 on a single-socket
>> system as well.
>
> Oh, thanks!
>
> I've thought about that possibility before, but wasn't able to find a
> system which actually does that.
>
> May I ask why the rest of the memory isn't reported to the OS?

That memory doesn't belong to the OS (dom0), it is owned by the hypervisor.

>
> Sounds like I can't trust Linux resource management and probably need
> to read the DRAM config to figure things out after all.


My question is whether what you are trying to do should ever be done for
a guest at all (any guest, not necessarily Xen).

-boris

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-22 17:27                   ` Boris Ostrovsky
@ 2017-11-23  8:11                     ` Christian König
  2017-11-23 14:12                       ` Boris Ostrovsky
  0 siblings, 1 reply; 27+ messages in thread
From: Christian König @ 2017-11-23  8:11 UTC (permalink / raw)
  To: Boris Ostrovsky, helgaas, linux-pci, dri-devel, linux-kernel,
	amd-gfx, xen-devel

Am 22.11.2017 um 18:27 schrieb Boris Ostrovsky:
> On 11/22/2017 11:54 AM, Christian König wrote:
>> Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky:
>>> On 11/22/2017 05:09 AM, Christian König wrote:
>>>> Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky:
>>>>> On 11/21/2017 08:34 AM, Christian König wrote:
>>>>>> Hi Boris,
>>>>>>
>>>>>> attached are two patches.
>>>>>>
>>>>>> The first one is a trivial fix for the infinite loop issue, it now
>>>>>> correctly aborts the fixup when it can't find address space for the
>>>>>> root window.
>>>>>>
>>>>>> The second is a workaround for your board. It simply checks if there
>>>>>> is exactly one Processor Function to apply this fix on.
>>>>>>
>>>>>> Both are based on linus current master branch. Please test if they
>>>>>> fix
>>>>>> your issue.
>>>>> Yes, they do fix it but that's because the feature is disabled.
>>>>>
>>>>> Do you know what the actual problem was (on Xen)?
>>>> I still haven't understood what you actually did with Xen.
>>>>
>>>> When you used PCI pass through with those devices then you have made a
>>>> major configuration error.
>>>>
>>>> When the problem happened on dom0 then the explanation is most likely
>>>> that some PCI device ended up in the configured space, but the routing
>>>> was only setup correctly on one CPU socket.
>>> The problem is that dom0 can be (and was in my case() booted with less
>>> than full physical memory and so the "rest" of the host memory is not
>>> necessarily reflected in iomem. Your patch then tried to configure that
>>> memory for MMIO and the system hang.
>>>
>>> And so my guess is that this patch will break dom0 on a single-socket
>>> system as well.
>> Oh, thanks!
>>
>> I've thought about that possibility before, but wasn't able to find a
>> system which actually does that.
>>
>> May I ask why the rest of the memory isn't reported to the OS?
> That memory doesn't belong to the OS (dom0), it is owned by the hypervisor.
>
>> Sounds like I can't trust Linux resource management and probably need
>> to read the DRAM config to figure things out after all.
>
> My question is whether what you are trying to do should ever be done for
> a guest at all (any guest, not necessarily Xen).

The issue is probably that I don't know enough about Xen: What exactly 
is dom0? My understanding was that dom0 is the hypervisor, but that 
seems to be incorrect.

The issue is that under no circumstances *EVER* a virtualized guest 
should have access to the PCI devices marked as "Processor Function" on 
AMD platforms. Otherwise it is trivial to break out of the virtualization.

When dom0 is something like the system domain with all hardware access 
then the approach seems legitimate, but then the hypervisor should 
report the stolen memory to the OS using the e820 table.

When the hypervisor doesn't do that and the Linux kernel isn't aware 
that there is memory at a given location mapping PCI space there will 
obviously crash the hypervisor.

Possible solutions as far as I can see are either disabling this feature 
when we detect that we are a Xen dom0, scanning the DRAM settings to 
update Linux resource handling or fixing Xen to report stolen memory to 
the dom0 OS as reserved.

Opinions?

Thanks,
Christian.

>
> -boris
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-23  8:11                     ` Christian König
@ 2017-11-23 14:12                       ` Boris Ostrovsky
  2017-11-27 18:30                         ` Boris Ostrovsky
  0 siblings, 1 reply; 27+ messages in thread
From: Boris Ostrovsky @ 2017-11-23 14:12 UTC (permalink / raw)
  To: Christian König, helgaas, linux-pci, dri-devel,
	linux-kernel, amd-gfx, xen-devel



On 11/23/2017 03:11 AM, Christian König wrote:
> Am 22.11.2017 um 18:27 schrieb Boris Ostrovsky:
>> On 11/22/2017 11:54 AM, Christian König wrote:
>>> Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky:
>>>> On 11/22/2017 05:09 AM, Christian König wrote:
>>>>> Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky:
>>>>>> On 11/21/2017 08:34 AM, Christian König wrote:
>>>>>>> Hi Boris,
>>>>>>>
>>>>>>> attached are two patches.
>>>>>>>
>>>>>>> The first one is a trivial fix for the infinite loop issue, it now
>>>>>>> correctly aborts the fixup when it can't find address space for the
>>>>>>> root window.
>>>>>>>
>>>>>>> The second is a workaround for your board. It simply checks if there
>>>>>>> is exactly one Processor Function to apply this fix on.
>>>>>>>
>>>>>>> Both are based on linus current master branch. Please test if they
>>>>>>> fix
>>>>>>> your issue.
>>>>>> Yes, they do fix it but that's because the feature is disabled.
>>>>>>
>>>>>> Do you know what the actual problem was (on Xen)?
>>>>> I still haven't understood what you actually did with Xen.
>>>>>
>>>>> When you used PCI pass through with those devices then you have made a
>>>>> major configuration error.
>>>>>
>>>>> When the problem happened on dom0 then the explanation is most likely
>>>>> that some PCI device ended up in the configured space, but the routing
>>>>> was only setup correctly on one CPU socket.
>>>> The problem is that dom0 can be (and was in my case() booted with less
>>>> than full physical memory and so the "rest" of the host memory is not
>>>> necessarily reflected in iomem. Your patch then tried to configure that
>>>> memory for MMIO and the system hang.
>>>>
>>>> And so my guess is that this patch will break dom0 on a single-socket
>>>> system as well.
>>> Oh, thanks!
>>>
>>> I've thought about that possibility before, but wasn't able to find a
>>> system which actually does that.
>>>
>>> May I ask why the rest of the memory isn't reported to the OS?
>> That memory doesn't belong to the OS (dom0), it is owned by the 
>> hypervisor.
>>
>>> Sounds like I can't trust Linux resource management and probably need
>>> to read the DRAM config to figure things out after all.
>>
>> My question is whether what you are trying to do should ever be done for
>> a guest at all (any guest, not necessarily Xen).
> 
> The issue is probably that I don't know enough about Xen: What exactly 
> is dom0? My understanding was that dom0 is the hypervisor, but that 
> seems to be incorrect.
> 
> The issue is that under no circumstances *EVER* a virtualized guest 
> should have access to the PCI devices marked as "Processor Function" on 
> AMD platforms. Otherwise it is trivial to break out of the virtualization.
> 
> When dom0 is something like the system domain with all hardware access 
> then the approach seems legitimate, but then the hypervisor should 
> report the stolen memory to the OS using the e820 table.
> 
> When the hypervisor doesn't do that and the Linux kernel isn't aware 
> that there is memory at a given location mapping PCI space there will 
> obviously crash the hypervisor.
> 
> Possible solutions as far as I can see are either disabling this feature 
> when we detect that we are a Xen dom0, scanning the DRAM settings to 
> update Linux resource handling or fixing Xen to report stolen memory to 
> the dom0 OS as reserved.
> 
> Opinions?

You are right, these functions are not exposed to a regular guest.

I think for dom0 (which is a special Xen guest, with additional 
privileges) we may be able to add a reserved e820 region for host memory 
that is not assigned to dom0. Let me try it on Monday (I am out until then).

-boris

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-23 14:12                       ` Boris Ostrovsky
@ 2017-11-27 18:30                         ` Boris Ostrovsky
  2017-11-28  9:12                           ` Christian König
  0 siblings, 1 reply; 27+ messages in thread
From: Boris Ostrovsky @ 2017-11-27 18:30 UTC (permalink / raw)
  To: Christian König, helgaas, linux-pci, dri-devel,
	linux-kernel, amd-gfx, xen-devel

On 11/23/2017 09:12 AM, Boris Ostrovsky wrote:
>
>
> On 11/23/2017 03:11 AM, Christian König wrote:
>> Am 22.11.2017 um 18:27 schrieb Boris Ostrovsky:
>>> On 11/22/2017 11:54 AM, Christian König wrote:
>>>> Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky:
>>>>> On 11/22/2017 05:09 AM, Christian König wrote:
>>>>>> Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky:
>>>>>>> On 11/21/2017 08:34 AM, Christian König wrote:
>>>>>>>> Hi Boris,
>>>>>>>>
>>>>>>>> attached are two patches.
>>>>>>>>
>>>>>>>> The first one is a trivial fix for the infinite loop issue, it now
>>>>>>>> correctly aborts the fixup when it can't find address space for
>>>>>>>> the
>>>>>>>> root window.
>>>>>>>>
>>>>>>>> The second is a workaround for your board. It simply checks if
>>>>>>>> there
>>>>>>>> is exactly one Processor Function to apply this fix on.
>>>>>>>>
>>>>>>>> Both are based on linus current master branch. Please test if they
>>>>>>>> fix
>>>>>>>> your issue.
>>>>>>> Yes, they do fix it but that's because the feature is disabled.
>>>>>>>
>>>>>>> Do you know what the actual problem was (on Xen)?
>>>>>> I still haven't understood what you actually did with Xen.
>>>>>>
>>>>>> When you used PCI pass through with those devices then you have
>>>>>> made a
>>>>>> major configuration error.
>>>>>>
>>>>>> When the problem happened on dom0 then the explanation is most
>>>>>> likely
>>>>>> that some PCI device ended up in the configured space, but the
>>>>>> routing
>>>>>> was only setup correctly on one CPU socket.
>>>>> The problem is that dom0 can be (and was in my case() booted with
>>>>> less
>>>>> than full physical memory and so the "rest" of the host memory is not
>>>>> necessarily reflected in iomem. Your patch then tried to configure
>>>>> that
>>>>> memory for MMIO and the system hang.
>>>>>
>>>>> And so my guess is that this patch will break dom0 on a single-socket
>>>>> system as well.
>>>> Oh, thanks!
>>>>
>>>> I've thought about that possibility before, but wasn't able to find a
>>>> system which actually does that.
>>>>
>>>> May I ask why the rest of the memory isn't reported to the OS?
>>> That memory doesn't belong to the OS (dom0), it is owned by the
>>> hypervisor.
>>>
>>>> Sounds like I can't trust Linux resource management and probably need
>>>> to read the DRAM config to figure things out after all.
>>>
>>> My question is whether what you are trying to do should ever be done
>>> for
>>> a guest at all (any guest, not necessarily Xen).
>>
>> The issue is probably that I don't know enough about Xen: What
>> exactly is dom0? My understanding was that dom0 is the hypervisor,
>> but that seems to be incorrect.
>>
>> The issue is that under no circumstances *EVER* a virtualized guest
>> should have access to the PCI devices marked as "Processor Function"
>> on AMD platforms. Otherwise it is trivial to break out of the
>> virtualization.
>>
>> When dom0 is something like the system domain with all hardware
>> access then the approach seems legitimate, but then the hypervisor
>> should report the stolen memory to the OS using the e820 table.
>>
>> When the hypervisor doesn't do that and the Linux kernel isn't aware
>> that there is memory at a given location mapping PCI space there will
>> obviously crash the hypervisor.
>>
>> Possible solutions as far as I can see are either disabling this
>> feature when we detect that we are a Xen dom0, scanning the DRAM
>> settings to update Linux resource handling or fixing Xen to report
>> stolen memory to the dom0 OS as reserved.
>>
>> Opinions?
>
> You are right, these functions are not exposed to a regular guest.
>
> I think for dom0 (which is a special Xen guest, with additional
> privileges) we may be able to add a reserved e820 region for host
> memory that is not assigned to dom0. Let me try it on Monday (I am out
> until then).


One thing I realized while looking at solution for Xen dom0 is that this
patch may cause problems for memory hotplug. What happens if new memory
is added to the system and we have everything above current memory set
to MMIO?

-boris

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-27 18:30                         ` Boris Ostrovsky
@ 2017-11-28  9:12                           ` Christian König
  2017-11-28  9:46                             ` [Xen-devel] " Jan Beulich
  2017-11-28 18:55                             ` Boris Ostrovsky
  0 siblings, 2 replies; 27+ messages in thread
From: Christian König @ 2017-11-28  9:12 UTC (permalink / raw)
  To: Boris Ostrovsky, helgaas, linux-pci, dri-devel, linux-kernel,
	amd-gfx, xen-devel

[-- Attachment #1: Type: text/plain, Size: 4956 bytes --]

Am 27.11.2017 um 19:30 schrieb Boris Ostrovsky:
> On 11/23/2017 09:12 AM, Boris Ostrovsky wrote:
>>
>> On 11/23/2017 03:11 AM, Christian König wrote:
>>> Am 22.11.2017 um 18:27 schrieb Boris Ostrovsky:
>>>> On 11/22/2017 11:54 AM, Christian König wrote:
>>>>> Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky:
>>>>>> On 11/22/2017 05:09 AM, Christian König wrote:
>>>>>>> Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky:
>>>>>>>> On 11/21/2017 08:34 AM, Christian König wrote:
>>>>>>>>> Hi Boris,
>>>>>>>>>
>>>>>>>>> attached are two patches.
>>>>>>>>>
>>>>>>>>> The first one is a trivial fix for the infinite loop issue, it now
>>>>>>>>> correctly aborts the fixup when it can't find address space for
>>>>>>>>> the
>>>>>>>>> root window.
>>>>>>>>>
>>>>>>>>> The second is a workaround for your board. It simply checks if
>>>>>>>>> there
>>>>>>>>> is exactly one Processor Function to apply this fix on.
>>>>>>>>>
>>>>>>>>> Both are based on linus current master branch. Please test if they
>>>>>>>>> fix
>>>>>>>>> your issue.
>>>>>>>> Yes, they do fix it but that's because the feature is disabled.
>>>>>>>>
>>>>>>>> Do you know what the actual problem was (on Xen)?
>>>>>>> I still haven't understood what you actually did with Xen.
>>>>>>>
>>>>>>> When you used PCI pass through with those devices then you have
>>>>>>> made a
>>>>>>> major configuration error.
>>>>>>>
>>>>>>> When the problem happened on dom0 then the explanation is most
>>>>>>> likely
>>>>>>> that some PCI device ended up in the configured space, but the
>>>>>>> routing
>>>>>>> was only setup correctly on one CPU socket.
>>>>>> The problem is that dom0 can be (and was in my case() booted with
>>>>>> less
>>>>>> than full physical memory and so the "rest" of the host memory is not
>>>>>> necessarily reflected in iomem. Your patch then tried to configure
>>>>>> that
>>>>>> memory for MMIO and the system hang.
>>>>>>
>>>>>> And so my guess is that this patch will break dom0 on a single-socket
>>>>>> system as well.
>>>>> Oh, thanks!
>>>>>
>>>>> I've thought about that possibility before, but wasn't able to find a
>>>>> system which actually does that.
>>>>>
>>>>> May I ask why the rest of the memory isn't reported to the OS?
>>>> That memory doesn't belong to the OS (dom0), it is owned by the
>>>> hypervisor.
>>>>
>>>>> Sounds like I can't trust Linux resource management and probably need
>>>>> to read the DRAM config to figure things out after all.
>>>> My question is whether what you are trying to do should ever be done
>>>> for
>>>> a guest at all (any guest, not necessarily Xen).
>>> The issue is probably that I don't know enough about Xen: What
>>> exactly is dom0? My understanding was that dom0 is the hypervisor,
>>> but that seems to be incorrect.
>>>
>>> The issue is that under no circumstances *EVER* a virtualized guest
>>> should have access to the PCI devices marked as "Processor Function"
>>> on AMD platforms. Otherwise it is trivial to break out of the
>>> virtualization.
>>>
>>> When dom0 is something like the system domain with all hardware
>>> access then the approach seems legitimate, but then the hypervisor
>>> should report the stolen memory to the OS using the e820 table.
>>>
>>> When the hypervisor doesn't do that and the Linux kernel isn't aware
>>> that there is memory at a given location mapping PCI space there will
>>> obviously crash the hypervisor.
>>>
>>> Possible solutions as far as I can see are either disabling this
>>> feature when we detect that we are a Xen dom0, scanning the DRAM
>>> settings to update Linux resource handling or fixing Xen to report
>>> stolen memory to the dom0 OS as reserved.
>>>
>>> Opinions?
>> You are right, these functions are not exposed to a regular guest.
>>
>> I think for dom0 (which is a special Xen guest, with additional
>> privileges) we may be able to add a reserved e820 region for host
>> memory that is not assigned to dom0. Let me try it on Monday (I am out
>> until then).
>
> One thing I realized while looking at solution for Xen dom0 is that this
> patch may cause problems for memory hotplug.

Good point. My assumption was that when you got an BIOS which can handle 
memory hotplug then you also got an BIOS which doesn't need this fixup. 
But I've never validated that assumption.

> What happens if new memory
> is added to the system and we have everything above current memory set
> to MMIO?

In theory the BIOS would search for address space and won't find 
anything, so the hotplug operation should fail even before it reaches 
the kernel in the first place.

In practice I think that nobody ever tested if that works correctly. So 
I'm pretty sure the system would just crash.

How about the attached patch? It limits the newly added MMIO space to 
the upper 256GB of the address space. That should still be enough for 
most devices, but we avoid both issues with Xen dom0 as most likely 
problems with memory hotplug as well.

Christian.

>
> -boris
>


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-x86-PCI-limit-the-size-of-the-64bit-BAR-to-256GB.patch --]
[-- Type: text/x-patch; name="0001-x86-PCI-limit-the-size-of-the-64bit-BAR-to-256GB.patch", Size: 1156 bytes --]

>From 586bd9d67ebb6ca48bd0a6b1bd9203e94093cc8e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>
Date: Tue, 28 Nov 2017 10:02:35 +0100
Subject: [PATCH] x86/PCI: limit the size of the 64bit BAR to 256GB
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This avoids problems with Xen which hides some memory resources from the
OS and potentially also allows memory hotplug while this fixup is
enabled.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 arch/x86/pci/fixup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 56c39a7bd080..6dffdee8a2de 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -690,7 +690,7 @@ static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
 	res->name = "PCI Bus 0000:00";
 	res->flags = IORESOURCE_PREFETCH | IORESOURCE_MEM |
 		IORESOURCE_MEM_64 | IORESOURCE_WINDOW;
-	res->start = 0x100000000ull;
+	res->start = 0xbd00000000ull;
 	res->end = 0xfd00000000ull - 1;
 
 	/* Just grab the free area behind system memory for this */
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-28  9:12                           ` Christian König
@ 2017-11-28  9:46                             ` Jan Beulich
  2017-11-28 10:17                               ` Christian König
  2017-11-28 18:55                             ` Boris Ostrovsky
  1 sibling, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2017-11-28  9:46 UTC (permalink / raw)
  To: Christian König
  Cc: helgaas, amd-gfx, dri-devel, xen-devel, Boris Ostrovsky,
	linux-kernel, linux-pci

>>> On 28.11.17 at 10:12, <christian.koenig@amd.com> wrote:
> In theory the BIOS would search for address space and won't find 
> anything, so the hotplug operation should fail even before it reaches 
> the kernel in the first place.

How would the BIOS know what the OS does or plans to do? I think
it's the other way around - the OS needs to avoid using any regions
for MMIO which are marked as hotpluggable in SRAT. Since there is
no vNUMA yet for Xen Dom0, that would need special handling.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-28  9:46                             ` [Xen-devel] " Jan Beulich
@ 2017-11-28 10:17                               ` Christian König
  2017-11-28 10:53                                 ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Christian König @ 2017-11-28 10:17 UTC (permalink / raw)
  To: Jan Beulich
  Cc: helgaas, amd-gfx, dri-devel, xen-devel, Boris Ostrovsky,
	linux-kernel, linux-pci

Am 28.11.2017 um 10:46 schrieb Jan Beulich:
>>>> On 28.11.17 at 10:12, <christian.koenig@amd.com> wrote:
>> In theory the BIOS would search for address space and won't find
>> anything, so the hotplug operation should fail even before it reaches
>> the kernel in the first place.
> How would the BIOS know what the OS does or plans to do?

As far as I know the ACPI BIOS should work directly with the register 
content.

So when we update the register content to enable the MMIO decoding the 
BIOS should know that as well.

> I think
> it's the other way around - the OS needs to avoid using any regions
> for MMIO which are marked as hotpluggable in SRAT.

I was under the impression that this is exactly what 
acpi_numa_memory_affinity_init() does.

> Since there is
> no vNUMA yet for Xen Dom0, that would need special handling.

I think that the problem is rather that SRAT is NUMA specific and if I'm 
not totally mistaken the content is ignored when NUMA support isn't 
compiled into the kernel.

When Xen steals some memory from Dom0 by hocking up itself into the e820 
call then I would say the cleanest way is to report this memory in e820 
as reserved as well. But take that with a grain of salt, I'm seriously 
not a Xen expert.

Regards,
Christian.

>
> Jan
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-28 10:17                               ` Christian König
@ 2017-11-28 10:53                                 ` Jan Beulich
  2017-11-28 11:59                                   ` Christian König
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2017-11-28 10:53 UTC (permalink / raw)
  To: Christian König
  Cc: helgaas, amd-gfx, dri-devel, xen-devel, Boris Ostrovsky,
	linux-kernel, linux-pci

>>> On 28.11.17 at 11:17, <christian.koenig@amd.com> wrote:
> Am 28.11.2017 um 10:46 schrieb Jan Beulich:
>>>>> On 28.11.17 at 10:12, <christian.koenig@amd.com> wrote:
>>> In theory the BIOS would search for address space and won't find
>>> anything, so the hotplug operation should fail even before it reaches
>>> the kernel in the first place.
>> How would the BIOS know what the OS does or plans to do?
> 
> As far as I know the ACPI BIOS should work directly with the register 
> content.
> 
> So when we update the register content to enable the MMIO decoding the 
> BIOS should know that as well.

I'm afraid I don't follow: During memory hotplug, surely you don't
expect the BIOS to do a PCI bus scan? Plus even if it did, it would
be racy - some device could, at this very moment, have memory
decoding disabled, just for the OS to re-enable it a millisecond
later. Yet looking at BAR values is meaningless when memory
decode of a device is disabled.

>> I think
>> it's the other way around - the OS needs to avoid using any regions
>> for MMIO which are marked as hotpluggable in SRAT.
> 
> I was under the impression that this is exactly what 
> acpi_numa_memory_affinity_init() does.

Perhaps, except that (when I last looked) insufficient state is
(was) being recorded to have that information readily available
at the time MMIO space above 4Gb needs to be allocated for
some device.

>> Since there is
>> no vNUMA yet for Xen Dom0, that would need special handling.
> 
> I think that the problem is rather that SRAT is NUMA specific and if I'm 
> not totally mistaken the content is ignored when NUMA support isn't 
> compiled into the kernel.
> 
> When Xen steals some memory from Dom0 by hocking up itself into the e820 
> call then I would say the cleanest way is to report this memory in e820 
> as reserved as well. But take that with a grain of salt, I'm seriously 
> not a Xen expert.

The E820 handling in PV Linux is all fake anyway - there's a single
chunk of memory given to a PV guest (including Dom0), contiguous
in what PV guests know as "physical address space" (not to be
mixed up with "machine address space", which is where MMIO
needs to be allocated from). Xen code in the kernel then mimics
an E820 matching the host one, moving around pieces of memory
in physical address space if necessary.

Since Dom0 knows the machine E820, MMIO allocation shouldn't
need to be much different there from the non-Xen case.

Jan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-28 10:53                                 ` Jan Beulich
@ 2017-11-28 11:59                                   ` Christian König
  0 siblings, 0 replies; 27+ messages in thread
From: Christian König @ 2017-11-28 11:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: helgaas, amd-gfx, dri-devel, xen-devel, Boris Ostrovsky,
	linux-kernel, linux-pci

Am 28.11.2017 um 11:53 schrieb Jan Beulich:
>>>> On 28.11.17 at 11:17, <christian.koenig@amd.com> wrote:
>> Am 28.11.2017 um 10:46 schrieb Jan Beulich:
>>>>>> On 28.11.17 at 10:12, <christian.koenig@amd.com> wrote:
>>>> In theory the BIOS would search for address space and won't find
>>>> anything, so the hotplug operation should fail even before it reaches
>>>> the kernel in the first place.
>>> How would the BIOS know what the OS does or plans to do?
>> As far as I know the ACPI BIOS should work directly with the register
>> content.
>>
>> So when we update the register content to enable the MMIO decoding the
>> BIOS should know that as well.
> I'm afraid I don't follow: During memory hotplug, surely you don't
> expect the BIOS to do a PCI bus scan? Plus even if it did, it would
> be racy - some device could, at this very moment, have memory
> decoding disabled, just for the OS to re-enable it a millisecond
> later. Yet looking at BAR values is meaningless when memory
> decode of a device is disabled.

No, sorry you misunderstood me. The PCI bus is not even involved here.

In AMD Family CPUs you have four main types of address space routed by 
the NB:
1.  Memory space targeting system DRAM.
2.  Memory space targeting IO (MMIO).
3.  IO space.
4.  Configuration space.

See section "2.8.2 NB Routing" in the BIOS and Kernel Developer’s Guide 
(https://support.amd.com/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf).

Long story short you have fix addresses for configuration and legacy IO 
(VGA for example) and then configurable memory space for DRAM and MMIO.

What the ACPI BIOS does (or at least should do) is taking a look at the 
registers to find space during memory hotplug.

Now in theory the MMIO space should be configurable by similar ACPI BIOS 
functions, but unfortunately most BIOS doesn't enable that function 
because it can break some older Windows versions.

So what we do here is just what the BIOS should have provided in the 
first place.

>>> I think
>>> it's the other way around - the OS needs to avoid using any regions
>>> for MMIO which are marked as hotpluggable in SRAT.
>> I was under the impression that this is exactly what
>> acpi_numa_memory_affinity_init() does.
> Perhaps, except that (when I last looked) insufficient state is
> (was) being recorded to have that information readily available
> at the time MMIO space above 4Gb needs to be allocated for
> some device.

That was also my concern, but in the most recent version I'm 
intentionally doing this fixup very late after all the PCI setup is 
already done.

This way the extra address space is only available for devices which are 
added by PCI hotplug or for resizing BARs on the fly (which is the use 
case I'm interested in).

>>> Since there is
>>> no vNUMA yet for Xen Dom0, that would need special handling.
>> I think that the problem is rather that SRAT is NUMA specific and if I'm
>> not totally mistaken the content is ignored when NUMA support isn't
>> compiled into the kernel.
>>
>> When Xen steals some memory from Dom0 by hocking up itself into the e820
>> call then I would say the cleanest way is to report this memory in e820
>> as reserved as well. But take that with a grain of salt, I'm seriously
>> not a Xen expert.
> The E820 handling in PV Linux is all fake anyway - there's a single
> chunk of memory given to a PV guest (including Dom0), contiguous
> in what PV guests know as "physical address space" (not to be
> mixed up with "machine address space", which is where MMIO
> needs to be allocated from). Xen code in the kernel then mimics
> an E820 matching the host one, moving around pieces of memory
> in physical address space if necessary.

Good to know.

> Since Dom0 knows the machine E820, MMIO allocation shouldn't
> need to be much different there from the non-Xen case.

Yes, completely agree.

I think even if we don't do MMIO allocation with this fixup letting the 
kernel in Dom0 know which memory/address space regions are in use is 
still a good idea.

Otherwise we will run into exactly the same problem when we do the MMIO 
allocation with an ACPI method and that is certainly going to come in 
the next BIOS generations because Microsoft is pushing for it.

Regards,
Christian.

>
> Jan
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
  2017-11-28  9:12                           ` Christian König
  2017-11-28  9:46                             ` [Xen-devel] " Jan Beulich
@ 2017-11-28 18:55                             ` Boris Ostrovsky
  1 sibling, 0 replies; 27+ messages in thread
From: Boris Ostrovsky @ 2017-11-28 18:55 UTC (permalink / raw)
  To: Christian König, helgaas, linux-pci, dri-devel,
	linux-kernel, amd-gfx, xen-devel

On 11/28/2017 04:12 AM, Christian König wrote:
>
>
> How about the attached patch? It limits the newly added MMIO space to
> the upper 256GB of the address space. That should still be enough for
> most devices, but we avoid both issues with Xen dom0 as most likely
> problems with memory hotplug as well.

It certainly makes the problem to be less likely so I guess we could do
this for now.

Thanks.
-boris

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-11-28 18:56 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-18 13:58 Resizable PCI BAR support V9 Christian König
2017-10-18 13:58 ` [PATCH v9 1/5] PCI: add a define for the PCI resource type mask v2 Christian König
2017-10-18 13:58 ` [PATCH v9 2/5] PCI: add resizeable BAR infrastructure v5 Christian König
2017-10-18 13:58 ` [PATCH v9 3/5] PCI: add functionality for resizing resources v7 Christian König
2017-10-18 13:58 ` [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5 Christian König
2017-11-02 16:43   ` Alex Deucher
2017-11-20 15:51   ` Boris Ostrovsky
2017-11-20 16:07     ` Christian König
2017-11-20 16:33       ` Boris Ostrovsky
2017-11-21 13:34         ` Christian König
2017-11-21 22:26           ` Boris Ostrovsky
2017-11-22 10:09             ` Christian König
2017-11-22 16:24               ` Boris Ostrovsky
2017-11-22 16:54                 ` Christian König
2017-11-22 17:27                   ` Boris Ostrovsky
2017-11-23  8:11                     ` Christian König
2017-11-23 14:12                       ` Boris Ostrovsky
2017-11-27 18:30                         ` Boris Ostrovsky
2017-11-28  9:12                           ` Christian König
2017-11-28  9:46                             ` [Xen-devel] " Jan Beulich
2017-11-28 10:17                               ` Christian König
2017-11-28 10:53                                 ` Jan Beulich
2017-11-28 11:59                                   ` Christian König
2017-11-28 18:55                             ` Boris Ostrovsky
2017-10-18 13:58 ` [PATCH v9 5/5] drm/amdgpu: resize VRAM BAR for CPU access v5 Christian König
2017-10-24 19:44 ` Resizable PCI BAR support V9 Bjorn Helgaas
2017-10-25 11:27   ` Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).