All of lore.kernel.org
 help / color / mirror / Atom feed
* Resizeable PCI BAR support v6
@ 2017-05-09 16:49 Christian König
  2017-05-09 16:49 ` [PATCH v6 1/5] PCI: add a define for the PCI resource type mask v2 Christian König
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Christian König @ 2017-05-09 16:49 UTC (permalink / raw)
  To: linux-pci, platform-driver-x86, helgaas

Hi everyone,

is is the sixth incarnation of this set of patches. It enables device
drivers to resize and most likely also relocate the PCI BAR of devices
they manage to allow the CPU to access all of the device local memory at once.

This is very useful for GFX device drivers where the default PCI BAR is only
about 256MB in size for compatibility reasons, but the device easily have
multiple gigabyte of local memory.

Noteable changed compared to v5:
1. Added some rb.
2. Followed Andy's suggestion on patch #3 to use a next
   pointer to make it more readable.

Mostly every comment addressed now, so next question is how we want to
merge this upstream?

I would suggest patches #1-#4 through some PCI branch and we send #5 through the drm
maintainer as soon as the dependencies are accepted?

Thanks for all the help,
Christian.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v6 1/5] PCI: add a define for the PCI resource type mask v2
  2017-05-09 16:49 Resizeable PCI BAR support v6 Christian König
@ 2017-05-09 16:49 ` Christian König
  2017-05-09 16:49 ` [PATCH v6 2/5] PCI: add resizeable BAR infrastructure v5 Christian König
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-05-09 16:49 UTC (permalink / raw)
  To: linux-pci, platform-driver-x86, helgaas

From: Christian König <christian.koenig@amd.com>

We use this mask multiple times in the bus setup.

v2: fix some style nit picks

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
---
 drivers/pci/pci.h       |  4 ++++
 drivers/pci/setup-bus.c | 12 +++---------
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4518562..56ad106 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -6,6 +6,10 @@
 
 #define PCI_FIND_CAP_TTL	48
 
+#define PCI_RES_TYPE_MASK \
+	(IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
+	 IORESOURCE_MEM_64)
+
 extern const unsigned char pcie_link_speed[];
 
 bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index f30ca75..efebd70 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1532,8 +1532,6 @@ static void pci_bridge_release_resources(struct pci_bus *bus,
 {
 	struct pci_dev *dev = bus->self;
 	struct resource *r;
-	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
-				  IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
 	unsigned old_flags = 0;
 	struct resource *b_res;
 	int idx = 1;
@@ -1576,7 +1574,7 @@ static void pci_bridge_release_resources(struct pci_bus *bus,
 	 */
 	release_child_resources(r);
 	if (!release_resource(r)) {
-		type = old_flags = r->flags & type_mask;
+		type = old_flags = r->flags & PCI_RES_TYPE_MASK;
 		dev_printk(KERN_DEBUG, &dev->dev, "resource %d %pR released\n",
 					PCI_BRIDGE_RESOURCES + idx, r);
 		/* keep the old size */
@@ -1767,8 +1765,6 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 	enum release_type rel_type = leaf_only;
 	LIST_HEAD(fail_head);
 	struct pci_dev_resource *fail_res;
-	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
-				  IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
 	int pci_try_num = 1;
 	enum enable_type enable_local;
 
@@ -1827,7 +1823,7 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 	 */
 	list_for_each_entry(fail_res, &fail_head, list)
 		pci_bus_release_bridge_resources(fail_res->dev->bus,
-						 fail_res->flags & type_mask,
+						 fail_res->flags & PCI_RES_TYPE_MASK,
 						 rel_type);
 
 	/* restore size and flags */
@@ -1871,8 +1867,6 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
 	LIST_HEAD(fail_head);
 	struct pci_dev_resource *fail_res;
 	int retval;
-	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
-				  IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
 
 again:
 	__pci_bus_size_bridges(parent, &add_list);
@@ -1898,7 +1892,7 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
 	 */
 	list_for_each_entry(fail_res, &fail_head, list)
 		pci_bus_release_bridge_resources(fail_res->dev->bus,
-						 fail_res->flags & type_mask,
+						 fail_res->flags & PCI_RES_TYPE_MASK,
 						 whole_subtree);
 
 	/* restore size and flags */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 2/5] PCI: add resizeable BAR infrastructure v5
  2017-05-09 16:49 Resizeable PCI BAR support v6 Christian König
  2017-05-09 16:49 ` [PATCH v6 1/5] PCI: add a define for the PCI resource type mask v2 Christian König
@ 2017-05-09 16:49 ` Christian König
  2017-05-09 16:49 ` [PATCH v6 3/5] PCI: add functionality for resizing resources v5 Christian König
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-05-09 16:49 UTC (permalink / raw)
  To: linux-pci, platform-driver-x86, helgaas

From: Christian König <christian.koenig@amd.com>

Just the defines and helper functions to read the possible sizes of a BAR and
update it's size.

See https://pcisig.com/sites/default/files/specification_documents/ECN_Resizable-BAR_24Apr2008.pdf
and PCIe r3.1, sec 7.22.

This is useful for hardware with large local storage (mostly GFX) which only
expose 256MB BARs initially to be compatible with 32bit systems.

v2: provide read helper as well
v3: improve function names, use unsigned values, add better comments.
v4: move definition, improve commit message, s/bar/BAR/
v5: split out helper to find ctrl reg pos, style fixes, comment fixes,
    add pci_rbar_size_to_bytes as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
---
 drivers/pci/pci.c             | 104 ++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci.h             |   8 ++++
 include/uapi/linux/pci_regs.h |  11 ++++-
 3 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ba34907..0cbf4a6 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2944,6 +2944,110 @@ bool pci_acs_path_enabled(struct pci_dev *start,
 }
 
 /**
+ * pci_rbar_find_pos - find position of resize ctrl reg for BAR
+ * @dev: PCI device
+ * @bar: BAR to find
+ *
+ * Helper to find the postion of the ctrl register for a BAR.
+ * Returns -ENOTSUPP of resizeable BARs are not supported at all.
+ * Returns -ENOENT if not ctrl register for the BAR could be found.
+ */
+static int pci_rbar_find_pos(struct pci_dev *pdev, int bar)
+{
+	unsigned int pos, nbars;
+	unsigned int i;
+	u32 ctrl;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_REBAR);
+	if (!pos)
+		return -ENOTSUPP;
+
+	pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+	nbars = (ctrl & PCI_REBAR_CTRL_NBAR_MASK) >> PCI_REBAR_CTRL_NBAR_SHIFT;
+
+	for (i = 0; i < nbars; ++i, pos += 8) {
+		int bar_idx;
+
+		pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+		bar_idx = (ctrl & PCI_REBAR_CTRL_BAR_IDX_MASK) >>
+				PCI_REBAR_CTRL_BAR_IDX_SHIFT;
+		if (bar_idx == bar)
+			return pos;
+	}
+
+	return -ENOENT;
+}
+
+/**
+ * pci_rbar_get_possible_sizes - get possible sizes for BAR
+ * @dev: PCI device
+ * @bar: BAR to query
+ *
+ * Get the possible sizes of a resizeable BAR as bitmask defined in the spec
+ * (bit 0=1MB, bit 19=512GB). Returns 0 if BAR isn't resizeable.
+ */
+u32 pci_rbar_get_possible_sizes(struct pci_dev *pdev, int bar)
+{
+	u32 cap;
+	int pos;
+
+	pos = pci_rbar_find_pos(pdev, bar);
+	if (pos < 0)
+		return 0;
+
+	pci_read_config_dword(pdev, pos + PCI_REBAR_CAP, &cap);
+	return (cap & PCI_REBAR_CTRL_SIZES_MASK) >>
+		PCI_REBAR_CTRL_SIZES_SHIFT;
+}
+
+/**
+ * pci_rbar_get_current_size - get the current size of a BAR
+ * @dev: PCI device
+ * @bar: BAR to set size to
+ *
+ * Read the size of a BAR from the resizeable BAR config.
+ * Returns size if found or negative error code.
+ */
+int pci_rbar_get_current_size(struct pci_dev *pdev, int bar)
+{
+	u32 ctrl;
+	int pos;
+
+	pos = pci_rbar_find_pos(pdev, bar);
+	if (pos < 0)
+		return pos;
+
+	pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+	return (ctrl & PCI_REBAR_CTRL_BAR_SIZE_MASK) >>
+		PCI_REBAR_CTRL_BAR_SIZE_SHIFT;
+}
+
+/**
+ * pci_rbar_set_size - set a new size for a BAR
+ * @dev: PCI device
+ * @bar: BAR to set size to
+ * @size: new size as defined in the spec (0=1MB, 19=512GB)
+ *
+ * Set the new size of a BAR as defined in the spec.
+ * Returns zero if resizing was successful, error code otherwise.
+ */
+int pci_rbar_set_size(struct pci_dev *pdev, int bar, int size)
+{
+	u32 ctrl;
+	int pos;
+
+	pos = pci_rbar_find_pos(pdev, bar);
+	if (pos < 0)
+		return pos;
+
+	pci_read_config_dword(pdev, pos + PCI_REBAR_CTRL, &ctrl);
+	ctrl &= ~PCI_REBAR_CTRL_BAR_SIZE_MASK;
+	ctrl |= size << PCI_REBAR_CTRL_BAR_SIZE_SHIFT;
+	pci_write_config_dword(pdev, pos + PCI_REBAR_CTRL, ctrl);
+	return 0;
+}
+
+/**
  * pci_swizzle_interrupt_pin - swizzle INTx for device behind bridge
  * @dev: the PCI device
  * @pin: the INTx pin (1=INTA, 2=INTB, 3=INTC, 4=INTD)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 56ad106..a37fc9c 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -360,4 +360,12 @@ static inline int pci_dev_specific_reset(struct pci_dev *dev, int probe)
 }
 #endif
 
+u32 pci_rbar_get_possible_sizes(struct pci_dev *pdev, int bar);
+int pci_rbar_get_current_size(struct pci_dev *pdev, int bar);
+int pci_rbar_set_size(struct pci_dev *pdev, int bar, int size);
+static inline u64 pci_rbar_size_to_bytes(int size)
+{
+	return 1ULL << (size + 20);
+}
+
 #endif /* DRIVERS_PCI_H */
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index e5a2e68..a75429e 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -932,9 +932,16 @@
 #define PCI_SATA_SIZEOF_LONG	16
 
 /* Resizable BARs */
+#define PCI_REBAR_CAP		4	/* capability register */
+#define  PCI_REBAR_CTRL_SIZES_MASK	(0xFFFFF << 4)	/* mask for sizes */
+#define  PCI_REBAR_CTRL_SIZES_SHIFT	4	/* shift for sizes */
 #define PCI_REBAR_CTRL		8	/* control register */
-#define  PCI_REBAR_CTRL_NBAR_MASK	(7 << 5)	/* mask for # bars */
-#define  PCI_REBAR_CTRL_NBAR_SHIFT	5	/* shift for # bars */
+#define  PCI_REBAR_CTRL_BAR_IDX_MASK	(7 << 0)	/* mask for BAR index */
+#define  PCI_REBAR_CTRL_BAR_IDX_SHIFT	0	/* shift for BAR index */
+#define  PCI_REBAR_CTRL_NBAR_MASK	(7 << 5)	/* mask for # BARs */
+#define  PCI_REBAR_CTRL_NBAR_SHIFT	5	/* shift for # BARs */
+#define  PCI_REBAR_CTRL_BAR_SIZE_MASK	(0x1F << 8)	/* mask for BAR size */
+#define  PCI_REBAR_CTRL_BAR_SIZE_SHIFT	8	/* shift for BAR size */
 
 /* Dynamic Power Allocation */
 #define PCI_DPA_CAP		4	/* capability register */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 3/5] PCI: add functionality for resizing resources v5
  2017-05-09 16:49 Resizeable PCI BAR support v6 Christian König
  2017-05-09 16:49 ` [PATCH v6 1/5] PCI: add a define for the PCI resource type mask v2 Christian König
  2017-05-09 16:49 ` [PATCH v6 2/5] PCI: add resizeable BAR infrastructure v5 Christian König
@ 2017-05-09 16:49 ` Christian König
  2017-05-11 11:56     ` Andy Shevchenko
  2017-05-09 16:49 ` [PATCH v6 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v3 Christian König
  2017-05-09 16:49 ` [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2 Christian König
  4 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2017-05-09 16:49 UTC (permalink / raw)
  To: linux-pci, platform-driver-x86, helgaas

From: Christian König <christian.koenig@amd.com>

This allows device drivers to request resizing their BARs.

The function only tries to reprogram the windows of the bridge directly above
the requesting device and only the BAR of the same type (usually mem, 64bit,
prefetchable). This is done to make sure not to disturb other drivers by
changing the BARs of their devices.

If reprogramming the bridge BAR fails the old status is restored and -ENOSPC
returned to the calling device driver.

v2: rebase on changes in rbar support
v3: style cleanups, fail if memory decoding is enabled or resources
    still allocated, resize all unused bridge BARs,
    drop calling pci_reenable_device
v4: print resources before releasing them, style cleanups,
    use pci_rbar_size_to_bytes, use PCI_RES_TYPE_MASK
v5: use next pointer to simplify loop

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/pci/setup-bus.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/setup-res.c | 59 +++++++++++++++++++++++++++++
 include/linux/pci.h     |  3 ++
 3 files changed, 160 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index efebd70..f2e3812 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1917,6 +1917,104 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
 }
 EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
 
+int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
+{
+	struct pci_dev_resource *dev_res;
+	struct pci_dev *next;
+	LIST_HEAD(saved);
+	LIST_HEAD(added);
+	LIST_HEAD(failed);
+	unsigned int i;
+	int ret;
+
+	/* Walk to the root hub, releasing bridge BARs when possible */
+	next = bridge;
+	do {
+		bridge = next;
+		for (i = PCI_BRIDGE_RESOURCES; i < PCI_BRIDGE_RESOURCE_END;
+		     i++) {
+			struct resource *res = &bridge->resource[i];
+
+			if ((res->flags ^ type) & PCI_RES_TYPE_MASK)
+				continue;
+
+			/* Ignore BARs which are still in use */
+			if (res->child)
+				continue;
+
+			ret = add_to_list(&saved, bridge, res, 0, 0);
+			if (ret)
+				goto cleanup;
+
+			dev_info(&bridge->dev, "BAR %d: releasing %pR\n",
+				 i, res);
+
+			if (res->parent)
+				release_resource(res);
+			res->start = 0;
+			res->end = 0;
+			break;
+		}
+		if (i == PCI_BRIDGE_RESOURCE_END)
+			break;
+
+		next = bridge->bus ? bridge->bus->self : NULL;
+	} while (next);
+
+	if (list_empty(&saved))
+		return -ENOENT;
+
+	__pci_bus_size_bridges(bridge->subordinate, &added);
+	__pci_bridge_assign_resources(bridge, &added, &failed);
+	BUG_ON(!list_empty(&added));
+
+	if (!list_empty(&failed)) {
+		ret = -ENOSPC;
+		goto cleanup;
+	}
+
+	list_for_each_entry(dev_res, &saved, list) {
+		/* Skip the bridge we just assigned resources for. */
+		if (bridge == dev_res->dev)
+			continue;
+
+		bridge = dev_res->dev;
+		pci_setup_bridge(bridge->subordinate);
+	}
+
+	free_list(&saved);
+	return 0;
+
+cleanup:
+	/* restore size and flags */
+	list_for_each_entry(dev_res, &failed, list) {
+		struct resource *res = dev_res->res;
+
+		res->start = dev_res->start;
+		res->end = dev_res->end;
+		res->flags = dev_res->flags;
+	}
+	free_list(&failed);
+
+	/* Revert to the old configuration */
+	list_for_each_entry(dev_res, &saved, list) {
+		struct resource *res = dev_res->res;
+
+		bridge = dev_res->dev;
+		i = res - bridge->resource;
+
+		res->start = dev_res->start;
+		res->end = dev_res->end;
+		res->flags = dev_res->flags;
+
+		pci_claim_resource(bridge, i);
+		pci_setup_bridge(bridge->subordinate);
+	}
+	free_list(&saved);
+
+	return ret;
+}
+
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
 {
 	struct pci_dev *dev;
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 9526e34..9d3453c 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -363,6 +363,65 @@ int pci_reassign_resource(struct pci_dev *dev, int resno, resource_size_t addsiz
 	return 0;
 }
 
+void pci_release_resource(struct pci_dev *dev, int resno)
+{
+	struct resource *res = dev->resource + resno;
+
+	dev_info(&dev->dev, "BAR %d: releasing %pR\n", resno, res);
+	release_resource(res);
+	res->end = resource_size(res) - 1;
+	res->start = 0;
+	res->flags |= IORESOURCE_UNSET;
+}
+EXPORT_SYMBOL(pci_release_resource);
+
+int pci_resize_resource(struct pci_dev *dev, int resno, int size)
+{
+	struct resource *res = dev->resource + resno;
+	int old, ret;
+	u32 sizes;
+	u16 cmd;
+
+	pci_read_config_word(dev, PCI_COMMAND, &cmd);
+	if (cmd & PCI_COMMAND_MEMORY)
+		return -EBUSY;
+
+	sizes = pci_rbar_get_possible_sizes(dev, resno);
+	if (!sizes)
+		return -ENOTSUPP;
+
+	if (!(sizes & BIT(size)))
+		return -EINVAL;
+
+	old = pci_rbar_get_current_size(dev, resno);
+	if (old < 0)
+		return old;
+
+	/* Make sure the resource isn't assigned before making it larger. */
+	pci_release_resource(dev, resno);
+
+	ret = pci_rbar_set_size(dev, resno, size);
+	if (ret)
+		goto error_reassign;
+
+	res->end = res->start + pci_rbar_size_to_bytes(size) - 1;
+
+	ret = pci_reassign_bridge_resources(dev->bus->self, res->flags);
+	if (ret)
+		goto error_resize;
+
+	return 0;
+
+error_resize:
+	pci_rbar_set_size(dev, resno, old);
+	res->end = res->start + pci_rbar_size_to_bytes(old) - 1;
+
+error_reassign:
+	pci_assign_unassigned_bus_resources(dev->bus);
+	return ret;
+}
+EXPORT_SYMBOL(pci_resize_resource);
+
 int pci_enable_resources(struct pci_dev *dev, int mask)
 {
 	u16 cmd, old_cmd;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a38772a..f0a630a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1055,6 +1055,8 @@ void pci_reset_bridge_secondary_bus(struct pci_dev *dev);
 void pci_update_resource(struct pci_dev *dev, int resno);
 int __must_check pci_assign_resource(struct pci_dev *dev, int i);
 int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resource_size_t add_size, resource_size_t align);
+void pci_release_resource(struct pci_dev *dev, int resno);
+int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size);
 int pci_select_bars(struct pci_dev *dev, unsigned long flags);
 bool pci_device_is_present(struct pci_dev *pdev);
 void pci_ignore_hotplug(struct pci_dev *dev);
@@ -1135,6 +1137,7 @@ void pci_assign_unassigned_resources(void);
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus);
 void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus);
+int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type);
 void pdev_enable_device(struct pci_dev *);
 int pci_enable_resources(struct pci_dev *, int mask);
 void pci_fixup_irqs(u8 (*)(struct pci_dev *, u8 *),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v3
  2017-05-09 16:49 Resizeable PCI BAR support v6 Christian König
                   ` (2 preceding siblings ...)
  2017-05-09 16:49 ` [PATCH v6 3/5] PCI: add functionality for resizing resources v5 Christian König
@ 2017-05-09 16:49 ` Christian König
  2017-05-09 16:49 ` [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2 Christian König
  4 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-05-09 16:49 UTC (permalink / raw)
  To: linux-pci, platform-driver-x86, helgaas

From: Christian König <christian.koenig@amd.com>

Most BIOS don't enable this because of compatibility reasons.

Manually enable a 64bit BAR of 64GB size so that we have
enough room for PCI devices.

v2: style cleanups, increase size, add resource name, set correct flags,
    print message that windows was added
v3: add defines for all the magic numbers, style cleanups

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
---
 arch/x86/pci/fixup.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 6d52b94..1c36ed6 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -571,3 +571,72 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2fc0, pci_invalid_bar);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6f60, pci_invalid_bar);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fa0, pci_invalid_bar);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fc0, pci_invalid_bar);
+
+#define AMD_141b_MMIO_BASE(x)	(0x80 + (x) * 0x8)
+#define AMD_141b_MMIO_BASE_RE_MASK		BIT(0)
+#define AMD_141b_MMIO_BASE_WE_MASK		BIT(1)
+#define AMD_141b_MMIO_BASE_MMIOBASE_MASK	GENMASK(31,8)
+
+#define AMD_141b_MMIO_LIMIT(x)	(0x84 + (x) * 0x8)
+#define AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK	GENMASK(31,8)
+
+#define AMD_141b_MMIO_HIGH(x)	(0x180 + (x) * 0x4)
+#define AMD_141b_MMIO_HIGH_MMIOBASE_MASK	GENMASK(7,0)
+#define AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT	16
+#define AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK	GENMASK(23,16)
+
+static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
+{
+	struct resource *res, *conflict;
+	u32 base, limit, high;
+	unsigned i;
+
+	for (i = 0; i < 8; ++i) {
+		pci_read_config_dword(dev, AMD_141b_MMIO_BASE(i), &base);
+		pci_read_config_dword(dev, AMD_141b_MMIO_HIGH(i), &high);
+
+		/* Is this slot free? */
+		if (!(base & (AMD_141b_MMIO_BASE_RE_MASK |
+			      AMD_141b_MMIO_BASE_WE_MASK)))
+			break;
+
+		base >>= 8;
+		base |= high << 24;
+
+		/* Abort if a slot already configures a 64bit BAR. */
+		if (base > 0x10000)
+			return;
+	}
+	if (i == 8)
+		return;
+
+	res = kzalloc(sizeof(*res), GFP_KERNEL);
+	if (!res)
+		return;
+
+	res->name = "PCI Bus 0000:00";
+	res->flags = IORESOURCE_PREFETCH | IORESOURCE_MEM |
+		IORESOURCE_MEM_64 | IORESOURCE_WINDOW;
+	res->start = 0x100000000ull;
+	res->end = 0xfd00000000ull - 1;
+
+	/* Just grab the free area behind system memory for this */
+	while ((conflict = request_resource_conflict(&iomem_resource, res)))
+		res->start = conflict->end + 1;
+
+	dev_info(&dev->dev, "adding root bus resource %pR\n", res);
+
+	base = ((res->start >> 8) & AMD_141b_MMIO_BASE_MMIOBASE_MASK) |
+		AMD_141b_MMIO_BASE_RE_MASK | AMD_141b_MMIO_BASE_WE_MASK;
+	limit = ((res->end + 1) >> 8) & AMD_141b_MMIO_LIMIT_MMIOLIMIT_MASK;
+	high = ((res->start >> 40) & AMD_141b_MMIO_HIGH_MMIOBASE_MASK) |
+		((((res->end + 1) >> 40) << AMD_141b_MMIO_HIGH_MMIOLIMIT_SHIFT)
+		 & AMD_141b_MMIO_HIGH_MMIOLIMIT_MASK);
+
+	pci_write_config_dword(dev, AMD_141b_MMIO_HIGH(i), high);
+	pci_write_config_dword(dev, AMD_141b_MMIO_LIMIT(i), limit);
+	pci_write_config_dword(dev, AMD_141b_MMIO_BASE(i), base);
+
+	pci_bus_add_resource(dev->bus, res, 0);
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x141b, pci_amd_enable_64bit_bar);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
  2017-05-09 16:49 Resizeable PCI BAR support v6 Christian König
                   ` (3 preceding siblings ...)
  2017-05-09 16:49 ` [PATCH v6 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v3 Christian König
@ 2017-05-09 16:49 ` Christian König
  2017-05-11 12:00     ` Andy Shevchenko
  2017-06-01 20:14     ` Bjorn Helgaas
  4 siblings, 2 replies; 23+ messages in thread
From: Christian König @ 2017-05-09 16:49 UTC (permalink / raw)
  To: linux-pci, platform-driver-x86, helgaas

From: Christian König <christian.koenig@amd.com>

Try to resize BAR0 to let CPU access all of VRAM.

v2: rebased, style cleanups, disable mem decode before resize,
    handle gmc_v9 as well, round size up to power of two.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 37 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c      |  8 ++++---
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c      |  8 ++++---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c      | 10 ++++----
 5 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 5310781..d6f5286 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1884,6 +1884,7 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
 				 struct ttm_mem_reg *mem);
 void amdgpu_vram_location(struct amdgpu_device *adev, struct amdgpu_mc *mc, u64 base);
 void amdgpu_gtt_location(struct amdgpu_device *adev, struct amdgpu_mc *mc);
+void amdgpu_resize_bar0(struct amdgpu_device *adev);
 void amdgpu_ttm_set_active_vram_size(struct amdgpu_device *adev, u64 size);
 int amdgpu_ttm_init(struct amdgpu_device *adev);
 void amdgpu_ttm_fini(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2719c02..4e83a56 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -694,6 +694,43 @@ void amdgpu_gtt_location(struct amdgpu_device *adev, struct amdgpu_mc *mc)
 			mc->gtt_size >> 20, mc->gtt_start, mc->gtt_end);
 }
 
+/**
+ * amdgpu_resize_bar0 - try to resize BAR0
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Try to resize BAR0 to make all VRAM CPU accessible.
+ */
+void amdgpu_resize_bar0(struct amdgpu_device *adev)
+{
+	u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
+	u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) -1;
+	u16 cmd;
+	int r;
+
+	/* Free the doorbell mapping, it most likely needs to move as well */
+	amdgpu_doorbell_fini(adev);
+	pci_release_resource(adev->pdev, 2);
+
+	/* Disable memory decoding while we change the BAR addresses and size */
+	pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
+	pci_write_config_word(adev->pdev, PCI_COMMAND,
+			      cmd & ~PCI_COMMAND_MEMORY);
+
+	r = pci_resize_resource(adev->pdev, 0, rbar_size);
+	if (r == -ENOSPC)
+		DRM_INFO("Not enough PCI address space for a large BAR.");
+	else if (r && r != -ENOTSUPP)
+		DRM_ERROR("Problem resizing BAR0 (%d).", r);
+
+	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
+
+	/* When the doorbell BAR isn't available we have no chance of
+	 * using the device.
+	 */
+	BUG_ON(amdgpu_doorbell_init(adev));
+}
+
 /*
  * GPU helpers function.
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index a9083a1..ae71524 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -372,13 +372,15 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev)
 		}
 		adev->mc.vram_width = numchan * chansize;
 	}
-	/* Could aper size report 0 ? */
-	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
-	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
 	/* size in MB on si */
 	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 
+	if (!(adev->flags & AMD_IS_APU))
+		amdgpu_resize_bar0(adev);
+	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
+	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
+
 #ifdef CONFIG_X86_64
 	if (adev->flags & AMD_IS_APU) {
 		adev->mc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 4ac9978..1655de2 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -534,13 +534,15 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev)
 		}
 		adev->mc.vram_width = numchan * chansize;
 	}
-	/* Could aper size report 0 ? */
-	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
-	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
 	/* size in MB on si */
 	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 
+	if (!(adev->flags & AMD_IS_APU))
+		amdgpu_resize_bar0(adev);
+	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
+	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
+
 #ifdef CONFIG_X86_64
 	if (adev->flags & AMD_IS_APU) {
 		adev->mc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index b9f11fa..d65728a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -469,16 +469,18 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
 	}
 	adev->mc.vram_width = numchan * chansize;
 
-	/* Could aper size report 0 ? */
-	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
-	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
 	/* size in MB on si */
 	adev->mc.mc_vram_size =
 		nbio_v6_1_get_memsize(adev) * 1024ULL * 1024ULL;
 	adev->mc.real_vram_size = adev->mc.mc_vram_size;
-	adev->mc.visible_vram_size = adev->mc.aper_size;
+
+	if (!(adev->flags & AMD_IS_APU))
+		amdgpu_resize_bar0(adev);
+	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
+	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
 
 	/* In case the PCI BAR is larger than the actual amount of vram */
+	adev->mc.visible_vram_size = adev->mc.aper_size;
 	if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
 		adev->mc.visible_vram_size = adev->mc.real_vram_size;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 3/5] PCI: add functionality for resizing resources v5
  2017-05-09 16:49 ` [PATCH v6 3/5] PCI: add functionality for resizing resources v5 Christian König
@ 2017-05-11 11:56     ` Andy Shevchenko
  0 siblings, 0 replies; 23+ messages in thread
From: Andy Shevchenko @ 2017-05-11 11:56 UTC (permalink / raw)
  To: Christian König; +Cc: linux-pci, Platform Driver, Bjorn Helgaas

On Tue, May 9, 2017 at 7:49 PM, Christian König <deathsimple@vodafone.de> wrote:
> From: Christian König <christian.koenig@amd.com>
>
> This allows device drivers to request resizing their BARs.
>
> The function only tries to reprogram the windows of the bridge directly above
> the requesting device and only the BAR of the same type (usually mem, 64bit,
> prefetchable). This is done to make sure not to disturb other drivers by
> changing the BARs of their devices.
>
> If reprogramming the bridge BAR fails the old status is restored and -ENOSPC
> returned to the calling device driver.
>
> v2: rebase on changes in rbar support
> v3: style cleanups, fail if memory decoding is enabled or resources
>     still allocated, resize all unused bridge BARs,
>     drop calling pci_reenable_device
> v4: print resources before releasing them, style cleanups,
>     use pci_rbar_size_to_bytes, use PCI_RES_TYPE_MASK
> v5: use next pointer to simplify loop

My main point has been addressed, so, to avoid bikeshedding FWIW:
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>

P.S. I didn't check deep any PCI protocol stuff since I have no specs,
no nothing to do such. I hope some one who is more familiar can do
that.

>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/pci/setup-bus.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/pci/setup-res.c | 59 +++++++++++++++++++++++++++++
>  include/linux/pci.h     |  3 ++
>  3 files changed, 160 insertions(+)
>
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index efebd70..f2e3812 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -1917,6 +1917,104 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
>  }
>  EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
>
> +int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
> +{
> +       struct pci_dev_resource *dev_res;
> +       struct pci_dev *next;
> +       LIST_HEAD(saved);
> +       LIST_HEAD(added);
> +       LIST_HEAD(failed);
> +       unsigned int i;
> +       int ret;
> +
> +       /* Walk to the root hub, releasing bridge BARs when possible */
> +       next = bridge;
> +       do {
> +               bridge = next;
> +               for (i = PCI_BRIDGE_RESOURCES; i < PCI_BRIDGE_RESOURCE_END;
> +                    i++) {
> +                       struct resource *res = &bridge->resource[i];
> +
> +                       if ((res->flags ^ type) & PCI_RES_TYPE_MASK)
> +                               continue;
> +
> +                       /* Ignore BARs which are still in use */
> +                       if (res->child)
> +                               continue;
> +
> +                       ret = add_to_list(&saved, bridge, res, 0, 0);
> +                       if (ret)
> +                               goto cleanup;
> +
> +                       dev_info(&bridge->dev, "BAR %d: releasing %pR\n",
> +                                i, res);
> +
> +                       if (res->parent)
> +                               release_resource(res);
> +                       res->start = 0;
> +                       res->end = 0;
> +                       break;
> +               }
> +               if (i == PCI_BRIDGE_RESOURCE_END)
> +                       break;
> +
> +               next = bridge->bus ? bridge->bus->self : NULL;
> +       } while (next);
> +
> +       if (list_empty(&saved))
> +               return -ENOENT;
> +
> +       __pci_bus_size_bridges(bridge->subordinate, &added);
> +       __pci_bridge_assign_resources(bridge, &added, &failed);
> +       BUG_ON(!list_empty(&added));
> +
> +       if (!list_empty(&failed)) {
> +               ret = -ENOSPC;
> +               goto cleanup;
> +       }
> +
> +       list_for_each_entry(dev_res, &saved, list) {
> +               /* Skip the bridge we just assigned resources for. */
> +               if (bridge == dev_res->dev)
> +                       continue;
> +
> +               bridge = dev_res->dev;
> +               pci_setup_bridge(bridge->subordinate);
> +       }
> +
> +       free_list(&saved);
> +       return 0;
> +
> +cleanup:
> +       /* restore size and flags */
> +       list_for_each_entry(dev_res, &failed, list) {
> +               struct resource *res = dev_res->res;
> +
> +               res->start = dev_res->start;
> +               res->end = dev_res->end;
> +               res->flags = dev_res->flags;
> +       }
> +       free_list(&failed);
> +
> +       /* Revert to the old configuration */
> +       list_for_each_entry(dev_res, &saved, list) {
> +               struct resource *res = dev_res->res;
> +
> +               bridge = dev_res->dev;
> +               i = res - bridge->resource;
> +
> +               res->start = dev_res->start;
> +               res->end = dev_res->end;
> +               res->flags = dev_res->flags;
> +
> +               pci_claim_resource(bridge, i);
> +               pci_setup_bridge(bridge->subordinate);
> +       }
> +       free_list(&saved);
> +
> +       return ret;
> +}
> +
>  void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
>  {
>         struct pci_dev *dev;
> diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
> index 9526e34..9d3453c 100644
> --- a/drivers/pci/setup-res.c
> +++ b/drivers/pci/setup-res.c
> @@ -363,6 +363,65 @@ int pci_reassign_resource(struct pci_dev *dev, int resno, resource_size_t addsiz
>         return 0;
>  }
>
> +void pci_release_resource(struct pci_dev *dev, int resno)
> +{
> +       struct resource *res = dev->resource + resno;
> +
> +       dev_info(&dev->dev, "BAR %d: releasing %pR\n", resno, res);
> +       release_resource(res);
> +       res->end = resource_size(res) - 1;
> +       res->start = 0;
> +       res->flags |= IORESOURCE_UNSET;
> +}
> +EXPORT_SYMBOL(pci_release_resource);
> +
> +int pci_resize_resource(struct pci_dev *dev, int resno, int size)
> +{
> +       struct resource *res = dev->resource + resno;
> +       int old, ret;
> +       u32 sizes;
> +       u16 cmd;
> +
> +       pci_read_config_word(dev, PCI_COMMAND, &cmd);
> +       if (cmd & PCI_COMMAND_MEMORY)
> +               return -EBUSY;
> +
> +       sizes = pci_rbar_get_possible_sizes(dev, resno);
> +       if (!sizes)
> +               return -ENOTSUPP;
> +
> +       if (!(sizes & BIT(size)))
> +               return -EINVAL;
> +
> +       old = pci_rbar_get_current_size(dev, resno);
> +       if (old < 0)
> +               return old;
> +
> +       /* Make sure the resource isn't assigned before making it larger. */
> +       pci_release_resource(dev, resno);
> +
> +       ret = pci_rbar_set_size(dev, resno, size);
> +       if (ret)
> +               goto error_reassign;
> +
> +       res->end = res->start + pci_rbar_size_to_bytes(size) - 1;
> +
> +       ret = pci_reassign_bridge_resources(dev->bus->self, res->flags);
> +       if (ret)
> +               goto error_resize;
> +
> +       return 0;
> +
> +error_resize:
> +       pci_rbar_set_size(dev, resno, old);
> +       res->end = res->start + pci_rbar_size_to_bytes(old) - 1;
> +
> +error_reassign:
> +       pci_assign_unassigned_bus_resources(dev->bus);
> +       return ret;
> +}
> +EXPORT_SYMBOL(pci_resize_resource);
> +
>  int pci_enable_resources(struct pci_dev *dev, int mask)
>  {
>         u16 cmd, old_cmd;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index a38772a..f0a630a 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1055,6 +1055,8 @@ void pci_reset_bridge_secondary_bus(struct pci_dev *dev);
>  void pci_update_resource(struct pci_dev *dev, int resno);
>  int __must_check pci_assign_resource(struct pci_dev *dev, int i);
>  int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resource_size_t add_size, resource_size_t align);
> +void pci_release_resource(struct pci_dev *dev, int resno);
> +int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size);
>  int pci_select_bars(struct pci_dev *dev, unsigned long flags);
>  bool pci_device_is_present(struct pci_dev *pdev);
>  void pci_ignore_hotplug(struct pci_dev *dev);
> @@ -1135,6 +1137,7 @@ void pci_assign_unassigned_resources(void);
>  void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
>  void pci_assign_unassigned_bus_resources(struct pci_bus *bus);
>  void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus);
> +int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type);
>  void pdev_enable_device(struct pci_dev *);
>  int pci_enable_resources(struct pci_dev *, int mask);
>  void pci_fixup_irqs(u8 (*)(struct pci_dev *, u8 *),
> --
> 2.7.4
>



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 3/5] PCI: add functionality for resizing resources v5
@ 2017-05-11 11:56     ` Andy Shevchenko
  0 siblings, 0 replies; 23+ messages in thread
From: Andy Shevchenko @ 2017-05-11 11:56 UTC (permalink / raw)
  To: Christian König; +Cc: linux-pci, Platform Driver, Bjorn Helgaas

On Tue, May 9, 2017 at 7:49 PM, Christian K=C3=B6nig <deathsimple@vodafone.=
de> wrote:
> From: Christian K=C3=B6nig <christian.koenig@amd.com>
>
> This allows device drivers to request resizing their BARs.
>
> The function only tries to reprogram the windows of the bridge directly a=
bove
> the requesting device and only the BAR of the same type (usually mem, 64b=
it,
> prefetchable). This is done to make sure not to disturb other drivers by
> changing the BARs of their devices.
>
> If reprogramming the bridge BAR fails the old status is restored and -ENO=
SPC
> returned to the calling device driver.
>
> v2: rebase on changes in rbar support
> v3: style cleanups, fail if memory decoding is enabled or resources
>     still allocated, resize all unused bridge BARs,
>     drop calling pci_reenable_device
> v4: print resources before releasing them, style cleanups,
>     use pci_rbar_size_to_bytes, use PCI_RES_TYPE_MASK
> v5: use next pointer to simplify loop

My main point has been addressed, so, to avoid bikeshedding FWIW:
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>

P.S. I didn't check deep any PCI protocol stuff since I have no specs,
no nothing to do such. I hope some one who is more familiar can do
that.

>
> Signed-off-by: Christian K=C3=B6nig <christian.koenig@amd.com>
> ---
>  drivers/pci/setup-bus.c | 98 +++++++++++++++++++++++++++++++++++++++++++=
++++++
>  drivers/pci/setup-res.c | 59 +++++++++++++++++++++++++++++
>  include/linux/pci.h     |  3 ++
>  3 files changed, 160 insertions(+)
>
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index efebd70..f2e3812 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -1917,6 +1917,104 @@ void pci_assign_unassigned_bridge_resources(struc=
t pci_dev *bridge)
>  }
>  EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
>
> +int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long =
type)
> +{
> +       struct pci_dev_resource *dev_res;
> +       struct pci_dev *next;
> +       LIST_HEAD(saved);
> +       LIST_HEAD(added);
> +       LIST_HEAD(failed);
> +       unsigned int i;
> +       int ret;
> +
> +       /* Walk to the root hub, releasing bridge BARs when possible */
> +       next =3D bridge;
> +       do {
> +               bridge =3D next;
> +               for (i =3D PCI_BRIDGE_RESOURCES; i < PCI_BRIDGE_RESOURCE_=
END;
> +                    i++) {
> +                       struct resource *res =3D &bridge->resource[i];
> +
> +                       if ((res->flags ^ type) & PCI_RES_TYPE_MASK)
> +                               continue;
> +
> +                       /* Ignore BARs which are still in use */
> +                       if (res->child)
> +                               continue;
> +
> +                       ret =3D add_to_list(&saved, bridge, res, 0, 0);
> +                       if (ret)
> +                               goto cleanup;
> +
> +                       dev_info(&bridge->dev, "BAR %d: releasing %pR\n",
> +                                i, res);
> +
> +                       if (res->parent)
> +                               release_resource(res);
> +                       res->start =3D 0;
> +                       res->end =3D 0;
> +                       break;
> +               }
> +               if (i =3D=3D PCI_BRIDGE_RESOURCE_END)
> +                       break;
> +
> +               next =3D bridge->bus ? bridge->bus->self : NULL;
> +       } while (next);
> +
> +       if (list_empty(&saved))
> +               return -ENOENT;
> +
> +       __pci_bus_size_bridges(bridge->subordinate, &added);
> +       __pci_bridge_assign_resources(bridge, &added, &failed);
> +       BUG_ON(!list_empty(&added));
> +
> +       if (!list_empty(&failed)) {
> +               ret =3D -ENOSPC;
> +               goto cleanup;
> +       }
> +
> +       list_for_each_entry(dev_res, &saved, list) {
> +               /* Skip the bridge we just assigned resources for. */
> +               if (bridge =3D=3D dev_res->dev)
> +                       continue;
> +
> +               bridge =3D dev_res->dev;
> +               pci_setup_bridge(bridge->subordinate);
> +       }
> +
> +       free_list(&saved);
> +       return 0;
> +
> +cleanup:
> +       /* restore size and flags */
> +       list_for_each_entry(dev_res, &failed, list) {
> +               struct resource *res =3D dev_res->res;
> +
> +               res->start =3D dev_res->start;
> +               res->end =3D dev_res->end;
> +               res->flags =3D dev_res->flags;
> +       }
> +       free_list(&failed);
> +
> +       /* Revert to the old configuration */
> +       list_for_each_entry(dev_res, &saved, list) {
> +               struct resource *res =3D dev_res->res;
> +
> +               bridge =3D dev_res->dev;
> +               i =3D res - bridge->resource;
> +
> +               res->start =3D dev_res->start;
> +               res->end =3D dev_res->end;
> +               res->flags =3D dev_res->flags;
> +
> +               pci_claim_resource(bridge, i);
> +               pci_setup_bridge(bridge->subordinate);
> +       }
> +       free_list(&saved);
> +
> +       return ret;
> +}
> +
>  void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
>  {
>         struct pci_dev *dev;
> diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
> index 9526e34..9d3453c 100644
> --- a/drivers/pci/setup-res.c
> +++ b/drivers/pci/setup-res.c
> @@ -363,6 +363,65 @@ int pci_reassign_resource(struct pci_dev *dev, int r=
esno, resource_size_t addsiz
>         return 0;
>  }
>
> +void pci_release_resource(struct pci_dev *dev, int resno)
> +{
> +       struct resource *res =3D dev->resource + resno;
> +
> +       dev_info(&dev->dev, "BAR %d: releasing %pR\n", resno, res);
> +       release_resource(res);
> +       res->end =3D resource_size(res) - 1;
> +       res->start =3D 0;
> +       res->flags |=3D IORESOURCE_UNSET;
> +}
> +EXPORT_SYMBOL(pci_release_resource);
> +
> +int pci_resize_resource(struct pci_dev *dev, int resno, int size)
> +{
> +       struct resource *res =3D dev->resource + resno;
> +       int old, ret;
> +       u32 sizes;
> +       u16 cmd;
> +
> +       pci_read_config_word(dev, PCI_COMMAND, &cmd);
> +       if (cmd & PCI_COMMAND_MEMORY)
> +               return -EBUSY;
> +
> +       sizes =3D pci_rbar_get_possible_sizes(dev, resno);
> +       if (!sizes)
> +               return -ENOTSUPP;
> +
> +       if (!(sizes & BIT(size)))
> +               return -EINVAL;
> +
> +       old =3D pci_rbar_get_current_size(dev, resno);
> +       if (old < 0)
> +               return old;
> +
> +       /* Make sure the resource isn't assigned before making it larger.=
 */
> +       pci_release_resource(dev, resno);
> +
> +       ret =3D pci_rbar_set_size(dev, resno, size);
> +       if (ret)
> +               goto error_reassign;
> +
> +       res->end =3D res->start + pci_rbar_size_to_bytes(size) - 1;
> +
> +       ret =3D pci_reassign_bridge_resources(dev->bus->self, res->flags)=
;
> +       if (ret)
> +               goto error_resize;
> +
> +       return 0;
> +
> +error_resize:
> +       pci_rbar_set_size(dev, resno, old);
> +       res->end =3D res->start + pci_rbar_size_to_bytes(old) - 1;
> +
> +error_reassign:
> +       pci_assign_unassigned_bus_resources(dev->bus);
> +       return ret;
> +}
> +EXPORT_SYMBOL(pci_resize_resource);
> +
>  int pci_enable_resources(struct pci_dev *dev, int mask)
>  {
>         u16 cmd, old_cmd;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index a38772a..f0a630a 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1055,6 +1055,8 @@ void pci_reset_bridge_secondary_bus(struct pci_dev =
*dev);
>  void pci_update_resource(struct pci_dev *dev, int resno);
>  int __must_check pci_assign_resource(struct pci_dev *dev, int i);
>  int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resou=
rce_size_t add_size, resource_size_t align);
> +void pci_release_resource(struct pci_dev *dev, int resno);
> +int __must_check pci_resize_resource(struct pci_dev *dev, int i, int siz=
e);
>  int pci_select_bars(struct pci_dev *dev, unsigned long flags);
>  bool pci_device_is_present(struct pci_dev *pdev);
>  void pci_ignore_hotplug(struct pci_dev *dev);
> @@ -1135,6 +1137,7 @@ void pci_assign_unassigned_resources(void);
>  void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
>  void pci_assign_unassigned_bus_resources(struct pci_bus *bus);
>  void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus);
> +int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long =
type);
>  void pdev_enable_device(struct pci_dev *);
>  int pci_enable_resources(struct pci_dev *, int mask);
>  void pci_fixup_irqs(u8 (*)(struct pci_dev *, u8 *),
> --
> 2.7.4
>



--=20
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
  2017-05-09 16:49 ` [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2 Christian König
@ 2017-05-11 12:00     ` Andy Shevchenko
  2017-06-01 20:14     ` Bjorn Helgaas
  1 sibling, 0 replies; 23+ messages in thread
From: Andy Shevchenko @ 2017-05-11 12:00 UTC (permalink / raw)
  To: Christian König; +Cc: linux-pci, Platform Driver, Bjorn Helgaas

On Tue, May 9, 2017 at 7:49 PM, Christian König <deathsimple@vodafone.de> wrote:
> From: Christian König <christian.koenig@amd.com>
>
> Try to resize BAR0 to let CPU access all of VRAM.

> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> @@ -372,13 +372,15 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev)
>                 }
>                 adev->mc.vram_width = numchan * chansize;
>         }
> -       /* Could aper size report 0 ? */
> -       adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> -       adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
>         /* size in MB on si */
>         adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>         adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>

> +       if (!(adev->flags & AMD_IS_APU))
> +               amdgpu_resize_bar0(adev);
> +       adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> +       adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
> +

Looks to me like a candidate to be a helper. (3x occurrences)

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
@ 2017-05-11 12:00     ` Andy Shevchenko
  0 siblings, 0 replies; 23+ messages in thread
From: Andy Shevchenko @ 2017-05-11 12:00 UTC (permalink / raw)
  To: Christian König; +Cc: linux-pci, Platform Driver, Bjorn Helgaas

On Tue, May 9, 2017 at 7:49 PM, Christian K=C3=B6nig <deathsimple@vodafone.=
de> wrote:
> From: Christian K=C3=B6nig <christian.koenig@amd.com>
>
> Try to resize BAR0 to let CPU access all of VRAM.

> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> @@ -372,13 +372,15 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *a=
dev)
>                 }
>                 adev->mc.vram_width =3D numchan * chansize;
>         }
> -       /* Could aper size report 0 ? */
> -       adev->mc.aper_base =3D pci_resource_start(adev->pdev, 0);
> -       adev->mc.aper_size =3D pci_resource_len(adev->pdev, 0);
>         /* size in MB on si */
>         adev->mc.mc_vram_size =3D RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 10=
24ULL;
>         adev->mc.real_vram_size =3D RREG32(mmCONFIG_MEMSIZE) * 1024ULL * =
1024ULL;
>

> +       if (!(adev->flags & AMD_IS_APU))
> +               amdgpu_resize_bar0(adev);
> +       adev->mc.aper_base =3D pci_resource_start(adev->pdev, 0);
> +       adev->mc.aper_size =3D pci_resource_len(adev->pdev, 0);
> +

Looks to me like a candidate to be a helper. (3x occurrences)

--=20
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
  2017-05-09 16:49 ` [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2 Christian König
@ 2017-06-01 20:14     ` Bjorn Helgaas
  2017-06-01 20:14     ` Bjorn Helgaas
  1 sibling, 0 replies; 23+ messages in thread
From: Bjorn Helgaas @ 2017-06-01 20:14 UTC (permalink / raw)
  To: Christian König
  Cc: linux-pci, platform-driver-x86, Alex Deucher, David Airlie,
	amd-gfx, dri-devel

[+cc ADMGPU, DRM folks]

On Tue, May 09, 2017 at 06:49:07PM +0200, Christian König wrote:
> From: Christian König <christian.koenig@amd.com>
> 
> Try to resize BAR0 to let CPU access all of VRAM.
> 
> v2: rebased, style cleanups, disable mem decode before resize,
>     handle gmc_v9 as well, round size up to power of two.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 37 ++++++++++++++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c      |  8 ++++---
>  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c      |  8 ++++---
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c      | 10 ++++----
>  5 files changed, 54 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 5310781..d6f5286 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1884,6 +1884,7 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
>  				 struct ttm_mem_reg *mem);
>  void amdgpu_vram_location(struct amdgpu_device *adev, struct amdgpu_mc *mc, u64 base);
>  void amdgpu_gtt_location(struct amdgpu_device *adev, struct amdgpu_mc *mc);
> +void amdgpu_resize_bar0(struct amdgpu_device *adev);
>  void amdgpu_ttm_set_active_vram_size(struct amdgpu_device *adev, u64 size);
>  int amdgpu_ttm_init(struct amdgpu_device *adev);
>  void amdgpu_ttm_fini(struct amdgpu_device *adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 2719c02..4e83a56 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -694,6 +694,43 @@ void amdgpu_gtt_location(struct amdgpu_device *adev, struct amdgpu_mc *mc)
>  			mc->gtt_size >> 20, mc->gtt_start, mc->gtt_end);
>  }
>  
> +/**
> + * amdgpu_resize_bar0 - try to resize BAR0
> + *
> + * @adev: amdgpu_device pointer
> + *
> + * Try to resize BAR0 to make all VRAM CPU accessible.
> + */
> +void amdgpu_resize_bar0(struct amdgpu_device *adev)
> +{
> +	u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
> +	u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) -1;
> +	u16 cmd;
> +	int r;
> +
> +	/* Free the doorbell mapping, it most likely needs to move as well */
> +	amdgpu_doorbell_fini(adev);
> +	pci_release_resource(adev->pdev, 2);
> +
> +	/* Disable memory decoding while we change the BAR addresses and size */
> +	pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
> +	pci_write_config_word(adev->pdev, PCI_COMMAND,
> +			      cmd & ~PCI_COMMAND_MEMORY);
> +
> +	r = pci_resize_resource(adev->pdev, 0, rbar_size);
> +	if (r == -ENOSPC)
> +		DRM_INFO("Not enough PCI address space for a large BAR.");
> +	else if (r && r != -ENOTSUPP)
> +		DRM_ERROR("Problem resizing BAR0 (%d).", r);
> +
> +	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
> +
> +	/* When the doorbell BAR isn't available we have no chance of
> +	 * using the device.
> +	 */
> +	BUG_ON(amdgpu_doorbell_init(adev));

This amdgpu_doorbell_fini()/amdgpu_doorbell_init() thing doesn't look
right.  amdgpu_device_init() only calls amdgpu_doorbell_init() for
"adev->asic_type >= CHIP_BONAIRE", but we call it unconditionally
here.

This is the call graph:

  amdgpu_device_init
    adev->rmmio_base = pci_resource_start(adev->pdev, 5)   # 2 for < BONAIRE
    adev->rmmio = ioremap(adev->rmmio_base, ...)
    DRM_INFO("register mmio base: 0x%08X\n", (uint32_t)adev->rmmio_base)
    if (adev->asic_type >= CHIP_BONAIRE) {
      amdgpu_doorbell_init
	adev->doorbell.base = pci_resource_start(adev->pdev, 2)
	adev->doorbell.ptr = ioremap(adev->doorbell.base, ...)
    }
    amdgpu_init
      gmc_v7_0_sw_init              # gmc_v7_0_ip_funcs.sw_init
	gmc_v7_0_mc_init
  +         amdgpu_resize_bar0
  +           amdgpu_doorbell_fini
  +           pci_release_resource(adev->pdev, 2)
  +           pci_resize_resource(adev->pdev, 0, size)
  +           amdgpu_doorbell_init
	  adev->mc.aper_base = pci_resource_start(adev->pdev, 0)

If "asic_type < CHIP_BONAIRE", we ioremapped BAR 2 in
amdgpu_device_init(), then we released the resource here and never
updated the ioremap.

From the PCI core perspective, it would be much cleaner to do the BAR
resize before the driver calls pci_enable_device().  If that could be
done, there would be no need for this sort of shutdown/reinit stuff
and we wouldn't have to worry about issues like these.  The amdgpu
init path is pretty complicated, so I don't know whether this is
possible.

I would also like to simplify the driver usage model and get the
PCI_COMMAND_MEMORY twiddling into the PCI core instead of the driver.
Ideally, the driver would do something like this:

  pci_resize_resource(adev->pdev, 0, rbar_size);
  pci_enable_device(adev->pdev);

And the PCI core would be something along these lines:

  int pci_resize_resource(dev, bar, size)
  {
    if (pci_is_enabled(dev))
      return -EBUSY;

    pci_disable_decoding(dev);          # turn off MEM, IO decoding
    pci_release_resources(dev);         # (all of them)
    err = pci_resize_bar(dev, bar, size);     # change BAR size (only)
    if (err)
      return err;

    pci_assign_resources(dev);          # reassign all "dev" resources
    return 0;
  }

> +}
> +
>  /*
>   * GPU helpers function.
>   */
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> index a9083a1..ae71524 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> @@ -372,13 +372,15 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev)
>  		}
>  		adev->mc.vram_width = numchan * chansize;
>  	}
> -	/* Could aper size report 0 ? */
> -	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> -	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
>  	/* size in MB on si */
>  	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>  	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>  
> +	if (!(adev->flags & AMD_IS_APU))
> +		amdgpu_resize_bar0(adev);
> +	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> +	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
> +
>  #ifdef CONFIG_X86_64
>  	if (adev->flags & AMD_IS_APU) {
>  		adev->mc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22;
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> index 4ac9978..1655de2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> @@ -534,13 +534,15 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev)
>  		}
>  		adev->mc.vram_width = numchan * chansize;
>  	}
> -	/* Could aper size report 0 ? */
> -	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> -	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
>  	/* size in MB on si */
>  	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>  	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>  
> +	if (!(adev->flags & AMD_IS_APU))
> +		amdgpu_resize_bar0(adev);
> +	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> +	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
> +
>  #ifdef CONFIG_X86_64
>  	if (adev->flags & AMD_IS_APU) {
>  		adev->mc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22;
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index b9f11fa..d65728a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -469,16 +469,18 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
>  	}
>  	adev->mc.vram_width = numchan * chansize;
>  
> -	/* Could aper size report 0 ? */
> -	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> -	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
>  	/* size in MB on si */
>  	adev->mc.mc_vram_size =
>  		nbio_v6_1_get_memsize(adev) * 1024ULL * 1024ULL;
>  	adev->mc.real_vram_size = adev->mc.mc_vram_size;
> -	adev->mc.visible_vram_size = adev->mc.aper_size;
> +
> +	if (!(adev->flags & AMD_IS_APU))
> +		amdgpu_resize_bar0(adev);
> +	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> +	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
>  
>  	/* In case the PCI BAR is larger than the actual amount of vram */
> +	adev->mc.visible_vram_size = adev->mc.aper_size;
>  	if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
>  		adev->mc.visible_vram_size = adev->mc.real_vram_size;
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
@ 2017-06-01 20:14     ` Bjorn Helgaas
  0 siblings, 0 replies; 23+ messages in thread
From: Bjorn Helgaas @ 2017-06-01 20:14 UTC (permalink / raw)
  To: Christian König
  Cc: linux-pci, platform-driver-x86, Alex Deucher, David Airlie,
	amd-gfx, dri-devel

[+cc ADMGPU, DRM folks]

On Tue, May 09, 2017 at 06:49:07PM +0200, Christian König wrote:
> From: Christian König <christian.koenig@amd.com>
> 
> Try to resize BAR0 to let CPU access all of VRAM.
> 
> v2: rebased, style cleanups, disable mem decode before resize,
>     handle gmc_v9 as well, round size up to power of two.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 37 ++++++++++++++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c      |  8 ++++---
>  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c      |  8 ++++---
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c      | 10 ++++----
>  5 files changed, 54 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 5310781..d6f5286 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1884,6 +1884,7 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
>  				 struct ttm_mem_reg *mem);
>  void amdgpu_vram_location(struct amdgpu_device *adev, struct amdgpu_mc *mc, u64 base);
>  void amdgpu_gtt_location(struct amdgpu_device *adev, struct amdgpu_mc *mc);
> +void amdgpu_resize_bar0(struct amdgpu_device *adev);
>  void amdgpu_ttm_set_active_vram_size(struct amdgpu_device *adev, u64 size);
>  int amdgpu_ttm_init(struct amdgpu_device *adev);
>  void amdgpu_ttm_fini(struct amdgpu_device *adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 2719c02..4e83a56 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -694,6 +694,43 @@ void amdgpu_gtt_location(struct amdgpu_device *adev, struct amdgpu_mc *mc)
>  			mc->gtt_size >> 20, mc->gtt_start, mc->gtt_end);
>  }
>  
> +/**
> + * amdgpu_resize_bar0 - try to resize BAR0
> + *
> + * @adev: amdgpu_device pointer
> + *
> + * Try to resize BAR0 to make all VRAM CPU accessible.
> + */
> +void amdgpu_resize_bar0(struct amdgpu_device *adev)
> +{
> +	u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
> +	u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) -1;
> +	u16 cmd;
> +	int r;
> +
> +	/* Free the doorbell mapping, it most likely needs to move as well */
> +	amdgpu_doorbell_fini(adev);
> +	pci_release_resource(adev->pdev, 2);
> +
> +	/* Disable memory decoding while we change the BAR addresses and size */
> +	pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
> +	pci_write_config_word(adev->pdev, PCI_COMMAND,
> +			      cmd & ~PCI_COMMAND_MEMORY);
> +
> +	r = pci_resize_resource(adev->pdev, 0, rbar_size);
> +	if (r == -ENOSPC)
> +		DRM_INFO("Not enough PCI address space for a large BAR.");
> +	else if (r && r != -ENOTSUPP)
> +		DRM_ERROR("Problem resizing BAR0 (%d).", r);
> +
> +	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
> +
> +	/* When the doorbell BAR isn't available we have no chance of
> +	 * using the device.
> +	 */
> +	BUG_ON(amdgpu_doorbell_init(adev));

This amdgpu_doorbell_fini()/amdgpu_doorbell_init() thing doesn't look
right.  amdgpu_device_init() only calls amdgpu_doorbell_init() for
"adev->asic_type >= CHIP_BONAIRE", but we call it unconditionally
here.

This is the call graph:

  amdgpu_device_init
    adev->rmmio_base = pci_resource_start(adev->pdev, 5)   # 2 for < BONAIRE
    adev->rmmio = ioremap(adev->rmmio_base, ...)
    DRM_INFO("register mmio base: 0x%08X\n", (uint32_t)adev->rmmio_base)
    if (adev->asic_type >= CHIP_BONAIRE) {
      amdgpu_doorbell_init
	adev->doorbell.base = pci_resource_start(adev->pdev, 2)
	adev->doorbell.ptr = ioremap(adev->doorbell.base, ...)
    }
    amdgpu_init
      gmc_v7_0_sw_init              # gmc_v7_0_ip_funcs.sw_init
	gmc_v7_0_mc_init
  +         amdgpu_resize_bar0
  +           amdgpu_doorbell_fini
  +           pci_release_resource(adev->pdev, 2)
  +           pci_resize_resource(adev->pdev, 0, size)
  +           amdgpu_doorbell_init
	  adev->mc.aper_base = pci_resource_start(adev->pdev, 0)

If "asic_type < CHIP_BONAIRE", we ioremapped BAR 2 in
amdgpu_device_init(), then we released the resource here and never
updated the ioremap.

>From the PCI core perspective, it would be much cleaner to do the BAR
resize before the driver calls pci_enable_device().  If that could be
done, there would be no need for this sort of shutdown/reinit stuff
and we wouldn't have to worry about issues like these.  The amdgpu
init path is pretty complicated, so I don't know whether this is
possible.

I would also like to simplify the driver usage model and get the
PCI_COMMAND_MEMORY twiddling into the PCI core instead of the driver.
Ideally, the driver would do something like this:

  pci_resize_resource(adev->pdev, 0, rbar_size);
  pci_enable_device(adev->pdev);

And the PCI core would be something along these lines:

  int pci_resize_resource(dev, bar, size)
  {
    if (pci_is_enabled(dev))
      return -EBUSY;

    pci_disable_decoding(dev);          # turn off MEM, IO decoding
    pci_release_resources(dev);         # (all of them)
    err = pci_resize_bar(dev, bar, size);     # change BAR size (only)
    if (err)
      return err;

    pci_assign_resources(dev);          # reassign all "dev" resources
    return 0;
  }

> +}
> +
>  /*
>   * GPU helpers function.
>   */
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> index a9083a1..ae71524 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> @@ -372,13 +372,15 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev)
>  		}
>  		adev->mc.vram_width = numchan * chansize;
>  	}
> -	/* Could aper size report 0 ? */
> -	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> -	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
>  	/* size in MB on si */
>  	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>  	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>  
> +	if (!(adev->flags & AMD_IS_APU))
> +		amdgpu_resize_bar0(adev);
> +	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> +	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
> +
>  #ifdef CONFIG_X86_64
>  	if (adev->flags & AMD_IS_APU) {
>  		adev->mc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22;
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> index 4ac9978..1655de2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> @@ -534,13 +534,15 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev)
>  		}
>  		adev->mc.vram_width = numchan * chansize;
>  	}
> -	/* Could aper size report 0 ? */
> -	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> -	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
>  	/* size in MB on si */
>  	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>  	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>  
> +	if (!(adev->flags & AMD_IS_APU))
> +		amdgpu_resize_bar0(adev);
> +	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> +	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
> +
>  #ifdef CONFIG_X86_64
>  	if (adev->flags & AMD_IS_APU) {
>  		adev->mc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22;
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index b9f11fa..d65728a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -469,16 +469,18 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
>  	}
>  	adev->mc.vram_width = numchan * chansize;
>  
> -	/* Could aper size report 0 ? */
> -	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> -	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
>  	/* size in MB on si */
>  	adev->mc.mc_vram_size =
>  		nbio_v6_1_get_memsize(adev) * 1024ULL * 1024ULL;
>  	adev->mc.real_vram_size = adev->mc.mc_vram_size;
> -	adev->mc.visible_vram_size = adev->mc.aper_size;
> +
> +	if (!(adev->flags & AMD_IS_APU))
> +		amdgpu_resize_bar0(adev);
> +	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
> +	adev->mc.aper_size = pci_resource_len(adev->pdev, 0);
>  
>  	/* In case the PCI BAR is larger than the actual amount of vram */
> +	adev->mc.visible_vram_size = adev->mc.aper_size;
>  	if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
>  		adev->mc.visible_vram_size = adev->mc.real_vram_size;
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
  2017-06-01 20:14     ` Bjorn Helgaas
  (?)
@ 2017-06-02  9:32     ` Christian König
  2017-06-02 20:26       ` Bjorn Helgaas
  -1 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2017-06-02  9:32 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, platform-driver-x86, Alex Deucher, David Airlie,
	amd-gfx, dri-devel

Hi Bjorn,

sorry for not responding earlier and thanks for picking this thread up 
again.

Am 01.06.2017 um 22:14 schrieb Bjorn Helgaas:
> [+cc ADMGPU, DRM folks]
>
> On Tue, May 09, 2017 at 06:49:07PM +0200, Christian König wrote:
>> [SNIP]
>> +/**
>> + * amdgpu_resize_bar0 - try to resize BAR0
>> + *
>> + * @adev: amdgpu_device pointer
>> + *
>> + * Try to resize BAR0 to make all VRAM CPU accessible.
>> + */
>> +void amdgpu_resize_bar0(struct amdgpu_device *adev)
>> +{
>> +	u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
>> +	u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) -1;
>> +	u16 cmd;
>> +	int r;
>> +
>> +	/* Free the doorbell mapping, it most likely needs to move as well */
>> +	amdgpu_doorbell_fini(adev);
>> +	pci_release_resource(adev->pdev, 2);
>> +
>> +	/* Disable memory decoding while we change the BAR addresses and size */
>> +	pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
>> +	pci_write_config_word(adev->pdev, PCI_COMMAND,
>> +			      cmd & ~PCI_COMMAND_MEMORY);
>> +
>> +	r = pci_resize_resource(adev->pdev, 0, rbar_size);
>> +	if (r == -ENOSPC)
>> +		DRM_INFO("Not enough PCI address space for a large BAR.");
>> +	else if (r && r != -ENOTSUPP)
>> +		DRM_ERROR("Problem resizing BAR0 (%d).", r);
>> +
>> +	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
>> +
>> +	/* When the doorbell BAR isn't available we have no chance of
>> +	 * using the device.
>> +	 */
>> +	BUG_ON(amdgpu_doorbell_init(adev));
> This amdgpu_doorbell_fini()/amdgpu_doorbell_init() thing doesn't look
> right.  amdgpu_device_init() only calls amdgpu_doorbell_init() for
> "adev->asic_type >= CHIP_BONAIRE", but we call it unconditionally
> here.
>
> This is the call graph:
>
>    amdgpu_device_init
>      adev->rmmio_base = pci_resource_start(adev->pdev, 5)   # 2 for < BONAIRE
>      adev->rmmio = ioremap(adev->rmmio_base, ...)
>      DRM_INFO("register mmio base: 0x%08X\n", (uint32_t)adev->rmmio_base)
>      if (adev->asic_type >= CHIP_BONAIRE) {
>        amdgpu_doorbell_init
> 	adev->doorbell.base = pci_resource_start(adev->pdev, 2)
> 	adev->doorbell.ptr = ioremap(adev->doorbell.base, ...)
>      }
>      amdgpu_init
>        gmc_v7_0_sw_init              # gmc_v7_0_ip_funcs.sw_init
> 	gmc_v7_0_mc_init
>    +         amdgpu_resize_bar0
>    +           amdgpu_doorbell_fini
>    +           pci_release_resource(adev->pdev, 2)
>    +           pci_resize_resource(adev->pdev, 0, size)
>    +           amdgpu_doorbell_init
> 	  adev->mc.aper_base = pci_resource_start(adev->pdev, 0)
>
> If "asic_type < CHIP_BONAIRE", we ioremapped BAR 2 in
> amdgpu_device_init(), then we released the resource here and never
> updated the ioremap.

The first hardware with a resizeable BAR I could find is a Tonga, and 
that is even a generation later than Bonaire.

So we are never going to call this code on earlier hardware generations.

But I think I will just move the asic_type check into the function to be 
absolute sure.

>  From the PCI core perspective, it would be much cleaner to do the BAR
> resize before the driver calls pci_enable_device().  If that could be
> done, there would be no need for this sort of shutdown/reinit stuff
> and we wouldn't have to worry about issues like these.  The amdgpu
> init path is pretty complicated, so I don't know whether this is
> possible.

I completely agree on this and it is actually the approach I tried first.

There are just two problems with this approach:
1. When the amdgpu driver is loaded there can already be the VGA 
console, Vesa or EFI driver active for the device and displaying the 
splash screen.

When we resize and most likely relocate the BAR while those drivers are 
active it will certainly cause problems.

What amdgpu does before trying to resize the BAR is kicking out other 
driver and making sure it has exclusive access to the hardware.

2. Without taking a look at the registers you don't know how much memory 
there actually is on the board.

We could always resize it to the maximum supported, but that would mean 
we could easily waste 128GB of address space while the hardware only has 
8GB of VRAM.

That would not necessarily hurt us when we have enough address space, 
but at least kind of sucks.

> I would also like to simplify the driver usage model and get the
> PCI_COMMAND_MEMORY twiddling into the PCI core instead of the driver.
> Ideally, the driver would do something like this:
>
>    pci_resize_resource(adev->pdev, 0, rbar_size);
>    pci_enable_device(adev->pdev);
>
> And the PCI core would be something along these lines:
>
>    int pci_resize_resource(dev, bar, size)
>    {
>      if (pci_is_enabled(dev))
>        return -EBUSY;
>
>      pci_disable_decoding(dev);          # turn off MEM, IO decoding
>      pci_release_resources(dev);         # (all of them)
>      err = pci_resize_bar(dev, bar, size);     # change BAR size (only)
>      if (err)
>        return err;
>
>      pci_assign_resources(dev);          # reassign all "dev" resources
>      return 0;
>    }

I already tried the approach with releasing all resources, but it didn't 
worked so well.

When resizing fails because we don't have enough address space then we 
at least want to get back to a working config.

Releasing everything makes that rather tricky, since I would then need 
to keep a backup of the old config and try to restore it.

Additional to that I'm not sure if releasing the register BAR and 
relocating it works with amdgpu.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
  2017-06-02  9:32     ` Christian König
@ 2017-06-02 20:26       ` Bjorn Helgaas
       [not found]         ` <20170602202631.GA1452-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Bjorn Helgaas @ 2017-06-02 20:26 UTC (permalink / raw)
  To: Christian König
  Cc: linux-pci, platform-driver-x86, Alex Deucher, David Airlie,
	amd-gfx, dri-devel

On Fri, Jun 02, 2017 at 11:32:21AM +0200, Christian König wrote:
> Hi Bjorn,
> 
> sorry for not responding earlier and thanks for picking this thread
> up again.
> 
> Am 01.06.2017 um 22:14 schrieb Bjorn Helgaas:
> >[+cc ADMGPU, DRM folks]
> >
> >On Tue, May 09, 2017 at 06:49:07PM +0200, Christian König wrote:
> >>[SNIP]
> >>+/**
> >>+ * amdgpu_resize_bar0 - try to resize BAR0
> >>+ *
> >>+ * @adev: amdgpu_device pointer
> >>+ *
> >>+ * Try to resize BAR0 to make all VRAM CPU accessible.
> >>+ */
> >>+void amdgpu_resize_bar0(struct amdgpu_device *adev)
> >>+{
> >>+	u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
> >>+	u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) -1;
> >>+	u16 cmd;
> >>+	int r;
> >>+
> >>+	/* Free the doorbell mapping, it most likely needs to move as well */
> >>+	amdgpu_doorbell_fini(adev);
> >>+	pci_release_resource(adev->pdev, 2);
> >>+
> >>+	/* Disable memory decoding while we change the BAR addresses and size */
> >>+	pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
> >>+	pci_write_config_word(adev->pdev, PCI_COMMAND,
> >>+			      cmd & ~PCI_COMMAND_MEMORY);
> >>+
> >>+	r = pci_resize_resource(adev->pdev, 0, rbar_size);
> >>+	if (r == -ENOSPC)
> >>+		DRM_INFO("Not enough PCI address space for a large BAR.");
> >>+	else if (r && r != -ENOTSUPP)
> >>+		DRM_ERROR("Problem resizing BAR0 (%d).", r);
> >>+
> >>+	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
> >>+
> >>+	/* When the doorbell BAR isn't available we have no chance of
> >>+	 * using the device.
> >>+	 */
> >>+	BUG_ON(amdgpu_doorbell_init(adev));
> >This amdgpu_doorbell_fini()/amdgpu_doorbell_init() thing doesn't look
> >right.  amdgpu_device_init() only calls amdgpu_doorbell_init() for
> >"adev->asic_type >= CHIP_BONAIRE", but we call it unconditionally
> >here.
> >
> >This is the call graph:
> >
> >   amdgpu_device_init
> >     adev->rmmio_base = pci_resource_start(adev->pdev, 5)   # 2 for < BONAIRE
> >     adev->rmmio = ioremap(adev->rmmio_base, ...)
> >     DRM_INFO("register mmio base: 0x%08X\n", (uint32_t)adev->rmmio_base)
> >     if (adev->asic_type >= CHIP_BONAIRE) {
> >       amdgpu_doorbell_init
> >	adev->doorbell.base = pci_resource_start(adev->pdev, 2)
> >	adev->doorbell.ptr = ioremap(adev->doorbell.base, ...)
> >     }
> >     amdgpu_init
> >       gmc_v7_0_sw_init              # gmc_v7_0_ip_funcs.sw_init
> >	gmc_v7_0_mc_init
> >   +         amdgpu_resize_bar0
> >   +           amdgpu_doorbell_fini
> >   +           pci_release_resource(adev->pdev, 2)
> >   +           pci_resize_resource(adev->pdev, 0, size)
> >   +           amdgpu_doorbell_init
> >	  adev->mc.aper_base = pci_resource_start(adev->pdev, 0)
> >
> >If "asic_type < CHIP_BONAIRE", we ioremapped BAR 2 in
> >amdgpu_device_init(), then we released the resource here and never
> >updated the ioremap.
> 
> The first hardware with a resizeable BAR I could find is a Tonga,
> and that is even a generation later than Bonaire.
> 
> So we are never going to call this code on earlier hardware generations.

The problem with that is that it's impossible for a code reader to
verify that.  So adding a check is ugly but I think makes it more
readable.

> But I think I will just move the asic_type check into the function
> to be absolute sure.
> 
> > From the PCI core perspective, it would be much cleaner to do the BAR
> >resize before the driver calls pci_enable_device().  If that could be
> >done, there would be no need for this sort of shutdown/reinit stuff
> >and we wouldn't have to worry about issues like these.  The amdgpu
> >init path is pretty complicated, so I don't know whether this is
> >possible.
> 
> I completely agree on this and it is actually the approach I tried first.
> 
> There are just two problems with this approach:
> 1. When the amdgpu driver is loaded there can already be the VGA
> console, Vesa or EFI driver active for the device and displaying the
> splash screen.
> 
> When we resize and most likely relocate the BAR while those drivers
> are active it will certainly cause problems.
> 
> What amdgpu does before trying to resize the BAR is kicking out
> other driver and making sure it has exclusive access to the
> hardware.

I don't understand the problem here yet.  If you need to enable the
device, then disable it, resize, and re-enable it, that's fine.

The important thing I'm looking for is that the resize happens before
a pci_enable_device(), because pci_enable_device() is the sync point
where the PCI core enables resources and makes them available to the
driver.  Drivers know that they can't look at the resources before
that point.  There's a little bit of text about this in [1].

> 2. Without taking a look at the registers you don't know how much
> memory there actually is on the board.
> 
> We could always resize it to the maximum supported, but that would
> mean we could easily waste 128GB of address space while the hardware
> only has 8GB of VRAM.
> 
> That would not necessarily hurt us when we have enough address
> space, but at least kind of sucks.

Enable, read regs, disable, kick out other drivers, resize, enable.
Does that solve this problem?

> >I would also like to simplify the driver usage model and get the
> >PCI_COMMAND_MEMORY twiddling into the PCI core instead of the driver.
> >Ideally, the driver would do something like this:
> >
> >   pci_resize_resource(adev->pdev, 0, rbar_size);
> >   pci_enable_device(adev->pdev);
> >
> >And the PCI core would be something along these lines:
> >
> >   int pci_resize_resource(dev, bar, size)
> >   {
> >     if (pci_is_enabled(dev))
> >       return -EBUSY;
> >
> >     pci_disable_decoding(dev);          # turn off MEM, IO decoding
> >     pci_release_resources(dev);         # (all of them)
> >     err = pci_resize_bar(dev, bar, size);     # change BAR size (only)
> >     if (err)
> >       return err;
> >
> >     pci_assign_resources(dev);          # reassign all "dev" resources
> >     return 0;
> >   }
> 
> I already tried the approach with releasing all resources, but it
> didn't worked so well.
> 
> When resizing fails because we don't have enough address space then
> we at least want to get back to a working config.
> 
> Releasing everything makes that rather tricky, since I would then
> need to keep a backup of the old config and try to restore it.

If resizing fails because of lack of address space, I would expect the
PCI core to at least restore to the previous state.  If it doesn't, I
think that would be a defect.

Having the driver specify the BARs it thinks might cause issues feels
like a crutch.

> Additional to that I'm not sure if releasing the register BAR and
> relocating it works with amdgpu.

If the BAR can't be relocated, that sounds like a hardware defect.  If
that's really the case, you could mark it IORESOURCE_PCI_FIXED so we
don't move it.  Or if it's an amdgpu defect, e.g., if amdgpu doesn't
re-read the resource addresses after pci_enable_device(), you should
fix amdgpu.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/PCI/pci.txt#n255

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
  2017-06-02 20:26       ` Bjorn Helgaas
@ 2017-06-06 11:51             ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-06-06 11:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: David Airlie, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	platform-driver-x86-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Alex Deucher

Am 02.06.2017 um 22:26 schrieb Bjorn Helgaas:
> On Fri, Jun 02, 2017 at 11:32:21AM +0200, Christian König wrote:
>> Hi Bjorn,
>>
>> sorry for not responding earlier and thanks for picking this thread
>> up again.
>>
>> Am 01.06.2017 um 22:14 schrieb Bjorn Helgaas:
>>> [+cc ADMGPU, DRM folks]
>>>
>>> On Tue, May 09, 2017 at 06:49:07PM +0200, Christian König wrote:
>>>> [SNIP]
>>>> +/**
>>>> + * amdgpu_resize_bar0 - try to resize BAR0
>>>> + *
>>>> + * @adev: amdgpu_device pointer
>>>> + *
>>>> + * Try to resize BAR0 to make all VRAM CPU accessible.
>>>> + */
>>>> +void amdgpu_resize_bar0(struct amdgpu_device *adev)
>>>> +{
>>>> +	u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
>>>> +	u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) -1;
>>>> +	u16 cmd;
>>>> +	int r;
>>>> +
>>>> +	/* Free the doorbell mapping, it most likely needs to move as well */
>>>> +	amdgpu_doorbell_fini(adev);
>>>> +	pci_release_resource(adev->pdev, 2);
>>>> +
>>>> +	/* Disable memory decoding while we change the BAR addresses and size */
>>>> +	pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
>>>> +	pci_write_config_word(adev->pdev, PCI_COMMAND,
>>>> +			      cmd & ~PCI_COMMAND_MEMORY);
>>>> +
>>>> +	r = pci_resize_resource(adev->pdev, 0, rbar_size);
>>>> +	if (r == -ENOSPC)
>>>> +		DRM_INFO("Not enough PCI address space for a large BAR.");
>>>> +	else if (r && r != -ENOTSUPP)
>>>> +		DRM_ERROR("Problem resizing BAR0 (%d).", r);
>>>> +
>>>> +	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
>>>> +
>>>> +	/* When the doorbell BAR isn't available we have no chance of
>>>> +	 * using the device.
>>>> +	 */
>>>> +	BUG_ON(amdgpu_doorbell_init(adev));
>>> This amdgpu_doorbell_fini()/amdgpu_doorbell_init() thing doesn't look
>>> right.  amdgpu_device_init() only calls amdgpu_doorbell_init() for
>>> "adev->asic_type >= CHIP_BONAIRE", but we call it unconditionally
>>> here.
>>>
>>> This is the call graph:
>>>
>>>    amdgpu_device_init
>>>      adev->rmmio_base = pci_resource_start(adev->pdev, 5)   # 2 for < BONAIRE
>>>      adev->rmmio = ioremap(adev->rmmio_base, ...)
>>>      DRM_INFO("register mmio base: 0x%08X\n", (uint32_t)adev->rmmio_base)
>>>      if (adev->asic_type >= CHIP_BONAIRE) {
>>>        amdgpu_doorbell_init
>>> 	adev->doorbell.base = pci_resource_start(adev->pdev, 2)
>>> 	adev->doorbell.ptr = ioremap(adev->doorbell.base, ...)
>>>      }
>>>      amdgpu_init
>>>        gmc_v7_0_sw_init              # gmc_v7_0_ip_funcs.sw_init
>>> 	gmc_v7_0_mc_init
>>>    +         amdgpu_resize_bar0
>>>    +           amdgpu_doorbell_fini
>>>    +           pci_release_resource(adev->pdev, 2)
>>>    +           pci_resize_resource(adev->pdev, 0, size)
>>>    +           amdgpu_doorbell_init
>>> 	  adev->mc.aper_base = pci_resource_start(adev->pdev, 0)
>>>
>>> If "asic_type < CHIP_BONAIRE", we ioremapped BAR 2 in
>>> amdgpu_device_init(), then we released the resource here and never
>>> updated the ioremap.
>> The first hardware with a resizeable BAR I could find is a Tonga,
>> and that is even a generation later than Bonaire.
>>
>> So we are never going to call this code on earlier hardware generations.
> The problem with that is that it's impossible for a code reader to
> verify that.  So adding a check is ugly but I think makes it more
> readable.

Good point. I will just move the check into the function itself, that 
should handle all such cases.

>>>  From the PCI core perspective, it would be much cleaner to do the BAR
>>> resize before the driver calls pci_enable_device().  If that could be
>>> done, there would be no need for this sort of shutdown/reinit stuff
>>> and we wouldn't have to worry about issues like these.  The amdgpu
>>> init path is pretty complicated, so I don't know whether this is
>>> possible.
>> I completely agree on this and it is actually the approach I tried first.
>>
>> There are just two problems with this approach:
>> 1. When the amdgpu driver is loaded there can already be the VGA
>> console, Vesa or EFI driver active for the device and displaying the
>> splash screen.
>>
>> When we resize and most likely relocate the BAR while those drivers
>> are active it will certainly cause problems.
>>
>> What amdgpu does before trying to resize the BAR is kicking out
>> other driver and making sure it has exclusive access to the
>> hardware.
> I don't understand the problem here yet.  If you need to enable the
> device, then disable it, resize, and re-enable it, that's fine.

The issue is we never enable the device ourself in amdgpu, except for 
some rare cases during resume.

In most of the cases we have to handle this is the primary display 
device which is enabled by either the BIOS, VGA console, VesaFB or 
EFIFB. Amdgpu just kicks out whatever driver was responsible for the 
device previously and takes over.

I could of course do the disable/resize/reenable dance, but I would 
rather want to avoid that.

The hardware is most likely already displaying a boot splash and we want 
to transit to the desktop without any flickering (at least that's the 
long term goal). Completely disabling the device to do this doesn't 
sounds like a good idea if we want that.

> The important thing I'm looking for is that the resize happens before
> a pci_enable_device(), because pci_enable_device() is the sync point
> where the PCI core enables resources and makes them available to the
> driver.  Drivers know that they can't look at the resources before
> that point.  There's a little bit of text about this in [1].

Yeah, I understand that. But wouldn't it be sufficient to just disable 
memory decoding during the resize?

I can easily guarantee that the CPU isn't accessing the BAR during the 
time (we need to do this for changing the memory clocks as well), but I 
have a bad gut feeling completely turning of the device while we are 
still displaying stuff.

>> 2. Without taking a look at the registers you don't know how much
>> memory there actually is on the board.
>>
>> We could always resize it to the maximum supported, but that would
>> mean we could easily waste 128GB of address space while the hardware
>> only has 8GB of VRAM.
>>
>> That would not necessarily hurt us when we have enough address
>> space, but at least kind of sucks.
> Enable, read regs, disable, kick out other drivers, resize, enable.
> Does that solve this problem?

Yeah, that sounds like it should do it. I will try and take a look if 
that works or not.

>>> I would also like to simplify the driver usage model and get the
>>> PCI_COMMAND_MEMORY twiddling into the PCI core instead of the driver.
>>> Ideally, the driver would do something like this:
>>>
>>>    pci_resize_resource(adev->pdev, 0, rbar_size);
>>>    pci_enable_device(adev->pdev);
>>>
>>> And the PCI core would be something along these lines:
>>>
>>>    int pci_resize_resource(dev, bar, size)
>>>    {
>>>      if (pci_is_enabled(dev))
>>>        return -EBUSY;
>>>
>>>      pci_disable_decoding(dev);          # turn off MEM, IO decoding
>>>      pci_release_resources(dev);         # (all of them)
>>>      err = pci_resize_bar(dev, bar, size);     # change BAR size (only)
>>>      if (err)
>>>        return err;
>>>
>>>      pci_assign_resources(dev);          # reassign all "dev" resources
>>>      return 0;
>>>    }
>> I already tried the approach with releasing all resources, but it
>> didn't worked so well.
>>
>> When resizing fails because we don't have enough address space then
>> we at least want to get back to a working config.
>>
>> Releasing everything makes that rather tricky, since I would then
>> need to keep a backup of the old config and try to restore it.
> If resizing fails because of lack of address space, I would expect the
> PCI core to at least restore to the previous state.  If it doesn't, I
> think that would be a defect.

I completely agree, but this is unfortunately what happens.

The allocation and alignment functions in the PCI core sometimes doesn't 
seem to be able to restore the old config.

I mean just try pci=realloc. The core sometimes doesn't seem to be able 
to come up with a valid config with that.

> Having the driver specify the BARs it thinks might cause issues feels
> like a crutch.
Why? The driver just releases what is necessary to move the BAR.

>
>> Additional to that I'm not sure if releasing the register BAR and
>> relocating it works with amdgpu.
> If the BAR can't be relocated, that sounds like a hardware defect.  If
> that's really the case, you could mark it IORESOURCE_PCI_FIXED so we
> don't move it.  Or if it's an amdgpu defect, e.g., if amdgpu doesn't
> re-read the resource addresses after pci_enable_device(), you should
> fix amdgpu.

No, the issue is not the hardware but rather the fact that the I/O BAR 
might be used by more than one driver in the system.

We also have an alsa audio driver using this to program it's DMA for 
HDMI/DP audio and there is also the kicked out driver which might should 
take over again if amdgpu ever unloads.

I need to double check how those two work.

Regards,
Christian.

>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/PCI/pci.txt#n255


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
@ 2017-06-06 11:51             ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-06-06 11:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, platform-driver-x86, Alex Deucher, David Airlie,
	amd-gfx, dri-devel

Am 02.06.2017 um 22:26 schrieb Bjorn Helgaas:
> On Fri, Jun 02, 2017 at 11:32:21AM +0200, Christian König wrote:
>> Hi Bjorn,
>>
>> sorry for not responding earlier and thanks for picking this thread
>> up again.
>>
>> Am 01.06.2017 um 22:14 schrieb Bjorn Helgaas:
>>> [+cc ADMGPU, DRM folks]
>>>
>>> On Tue, May 09, 2017 at 06:49:07PM +0200, Christian König wrote:
>>>> [SNIP]
>>>> +/**
>>>> + * amdgpu_resize_bar0 - try to resize BAR0
>>>> + *
>>>> + * @adev: amdgpu_device pointer
>>>> + *
>>>> + * Try to resize BAR0 to make all VRAM CPU accessible.
>>>> + */
>>>> +void amdgpu_resize_bar0(struct amdgpu_device *adev)
>>>> +{
>>>> +	u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
>>>> +	u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) -1;
>>>> +	u16 cmd;
>>>> +	int r;
>>>> +
>>>> +	/* Free the doorbell mapping, it most likely needs to move as well */
>>>> +	amdgpu_doorbell_fini(adev);
>>>> +	pci_release_resource(adev->pdev, 2);
>>>> +
>>>> +	/* Disable memory decoding while we change the BAR addresses and size */
>>>> +	pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
>>>> +	pci_write_config_word(adev->pdev, PCI_COMMAND,
>>>> +			      cmd & ~PCI_COMMAND_MEMORY);
>>>> +
>>>> +	r = pci_resize_resource(adev->pdev, 0, rbar_size);
>>>> +	if (r == -ENOSPC)
>>>> +		DRM_INFO("Not enough PCI address space for a large BAR.");
>>>> +	else if (r && r != -ENOTSUPP)
>>>> +		DRM_ERROR("Problem resizing BAR0 (%d).", r);
>>>> +
>>>> +	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
>>>> +
>>>> +	/* When the doorbell BAR isn't available we have no chance of
>>>> +	 * using the device.
>>>> +	 */
>>>> +	BUG_ON(amdgpu_doorbell_init(adev));
>>> This amdgpu_doorbell_fini()/amdgpu_doorbell_init() thing doesn't look
>>> right.  amdgpu_device_init() only calls amdgpu_doorbell_init() for
>>> "adev->asic_type >= CHIP_BONAIRE", but we call it unconditionally
>>> here.
>>>
>>> This is the call graph:
>>>
>>>    amdgpu_device_init
>>>      adev->rmmio_base = pci_resource_start(adev->pdev, 5)   # 2 for < BONAIRE
>>>      adev->rmmio = ioremap(adev->rmmio_base, ...)
>>>      DRM_INFO("register mmio base: 0x%08X\n", (uint32_t)adev->rmmio_base)
>>>      if (adev->asic_type >= CHIP_BONAIRE) {
>>>        amdgpu_doorbell_init
>>> 	adev->doorbell.base = pci_resource_start(adev->pdev, 2)
>>> 	adev->doorbell.ptr = ioremap(adev->doorbell.base, ...)
>>>      }
>>>      amdgpu_init
>>>        gmc_v7_0_sw_init              # gmc_v7_0_ip_funcs.sw_init
>>> 	gmc_v7_0_mc_init
>>>    +         amdgpu_resize_bar0
>>>    +           amdgpu_doorbell_fini
>>>    +           pci_release_resource(adev->pdev, 2)
>>>    +           pci_resize_resource(adev->pdev, 0, size)
>>>    +           amdgpu_doorbell_init
>>> 	  adev->mc.aper_base = pci_resource_start(adev->pdev, 0)
>>>
>>> If "asic_type < CHIP_BONAIRE", we ioremapped BAR 2 in
>>> amdgpu_device_init(), then we released the resource here and never
>>> updated the ioremap.
>> The first hardware with a resizeable BAR I could find is a Tonga,
>> and that is even a generation later than Bonaire.
>>
>> So we are never going to call this code on earlier hardware generations.
> The problem with that is that it's impossible for a code reader to
> verify that.  So adding a check is ugly but I think makes it more
> readable.

Good point. I will just move the check into the function itself, that 
should handle all such cases.

>>>  From the PCI core perspective, it would be much cleaner to do the BAR
>>> resize before the driver calls pci_enable_device().  If that could be
>>> done, there would be no need for this sort of shutdown/reinit stuff
>>> and we wouldn't have to worry about issues like these.  The amdgpu
>>> init path is pretty complicated, so I don't know whether this is
>>> possible.
>> I completely agree on this and it is actually the approach I tried first.
>>
>> There are just two problems with this approach:
>> 1. When the amdgpu driver is loaded there can already be the VGA
>> console, Vesa or EFI driver active for the device and displaying the
>> splash screen.
>>
>> When we resize and most likely relocate the BAR while those drivers
>> are active it will certainly cause problems.
>>
>> What amdgpu does before trying to resize the BAR is kicking out
>> other driver and making sure it has exclusive access to the
>> hardware.
> I don't understand the problem here yet.  If you need to enable the
> device, then disable it, resize, and re-enable it, that's fine.

The issue is we never enable the device ourself in amdgpu, except for 
some rare cases during resume.

In most of the cases we have to handle this is the primary display 
device which is enabled by either the BIOS, VGA console, VesaFB or 
EFIFB. Amdgpu just kicks out whatever driver was responsible for the 
device previously and takes over.

I could of course do the disable/resize/reenable dance, but I would 
rather want to avoid that.

The hardware is most likely already displaying a boot splash and we want 
to transit to the desktop without any flickering (at least that's the 
long term goal). Completely disabling the device to do this doesn't 
sounds like a good idea if we want that.

> The important thing I'm looking for is that the resize happens before
> a pci_enable_device(), because pci_enable_device() is the sync point
> where the PCI core enables resources and makes them available to the
> driver.  Drivers know that they can't look at the resources before
> that point.  There's a little bit of text about this in [1].

Yeah, I understand that. But wouldn't it be sufficient to just disable 
memory decoding during the resize?

I can easily guarantee that the CPU isn't accessing the BAR during the 
time (we need to do this for changing the memory clocks as well), but I 
have a bad gut feeling completely turning of the device while we are 
still displaying stuff.

>> 2. Without taking a look at the registers you don't know how much
>> memory there actually is on the board.
>>
>> We could always resize it to the maximum supported, but that would
>> mean we could easily waste 128GB of address space while the hardware
>> only has 8GB of VRAM.
>>
>> That would not necessarily hurt us when we have enough address
>> space, but at least kind of sucks.
> Enable, read regs, disable, kick out other drivers, resize, enable.
> Does that solve this problem?

Yeah, that sounds like it should do it. I will try and take a look if 
that works or not.

>>> I would also like to simplify the driver usage model and get the
>>> PCI_COMMAND_MEMORY twiddling into the PCI core instead of the driver.
>>> Ideally, the driver would do something like this:
>>>
>>>    pci_resize_resource(adev->pdev, 0, rbar_size);
>>>    pci_enable_device(adev->pdev);
>>>
>>> And the PCI core would be something along these lines:
>>>
>>>    int pci_resize_resource(dev, bar, size)
>>>    {
>>>      if (pci_is_enabled(dev))
>>>        return -EBUSY;
>>>
>>>      pci_disable_decoding(dev);          # turn off MEM, IO decoding
>>>      pci_release_resources(dev);         # (all of them)
>>>      err = pci_resize_bar(dev, bar, size);     # change BAR size (only)
>>>      if (err)
>>>        return err;
>>>
>>>      pci_assign_resources(dev);          # reassign all "dev" resources
>>>      return 0;
>>>    }
>> I already tried the approach with releasing all resources, but it
>> didn't worked so well.
>>
>> When resizing fails because we don't have enough address space then
>> we at least want to get back to a working config.
>>
>> Releasing everything makes that rather tricky, since I would then
>> need to keep a backup of the old config and try to restore it.
> If resizing fails because of lack of address space, I would expect the
> PCI core to at least restore to the previous state.  If it doesn't, I
> think that would be a defect.

I completely agree, but this is unfortunately what happens.

The allocation and alignment functions in the PCI core sometimes doesn't 
seem to be able to restore the old config.

I mean just try pci=realloc. The core sometimes doesn't seem to be able 
to come up with a valid config with that.

> Having the driver specify the BARs it thinks might cause issues feels
> like a crutch.
Why? The driver just releases what is necessary to move the BAR.

>
>> Additional to that I'm not sure if releasing the register BAR and
>> relocating it works with amdgpu.
> If the BAR can't be relocated, that sounds like a hardware defect.  If
> that's really the case, you could mark it IORESOURCE_PCI_FIXED so we
> don't move it.  Or if it's an amdgpu defect, e.g., if amdgpu doesn't
> re-read the resource addresses after pci_enable_device(), you should
> fix amdgpu.

No, the issue is not the hardware but rather the fact that the I/O BAR 
might be used by more than one driver in the system.

We also have an alsa audio driver using this to program it's DMA for 
HDMI/DP audio and there is also the kicked out driver which might should 
take over again if amdgpu ever unloads.

I need to double check how those two work.

Regards,
Christian.

>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/PCI/pci.txt#n255

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
  2017-06-06 11:51             ` Christian König
@ 2017-06-06 13:53                 ` Alex Deucher
  -1 siblings, 0 replies; 23+ messages in thread
From: Alex Deucher @ 2017-06-06 13:53 UTC (permalink / raw)
  To: Christian König
  Cc: David Airlie, Linux PCI, amd-gfx list,
	platform-driver-x86-u79uwXL29TY76Z2rM5mHXA, Bjorn Helgaas,
	Maling list - DRI developers, Alex Deucher

On Tue, Jun 6, 2017 at 7:51 AM, Christian König <deathsimple@vodafone.de> wrote:
> Am 02.06.2017 um 22:26 schrieb Bjorn Helgaas:
>>
>> On Fri, Jun 02, 2017 at 11:32:21AM +0200, Christian König wrote:
>>>
>>> Hi Bjorn,
>>>
>>> sorry for not responding earlier and thanks for picking this thread
>>> up again.
>>>
>>> Am 01.06.2017 um 22:14 schrieb Bjorn Helgaas:
>>>>
>>>> [+cc ADMGPU, DRM folks]
>>>>
>>>> On Tue, May 09, 2017 at 06:49:07PM +0200, Christian König wrote:
>>>>>
>>>>> [SNIP]
>>>>> +/**
>>>>> + * amdgpu_resize_bar0 - try to resize BAR0
>>>>> + *
>>>>> + * @adev: amdgpu_device pointer
>>>>> + *
>>>>> + * Try to resize BAR0 to make all VRAM CPU accessible.
>>>>> + */
>>>>> +void amdgpu_resize_bar0(struct amdgpu_device *adev)
>>>>> +{
>>>>> +       u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
>>>>> +       u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) -1;
>>>>> +       u16 cmd;
>>>>> +       int r;
>>>>> +
>>>>> +       /* Free the doorbell mapping, it most likely needs to move as
>>>>> well */
>>>>> +       amdgpu_doorbell_fini(adev);
>>>>> +       pci_release_resource(adev->pdev, 2);
>>>>> +
>>>>> +       /* Disable memory decoding while we change the BAR addresses
>>>>> and size */
>>>>> +       pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
>>>>> +       pci_write_config_word(adev->pdev, PCI_COMMAND,
>>>>> +                             cmd & ~PCI_COMMAND_MEMORY);
>>>>> +
>>>>> +       r = pci_resize_resource(adev->pdev, 0, rbar_size);
>>>>> +       if (r == -ENOSPC)
>>>>> +               DRM_INFO("Not enough PCI address space for a large
>>>>> BAR.");
>>>>> +       else if (r && r != -ENOTSUPP)
>>>>> +               DRM_ERROR("Problem resizing BAR0 (%d).", r);
>>>>> +
>>>>> +       pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
>>>>> +
>>>>> +       /* When the doorbell BAR isn't available we have no chance of
>>>>> +        * using the device.
>>>>> +        */
>>>>> +       BUG_ON(amdgpu_doorbell_init(adev));
>>>>
>>>> This amdgpu_doorbell_fini()/amdgpu_doorbell_init() thing doesn't look
>>>> right.  amdgpu_device_init() only calls amdgpu_doorbell_init() for
>>>> "adev->asic_type >= CHIP_BONAIRE", but we call it unconditionally
>>>> here.
>>>>
>>>> This is the call graph:
>>>>
>>>>    amdgpu_device_init
>>>>      adev->rmmio_base = pci_resource_start(adev->pdev, 5)   # 2 for <
>>>> BONAIRE
>>>>      adev->rmmio = ioremap(adev->rmmio_base, ...)
>>>>      DRM_INFO("register mmio base: 0x%08X\n",
>>>> (uint32_t)adev->rmmio_base)
>>>>      if (adev->asic_type >= CHIP_BONAIRE) {
>>>>        amdgpu_doorbell_init
>>>>         adev->doorbell.base = pci_resource_start(adev->pdev, 2)
>>>>         adev->doorbell.ptr = ioremap(adev->doorbell.base, ...)
>>>>      }
>>>>      amdgpu_init
>>>>        gmc_v7_0_sw_init              # gmc_v7_0_ip_funcs.sw_init
>>>>         gmc_v7_0_mc_init
>>>>    +         amdgpu_resize_bar0
>>>>    +           amdgpu_doorbell_fini
>>>>    +           pci_release_resource(adev->pdev, 2)
>>>>    +           pci_resize_resource(adev->pdev, 0, size)
>>>>    +           amdgpu_doorbell_init
>>>>           adev->mc.aper_base = pci_resource_start(adev->pdev, 0)
>>>>
>>>> If "asic_type < CHIP_BONAIRE", we ioremapped BAR 2 in
>>>> amdgpu_device_init(), then we released the resource here and never
>>>> updated the ioremap.
>>>
>>> The first hardware with a resizeable BAR I could find is a Tonga,
>>> and that is even a generation later than Bonaire.
>>>
>>> So we are never going to call this code on earlier hardware generations.
>>
>> The problem with that is that it's impossible for a code reader to
>> verify that.  So adding a check is ugly but I think makes it more
>> readable.
>
>
> Good point. I will just move the check into the function itself, that should
> handle all such cases.
>
>>>>  From the PCI core perspective, it would be much cleaner to do the BAR
>>>> resize before the driver calls pci_enable_device().  If that could be
>>>> done, there would be no need for this sort of shutdown/reinit stuff
>>>> and we wouldn't have to worry about issues like these.  The amdgpu
>>>> init path is pretty complicated, so I don't know whether this is
>>>> possible.
>>>
>>> I completely agree on this and it is actually the approach I tried first.
>>>
>>> There are just two problems with this approach:
>>> 1. When the amdgpu driver is loaded there can already be the VGA
>>> console, Vesa or EFI driver active for the device and displaying the
>>> splash screen.
>>>
>>> When we resize and most likely relocate the BAR while those drivers
>>> are active it will certainly cause problems.
>>>
>>> What amdgpu does before trying to resize the BAR is kicking out
>>> other driver and making sure it has exclusive access to the
>>> hardware.
>>
>> I don't understand the problem here yet.  If you need to enable the
>> device, then disable it, resize, and re-enable it, that's fine.
>
>
> The issue is we never enable the device ourself in amdgpu, except for some
> rare cases during resume.
>
> In most of the cases we have to handle this is the primary display device
> which is enabled by either the BIOS, VGA console, VesaFB or EFIFB. Amdgpu
> just kicks out whatever driver was responsible for the device previously and
> takes over.
>
> I could of course do the disable/resize/reenable dance, but I would rather
> want to avoid that.
>
> The hardware is most likely already displaying a boot splash and we want to
> transit to the desktop without any flickering (at least that's the long term
> goal). Completely disabling the device to do this doesn't sounds like a good
> idea if we want that.
>
>> The important thing I'm looking for is that the resize happens before
>> a pci_enable_device(), because pci_enable_device() is the sync point
>> where the PCI core enables resources and makes them available to the
>> driver.  Drivers know that they can't look at the resources before
>> that point.  There's a little bit of text about this in [1].
>
>
> Yeah, I understand that. But wouldn't it be sufficient to just disable
> memory decoding during the resize?
>
> I can easily guarantee that the CPU isn't accessing the BAR during the time
> (we need to do this for changing the memory clocks as well), but I have a
> bad gut feeling completely turning of the device while we are still
> displaying stuff.
>
>>> 2. Without taking a look at the registers you don't know how much
>>> memory there actually is on the board.
>>>
>>> We could always resize it to the maximum supported, but that would
>>> mean we could easily waste 128GB of address space while the hardware
>>> only has 8GB of VRAM.
>>>
>>> That would not necessarily hurt us when we have enough address
>>> space, but at least kind of sucks.
>>
>> Enable, read regs, disable, kick out other drivers, resize, enable.
>> Does that solve this problem?
>
>
> Yeah, that sounds like it should do it. I will try and take a look if that
> works or not.
>
>
>>>> I would also like to simplify the driver usage model and get the
>>>> PCI_COMMAND_MEMORY twiddling into the PCI core instead of the driver.
>>>> Ideally, the driver would do something like this:
>>>>
>>>>    pci_resize_resource(adev->pdev, 0, rbar_size);
>>>>    pci_enable_device(adev->pdev);
>>>>
>>>> And the PCI core would be something along these lines:
>>>>
>>>>    int pci_resize_resource(dev, bar, size)
>>>>    {
>>>>      if (pci_is_enabled(dev))
>>>>        return -EBUSY;
>>>>
>>>>      pci_disable_decoding(dev);          # turn off MEM, IO decoding
>>>>      pci_release_resources(dev);         # (all of them)
>>>>      err = pci_resize_bar(dev, bar, size);     # change BAR size (only)
>>>>      if (err)
>>>>        return err;
>>>>
>>>>      pci_assign_resources(dev);          # reassign all "dev" resources
>>>>      return 0;
>>>>    }
>>>
>>> I already tried the approach with releasing all resources, but it
>>> didn't worked so well.
>>>
>>> When resizing fails because we don't have enough address space then
>>> we at least want to get back to a working config.
>>>
>>> Releasing everything makes that rather tricky, since I would then
>>> need to keep a backup of the old config and try to restore it.
>>
>> If resizing fails because of lack of address space, I would expect the
>> PCI core to at least restore to the previous state.  If it doesn't, I
>> think that would be a defect.
>
>
> I completely agree, but this is unfortunately what happens.
>
> The allocation and alignment functions in the PCI core sometimes doesn't
> seem to be able to restore the old config.
>
> I mean just try pci=realloc. The core sometimes doesn't seem to be able to
> come up with a valid config with that.
>
>> Having the driver specify the BARs it thinks might cause issues feels
>> like a crutch.
>
> Why? The driver just releases what is necessary to move the BAR.
>
>>
>>> Additional to that I'm not sure if releasing the register BAR and
>>> relocating it works with amdgpu.
>>
>> If the BAR can't be relocated, that sounds like a hardware defect.  If
>> that's really the case, you could mark it IORESOURCE_PCI_FIXED so we
>> don't move it.  Or if it's an amdgpu defect, e.g., if amdgpu doesn't
>> re-read the resource addresses after pci_enable_device(), you should
>> fix amdgpu.
>
>
> No, the issue is not the hardware but rather the fact that the I/O BAR might
> be used by more than one driver in the system.
>
> We also have an alsa audio driver using this to program it's DMA for HDMI/DP
> audio and there is also the kicked out driver which might should take over
> again if amdgpu ever unloads.

I think you'll only see this on chips with ACP (Carrizo and Stoney)
and I think the soc alsa driver gets loaded via a hotplug call from
amdgpu when it detects ACP so as long as we do the resize before
probing ACP we should be fine.  On RV, ACP moves to it's own pci
device and the HDA audio on other devices is exposed via it's own pci
device.

Alex


>
> I need to double check how those two work.
>
> Regards,
> Christian.
>
>
>>
>> [1]
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/PCI/pci.txt#n255
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
@ 2017-06-06 13:53                 ` Alex Deucher
  0 siblings, 0 replies; 23+ messages in thread
From: Alex Deucher @ 2017-06-06 13:53 UTC (permalink / raw)
  To: Christian König
  Cc: Bjorn Helgaas, David Airlie, Linux PCI,
	Maling list - DRI developers, platform-driver-x86, amd-gfx list,
	Alex Deucher

On Tue, Jun 6, 2017 at 7:51 AM, Christian K=C3=B6nig <deathsimple@vodafone.=
de> wrote:
> Am 02.06.2017 um 22:26 schrieb Bjorn Helgaas:
>>
>> On Fri, Jun 02, 2017 at 11:32:21AM +0200, Christian K=C3=B6nig wrote:
>>>
>>> Hi Bjorn,
>>>
>>> sorry for not responding earlier and thanks for picking this thread
>>> up again.
>>>
>>> Am 01.06.2017 um 22:14 schrieb Bjorn Helgaas:
>>>>
>>>> [+cc ADMGPU, DRM folks]
>>>>
>>>> On Tue, May 09, 2017 at 06:49:07PM +0200, Christian K=C3=B6nig wrote:
>>>>>
>>>>> [SNIP]
>>>>> +/**
>>>>> + * amdgpu_resize_bar0 - try to resize BAR0
>>>>> + *
>>>>> + * @adev: amdgpu_device pointer
>>>>> + *
>>>>> + * Try to resize BAR0 to make all VRAM CPU accessible.
>>>>> + */
>>>>> +void amdgpu_resize_bar0(struct amdgpu_device *adev)
>>>>> +{
>>>>> +       u64 space_needed =3D roundup_pow_of_two(adev->mc.real_vram_si=
ze);
>>>>> +       u32 rbar_size =3D order_base_2(((space_needed >> 20) | 1)) -1=
;
>>>>> +       u16 cmd;
>>>>> +       int r;
>>>>> +
>>>>> +       /* Free the doorbell mapping, it most likely needs to move as
>>>>> well */
>>>>> +       amdgpu_doorbell_fini(adev);
>>>>> +       pci_release_resource(adev->pdev, 2);
>>>>> +
>>>>> +       /* Disable memory decoding while we change the BAR addresses
>>>>> and size */
>>>>> +       pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
>>>>> +       pci_write_config_word(adev->pdev, PCI_COMMAND,
>>>>> +                             cmd & ~PCI_COMMAND_MEMORY);
>>>>> +
>>>>> +       r =3D pci_resize_resource(adev->pdev, 0, rbar_size);
>>>>> +       if (r =3D=3D -ENOSPC)
>>>>> +               DRM_INFO("Not enough PCI address space for a large
>>>>> BAR.");
>>>>> +       else if (r && r !=3D -ENOTSUPP)
>>>>> +               DRM_ERROR("Problem resizing BAR0 (%d).", r);
>>>>> +
>>>>> +       pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
>>>>> +
>>>>> +       /* When the doorbell BAR isn't available we have no chance of
>>>>> +        * using the device.
>>>>> +        */
>>>>> +       BUG_ON(amdgpu_doorbell_init(adev));
>>>>
>>>> This amdgpu_doorbell_fini()/amdgpu_doorbell_init() thing doesn't look
>>>> right.  amdgpu_device_init() only calls amdgpu_doorbell_init() for
>>>> "adev->asic_type >=3D CHIP_BONAIRE", but we call it unconditionally
>>>> here.
>>>>
>>>> This is the call graph:
>>>>
>>>>    amdgpu_device_init
>>>>      adev->rmmio_base =3D pci_resource_start(adev->pdev, 5)   # 2 for =
<
>>>> BONAIRE
>>>>      adev->rmmio =3D ioremap(adev->rmmio_base, ...)
>>>>      DRM_INFO("register mmio base: 0x%08X\n",
>>>> (uint32_t)adev->rmmio_base)
>>>>      if (adev->asic_type >=3D CHIP_BONAIRE) {
>>>>        amdgpu_doorbell_init
>>>>         adev->doorbell.base =3D pci_resource_start(adev->pdev, 2)
>>>>         adev->doorbell.ptr =3D ioremap(adev->doorbell.base, ...)
>>>>      }
>>>>      amdgpu_init
>>>>        gmc_v7_0_sw_init              # gmc_v7_0_ip_funcs.sw_init
>>>>         gmc_v7_0_mc_init
>>>>    +         amdgpu_resize_bar0
>>>>    +           amdgpu_doorbell_fini
>>>>    +           pci_release_resource(adev->pdev, 2)
>>>>    +           pci_resize_resource(adev->pdev, 0, size)
>>>>    +           amdgpu_doorbell_init
>>>>           adev->mc.aper_base =3D pci_resource_start(adev->pdev, 0)
>>>>
>>>> If "asic_type < CHIP_BONAIRE", we ioremapped BAR 2 in
>>>> amdgpu_device_init(), then we released the resource here and never
>>>> updated the ioremap.
>>>
>>> The first hardware with a resizeable BAR I could find is a Tonga,
>>> and that is even a generation later than Bonaire.
>>>
>>> So we are never going to call this code on earlier hardware generations=
.
>>
>> The problem with that is that it's impossible for a code reader to
>> verify that.  So adding a check is ugly but I think makes it more
>> readable.
>
>
> Good point. I will just move the check into the function itself, that sho=
uld
> handle all such cases.
>
>>>>  From the PCI core perspective, it would be much cleaner to do the BAR
>>>> resize before the driver calls pci_enable_device().  If that could be
>>>> done, there would be no need for this sort of shutdown/reinit stuff
>>>> and we wouldn't have to worry about issues like these.  The amdgpu
>>>> init path is pretty complicated, so I don't know whether this is
>>>> possible.
>>>
>>> I completely agree on this and it is actually the approach I tried firs=
t.
>>>
>>> There are just two problems with this approach:
>>> 1. When the amdgpu driver is loaded there can already be the VGA
>>> console, Vesa or EFI driver active for the device and displaying the
>>> splash screen.
>>>
>>> When we resize and most likely relocate the BAR while those drivers
>>> are active it will certainly cause problems.
>>>
>>> What amdgpu does before trying to resize the BAR is kicking out
>>> other driver and making sure it has exclusive access to the
>>> hardware.
>>
>> I don't understand the problem here yet.  If you need to enable the
>> device, then disable it, resize, and re-enable it, that's fine.
>
>
> The issue is we never enable the device ourself in amdgpu, except for som=
e
> rare cases during resume.
>
> In most of the cases we have to handle this is the primary display device
> which is enabled by either the BIOS, VGA console, VesaFB or EFIFB. Amdgpu
> just kicks out whatever driver was responsible for the device previously =
and
> takes over.
>
> I could of course do the disable/resize/reenable dance, but I would rathe=
r
> want to avoid that.
>
> The hardware is most likely already displaying a boot splash and we want =
to
> transit to the desktop without any flickering (at least that's the long t=
erm
> goal). Completely disabling the device to do this doesn't sounds like a g=
ood
> idea if we want that.
>
>> The important thing I'm looking for is that the resize happens before
>> a pci_enable_device(), because pci_enable_device() is the sync point
>> where the PCI core enables resources and makes them available to the
>> driver.  Drivers know that they can't look at the resources before
>> that point.  There's a little bit of text about this in [1].
>
>
> Yeah, I understand that. But wouldn't it be sufficient to just disable
> memory decoding during the resize?
>
> I can easily guarantee that the CPU isn't accessing the BAR during the ti=
me
> (we need to do this for changing the memory clocks as well), but I have a
> bad gut feeling completely turning of the device while we are still
> displaying stuff.
>
>>> 2. Without taking a look at the registers you don't know how much
>>> memory there actually is on the board.
>>>
>>> We could always resize it to the maximum supported, but that would
>>> mean we could easily waste 128GB of address space while the hardware
>>> only has 8GB of VRAM.
>>>
>>> That would not necessarily hurt us when we have enough address
>>> space, but at least kind of sucks.
>>
>> Enable, read regs, disable, kick out other drivers, resize, enable.
>> Does that solve this problem?
>
>
> Yeah, that sounds like it should do it. I will try and take a look if tha=
t
> works or not.
>
>
>>>> I would also like to simplify the driver usage model and get the
>>>> PCI_COMMAND_MEMORY twiddling into the PCI core instead of the driver.
>>>> Ideally, the driver would do something like this:
>>>>
>>>>    pci_resize_resource(adev->pdev, 0, rbar_size);
>>>>    pci_enable_device(adev->pdev);
>>>>
>>>> And the PCI core would be something along these lines:
>>>>
>>>>    int pci_resize_resource(dev, bar, size)
>>>>    {
>>>>      if (pci_is_enabled(dev))
>>>>        return -EBUSY;
>>>>
>>>>      pci_disable_decoding(dev);          # turn off MEM, IO decoding
>>>>      pci_release_resources(dev);         # (all of them)
>>>>      err =3D pci_resize_bar(dev, bar, size);     # change BAR size (on=
ly)
>>>>      if (err)
>>>>        return err;
>>>>
>>>>      pci_assign_resources(dev);          # reassign all "dev" resource=
s
>>>>      return 0;
>>>>    }
>>>
>>> I already tried the approach with releasing all resources, but it
>>> didn't worked so well.
>>>
>>> When resizing fails because we don't have enough address space then
>>> we at least want to get back to a working config.
>>>
>>> Releasing everything makes that rather tricky, since I would then
>>> need to keep a backup of the old config and try to restore it.
>>
>> If resizing fails because of lack of address space, I would expect the
>> PCI core to at least restore to the previous state.  If it doesn't, I
>> think that would be a defect.
>
>
> I completely agree, but this is unfortunately what happens.
>
> The allocation and alignment functions in the PCI core sometimes doesn't
> seem to be able to restore the old config.
>
> I mean just try pci=3Drealloc. The core sometimes doesn't seem to be able=
 to
> come up with a valid config with that.
>
>> Having the driver specify the BARs it thinks might cause issues feels
>> like a crutch.
>
> Why? The driver just releases what is necessary to move the BAR.
>
>>
>>> Additional to that I'm not sure if releasing the register BAR and
>>> relocating it works with amdgpu.
>>
>> If the BAR can't be relocated, that sounds like a hardware defect.  If
>> that's really the case, you could mark it IORESOURCE_PCI_FIXED so we
>> don't move it.  Or if it's an amdgpu defect, e.g., if amdgpu doesn't
>> re-read the resource addresses after pci_enable_device(), you should
>> fix amdgpu.
>
>
> No, the issue is not the hardware but rather the fact that the I/O BAR mi=
ght
> be used by more than one driver in the system.
>
> We also have an alsa audio driver using this to program it's DMA for HDMI=
/DP
> audio and there is also the kicked out driver which might should take ove=
r
> again if amdgpu ever unloads.

I think you'll only see this on chips with ACP (Carrizo and Stoney)
and I think the soc alsa driver gets loaded via a hotplug call from
amdgpu when it detects ACP so as long as we do the resize before
probing ACP we should be fine.  On RV, ACP moves to it's own pci
device and the HDA audio on other devices is exposed via it's own pci
device.

Alex


>
> I need to double check how those two work.
>
> Regards,
> Christian.
>
>
>>
>> [1]
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/=
Documentation/PCI/pci.txt#n255
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
  2017-06-06 11:51             ` Christian König
  (?)
  (?)
@ 2017-06-06 23:10             ` Bjorn Helgaas
  2017-06-07  3:50                 ` Deucher, Alexander
       [not found]               ` <20170606231019.GC12672-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
  -1 siblings, 2 replies; 23+ messages in thread
From: Bjorn Helgaas @ 2017-06-06 23:10 UTC (permalink / raw)
  To: Christian König
  Cc: linux-pci, platform-driver-x86, Alex Deucher, David Airlie,
	amd-gfx, dri-devel

On Tue, Jun 06, 2017 at 01:51:11PM +0200, Christian König wrote:
> Am 02.06.2017 um 22:26 schrieb Bjorn Helgaas:
> >On Fri, Jun 02, 2017 at 11:32:21AM +0200, Christian König wrote:
> >>Hi Bjorn,
> >>
> >>sorry for not responding earlier and thanks for picking this thread
> >>up again.
> >>
> >>Am 01.06.2017 um 22:14 schrieb Bjorn Helgaas:
> >>>[+cc ADMGPU, DRM folks]
> >>>
> >>>On Tue, May 09, 2017 at 06:49:07PM +0200, Christian König wrote:
> >>>>[SNIP]
> >>>>+/**
> >>>>+ * amdgpu_resize_bar0 - try to resize BAR0
> >>>>+ *
> >>>>+ * @adev: amdgpu_device pointer
> >>>>+ *
> >>>>+ * Try to resize BAR0 to make all VRAM CPU accessible.
> >>>>+ */
> >>>>+void amdgpu_resize_bar0(struct amdgpu_device *adev)
> >>>>+{
> >>>>+	u64 space_needed = roundup_pow_of_two(adev->mc.real_vram_size);
> >>>>+	u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) -1;
> >>>>+	u16 cmd;
> >>>>+	int r;
> >>>>+
> >>>>+	/* Free the doorbell mapping, it most likely needs to move as well */
> >>>>+	amdgpu_doorbell_fini(adev);
> >>>>+	pci_release_resource(adev->pdev, 2);
> >>>>+
> >>>>+	/* Disable memory decoding while we change the BAR addresses and size */
> >>>>+	pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
> >>>>+	pci_write_config_word(adev->pdev, PCI_COMMAND,
> >>>>+			      cmd & ~PCI_COMMAND_MEMORY);
> >>>>+
> >>>>+	r = pci_resize_resource(adev->pdev, 0, rbar_size);
> >>>>+	if (r == -ENOSPC)
> >>>>+		DRM_INFO("Not enough PCI address space for a large BAR.");
> >>>>+	else if (r && r != -ENOTSUPP)
> >>>>+		DRM_ERROR("Problem resizing BAR0 (%d).", r);
> >>>>+
> >>>>+	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
> >>>>+
> >>>>+	/* When the doorbell BAR isn't available we have no chance of
> >>>>+	 * using the device.
> >>>>+	 */
> >>>>+	BUG_ON(amdgpu_doorbell_init(adev));

> >>> From the PCI core perspective, it would be much cleaner to do the BAR
> >>>resize before the driver calls pci_enable_device().  If that could be
> >>>done, there would be no need for this sort of shutdown/reinit stuff
> >>>and we wouldn't have to worry about issues like these.  The amdgpu
> >>>init path is pretty complicated, so I don't know whether this is
> >>>possible.
> >>I completely agree on this and it is actually the approach I tried first.
> >>
> >>There are just two problems with this approach:
> >>1. When the amdgpu driver is loaded there can already be the VGA
> >>console, Vesa or EFI driver active for the device and displaying the
> >>splash screen.
> >>
> >>When we resize and most likely relocate the BAR while those drivers
> >>are active it will certainly cause problems.
> >>
> >>What amdgpu does before trying to resize the BAR is kicking out
> >>other driver and making sure it has exclusive access to the
> >>hardware.
> >I don't understand the problem here yet.  If you need to enable the
> >device, then disable it, resize, and re-enable it, that's fine.
> 
> The issue is we never enable the device ourself in amdgpu, except
> for some rare cases during resume.
> 
> In most of the cases we have to handle this is the primary display
> device which is enabled by either the BIOS, VGA console, VesaFB or
> EFIFB. Amdgpu just kicks out whatever driver was responsible for the
> device previously and takes over.
> 
> I could of course do the disable/resize/reenable dance, but I would
> rather want to avoid that.
> 
> The hardware is most likely already displaying a boot splash and we
> want to transit to the desktop without any flickering (at least
> that's the long term goal). Completely disabling the device to do
> this doesn't sounds like a good idea if we want that.
> 
> >The important thing I'm looking for is that the resize happens before
> >a pci_enable_device(), because pci_enable_device() is the sync point
> >where the PCI core enables resources and makes them available to the
> >driver.  Drivers know that they can't look at the resources before
> >that point.  There's a little bit of text about this in [1].
> 
> Yeah, I understand that. But wouldn't it be sufficient to just
> disable memory decoding during the resize?
> 
> I can easily guarantee that the CPU isn't accessing the BAR during
> the time (we need to do this for changing the memory clocks as
> well), but I have a bad gut feeling completely turning of the device
> while we are still displaying stuff.

pci_disable_device() doesn't turn off the device; it only disables bus
mastering (and some of the arch-specific pcibios_disable_device()
implementations do a little more).  But it's certainly the wrong
direction -- it disables DMA, which has nothing to do with the BAR
decoding we're interested in.

What if the driver did something like this:

  pci_disable_decoding(dev, IORESOURCE_MEM);
  pci_release_resource(dev, 2);
  pci_resize_bar(dev, bar, size);
  pci_assign_resources(dev);
  pci_enable_decoding(dev, IORESOURCE_MEM);

That would require adding pci_enable/disable_decoding() to the driver
API, along with the requirement that the driver discard and remap
some resources after pci_enable_decoding().  I think
pci_enable_decoding() would look much like the existing
pci_enable_resources() except taking a resource type instead of a
bitmask.

I would *prefer* if we released and reassigned all resources, because
then the core has complete flexibility to move things around, and it's
easy to document that pci_assign_resources() means you have to
reread/remap everything.

If the driver only releases specified BARs, the pci_assign_resources()
interface becomes "you need to reread/remap the BAR you resized, plus
any other BARs you released".  It's a little more complicated to
describe and more dependent on previous driver actions.

But releasing only the specified BAR will make your life easier and
help with the fact that multiple drivers might be using the same BAR
(I have to raise my eyebrows at that), so I think I'm OK with it.  And
it would also side-step the "can't restore previous state" problem.

It's an "interesting" asymmetry that pci_enable_device() turns on BAR
decoding but doesn't touch bus mastering, while pci_disable_device()
turns off bus mastering but doesn't touch BAR decoding.

Bjorn

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
  2017-06-06 23:10             ` Bjorn Helgaas
@ 2017-06-07  3:50                 ` Deucher, Alexander
       [not found]               ` <20170606231019.GC12672-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
  1 sibling, 0 replies; 23+ messages in thread
From: Deucher, Alexander @ 2017-06-07  3:50 UTC (permalink / raw)
  To: 'Bjorn Helgaas', Christian König
  Cc: linux-pci, dri-devel, amd-gfx, platform-driver-x86

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> Sent: Tuesday, June 06, 2017 7:10 PM
> To: Christian König
> Cc: linux-pci@vger.kernel.org; platform-driver-x86@vger.kernel.org;
> Deucher, Alexander; David Airlie; amd-gfx@lists.freedesktop.org; dri-
> devel@lists.freedesktop.org
> Subject: Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access
> v2
> 
> On Tue, Jun 06, 2017 at 01:51:11PM +0200, Christian König wrote:
> > Am 02.06.2017 um 22:26 schrieb Bjorn Helgaas:
> > >On Fri, Jun 02, 2017 at 11:32:21AM +0200, Christian König wrote:
> > >>Hi Bjorn,
> > >>
> > >>sorry for not responding earlier and thanks for picking this thread
> > >>up again.
> > >>
> > >>Am 01.06.2017 um 22:14 schrieb Bjorn Helgaas:
> > >>>[+cc ADMGPU, DRM folks]
> > >>>
> > >>>On Tue, May 09, 2017 at 06:49:07PM +0200, Christian König wrote:
> > >>>>[SNIP]
> > >>>>+/**
> > >>>>+ * amdgpu_resize_bar0 - try to resize BAR0
> > >>>>+ *
> > >>>>+ * @adev: amdgpu_device pointer
> > >>>>+ *
> > >>>>+ * Try to resize BAR0 to make all VRAM CPU accessible.
> > >>>>+ */
> > >>>>+void amdgpu_resize_bar0(struct amdgpu_device *adev)
> > >>>>+{
> > >>>>+	u64 space_needed = roundup_pow_of_two(adev-
> >mc.real_vram_size);
> > >>>>+	u32 rbar_size = order_base_2(((space_needed >> 20) | 1)) -
> 1;
> > >>>>+	u16 cmd;
> > >>>>+	int r;
> > >>>>+
> > >>>>+	/* Free the doorbell mapping, it most likely needs to move as
> well */
> > >>>>+	amdgpu_doorbell_fini(adev);
> > >>>>+	pci_release_resource(adev->pdev, 2);
> > >>>>+
> > >>>>+	/* Disable memory decoding while we change the BAR
> addresses and size */
> > >>>>+	pci_read_config_word(adev->pdev, PCI_COMMAND,
> &cmd);
> > >>>>+	pci_write_config_word(adev->pdev, PCI_COMMAND,
> > >>>>+			      cmd & ~PCI_COMMAND_MEMORY);
> > >>>>+
> > >>>>+	r = pci_resize_resource(adev->pdev, 0, rbar_size);
> > >>>>+	if (r == -ENOSPC)
> > >>>>+		DRM_INFO("Not enough PCI address space for a
> large BAR.");
> > >>>>+	else if (r && r != -ENOTSUPP)
> > >>>>+		DRM_ERROR("Problem resizing BAR0 (%d).", r);
> > >>>>+
> > >>>>+	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
> > >>>>+
> > >>>>+	/* When the doorbell BAR isn't available we have no chance
> of
> > >>>>+	 * using the device.
> > >>>>+	 */
> > >>>>+	BUG_ON(amdgpu_doorbell_init(adev));
> 
> > >>> From the PCI core perspective, it would be much cleaner to do the BAR
> > >>>resize before the driver calls pci_enable_device().  If that could be
> > >>>done, there would be no need for this sort of shutdown/reinit stuff
> > >>>and we wouldn't have to worry about issues like these.  The amdgpu
> > >>>init path is pretty complicated, so I don't know whether this is
> > >>>possible.
> > >>I completely agree on this and it is actually the approach I tried first.
> > >>
> > >>There are just two problems with this approach:
> > >>1. When the amdgpu driver is loaded there can already be the VGA
> > >>console, Vesa or EFI driver active for the device and displaying the
> > >>splash screen.
> > >>
> > >>When we resize and most likely relocate the BAR while those drivers
> > >>are active it will certainly cause problems.
> > >>
> > >>What amdgpu does before trying to resize the BAR is kicking out
> > >>other driver and making sure it has exclusive access to the
> > >>hardware.
> > >I don't understand the problem here yet.  If you need to enable the
> > >device, then disable it, resize, and re-enable it, that's fine.
> >
> > The issue is we never enable the device ourself in amdgpu, except
> > for some rare cases during resume.
> >
> > In most of the cases we have to handle this is the primary display
> > device which is enabled by either the BIOS, VGA console, VesaFB or
> > EFIFB. Amdgpu just kicks out whatever driver was responsible for the
> > device previously and takes over.
> >
> > I could of course do the disable/resize/reenable dance, but I would
> > rather want to avoid that.
> >
> > The hardware is most likely already displaying a boot splash and we
> > want to transit to the desktop without any flickering (at least
> > that's the long term goal). Completely disabling the device to do
> > this doesn't sounds like a good idea if we want that.
> >
> > >The important thing I'm looking for is that the resize happens before
> > >a pci_enable_device(), because pci_enable_device() is the sync point
> > >where the PCI core enables resources and makes them available to the
> > >driver.  Drivers know that they can't look at the resources before
> > >that point.  There's a little bit of text about this in [1].
> >
> > Yeah, I understand that. But wouldn't it be sufficient to just
> > disable memory decoding during the resize?
> >
> > I can easily guarantee that the CPU isn't accessing the BAR during
> > the time (we need to do this for changing the memory clocks as
> > well), but I have a bad gut feeling completely turning of the device
> > while we are still displaying stuff.
> 
> pci_disable_device() doesn't turn off the device; it only disables bus
> mastering (and some of the arch-specific pcibios_disable_device()
> implementations do a little more).  But it's certainly the wrong
> direction -- it disables DMA, which has nothing to do with the BAR
> decoding we're interested in.
> 
> What if the driver did something like this:
> 
>   pci_disable_decoding(dev, IORESOURCE_MEM);
>   pci_release_resource(dev, 2);
>   pci_resize_bar(dev, bar, size);
>   pci_assign_resources(dev);
>   pci_enable_decoding(dev, IORESOURCE_MEM);
> 
> That would require adding pci_enable/disable_decoding() to the driver
> API, along with the requirement that the driver discard and remap
> some resources after pci_enable_decoding().  I think
> pci_enable_decoding() would look much like the existing
> pci_enable_resources() except taking a resource type instead of a
> bitmask.
> 
> I would *prefer* if we released and reassigned all resources, because
> then the core has complete flexibility to move things around, and it's
> easy to document that pci_assign_resources() means you have to
> reread/remap everything.
> 
> If the driver only releases specified BARs, the pci_assign_resources()
> interface becomes "you need to reread/remap the BAR you resized, plus
> any other BARs you released".  It's a little more complicated to
> describe and more dependent on previous driver actions.
> 
> But releasing only the specified BAR will make your life easier and
> help with the fact that multiple drivers might be using the same BAR
> (I have to raise my eyebrows at that), so I think I'm OK with it.  And

It's pretty standard on SoCs.  The kernel has the mfd infrastructure to support it.

Alex

> it would also side-step the "can't restore previous state" problem.
> 
> It's an "interesting" asymmetry that pci_enable_device() turns on BAR
> decoding but doesn't touch bus mastering, while pci_disable_device()
> turns off bus mastering but doesn't touch BAR decoding.
> 
> Bjorn
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
@ 2017-06-07  3:50                 ` Deucher, Alexander
  0 siblings, 0 replies; 23+ messages in thread
From: Deucher, Alexander @ 2017-06-07  3:50 UTC (permalink / raw)
  To: 'Bjorn Helgaas', Christian König
  Cc: linux-pci, platform-driver-x86, David Airlie, amd-gfx, dri-devel

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas@kernel.org]
> Sent: Tuesday, June 06, 2017 7:10 PM
> To: Christian K=F6nig
> Cc: linux-pci@vger.kernel.org; platform-driver-x86@vger.kernel.org;
> Deucher, Alexander; David Airlie; amd-gfx@lists.freedesktop.org; dri-
> devel@lists.freedesktop.org
> Subject: Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access
> v2
>=20
> On Tue, Jun 06, 2017 at 01:51:11PM +0200, Christian K=F6nig wrote:
> > Am 02.06.2017 um 22:26 schrieb Bjorn Helgaas:
> > >On Fri, Jun 02, 2017 at 11:32:21AM +0200, Christian K=F6nig wrote:
> > >>Hi Bjorn,
> > >>
> > >>sorry for not responding earlier and thanks for picking this thread
> > >>up again.
> > >>
> > >>Am 01.06.2017 um 22:14 schrieb Bjorn Helgaas:
> > >>>[+cc ADMGPU, DRM folks]
> > >>>
> > >>>On Tue, May 09, 2017 at 06:49:07PM +0200, Christian K=F6nig wrote:
> > >>>>[SNIP]
> > >>>>+/**
> > >>>>+ * amdgpu_resize_bar0 - try to resize BAR0
> > >>>>+ *
> > >>>>+ * @adev: amdgpu_device pointer
> > >>>>+ *
> > >>>>+ * Try to resize BAR0 to make all VRAM CPU accessible.
> > >>>>+ */
> > >>>>+void amdgpu_resize_bar0(struct amdgpu_device *adev)
> > >>>>+{
> > >>>>+	u64 space_needed =3D roundup_pow_of_two(adev-
> >mc.real_vram_size);
> > >>>>+	u32 rbar_size =3D order_base_2(((space_needed >> 20) | 1)) -
> 1;
> > >>>>+	u16 cmd;
> > >>>>+	int r;
> > >>>>+
> > >>>>+	/* Free the doorbell mapping, it most likely needs to move as
> well */
> > >>>>+	amdgpu_doorbell_fini(adev);
> > >>>>+	pci_release_resource(adev->pdev, 2);
> > >>>>+
> > >>>>+	/* Disable memory decoding while we change the BAR
> addresses and size */
> > >>>>+	pci_read_config_word(adev->pdev, PCI_COMMAND,
> &cmd);
> > >>>>+	pci_write_config_word(adev->pdev, PCI_COMMAND,
> > >>>>+			      cmd & ~PCI_COMMAND_MEMORY);
> > >>>>+
> > >>>>+	r =3D pci_resize_resource(adev->pdev, 0, rbar_size);
> > >>>>+	if (r =3D=3D -ENOSPC)
> > >>>>+		DRM_INFO("Not enough PCI address space for a
> large BAR.");
> > >>>>+	else if (r && r !=3D -ENOTSUPP)
> > >>>>+		DRM_ERROR("Problem resizing BAR0 (%d).", r);
> > >>>>+
> > >>>>+	pci_write_config_word(adev->pdev, PCI_COMMAND, cmd);
> > >>>>+
> > >>>>+	/* When the doorbell BAR isn't available we have no chance
> of
> > >>>>+	 * using the device.
> > >>>>+	 */
> > >>>>+	BUG_ON(amdgpu_doorbell_init(adev));
>=20
> > >>> From the PCI core perspective, it would be much cleaner to do the B=
AR
> > >>>resize before the driver calls pci_enable_device().  If that could b=
e
> > >>>done, there would be no need for this sort of shutdown/reinit stuff
> > >>>and we wouldn't have to worry about issues like these.  The amdgpu
> > >>>init path is pretty complicated, so I don't know whether this is
> > >>>possible.
> > >>I completely agree on this and it is actually the approach I tried fi=
rst.
> > >>
> > >>There are just two problems with this approach:
> > >>1. When the amdgpu driver is loaded there can already be the VGA
> > >>console, Vesa or EFI driver active for the device and displaying the
> > >>splash screen.
> > >>
> > >>When we resize and most likely relocate the BAR while those drivers
> > >>are active it will certainly cause problems.
> > >>
> > >>What amdgpu does before trying to resize the BAR is kicking out
> > >>other driver and making sure it has exclusive access to the
> > >>hardware.
> > >I don't understand the problem here yet.  If you need to enable the
> > >device, then disable it, resize, and re-enable it, that's fine.
> >
> > The issue is we never enable the device ourself in amdgpu, except
> > for some rare cases during resume.
> >
> > In most of the cases we have to handle this is the primary display
> > device which is enabled by either the BIOS, VGA console, VesaFB or
> > EFIFB. Amdgpu just kicks out whatever driver was responsible for the
> > device previously and takes over.
> >
> > I could of course do the disable/resize/reenable dance, but I would
> > rather want to avoid that.
> >
> > The hardware is most likely already displaying a boot splash and we
> > want to transit to the desktop without any flickering (at least
> > that's the long term goal). Completely disabling the device to do
> > this doesn't sounds like a good idea if we want that.
> >
> > >The important thing I'm looking for is that the resize happens before
> > >a pci_enable_device(), because pci_enable_device() is the sync point
> > >where the PCI core enables resources and makes them available to the
> > >driver.  Drivers know that they can't look at the resources before
> > >that point.  There's a little bit of text about this in [1].
> >
> > Yeah, I understand that. But wouldn't it be sufficient to just
> > disable memory decoding during the resize?
> >
> > I can easily guarantee that the CPU isn't accessing the BAR during
> > the time (we need to do this for changing the memory clocks as
> > well), but I have a bad gut feeling completely turning of the device
> > while we are still displaying stuff.
>=20
> pci_disable_device() doesn't turn off the device; it only disables bus
> mastering (and some of the arch-specific pcibios_disable_device()
> implementations do a little more).  But it's certainly the wrong
> direction -- it disables DMA, which has nothing to do with the BAR
> decoding we're interested in.
>=20
> What if the driver did something like this:
>=20
>   pci_disable_decoding(dev, IORESOURCE_MEM);
>   pci_release_resource(dev, 2);
>   pci_resize_bar(dev, bar, size);
>   pci_assign_resources(dev);
>   pci_enable_decoding(dev, IORESOURCE_MEM);
>=20
> That would require adding pci_enable/disable_decoding() to the driver
> API, along with the requirement that the driver discard and remap
> some resources after pci_enable_decoding().  I think
> pci_enable_decoding() would look much like the existing
> pci_enable_resources() except taking a resource type instead of a
> bitmask.
>=20
> I would *prefer* if we released and reassigned all resources, because
> then the core has complete flexibility to move things around, and it's
> easy to document that pci_assign_resources() means you have to
> reread/remap everything.
>=20
> If the driver only releases specified BARs, the pci_assign_resources()
> interface becomes "you need to reread/remap the BAR you resized, plus
> any other BARs you released".  It's a little more complicated to
> describe and more dependent on previous driver actions.
>=20
> But releasing only the specified BAR will make your life easier and
> help with the fact that multiple drivers might be using the same BAR
> (I have to raise my eyebrows at that), so I think I'm OK with it.  And

It's pretty standard on SoCs.  The kernel has the mfd infrastructure to sup=
port it.

Alex

> it would also side-step the "can't restore previous state" problem.
>=20
> It's an "interesting" asymmetry that pci_enable_device() turns on BAR
> decoding but doesn't touch bus mastering, while pci_disable_device()
> turns off bus mastering but doesn't touch BAR decoding.
>=20
> Bjorn

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
  2017-06-06 23:10             ` Bjorn Helgaas
@ 2017-06-09  8:59                   ` Christian König
       [not found]               ` <20170606231019.GC12672-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
  1 sibling, 0 replies; 23+ messages in thread
From: Christian König @ 2017-06-09  8:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: David Airlie, linux-pci-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	platform-driver-x86-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Alex Deucher

Am 07.06.2017 um 01:10 schrieb Bjorn Helgaas:
> [SNIP]
> What if the driver did something like this:
>
>    pci_disable_decoding(dev, IORESOURCE_MEM);
>    pci_release_resource(dev, 2);
>    pci_resize_bar(dev, bar, size);
>    pci_assign_resources(dev);
>    pci_enable_decoding(dev, IORESOURCE_MEM);
>
> That would require adding pci_enable/disable_decoding() to the driver
> API, along with the requirement that the driver discard and remap
> some resources after pci_enable_decoding().  I think
> pci_enable_decoding() would look much like the existing
> pci_enable_resources() except taking a resource type instead of a
> bitmask.

This is pretty close to what we already do. I'm going to send out an 
updated version of the patch in a minute.

One difference that I still have in this patch is
> +       pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
> +       pci_write_config_word(adev->pdev, PCI_COMMAND,
> +                             cmd & ~PCI_COMMAND_MEMORY);
in the driver instead of "pci_disable_decoding(dev, IORESOURCE_MEM);"

I thought that introducing a new interface for this two lines would be 
overkill, but if you find it cleaner to add the new interface I can 
change that.

> I would *prefer* if we released and reassigned all resources, because
> then the core has complete flexibility to move things around, and it's
> easy to document that pci_assign_resources() means you have to
> reread/remap everything.
I've tried this and it doesn't work at all, surprisingly the I/O ports 
turned out to be not the first problem I've ran into.

Releasing all resources means that we also try to release the 
0xa000-0xbffff (VGA) and 0xc0000-0xdffff (VBIOS) ranges.

They can be reassigned, but somehow that seems to cause problems later on.

> If the driver only releases specified BARs, the pci_assign_resources()
> interface becomes "you need to reread/remap the BAR you resized, plus
> any other BARs you released".  It's a little more complicated to
> describe and more dependent on previous driver actions.
How about the driver releases all resources it wants to move, including 
the resized and reallocates them after the resize is done?

>
> But releasing only the specified BAR will make your life easier and
> help with the fact that multiple drivers might be using the same BAR
> (I have to raise my eyebrows at that), so I think I'm OK with it.  And
> it would also side-step the "can't restore previous state" problem.
I agree that it is a bit more documentation work to describe how the 
interface works, but it is clearly less problematic during runtime.

Please take a look at the new version of the patches and let me know 
what you think.

Regards,
Christian.

>
> It's an "interesting" asymmetry that pci_enable_device() turns on BAR
> decoding but doesn't touch bus mastering, while pci_disable_device()
> turns off bus mastering but doesn't touch BAR decoding.
>
> Bjorn


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2
@ 2017-06-09  8:59                   ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-06-09  8:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, platform-driver-x86, Alex Deucher, David Airlie,
	amd-gfx, dri-devel

Am 07.06.2017 um 01:10 schrieb Bjorn Helgaas:
> [SNIP]
> What if the driver did something like this:
>
>    pci_disable_decoding(dev, IORESOURCE_MEM);
>    pci_release_resource(dev, 2);
>    pci_resize_bar(dev, bar, size);
>    pci_assign_resources(dev);
>    pci_enable_decoding(dev, IORESOURCE_MEM);
>
> That would require adding pci_enable/disable_decoding() to the driver
> API, along with the requirement that the driver discard and remap
> some resources after pci_enable_decoding().  I think
> pci_enable_decoding() would look much like the existing
> pci_enable_resources() except taking a resource type instead of a
> bitmask.

This is pretty close to what we already do. I'm going to send out an 
updated version of the patch in a minute.

One difference that I still have in this patch is
> +       pci_read_config_word(adev->pdev, PCI_COMMAND, &cmd);
> +       pci_write_config_word(adev->pdev, PCI_COMMAND,
> +                             cmd & ~PCI_COMMAND_MEMORY);
in the driver instead of "pci_disable_decoding(dev, IORESOURCE_MEM);"

I thought that introducing a new interface for this two lines would be 
overkill, but if you find it cleaner to add the new interface I can 
change that.

> I would *prefer* if we released and reassigned all resources, because
> then the core has complete flexibility to move things around, and it's
> easy to document that pci_assign_resources() means you have to
> reread/remap everything.
I've tried this and it doesn't work at all, surprisingly the I/O ports 
turned out to be not the first problem I've ran into.

Releasing all resources means that we also try to release the 
0xa000-0xbffff (VGA) and 0xc0000-0xdffff (VBIOS) ranges.

They can be reassigned, but somehow that seems to cause problems later on.

> If the driver only releases specified BARs, the pci_assign_resources()
> interface becomes "you need to reread/remap the BAR you resized, plus
> any other BARs you released".  It's a little more complicated to
> describe and more dependent on previous driver actions.
How about the driver releases all resources it wants to move, including 
the resized and reallocates them after the resize is done?

>
> But releasing only the specified BAR will make your life easier and
> help with the fact that multiple drivers might be using the same BAR
> (I have to raise my eyebrows at that), so I think I'm OK with it.  And
> it would also side-step the "can't restore previous state" problem.
I agree that it is a bit more documentation work to describe how the 
interface works, but it is clearly less problematic during runtime.

Please take a look at the new version of the patches and let me know 
what you think.

Regards,
Christian.

>
> It's an "interesting" asymmetry that pci_enable_device() turns on BAR
> decoding but doesn't touch bus mastering, while pci_disable_device()
> turns off bus mastering but doesn't touch BAR decoding.
>
> Bjorn

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-06-09  8:59 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-09 16:49 Resizeable PCI BAR support v6 Christian König
2017-05-09 16:49 ` [PATCH v6 1/5] PCI: add a define for the PCI resource type mask v2 Christian König
2017-05-09 16:49 ` [PATCH v6 2/5] PCI: add resizeable BAR infrastructure v5 Christian König
2017-05-09 16:49 ` [PATCH v6 3/5] PCI: add functionality for resizing resources v5 Christian König
2017-05-11 11:56   ` Andy Shevchenko
2017-05-11 11:56     ` Andy Shevchenko
2017-05-09 16:49 ` [PATCH v6 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v3 Christian König
2017-05-09 16:49 ` [PATCH v6 5/5] drm/amdgpu: resize VRAM BAR for CPU access v2 Christian König
2017-05-11 12:00   ` Andy Shevchenko
2017-05-11 12:00     ` Andy Shevchenko
2017-06-01 20:14   ` Bjorn Helgaas
2017-06-01 20:14     ` Bjorn Helgaas
2017-06-02  9:32     ` Christian König
2017-06-02 20:26       ` Bjorn Helgaas
     [not found]         ` <20170602202631.GA1452-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
2017-06-06 11:51           ` Christian König
2017-06-06 11:51             ` Christian König
     [not found]             ` <779a883a-265a-ae04-b0cd-8cb3599f0dc0-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-06-06 13:53               ` Alex Deucher
2017-06-06 13:53                 ` Alex Deucher
2017-06-06 23:10             ` Bjorn Helgaas
2017-06-07  3:50               ` Deucher, Alexander
2017-06-07  3:50                 ` Deucher, Alexander
     [not found]               ` <20170606231019.GC12672-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
2017-06-09  8:59                 ` Christian König
2017-06-09  8:59                   ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.