All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v11 0/7] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE
@ 2017-04-17 21:36 Bjorn Helgaas
  2017-04-17 21:36 ` [PATCH v11 1/7] PCI: Ignore requested alignment for IOV BARs Bjorn Helgaas
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Bjorn Helgaas @ 2017-04-17 21:36 UTC (permalink / raw)
  To: Yongji Xie
  Cc: zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

This is another iteration on Yongji's series to force MMIO BARs not to
share pages, which allows direct passthrough instead of QEMU emulation.

The only things I really changed were to add some comments and factor out
the code that requests resize.  I also pulled in the IOV patch, which
touches the same code but was posted separately.

These are all on my pci/resource branch, and if all goes well, I plan to
merge these for v4.12.

Changelog v11:
- Include patch to ignore IOV alignment
- Add patches to factor out realignment
- Add comment about "resize" usage
---

Bjorn Helgaas (2):
      PCI: Factor pci_reassigndev_resource_alignment()
      PCI: Don't reassign resources that are already aligned

Yongji Xie (5):
      PCI: Ignore requested alignment for IOV BARs
      PCI: A fix for caculating bridge window's size and alignment
      PCI: Add pcibios_default_alignment() for arch-specific alignment control
      powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
      PCI: Don't resize resources when realigning all devices in system


 arch/powerpc/include/asm/machdep.h        |    2 +
 arch/powerpc/kernel/pci-common.c          |    8 ++
 arch/powerpc/platforms/powernv/pci-ioda.c |    7 ++
 drivers/pci/pci.c                         |  106 ++++++++++++++++++++++-------
 drivers/pci/setup-bus.c                   |    4 +
 5 files changed, 100 insertions(+), 27 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v11 1/7] PCI: Ignore requested alignment for IOV BARs
  2017-04-17 21:36 [PATCH v11 0/7] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Bjorn Helgaas
@ 2017-04-17 21:36 ` Bjorn Helgaas
  2017-04-17 21:36 ` [PATCH v11 2/7] PCI: A fix for caculating bridge window's size and alignment Bjorn Helgaas
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Bjorn Helgaas @ 2017-04-17 21:36 UTC (permalink / raw)
  To: Yongji Xie
  Cc: zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

From: Yongji Xie <xyjxie@linux.vnet.ibm.com>

We would call pci_reassigndev_resource_alignment() before
pci_init_capabilities().  So the requested alignment would never work for
IOV BARs.

Furthermore, it's meaningless to request additional alignment for IOV BARs,
the IOV BAR alignment is only determined by the VF BAR size.

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 drivers/pci/pci.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7904d02ffdb9..679af2a253ad 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5084,7 +5084,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 	command &= ~PCI_COMMAND_MEMORY;
 	pci_write_config_word(dev, PCI_COMMAND, command);
 
-	for (i = 0; i < PCI_BRIDGE_RESOURCES; i++) {
+	for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
 		r = &dev->resource[i];
 		if (!(r->flags & IORESOURCE_MEM))
 			continue;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 2/7] PCI: A fix for caculating bridge window's size and alignment
  2017-04-17 21:36 [PATCH v11 0/7] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Bjorn Helgaas
  2017-04-17 21:36 ` [PATCH v11 1/7] PCI: Ignore requested alignment for IOV BARs Bjorn Helgaas
@ 2017-04-17 21:36 ` Bjorn Helgaas
  2017-04-17 21:45   ` Yinghai Lu
  2017-04-17 21:36 ` [PATCH v11 3/7] PCI: Add pcibios_default_alignment() for arch-specific alignment control Bjorn Helgaas
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Bjorn Helgaas @ 2017-04-17 21:36 UTC (permalink / raw)
  To: Yongji Xie
  Cc: zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

From: Yongji Xie <elohimes@gmail.com>

In case that one device's alignment is greater than its size,
we may get an incorrect size and alignment for its bus's memory
window in pbus_size_mem(). This patch fixes this case.

Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/setup-bus.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index cb389277df41..958da7db9033 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1066,10 +1066,10 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 				r->flags = 0;
 				continue;
 			}
-			size += r_size;
+			size += max(r_size, align);
 			/* Exclude ranges with size > align from
 			   calculation of the alignment. */
-			if (r_size == align)
+			if (r_size <= align)
 				aligns[order] += align;
 			if (order > max_order)
 				max_order = order;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 3/7] PCI: Add pcibios_default_alignment() for arch-specific alignment control
  2017-04-17 21:36 [PATCH v11 0/7] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Bjorn Helgaas
  2017-04-17 21:36 ` [PATCH v11 1/7] PCI: Ignore requested alignment for IOV BARs Bjorn Helgaas
  2017-04-17 21:36 ` [PATCH v11 2/7] PCI: A fix for caculating bridge window's size and alignment Bjorn Helgaas
@ 2017-04-17 21:36 ` Bjorn Helgaas
  2017-04-17 21:36 ` [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned Bjorn Helgaas
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Bjorn Helgaas @ 2017-04-17 21:36 UTC (permalink / raw)
  To: Yongji Xie
  Cc: zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

From: Yongji Xie <elohimes@gmail.com>

When VFIO passes through a PCI device to a guest, it does not allow
the guest to mmap BARs that are smaller than PAGE_SIZE unless it
can reserve the rest of the page (see vfio_pci_probe_mmaps()). This
is because a page might contain several small BARs for unrelated
devices and a guest should not be able to access all of them.

VFIO emulates guest accesses to non-mappable BARs, which is functional
but slow. On systems with large page sizes, e.g., PowerNV with 64K pages,
BARs are more likely to share a page and performance is more likely to
be a problem.

Add a weak function to set default alignment for all PCI devices.
An arch can override it to force the PCI core to place memory BARs on
their own pages.

Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 679af2a253ad..a2d1f144c94f 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4947,6 +4947,11 @@ void pci_ignore_hotplug(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_ignore_hotplug);
 
+resource_size_t __weak pcibios_default_alignment(struct pci_dev *dev)
+{
+	return 0;
+}
+
 #define RESOURCE_ALIGNMENT_PARAM_SIZE COMMAND_LINE_SIZE
 static char resource_alignment_param[RESOURCE_ALIGNMENT_PARAM_SIZE] = {0};
 static DEFINE_SPINLOCK(resource_alignment_lock);
@@ -4962,14 +4967,15 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 {
 	int seg, bus, slot, func, align_order, count;
 	unsigned short vendor, device, subsystem_vendor, subsystem_device;
-	resource_size_t align = 0;
+	resource_size_t align = pcibios_default_alignment(dev);
 	char *p;
 
 	spin_lock(&resource_alignment_lock);
 	p = resource_alignment_param;
-	if (!*p)
+	if (!*p && !align)
 		goto out;
 	if (pci_has_flag(PCI_PROBE_ONLY)) {
+		align = 0;
 		pr_info_once("PCI: Ignoring requested alignments (PCI_PROBE_ONLY)\n");
 		goto out;
 	}

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
  2017-04-17 21:36 [PATCH v11 0/7] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Bjorn Helgaas
                   ` (2 preceding siblings ...)
  2017-04-17 21:36 ` [PATCH v11 3/7] PCI: Add pcibios_default_alignment() for arch-specific alignment control Bjorn Helgaas
@ 2017-04-17 21:36 ` Bjorn Helgaas
  2017-04-18  5:30   ` Michael Ellerman
  2017-04-18 13:53   ` Bjorn Helgaas
  2017-04-17 21:37 ` [PATCH v11 5/7] PCI: Factor pci_reassigndev_resource_alignment() Bjorn Helgaas
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 15+ messages in thread
From: Bjorn Helgaas @ 2017-04-17 21:36 UTC (permalink / raw)
  To: Yongji Xie
  Cc: zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

From: Yongji Xie <elohimes@gmail.com>

This overrides pcibios_default_alignment() to set default alignment
to PAGE_SIZE for all PCI devices on PowerNV platform. Thus sub-page
BARs would not share a page and could be mapped into guest when VFIO
passthrough them.

Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 arch/powerpc/include/asm/machdep.h        |    2 ++
 arch/powerpc/kernel/pci-common.c          |    8 ++++++++
 arch/powerpc/platforms/powernv/pci-ioda.c |    7 +++++++
 3 files changed, 17 insertions(+)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 5011b69107a7..a82c1925ff43 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -173,6 +173,8 @@ struct machdep_calls {
 	/* Called after scan and before resource survey */
 	void (*pcibios_fixup_phb)(struct pci_controller *hose);
 
+	resource_size_t (*pcibios_default_alignment)(struct pci_dev *);
+
 #ifdef CONFIG_PCI_IOV
 	void (*pcibios_fixup_sriov)(struct pci_dev *pdev);
 	resource_size_t (*pcibios_iov_resource_alignment)(struct pci_dev *, int resno);
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index ffda24a38dda..ceda57484e8e 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -233,6 +233,14 @@ void pcibios_reset_secondary_bus(struct pci_dev *dev)
 	pci_reset_secondary_bus(dev);
 }
 
+resource_size_t pcibios_default_alignment(struct pci_dev *pdev)
+{
+	if (ppc_md.pcibios_default_alignment)
+		return ppc_md.pcibios_default_alignment(pdev);
+
+	return 0;
+}
+
 #ifdef CONFIG_PCI_IOV
 resource_size_t pcibios_iov_resource_alignment(struct pci_dev *pdev, int resno)
 {
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 6901a06da2f9..b724487cbd0f 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3287,6 +3287,11 @@ static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
 	}
 }
 
+static resource_size_t pnv_pci_default_alignment(struct pci_dev *pdev)
+{
+	return PAGE_SIZE;
+}
+
 #ifdef CONFIG_PCI_IOV
 static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
 						      int resno)
@@ -3820,6 +3825,8 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 		hose->controller_ops = pnv_pci_ioda_controller_ops;
 	}
 
+	ppc_md.pcibios_default_alignment = pnv_pci_default_alignment;
+
 #ifdef CONFIG_PCI_IOV
 	ppc_md.pcibios_fixup_sriov = pnv_pci_ioda_fixup_iov_resources;
 	ppc_md.pcibios_iov_resource_alignment = pnv_pci_iov_resource_alignment;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 5/7] PCI: Factor pci_reassigndev_resource_alignment()
  2017-04-17 21:36 [PATCH v11 0/7] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Bjorn Helgaas
                   ` (3 preceding siblings ...)
  2017-04-17 21:36 ` [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned Bjorn Helgaas
@ 2017-04-17 21:37 ` Bjorn Helgaas
  2017-04-17 21:37 ` [PATCH v11 6/7] PCI: Don't reassign resources that are already aligned Bjorn Helgaas
  2017-04-17 21:37 ` [PATCH v11 7/7] PCI: Don't resize resources when realigning all devices in system Bjorn Helgaas
  6 siblings, 0 replies; 15+ messages in thread
From: Bjorn Helgaas @ 2017-04-17 21:37 UTC (permalink / raw)
  To: Yongji Xie
  Cc: zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

Pull the BAR size adjustment out into a new function,
pci_request_resource_alignment(), and add a comment about how and why we
increase the resource size and alignment.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.c |   67 ++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 46 insertions(+), 21 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a2d1f144c94f..17746f778ca2 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5049,6 +5049,48 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 	return align;
 }
 
+static void pci_request_resource_alignment(struct pci_dev *dev, int bar,
+					   resource_size_t align)
+{
+	struct resource *r = &dev->resource[bar];
+	resource_size_t size;
+
+	if (!(r->flags & IORESOURCE_MEM))
+		return;
+
+	if (r->flags & IORESOURCE_PCI_FIXED) {
+		dev_info(&dev->dev, "BAR%d %pR: ignoring requested alignment %#llx\n",
+			 bar, r, (unsigned long long)align);
+		return;
+	}
+
+	size = resource_size(r);
+	if (size < align) {
+
+		/*
+		 * Increase the size of the resource.  BARs are aligned on
+		 * their size, so when we reallocate space for this
+		 * resource, we'll allocate it with the larger alignment.
+		 * It also prevents assignment of any other BARs inside the
+		 * size.  If we're requesting page alignment, this means no
+		 * other BARs will share the page.
+		 *
+		 * This makes the resource larger than the hardware BAR,
+		 * which may break drivers that compute things based on the
+		 * resource size, e.g., to find registers at a fixed offset
+		 * before the end of the BAR.  We hope users don't request
+		 * alignment for such devices.
+		 */
+		size = align;
+		dev_info(&dev->dev, "BAR%d %pR: requesting alignment to %#llx\n",
+			 bar, r, (unsigned long long)align);
+
+	}
+	r->flags |= IORESOURCE_UNSET;
+	r->end = size - 1;
+	r->start = 0;
+}
+
 /*
  * This function disables memory decoding and releases memory resources
  * of the device specified by kernel's boot parameter 'pci=resource_alignment='.
@@ -5090,28 +5132,11 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 	command &= ~PCI_COMMAND_MEMORY;
 	pci_write_config_word(dev, PCI_COMMAND, command);
 
-	for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
-		r = &dev->resource[i];
-		if (!(r->flags & IORESOURCE_MEM))
-			continue;
-		if (r->flags & IORESOURCE_PCI_FIXED) {
-			dev_info(&dev->dev, "Ignoring requested alignment for BAR%d: %pR\n",
-				i, r);
-			continue;
-		}
+	for (i = 0; i <= PCI_ROM_RESOURCE; i++)
+		pci_request_resource_alignment(dev, i, align);
 
-		size = resource_size(r);
-		if (size < align) {
-			size = align;
-			dev_info(&dev->dev,
-				"Rounding up size of resource #%d to %#llx.\n",
-				i, (unsigned long long)size);
-		}
-		r->flags |= IORESOURCE_UNSET;
-		r->end = size - 1;
-		r->start = 0;
-	}
-	/* Need to disable bridge's resource window,
+	/*
+	 * Need to disable bridge's resource window,
 	 * to enable the kernel to reassign new resource
 	 * window later on.
 	 */

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 6/7] PCI: Don't reassign resources that are already aligned
  2017-04-17 21:36 [PATCH v11 0/7] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Bjorn Helgaas
                   ` (4 preceding siblings ...)
  2017-04-17 21:37 ` [PATCH v11 5/7] PCI: Factor pci_reassigndev_resource_alignment() Bjorn Helgaas
@ 2017-04-17 21:37 ` Bjorn Helgaas
  2017-04-17 21:37 ` [PATCH v11 7/7] PCI: Don't resize resources when realigning all devices in system Bjorn Helgaas
  6 siblings, 0 replies; 15+ messages in thread
From: Bjorn Helgaas @ 2017-04-17 21:37 UTC (permalink / raw)
  To: Yongji Xie
  Cc: zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

The "pci=resource_alignment=" kernel argument designates devices for which
we want alignment greater than is required by the PCI specs.  Previously we
set IORESOURCE_UNSET for every MEM resource of those devices, even if the
resource was *already* sufficiently aligned.

If a resource is already sufficiently aligned, leave it alone and don't try
to reassign it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.c |   40 +++++++++++++++++++---------------------
 1 file changed, 19 insertions(+), 21 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 17746f778ca2..d92b80837cfa 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5065,30 +5065,28 @@ static void pci_request_resource_alignment(struct pci_dev *dev, int bar,
 	}
 
 	size = resource_size(r);
-	if (size < align) {
+	if (size >= align)
+		return;
 
-		/*
-		 * Increase the size of the resource.  BARs are aligned on
-		 * their size, so when we reallocate space for this
-		 * resource, we'll allocate it with the larger alignment.
-		 * It also prevents assignment of any other BARs inside the
-		 * size.  If we're requesting page alignment, this means no
-		 * other BARs will share the page.
-		 *
-		 * This makes the resource larger than the hardware BAR,
-		 * which may break drivers that compute things based on the
-		 * resource size, e.g., to find registers at a fixed offset
-		 * before the end of the BAR.  We hope users don't request
-		 * alignment for such devices.
-		 */
-		size = align;
-		dev_info(&dev->dev, "BAR%d %pR: requesting alignment to %#llx\n",
-			 bar, r, (unsigned long long)align);
+	/*
+	 * Increase the size of the resource.  BARs are aligned on their
+	 * size, so when we reallocate space for this resource, we'll
+	 * allocate it with the larger alignment.  It also prevents
+	 * assignment of any other BARs inside the size.  If we're
+	 * requesting page alignment, this means no other BARs will share
+	 * the page.
+	 *
+	 * This makes the resource larger than the hardware BAR, which may
+	 * break drivers that compute things based on the resource size,
+	 * e.g., to find registers at a fixed offset before the end of the
+	 * BAR.  We hope users don't request alignment for such devices.
+	 */
+	dev_info(&dev->dev, "BAR%d %pR: requesting alignment to %#llx\n",
+		 bar, r, (unsigned long long)align);
 
-	}
-	r->flags |= IORESOURCE_UNSET;
-	r->end = size - 1;
 	r->start = 0;
+	r->end = align - 1;
+	r->flags |= IORESOURCE_UNSET;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v11 7/7] PCI: Don't resize resources when realigning all devices in system
  2017-04-17 21:36 [PATCH v11 0/7] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Bjorn Helgaas
                   ` (5 preceding siblings ...)
  2017-04-17 21:37 ` [PATCH v11 6/7] PCI: Don't reassign resources that are already aligned Bjorn Helgaas
@ 2017-04-17 21:37 ` Bjorn Helgaas
  6 siblings, 0 replies; 15+ messages in thread
From: Bjorn Helgaas @ 2017-04-17 21:37 UTC (permalink / raw)
  To: Yongji Xie
  Cc: zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

From: Yongji Xie <elohimes@gmail.com>

The "pci=resource_alignment" argument aligns BARs of designated devices by
artificially increasing their size.  Increasing the size increases the
alignment and prevents other resources from being assigned in the same
alignment region, e.g., in the same page, but it can break drivers that use
the BAR size to locate things, e.g., ilo_map_device() does this:

  off = pci_resource_len(pdev, bar) - 0x2000;

The new pcibios_default_alignment() interface allows an arch to request
that *all* BARs in the system be aligned to a larger size.  In this case,
we don't need to artificially increase the resource size because we know
every BAR of every device will be realigned, so nothing will share the same
alignment region.

Use IORESOURCE_STARTALIGN to request realignment of PCI BARs when we know
we're realigning all BARs in the system.

[bhelgaas: comment, changelog]
Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.c |   59 +++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 43 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d92b80837cfa..bd31ecdb2257 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4959,11 +4959,13 @@ static DEFINE_SPINLOCK(resource_alignment_lock);
 /**
  * pci_specified_resource_alignment - get resource alignment specified by user.
  * @dev: the PCI device to get
+ * @resize: whether or not to change resources' size when reassigning alignment
  *
  * RETURNS: Resource alignment if it is specified.
  *          Zero if it is not specified.
  */
-static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
+static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
+							bool *resize)
 {
 	int seg, bus, slot, func, align_order, count;
 	unsigned short vendor, device, subsystem_vendor, subsystem_device;
@@ -5005,6 +5007,7 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 				(!device || (device == dev->device)) &&
 				(!subsystem_vendor || (subsystem_vendor == dev->subsystem_vendor)) &&
 				(!subsystem_device || (subsystem_device == dev->subsystem_device))) {
+				*resize = true;
 				if (align_order == -1)
 					align = PAGE_SIZE;
 				else
@@ -5030,6 +5033,7 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 				bus == dev->bus->number &&
 				slot == PCI_SLOT(dev->devfn) &&
 				func == PCI_FUNC(dev->devfn)) {
+				*resize = true;
 				if (align_order == -1)
 					align = PAGE_SIZE;
 				else
@@ -5050,7 +5054,7 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 }
 
 static void pci_request_resource_alignment(struct pci_dev *dev, int bar,
-					   resource_size_t align)
+					   resource_size_t align, bool resize)
 {
 	struct resource *r = &dev->resource[bar];
 	resource_size_t size;
@@ -5069,23 +5073,45 @@ static void pci_request_resource_alignment(struct pci_dev *dev, int bar,
 		return;
 
 	/*
-	 * Increase the size of the resource.  BARs are aligned on their
-	 * size, so when we reallocate space for this resource, we'll
-	 * allocate it with the larger alignment.  It also prevents
-	 * assignment of any other BARs inside the size.  If we're
-	 * requesting page alignment, this means no other BARs will share
-	 * the page.
+	 * Increase the alignment of the resource.  There are two ways we
+	 * can do this:
 	 *
-	 * This makes the resource larger than the hardware BAR, which may
-	 * break drivers that compute things based on the resource size,
-	 * e.g., to find registers at a fixed offset before the end of the
-	 * BAR.  We hope users don't request alignment for such devices.
+	 * 1) Increase the size of the resource.  BARs are aligned on their
+	 *    size, so when we reallocate space for this resource, we'll
+	 *    allocate it with the larger alignment.  This also prevents
+	 *    assignment of any other BARs inside the alignment region, so
+	 *    if we're requesting page alignment, this means no other BARs
+	 *    will share the page.
+	 *
+	 *    The disadvantage is that this makes the resource larger than
+	 *    the hardware BAR, which may break drivers that compute things
+	 *    based on the resource size, e.g., to find registers at a
+	 *    fixed offset before the end of the BAR.
+	 *
+	 * 2) Retain the resource size, but use IORESOURCE_STARTALIGN and
+	 *    set r->start to the desired alignment.  By itself this
+	 *    doesn't prevent other BARs being put inside the alignment
+	 *    region, but if we realign *every* resource of every device in
+	 *    the system, none of them will share an alignment region.
+	 *
+	 * When the user has requested alignment for only some devices via
+	 * the "pci=resource_alignment" argument, "resize" is true and we
+	 * use the first method.  Otherwise we assume we're aligning all
+	 * devices and we use the second.
 	 */
+
 	dev_info(&dev->dev, "BAR%d %pR: requesting alignment to %#llx\n",
 		 bar, r, (unsigned long long)align);
 
-	r->start = 0;
-	r->end = align - 1;
+	if (resize) {
+		r->start = 0;
+		r->end = align - 1;
+	} else {
+		r->flags &= ~IORESOURCE_SIZEALIGN;
+		r->flags |= IORESOURCE_STARTALIGN;
+		r->start = align;
+		r->end = r->start + size - 1;
+	}
 	r->flags |= IORESOURCE_UNSET;
 }
 
@@ -5102,6 +5128,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 	struct resource *r;
 	resource_size_t align, size;
 	u16 command;
+	bool resize = false;
 
 	/*
 	 * VF BARs are read-only zero according to SR-IOV spec r1.1, sec
@@ -5113,7 +5140,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 		return;
 
 	/* check if specified PCI is target device to reassign */
-	align = pci_specified_resource_alignment(dev);
+	align = pci_specified_resource_alignment(dev, &resize);
 	if (!align)
 		return;
 
@@ -5131,7 +5158,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 	pci_write_config_word(dev, PCI_COMMAND, command);
 
 	for (i = 0; i <= PCI_ROM_RESOURCE; i++)
-		pci_request_resource_alignment(dev, i, align);
+		pci_request_resource_alignment(dev, i, align, resize);
 
 	/*
 	 * Need to disable bridge's resource window,

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 2/7] PCI: A fix for caculating bridge window's size and alignment
  2017-04-17 21:36 ` [PATCH v11 2/7] PCI: A fix for caculating bridge window's size and alignment Bjorn Helgaas
@ 2017-04-17 21:45   ` Yinghai Lu
  2017-04-18  2:16     ` Yongji Xie
  0 siblings, 1 reply; 15+ messages in thread
From: Yinghai Lu @ 2017-04-17 21:45 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Yongji Xie, Li Zhong, Alexey Kardashevskiy, linux-pci,
	Gavin Shan, Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

On Mon, Apr 17, 2017 at 2:36 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> From: Yongji Xie <elohimes@gmail.com>
>
> In case that one device's alignment is greater than its size,
> we may get an incorrect size and alignment for its bus's memory
> window in pbus_size_mem(). This patch fixes this case.

In which case, that device alignment is not same as size?
or powerpc need small size, but alignment is PAGE_SIZE?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 2/7] PCI: A fix for caculating bridge window's size and alignment
  2017-04-17 21:45   ` Yinghai Lu
@ 2017-04-18  2:16     ` Yongji Xie
  0 siblings, 0 replies; 15+ messages in thread
From: Yongji Xie @ 2017-04-18  2:16 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Bjorn Helgaas, Li Zhong, Alexey Kardashevskiy, linux-pci,
	Gavin Shan, Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

Hi Yinghai,

On 18 April 2017 at 05:45, Yinghai Lu <yinghai@kernel.org> wrote:
> On Mon, Apr 17, 2017 at 2:36 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> From: Yongji Xie <elohimes@gmail.com>
>>
>> In case that one device's alignment is greater than its size,
>> we may get an incorrect size and alignment for its bus's memory
>> window in pbus_size_mem(). This patch fixes this case.
>
> In which case, that device alignment is not same as size?
> or powerpc need small size, but alignment is PAGE_SIZE?
>

Yes, powerpc may have some small size (smaller than PAGE_SIZE)
devices whose alignment would be enforced to be PAGE_SIZE
by pcibios_default_alignment() (in patch 4).

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
  2017-04-17 21:36 ` [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned Bjorn Helgaas
@ 2017-04-18  5:30   ` Michael Ellerman
  2017-04-18 13:53   ` Bjorn Helgaas
  1 sibling, 0 replies; 15+ messages in thread
From: Michael Ellerman @ 2017-04-18  5:30 UTC (permalink / raw)
  To: Bjorn Helgaas, Yongji Xie
  Cc: zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Benjamin Herrenschmidt,
	linuxppc-dev

Bjorn Helgaas <bhelgaas@google.com> writes:

> From: Yongji Xie <elohimes@gmail.com>
>
> This overrides pcibios_default_alignment() to set default alignment
> to PAGE_SIZE for all PCI devices on PowerNV platform. Thus sub-page
> BARs would not share a page and could be mapped into guest when VFIO
> passthrough them.
>
> Signed-off-by: Yongji Xie <elohimes@gmail.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  arch/powerpc/include/asm/machdep.h        |    2 ++
>  arch/powerpc/kernel/pci-common.c          |    8 ++++++++
>  arch/powerpc/platforms/powernv/pci-ioda.c |    7 +++++++

This looks fine to me, here's an ack, but don't feel you have to rebase
to add it:

Acked-by: Michael Ellerman <mpe@ellerman.id.au>


cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
  2017-04-17 21:36 ` [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned Bjorn Helgaas
  2017-04-18  5:30   ` Michael Ellerman
@ 2017-04-18 13:53   ` Bjorn Helgaas
  2017-04-19  1:47     ` Michael Ellerman
  1 sibling, 1 reply; 15+ messages in thread
From: Bjorn Helgaas @ 2017-04-18 13:53 UTC (permalink / raw)
  To: Yongji Xie
  Cc: Li Zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Michael Ellerman,
	Benjamin Herrenschmidt, linuxppc-dev

On Mon, Apr 17, 2017 at 4:36 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> From: Yongji Xie <elohimes@gmail.com>
>
> This overrides pcibios_default_alignment() to set default alignment
> to PAGE_SIZE for all PCI devices on PowerNV platform. Thus sub-page
> BARs would not share a page and could be mapped into guest when VFIO
> passthrough them.
>
> Signed-off-by: Yongji Xie <elohimes@gmail.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  arch/powerpc/include/asm/machdep.h        |    2 ++
>  arch/powerpc/kernel/pci-common.c          |    8 ++++++++
>  arch/powerpc/platforms/powernv/pci-ioda.c |    7 +++++++
>  3 files changed, 17 insertions(+)

> +resource_size_t pcibios_default_alignment(struct pci_dev *pdev)
> +{
> +       if (ppc_md.pcibios_default_alignment)
> +               return ppc_md.pcibios_default_alignment(pdev);
> +
> +       return 0;
> +}
> +
>  #ifdef CONFIG_PCI_IOV
>  resource_size_t pcibios_iov_resource_alignment(struct pci_dev *pdev, int resno)
>  {
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 6901a06da2f9..b724487cbd0f 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3287,6 +3287,11 @@ static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
>         }
>  }
>
> +static resource_size_t pnv_pci_default_alignment(struct pci_dev *pdev)
> +{
> +       return PAGE_SIZE;
> +}

Is it necessary that pcibios_default_alignment() take a pci_dev
pointer?  I'd like this better if it were:

  resource_size_t pcibios_default_alignment(void) { ... }

because the last patch relies on the assumption that all resources of
*all* devices will be realigned to the same alignment.

Bjorn

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
  2017-04-18 13:53   ` Bjorn Helgaas
@ 2017-04-19  1:47     ` Michael Ellerman
  2017-04-19  2:12       ` Yongji Xie
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Ellerman @ 2017-04-19  1:47 UTC (permalink / raw)
  To: Bjorn Helgaas, Yongji Xie
  Cc: Li Zhong, Alexey Kardashevskiy, linux-pci, Gavin Shan,
	Alex Williamson, Paul Mackerras, Benjamin Herrenschmidt,
	linuxppc-dev

Bjorn Helgaas <bhelgaas@google.com> writes:

> On Mon, Apr 17, 2017 at 4:36 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> From: Yongji Xie <elohimes@gmail.com>
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>> index 6901a06da2f9..b724487cbd0f 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>> @@ -3287,6 +3287,11 @@ static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
>>         }
>>  }
>>
>> +static resource_size_t pnv_pci_default_alignment(struct pci_dev *pdev)
>> +{
>> +       return PAGE_SIZE;
>> +}
>
> Is it necessary that pcibios_default_alignment() take a pci_dev
> pointer?

It's not necessary given the current implementation, obviously.

But it did strike me as a good idea to pass it in case we ever want to
do anything device specific in future.

> I'd like this better if it were:
>
>   resource_size_t pcibios_default_alignment(void) { ... }
>
> because the last patch relies on the assumption that all resources of
> *all* devices will be realigned to the same alignment.

But I guess that precludes doing anything device specific, at least
without further changes. So in that case it would be better if the API
didn't include the pci_dev.

Hopefully Yongji can confirm that there were no plans to use the
pci_dev in future patches.

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
  2017-04-19  1:47     ` Michael Ellerman
@ 2017-04-19  2:12       ` Yongji Xie
  2017-04-19 17:53         ` Bjorn Helgaas
  0 siblings, 1 reply; 15+ messages in thread
From: Yongji Xie @ 2017-04-19  2:12 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Bjorn Helgaas, Li Zhong, Alexey Kardashevskiy, linux-pci,
	Gavin Shan, Alex Williamson, Paul Mackerras,
	Benjamin Herrenschmidt, linuxppc-dev

On 19 April 2017 at 09:47, Michael Ellerman <mpe@ellerman.id.au> wrote:
> Bjorn Helgaas <bhelgaas@google.com> writes:
>
>> On Mon, Apr 17, 2017 at 4:36 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>> From: Yongji Xie <elohimes@gmail.com>
>>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>> index 6901a06da2f9..b724487cbd0f 100644
>>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>> @@ -3287,6 +3287,11 @@ static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
>>>         }
>>>  }
>>>
>>> +static resource_size_t pnv_pci_default_alignment(struct pci_dev *pdev)
>>> +{
>>> +       return PAGE_SIZE;
>>> +}
>>
>> Is it necessary that pcibios_default_alignment() take a pci_dev
>> pointer?
>
> It's not necessary given the current implementation, obviously.
>
> But it did strike me as a good idea to pass it in case we ever want to
> do anything device specific in future.
>
>> I'd like this better if it were:
>>
>>   resource_size_t pcibios_default_alignment(void) { ... }
>>
>> because the last patch relies on the assumption that all resources of
>> *all* devices will be realigned to the same alignment.
>
> But I guess that precludes doing anything device specific, at least
> without further changes. So in that case it would be better if the API
> didn't include the pci_dev.
>
> Hopefully Yongji can confirm that there were no plans to use the
> pci_dev in future patches.
>

Yes, seems like pci_dev pointer doesn't match the assumption
that all resources will be realigned to the same alignment. It's OK
to me to remove it.

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
  2017-04-19  2:12       ` Yongji Xie
@ 2017-04-19 17:53         ` Bjorn Helgaas
  0 siblings, 0 replies; 15+ messages in thread
From: Bjorn Helgaas @ 2017-04-19 17:53 UTC (permalink / raw)
  To: Yongji Xie
  Cc: Michael Ellerman, Li Zhong, Alexey Kardashevskiy, linux-pci,
	Gavin Shan, Alex Williamson, Paul Mackerras,
	Benjamin Herrenschmidt, linuxppc-dev

On Tue, Apr 18, 2017 at 9:12 PM, Yongji Xie <elohimes@gmail.com> wrote:
> On 19 April 2017 at 09:47, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> Bjorn Helgaas <bhelgaas@google.com> writes:
>>
>>> On Mon, Apr 17, 2017 at 4:36 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>>> From: Yongji Xie <elohimes@gmail.com>
>>>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>>> index 6901a06da2f9..b724487cbd0f 100644
>>>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>>> @@ -3287,6 +3287,11 @@ static void pnv_pci_setup_bridge(struct pci_bus *bus, unsigned long type)
>>>>         }
>>>>  }
>>>>
>>>> +static resource_size_t pnv_pci_default_alignment(struct pci_dev *pdev)
>>>> +{
>>>> +       return PAGE_SIZE;
>>>> +}
>>>
>>> Is it necessary that pcibios_default_alignment() take a pci_dev
>>> pointer?
>>
>> It's not necessary given the current implementation, obviously.
>>
>> But it did strike me as a good idea to pass it in case we ever want to
>> do anything device specific in future.
>>
>>> I'd like this better if it were:
>>>
>>>   resource_size_t pcibios_default_alignment(void) { ... }
>>>
>>> because the last patch relies on the assumption that all resources of
>>> *all* devices will be realigned to the same alignment.
>>
>> But I guess that precludes doing anything device specific, at least
>> without further changes. So in that case it would be better if the API
>> didn't include the pci_dev.
>>
>> Hopefully Yongji can confirm that there were no plans to use the
>> pci_dev in future patches.
>>
>
> Yes, seems like pci_dev pointer doesn't match the assumption
> that all resources will be realigned to the same alignment. It's OK
> to me to remove it.

I made this change (removing the pci_dev parameter) on my pci/resource branch.

Bjorn

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-04-19 17:54 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-17 21:36 [PATCH v11 0/7] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Bjorn Helgaas
2017-04-17 21:36 ` [PATCH v11 1/7] PCI: Ignore requested alignment for IOV BARs Bjorn Helgaas
2017-04-17 21:36 ` [PATCH v11 2/7] PCI: A fix for caculating bridge window's size and alignment Bjorn Helgaas
2017-04-17 21:45   ` Yinghai Lu
2017-04-18  2:16     ` Yongji Xie
2017-04-17 21:36 ` [PATCH v11 3/7] PCI: Add pcibios_default_alignment() for arch-specific alignment control Bjorn Helgaas
2017-04-17 21:36 ` [PATCH v11 4/7] powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned Bjorn Helgaas
2017-04-18  5:30   ` Michael Ellerman
2017-04-18 13:53   ` Bjorn Helgaas
2017-04-19  1:47     ` Michael Ellerman
2017-04-19  2:12       ` Yongji Xie
2017-04-19 17:53         ` Bjorn Helgaas
2017-04-17 21:37 ` [PATCH v11 5/7] PCI: Factor pci_reassigndev_resource_alignment() Bjorn Helgaas
2017-04-17 21:37 ` [PATCH v11 6/7] PCI: Don't reassign resources that are already aligned Bjorn Helgaas
2017-04-17 21:37 ` [PATCH v11 7/7] PCI: Don't resize resources when realigning all devices in system Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.