All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] powerpc/powernv: Avoid compound PE for VF
@ 2015-07-17  0:14 Gavin Shan
  2015-07-17  0:14 ` [PATCH 1/2] powerpc/powernv: Fix alignment for IOV BAR Gavin Shan
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Gavin Shan @ 2015-07-17  0:14 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: benh, paulus, aik, Gavin Shan

When the VF BAR size is equal to 128MB or bigger than that, the IOV BAR
is extended to cover number of maximal VFs supported by the PF, not 256.
Also, one PHB's M64 BAR is picked to cover VF BARs for 4 continous VFs,
but the PHB's M64 BAR is configured as being owned by single PE. Eventually,
those 4 VFs have 4 separate PEs from the perspective of PCI config or DMA,
but single shared PE from MMIO's perspective. Once we have compound PE, all
those 4 VFs included in the compound PE can't be passed to separate guests
with VFIO infrastructure.

The above gate (128MB) was choosen based on the assumption: one IOV BAR can
consume 1/4 of PHB's M64 window, which is 16GB. However, it can consume as
much as half of that (32GB) when the PF seats behind the root port. Accordingly,
the gate can be doubled to be 256MB in order to avoid compound PE as we can.


Gavin Shan (2):
  powerpc/powernv: Fix alignment for IOV BAR
  powerpc/powernv: Double VF BAR size for compound PE

 arch/powerpc/platforms/powernv/pci-ioda.c | 56 +++++++++++++++++++++++++------
 1 file changed, 45 insertions(+), 11 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] powerpc/powernv: Fix alignment for IOV BAR
  2015-07-17  0:14 [PATCH 0/2] powerpc/powernv: Avoid compound PE for VF Gavin Shan
@ 2015-07-17  0:14 ` Gavin Shan
  2015-07-17  0:14 ` [PATCH 2/2] powerpc/powernv: Double VF BAR size for compound PE Gavin Shan
  2015-07-29 23:06 ` [PATCH 0/2] powerpc/powernv: Avoid compound PE for VF Gavin Shan
  2 siblings, 0 replies; 5+ messages in thread
From: Gavin Shan @ 2015-07-17  0:14 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: benh, paulus, aik, Gavin Shan

IOV BAR is extended to cover 256 VFs or number of supported VFs,
the alignment is the IOV BAR size, which is usually huge and bigger
than M64 segment size (256MB). That means the IOV BAR is expected
to be assigned to the beginning of PHB's M64 window prior to other
M64 BARs in PCI devices that are hooked to the PCI bus behind root
port. Other M64 BARs actually need M64 segment size other than the
huge IOV BAR size as the required alignment.

The patch returns M64 segment size if IOV BAR size is bigger than
it when the PF seats behind root port. Otherwise, the IOV BAR size
is returned as before. It will save lots of consumed M64 space,
which would be 16GB in some cases as I observed.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 35 +++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index fdafbac..6ec62b9 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2961,16 +2961,39 @@ static resource_size_t pnv_pci_window_alignment(struct pci_bus *bus,
 static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
 						      int resno)
 {
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
+	struct pnv_phb *phb = hose->private_data;
 	struct pci_dn *pdn = pci_get_pdn(pdev);
-	resource_size_t align, iov_align;
+	resource_size_t align;
+	resource_size_t m64_segsz = phb->ioda.m64_segsize;
 
-	iov_align = resource_size(&pdev->resource[resno]);
-	if (iov_align)
-		return iov_align;
+	/*
+	 * When PF is the only one adapter under the PHB, the IOV BAR
+	 * is expected to be assigned prior to any other M64 BARs. To
+	 * have M64 segment size, which is usually smaller than IOV
+	 * BAR size, as the alignment to avoid wasting M64 space to
+	 * satisfy the alignment required by other M64 BARs.
+	 */
+	align = resource_size(&pdev->resource[resno]);
+	if (align) {
+		if (!pci_bus_is_root(pdev->bus) &&
+		    pci_bus_is_root(pdev->bus->self->bus))
+			align = min(align, m64_segsz);
+		else
+			align = max(align, m64_segsz);
+
+		return align;
+	}
 
 	align = pci_iov_resource_size(pdev, resno);
-	if (pdn->vfs_expanded)
-		return pdn->vfs_expanded * align;
+	if (pdn->vfs_expanded) {
+		align = pdn->vfs_expanded * align;
+		if (!pci_bus_is_root(pdev->bus) &&
+		    pci_bus_is_root(pdev->bus->self->bus))
+			align = min(align, m64_segsz);
+		else
+			align = max(align, m64_segsz);
+	}
 
 	return align;
 }
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] powerpc/powernv: Double VF BAR size for compound PE
  2015-07-17  0:14 [PATCH 0/2] powerpc/powernv: Avoid compound PE for VF Gavin Shan
  2015-07-17  0:14 ` [PATCH 1/2] powerpc/powernv: Fix alignment for IOV BAR Gavin Shan
@ 2015-07-17  0:14 ` Gavin Shan
  2015-07-17  0:28   ` Gavin Shan
  2015-07-29 23:06 ` [PATCH 0/2] powerpc/powernv: Avoid compound PE for VF Gavin Shan
  2 siblings, 1 reply; 5+ messages in thread
From: Gavin Shan @ 2015-07-17  0:14 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: benh, paulus, aik, Gavin Shan

When VF BAR size is equal to 128MB or bigger than that, we extend
the corresponding PF's IOV BAR to cover number of total VFs supported
by the PF. Otherwise, we extend the PF's IOV BAR to cover 256 VFs.
For the former case, we have to create compound PE, which includes
4 VFs. Those 4 VFs included in the compound PE can't be passed through
to different guests, which isn't good.

The gate (128MB) was choosen based on the assumption that each PHB
supports 64GB M64 space and one PF's IOV BAR can be extended to be
as huge as 1/4 of that, which is 16GB. However, the IOV BAR can be
extended to half of PHB's M64 window when the PF seats behind the
root port. In that case, the gate can be enlarged to be 256MB to
avoid compound PE as we can.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 6ec62b9..5b2e88f 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2721,6 +2721,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 	struct resource *res;
 	int i;
 	resource_size_t size;
+	resource_size_t limit;
 	struct pci_dn *pdn;
 	int mul, total_vfs;
 
@@ -2730,6 +2731,18 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 	hose = pci_bus_to_host(pdev->bus);
 	phb = hose->private_data;
 
+	/*
+	 * When the PF seats behind root port, the IOV BAR can
+	 * consume half of the PHB's M64 window. Otherwise,
+	 * 1/4 of the PHB's M64 window can be consumed to the
+	 * maximal degree.
+	 */
+	if (!pci_is_root_bus(pdev->bus) &&
+	    pci_is_root_bus(pdev->bus->self->bus))
+		limit = 128;
+	else
+		limit = 256;
+
 	pdn = pci_get_pdn(pdev);
 	pdn->vfs_expanded = 0;
 
@@ -2748,11 +2761,9 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 		}
 
 		size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES);
-
-		/* bigger than 64M */
-		if (size > (1 << 26)) {
-			dev_info(&pdev->dev, "PowerNV: VF BAR%d: %pR IOV size is bigger than 64M, roundup power2\n",
-				 i, res);
+		if (size >= (limit * 0x100000)) {
+			dev_info(&pdev->dev, "PowerNV: VF BAR%d: %pR IOV size is bigger than %lldMB, roundup power2\n",
+				 i, res, limit);
 			pdn->m64_per_iov = M64_PER_IOV;
 			mul = roundup_pow_of_two(total_vfs);
 			break;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] powerpc/powernv: Double VF BAR size for compound PE
  2015-07-17  0:14 ` [PATCH 2/2] powerpc/powernv: Double VF BAR size for compound PE Gavin Shan
@ 2015-07-17  0:28   ` Gavin Shan
  0 siblings, 0 replies; 5+ messages in thread
From: Gavin Shan @ 2015-07-17  0:28 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linuxppc-dev, benh, paulus, aik

On Fri, Jul 17, 2015 at 10:14:43AM +1000, Gavin Shan wrote:
>When VF BAR size is equal to 128MB or bigger than that, we extend
>the corresponding PF's IOV BAR to cover number of total VFs supported
>by the PF. Otherwise, we extend the PF's IOV BAR to cover 256 VFs.
>For the former case, we have to create compound PE, which includes
>4 VFs. Those 4 VFs included in the compound PE can't be passed through
>to different guests, which isn't good.
>
>The gate (128MB) was choosen based on the assumption that each PHB
>supports 64GB M64 space and one PF's IOV BAR can be extended to be
>as huge as 1/4 of that, which is 16GB. However, the IOV BAR can be
>extended to half of PHB's M64 window when the PF seats behind the
>root port. In that case, the gate can be enlarged to be 256MB to
>avoid compound PE as we can.
>
>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>---
> arch/powerpc/platforms/powernv/pci-ioda.c | 21 ++++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>index 6ec62b9..5b2e88f 100644
>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>@@ -2721,6 +2721,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
> 	struct resource *res;
> 	int i;
> 	resource_size_t size;
>+	resource_size_t limit;
> 	struct pci_dn *pdn;
> 	int mul, total_vfs;
>
>@@ -2730,6 +2731,18 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
> 	hose = pci_bus_to_host(pdev->bus);
> 	phb = hose->private_data;
>
>+	/*
>+	 * When the PF seats behind root port, the IOV BAR can
>+	 * consume half of the PHB's M64 window. Otherwise,
>+	 * 1/4 of the PHB's M64 window can be consumed to the
>+	 * maximal degree.
>+	 */
>+	if (!pci_is_root_bus(pdev->bus) &&
>+	    pci_is_root_bus(pdev->bus->self->bus))
>+		limit = 128;
>+	else
>+		limit = 256;
>+

I sent it too fast. The limit should be reversed: 256 when PF seats behind the
root port. Otherwise, it should be 128. I will send follow-up v2 after waiting
for couple of days in case there are some comments for this revision.

> 	pdn = pci_get_pdn(pdev);
> 	pdn->vfs_expanded = 0;
>
>@@ -2748,11 +2761,9 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
> 		}
>
> 		size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES);
>-
>-		/* bigger than 64M */
>-		if (size > (1 << 26)) {
>-			dev_info(&pdev->dev, "PowerNV: VF BAR%d: %pR IOV size is bigger than 64M, roundup power2\n",
>-				 i, res);
>+		if (size >= (limit * 0x100000)) {
>+			dev_info(&pdev->dev, "PowerNV: VF BAR%d: %pR IOV size is bigger than %lldMB, roundup power2\n",
>+				 i, res, limit);
> 			pdn->m64_per_iov = M64_PER_IOV;
> 			mul = roundup_pow_of_two(total_vfs);
> 			break;

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] powerpc/powernv: Avoid compound PE for VF
  2015-07-17  0:14 [PATCH 0/2] powerpc/powernv: Avoid compound PE for VF Gavin Shan
  2015-07-17  0:14 ` [PATCH 1/2] powerpc/powernv: Fix alignment for IOV BAR Gavin Shan
  2015-07-17  0:14 ` [PATCH 2/2] powerpc/powernv: Double VF BAR size for compound PE Gavin Shan
@ 2015-07-29 23:06 ` Gavin Shan
  2 siblings, 0 replies; 5+ messages in thread
From: Gavin Shan @ 2015-07-29 23:06 UTC (permalink / raw)
  To: Gavin Shan; +Cc: linuxppc-dev, benh, paulus, aik

On Fri, Jul 17, 2015 at 10:14:41AM +1000, Gavin Shan wrote:
>When the VF BAR size is equal to 128MB or bigger than that, the IOV BAR
>is extended to cover number of maximal VFs supported by the PF, not 256.
>Also, one PHB's M64 BAR is picked to cover VF BARs for 4 continous VFs,
>but the PHB's M64 BAR is configured as being owned by single PE. Eventually,
>those 4 VFs have 4 separate PEs from the perspective of PCI config or DMA,
>but single shared PE from MMIO's perspective. Once we have compound PE, all
>those 4 VFs included in the compound PE can't be passed to separate guests
>with VFIO infrastructure.
>
>The above gate (128MB) was choosen based on the assumption: one IOV BAR can
>consume 1/4 of PHB's M64 window, which is 16GB. However, it can consume as
>much as half of that (32GB) when the PF seats behind the root port. Accordingly,
>the gate can be doubled to be 256MB in order to avoid compound PE as we can.
>
>

Please ignore those two patches as Richard already sent one patch fixing it
in better way. Sorry for the noise!

Thanks,
Gavin

>Gavin Shan (2):
>  powerpc/powernv: Fix alignment for IOV BAR
>  powerpc/powernv: Double VF BAR size for compound PE
>
> arch/powerpc/platforms/powernv/pci-ioda.c | 56 +++++++++++++++++++++++++------
> 1 file changed, 45 insertions(+), 11 deletions(-)
>
>-- 
>2.1.0
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-07-29 23:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-17  0:14 [PATCH 0/2] powerpc/powernv: Avoid compound PE for VF Gavin Shan
2015-07-17  0:14 ` [PATCH 1/2] powerpc/powernv: Fix alignment for IOV BAR Gavin Shan
2015-07-17  0:14 ` [PATCH 2/2] powerpc/powernv: Double VF BAR size for compound PE Gavin Shan
2015-07-17  0:28   ` Gavin Shan
2015-07-29 23:06 ` [PATCH 0/2] powerpc/powernv: Avoid compound PE for VF Gavin Shan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.