linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Gavin Shan <gwshan@linux.vnet.ibm.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: linux-pci@vger.kernel.org, devicetree@vger.kernel.org,
	benh@kernel.crashing.org, bhelgaas@google.com, aik@ozlabs.ru,
	panto@antoniou-consulting.com, robherring2@gmail.com,
	grant.likely@linaro.org, Gavin Shan <gwshan@linux.vnet.ibm.com>
Subject: [PATCH v5 03/42] powerpc/powernv: M64 support improvement
Date: Thu,  4 Jun 2015 16:41:32 +1000	[thread overview]
Message-ID: <1433400131-18429-4-git-send-email-gwshan@linux.vnet.ibm.com> (raw)
In-Reply-To: <1433400131-18429-1-git-send-email-gwshan@linux.vnet.ibm.com>

We're having the hardware (on PHB3) or software enforced (on P7IOC)
limitation: M64 segment#x can only be assigned to PE#x. IO and M32
segment can be mapped to arbitrary PE# via IODT and M32DT. It means
the PE number should be x if M64 segment#x has been assigned to the
PE. Also, each PE owns one M64 segment at most. Currently, we are
reserving PE# according to root port's M64 window. It won't be reliable
once we extend M64 windows of root port, or the upstream port of the
PCIE switch behind root port to PHB's M64 window, in order to support
PCI hotplug in future.

The patch reserves PE# for M64 segments according to the M64 resources
of the PCI devices (not bridges) contained in the PE. Besides, it's
always worthy to trace the M64 segments consumed by the PE, which can
be released at PCI unplugging time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v5:
  * Made the changelog more descriptive on the fixed M64 seg# mapping
  * Dropped unnecessary and corrected comments pointed by aik
  * Replace "pe_bitsmap" with "pe_bitmap"
  * Fixed coding style complained by checkpatch.pl
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 189 ++++++++++++++++++------------
 arch/powerpc/platforms/powernv/pci.h      |  10 +-
 2 files changed, 121 insertions(+), 78 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 245ef81..71afb38 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -294,28 +294,78 @@ fail:
 	return -EIO;
 }
 
-static void pnv_ioda_reserve_m64_pe(struct pnv_phb *phb)
+/* We extend the M64 window of root port, or the upstream bridge port
+ * of the PCIE switch behind root port. So we shouldn't reserve PEs
+ * for M64 resources because there are no (normal) PCI devices consuming
+ * M64 resources on the PCI buses leading from root port, or the upstream
+ * bridge port. The function returns true if the indicated PCI bus needs
+ * reserved PEs because of M64 resources in advance. Otherwise, the
+ * function returns false.
+ */
+static bool pnv_ioda_need_m64_pe(struct pnv_phb *phb,
+				 struct pci_bus *bus)
 {
-	resource_size_t sgsz = phb->ioda.m64_segsize;
+	if (!bus || pci_is_root_bus(bus))
+		return false;
+
+	/* Bus leading from root port. We need check what types of PCI
+	 * devices on the bus. If it's connecting PCI bridge, we don't
+	 * need reserve M64 PEs for it. Otherwise, we still need to do
+	 * that.
+	 */
+	if (pci_is_root_bus(bus->self->bus)) {
+		struct pci_dev *pdev;
+
+		list_for_each_entry(pdev, &bus->devices, bus_list) {
+			if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL)
+				return true;
+		}
+
+		return false;
+	}
+
+	/* Bus leading from the upstream bridge port on top level */
+	if (pci_is_root_bus(bus->self->bus->self->bus))
+		return false;
+
+	return true;
+}
+
+static void pnv_ioda_reserve_m64_pe(struct pnv_phb *phb,
+				    struct pci_bus *bus)
+{
+	resource_size_t segsz = phb->ioda.m64_segsize;
 	struct pci_dev *pdev;
 	struct resource *r;
-	int base, step, i;
+	unsigned long pe_no, limit;
+	int i;
 
-	/*
-	 * Root bus always has full M64 range and root port has
-	 * M64 range used in reality. So we're checking root port
-	 * instead of root bus.
+	if (!pnv_ioda_need_m64_pe(phb, bus))
+		return;
+
+	/* The bridge's M64 window might have been extended to the
+	 * PHB's M64 window in order to support PCI hotplug. So the
+	 * bridge's M64 window isn't reliable to be used for picking
+	 * PE# for its leading PCI bus. We have to check the M64
+	 * resources consumed by the PCI devices, which seat on the
+	 * PCI bus.
 	 */
-	list_for_each_entry(pdev, &phb->hose->bus->devices, bus_list) {
-		for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
-			r = &pdev->resource[PCI_BRIDGE_RESOURCES + i];
-			if (!r->parent ||
-			    !pnv_pci_is_mem_pref_64(r->flags))
+	list_for_each_entry(pdev, &bus->devices, bus_list) {
+		for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+#ifdef CONFIG_PCI_IOV
+			if (i >= PCI_IOV_RESOURCES && i <= PCI_IOV_RESOURCE_END)
+				continue;
+#endif
+			r = &pdev->resource[i];
+			if (!r->flags || r->start >= r->end ||
+			    !r->parent || !pnv_pci_is_mem_pref_64(r->flags))
 				continue;
 
-			base = (r->start - phb->ioda.m64_base) / sgsz;
-			for (step = 0; step < resource_size(r) / sgsz; step++)
-				pnv_ioda_reserve_pe(phb, base + step);
+			pe_no = (r->start - phb->ioda.m64_base) / segsz;
+			limit = ALIGN(r->end - phb->ioda.m64_base, segsz) /
+				segsz;
+			for (; pe_no < limit; pe_no++)
+				pnv_ioda_reserve_pe(phb, pe_no);
 		}
 	}
 }
@@ -327,85 +377,63 @@ static int pnv_ioda_pick_m64_pe(struct pnv_phb *phb,
 	struct pci_dev *pdev;
 	struct resource *r;
 	struct pnv_ioda_pe *master_pe, *pe;
-	unsigned long size, *pe_alloc;
-	bool found;
-	int start, i, j;
-
-	/* Root bus shouldn't use M64 */
-	if (pci_is_root_bus(bus))
-		return IODA_INVALID_PE;
-
-	/* We support only one M64 window on each bus */
-	found = false;
-	pci_bus_for_each_resource(bus, r, i) {
-		if (r && r->parent &&
-		    pnv_pci_is_mem_pref_64(r->flags)) {
-			found = true;
-			break;
-		}
-	}
+	unsigned long size, *pe_bitmap;
+	unsigned long pe_no, limit;
+	int i;
 
-	/* No M64 window found ? */
-	if (!found)
+	if (!pnv_ioda_need_m64_pe(phb, bus))
 		return IODA_INVALID_PE;
 
 	/* Allocate bitmap */
 	size = _ALIGN_UP(phb->ioda.total_pe / 8, sizeof(unsigned long));
-	pe_alloc = kzalloc(size, GFP_KERNEL);
-	if (!pe_alloc) {
-		pr_warn("%s: Out of memory !\n",
-			__func__);
+	pe_bitmap = kzalloc(size, GFP_KERNEL);
+	if (!pe_bitmap)
 		return IODA_INVALID_PE;
-	}
 
-	/*
-	 * Figure out reserved PE numbers by the PE
-	 * the its child PEs.
-	 */
-	start = (r->start - phb->ioda.m64_base) / segsz;
-	for (i = 0; i < resource_size(r) / segsz; i++)
-		set_bit(start + i, pe_alloc);
-
-	if (all)
-		goto done;
-
-	/*
-	 * If the PE doesn't cover all subordinate buses,
-	 * we need subtract from reserved PEs for children.
+	/* The bridge's M64 window might be extended to PHB's M64
+	 * window by intention to support PCI hotplug. So we have
+	 * to check the M64 resources consumed by the PCI devices
+	 * on the PCI bus.
 	 */
 	list_for_each_entry(pdev, &bus->devices, bus_list) {
-		if (!pdev->subordinate)
-			continue;
+		for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+#ifdef CONFIG_PCI_IOV
+			if (i >= PCI_IOV_RESOURCES &&
+			    i <= PCI_IOV_RESOURCE_END)
+				continue;
+#endif
+			/* Don't scan bridge's window if the PE
+			 * doesn't contain its subordinate bus.
+			 */
+			if (!all && i >= PCI_BRIDGE_RESOURCES &&
+			    i <= PCI_BRIDGE_RESOURCE_END)
+				continue;
 
-		pci_bus_for_each_resource(pdev->subordinate, r, i) {
-			if (!r || !r->parent ||
-			    !pnv_pci_is_mem_pref_64(r->flags))
+			r = &pdev->resource[i];
+			if (!r->flags || r->start >= r->end ||
+			    !r->parent || !pnv_pci_is_mem_pref_64(r->flags))
 				continue;
 
-			start = (r->start - phb->ioda.m64_base) / segsz;
-			for (j = 0; j < resource_size(r) / segsz ; j++)
-				clear_bit(start + j, pe_alloc);
-                }
-        }
+			pe_no = (r->start - phb->ioda.m64_base) / segsz;
+			limit = ALIGN(r->end - phb->ioda.m64_base, segsz) /
+				segsz;
+			for (; pe_no < limit; pe_no++)
+				set_bit(pe_no, pe_bitmap);
+		}
+	}
 
-	/*
-	 * the current bus might not own M64 window and that's all
-	 * contributed by its child buses. For the case, we needn't
-	 * pick M64 dependent PE#.
-	 */
-	if (bitmap_empty(pe_alloc, phb->ioda.total_pe)) {
-		kfree(pe_alloc);
+	/* No M64 window found ? */
+	if (bitmap_empty(pe_bitmap, phb->ioda.total_pe)) {
+		kfree(pe_bitmap);
 		return IODA_INVALID_PE;
 	}
 
-	/*
-	 * Figure out the master PE and put all slave PEs to master
-	 * PE's list to form compound PE.
+	/* Figure out the master PE and put all slave PEs
+	 * to master PE's list to form compound PE.
 	 */
-done:
 	master_pe = NULL;
 	i = -1;
-	while ((i = find_next_bit(pe_alloc, phb->ioda.total_pe, i + 1)) <
+	while ((i = find_next_bit(pe_bitmap, phb->ioda.total_pe, i + 1)) <
 		phb->ioda.total_pe) {
 		pe = &phb->ioda.pe_array[i];
 
@@ -419,6 +447,13 @@ done:
 			list_add_tail(&pe->list, &master_pe->slaves);
 		}
 
+		/* Reserve the M64 segment, which should be available. Also,
+		 * those M64 segments consumed by slave PEs are contributed
+		 * to the master PE.
+		 */
+		BUG_ON(test_and_set_bit(pe->pe_number, phb->ioda.m64_segmap));
+		BUG_ON(test_and_set_bit(pe->pe_number, master_pe->m64_segmap));
+
 		/* P7IOC supports M64DT, which helps mapping M64 segment
 		 * to one particular PE#. However, PHB3 has fixed mapping
 		 * between M64 segment and PE#. In order to have same logic
@@ -440,7 +475,7 @@ done:
 		}
 	}
 
-	kfree(pe_alloc);
+	kfree(pe_bitmap);
 	return master_pe->pe_number;
 }
 
@@ -1233,7 +1268,7 @@ static void pnv_pci_ioda_setup_PEs(void)
 
 		/* M64 layout might affect PE allocation */
 		if (phb->reserve_m64_pe)
-			phb->reserve_m64_pe(phb);
+			phb->reserve_m64_pe(phb, phb->hose->bus);
 
 		pnv_ioda_setup_PEs(hose->bus);
 	}
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index fc6be02..54657f4 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -49,6 +49,13 @@ struct pnv_ioda_pe {
 	/* PE number */
 	unsigned int		pe_number;
 
+	/* IO/M32/M64 segments consumed by the PE. Each PE can
+	 * have one M64 segment at most, but M64 segments consumed
+	 * by slave PEs will be contributed to the master PE. One
+	 * PE can own multiple IO and M32 segments.
+	 */
+	unsigned long		m64_segmap[8];
+
 	/* "Weight" assigned to the PE for the sake of DMA resource
 	 * allocations
 	 */
@@ -113,7 +120,7 @@ struct pnv_phb {
 	u32 (*bdfn_to_pe)(struct pnv_phb *phb, struct pci_bus *bus, u32 devfn);
 	void (*shutdown)(struct pnv_phb *phb);
 	int (*init_m64)(struct pnv_phb *phb);
-	void (*reserve_m64_pe)(struct pnv_phb *phb);
+	void (*reserve_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus);
 	int (*pick_m64_pe)(struct pnv_phb *phb, struct pci_bus *bus, int all);
 	int (*get_pe_state)(struct pnv_phb *phb, int pe_no);
 	void (*freeze_pe)(struct pnv_phb *phb, int pe_no);
@@ -153,6 +160,7 @@ struct pnv_phb {
 			struct mutex		pe_alloc_mutex;
 
 			/* M32 & IO segment maps */
+			unsigned long		m64_segmap[8];
 			unsigned int		*m32_segmap;
 			unsigned int		*io_segmap;
 			struct pnv_ioda_pe	*pe_array;
-- 
2.1.0

  parent reply	other threads:[~2015-06-04  6:43 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-04  6:41 [PATCH v5 00/42] PowerPC/PowerNV: PCI Slot Management Gavin Shan
2015-06-04  6:41 ` [PATCH v5 01/42] PCI: Add pcibios_setup_bridge() Gavin Shan
2015-06-05 19:44   ` Bjorn Helgaas
2015-06-09  5:49     ` Gavin Shan
2015-06-04  6:41 ` [PATCH v5 02/42] powerpc/powernv: Enable M64 on P7IOC Gavin Shan
2015-06-04  6:41 ` Gavin Shan [this message]
2015-06-04  6:41 ` [PATCH v5 04/42] powerpc/powernv: Trace consumed IO and M32 segments by PE Gavin Shan
2015-06-04  6:41 ` [PATCH v5 05/42] powerpc/powernv: Simplify pnv_ioda_setup_pe_seg() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 06/42] powerpc/powernv: Improve IO and M32 mapping Gavin Shan
2015-06-04  6:41 ` [PATCH v5 07/42] powerpc/powernv: Calculate PHB's DMA weight dynamically Gavin Shan
2015-06-04  6:41 ` [PATCH v5 08/42] powerpc/powernv: DMA32 cleanup Gavin Shan
2015-06-10  4:17   ` Alexey Kardashevskiy
2015-06-10  6:12     ` Gavin Shan
2015-06-04  6:41 ` [PATCH v5 09/42] powerpc/powernv: pnv_ioda_setup_dma() configure one PE only Gavin Shan
2015-06-04  6:41 ` [PATCH v5 10/42] powerpc/powernv: Trace DMA32 segments consumed by PE Gavin Shan
2015-06-04  6:41 ` [PATCH v5 11/42] powerpc/powernv: Increase PE# capacity Gavin Shan
2015-06-10  4:41   ` Alexey Kardashevskiy
2015-06-10  6:18     ` Gavin Shan
2015-06-04  6:41 ` [PATCH v5 12/42] powerpc/pci: Cleanup on pci_controller_ops Gavin Shan
2015-06-10  4:43   ` Alexey Kardashevskiy
2015-06-10  6:20     ` Gavin Shan
2015-06-04  6:41 ` [PATCH v5 13/42] powerpc/pci: Override pcibios_setup_bridge() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 14/42] powerpc/powernv: Allocate PE# in deasending order Gavin Shan
2015-06-04  6:41 ` [PATCH v5 15/42] powerpc/powernv: Reserve PE# for root bus Gavin Shan
2015-06-04  6:41 ` [PATCH v5 16/42] powerpc/powernv: Create PEs dynamically Gavin Shan
2015-06-04  6:41 ` [PATCH v5 17/42] powerpc/powernv: PE oriented during configuration Gavin Shan
2015-06-04  6:41 ` [PATCH v5 18/42] powerpc/powernv: Helper function pnv_ioda_init_pe() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 19/42] powerpc/powernv: Remove DMA32 list of PEs Gavin Shan
2015-06-04  6:41 ` [PATCH v5 20/42] powerpc/powernv: Rename pnv_ioda_get_pe() to pnv_ioda_dev_to_pe() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 21/42] powerpc/powernv: Drop pnv_ioda_setup_dev_PE() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 22/42] powerpc/powernv: Move functions around Gavin Shan
2015-06-04  6:41 ` [PATCH v5 23/42] powerpc/powernv: Cleanup on pnv_pci_ioda2_release_dma_pe() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 24/42] powerpc/powernv: Release PEs dynamically Gavin Shan
2015-06-04  6:41 ` [PATCH v5 25/42] powerpc/powernv: Supports slot ID Gavin Shan
2015-06-04  6:41 ` [PATCH v5 26/42] powerpc/powernv: Use PCI slot reset infrastructure Gavin Shan
2015-06-04  6:41 ` [PATCH v5 27/42] powerpc/powernv: Simplify pnv_eeh_reset() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 28/42] powerpc/powernv: Don't cover root bus in pnv_pci_reset_secondary_bus() Gavin Shan
2015-06-04  6:41 ` [PATCH v5 29/42] powerpc/powernv: Issue fundamental reset " Gavin Shan
2015-06-04  6:41 ` [PATCH v5 30/42] powerpc/pci: Don't scan empty slot Gavin Shan
2015-06-04  6:42 ` [PATCH v5 31/42] powerpc/pci: Move pcibios_find_pci_bus() around Gavin Shan
2015-06-05 19:47   ` Bjorn Helgaas
2015-06-09  6:10     ` Gavin Shan
2015-06-04  6:42 ` [PATCH v5 32/42] powerpc/powernv: Introduce pnv_pci_poll() Gavin Shan
2015-06-04  6:42 ` [PATCH v5 33/42] powerpc/powernv: Functions to get/reset PCI slot status Gavin Shan
2015-06-04  6:42 ` [PATCH v5 34/42] powerpc/pci: Delay creating pci_dn Gavin Shan
2015-06-04  6:42 ` [PATCH v5 35/42] powerpc/pci: Create eeh_dev while " Gavin Shan
2015-06-04  6:42 ` [PATCH v5 36/42] powerpc/pci: Export traverse_pci_device_nodes() Gavin Shan
2015-06-04  6:42 ` [PATCH v5 37/42] powerpc/pci: Update bridge windows on PCI plugging Gavin Shan
2015-06-04  6:42 ` [PATCH v5 38/42] powerpc/powernv: Select OF_OVERLAY Gavin Shan
2015-06-04  6:42 ` [PATCH v5 39/42] drivers/of: Unflatten nodes equal or deeper than specified level Gavin Shan
2015-06-30 17:47   ` Grant Likely
2015-06-04  6:42 ` [PATCH v5 40/42] drivers/of: Allow to specify root node in of_fdt_unflatten_tree() Gavin Shan
2015-06-30 18:06   ` Grant Likely
2015-06-30 21:46     ` Benjamin Herrenschmidt
2015-06-04  6:42 ` [PATCH v5 41/42] drivers/of: Return allocated memory chunk from of_fdt_unflatten_tree() Gavin Shan
2015-06-04  6:42 ` [PATCH v5 42/42] pci/hotplug: PowerPC PowerNV PCI hotplug driver Gavin Shan
2015-06-05 20:11   ` Bjorn Helgaas
2015-06-05 20:18     ` Benjamin Herrenschmidt
2015-06-09  6:10       ` Gavin Shan
2015-06-09  6:08     ` Gavin Shan
2015-06-30 18:18   ` Grant Likely
2015-07-01  0:51     ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1433400131-18429-4-git-send-email-gwshan@linux.vnet.ibm.com \
    --to=gwshan@linux.vnet.ibm.com \
    --cc=aik@ozlabs.ru \
    --cc=benh@kernel.crashing.org \
    --cc=bhelgaas@google.com \
    --cc=devicetree@vger.kernel.org \
    --cc=grant.likely@linaro.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=panto@antoniou-consulting.com \
    --cc=robherring2@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).