All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V9 00/11] VF EEH on Power8
@ 2015-07-17  6:02 Wei Yang
  2015-07-17  6:02 ` [PATCH V9 01/11] PCI/IOV: Rename and export virtfn_add/virtfn_remove Wei Yang
                   ` (10 more replies)
  0 siblings, 11 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

This patchset enables EEH on SRIOV VFs. The general idea is to create proper
VF edev and VF PE and handle them properly.

Different from the Bus PE, VF PE just contain one VF. This introduces the
difference of EEH error handling on a VF PE. Generally, it has several
differences.

First, the VF's removal and re-enumerate rely on its PF. VF has a tight
relationship between its PF. This is not proper to enumerate a VF by usual
scan procedure. That's why virtfn_add/virtfn_remove are exported in this patch
set.

Second, the reset/restore of a VF is done in kernel space. FW is not aware of
the VF, this means the usual reset function done in FW will not work. One of
the patch will imitate the reset/restore function in kernel space.

Third, the VF may be removed during the PF's error_detected function. In this
case, the original error_detected->slot_reset->resume sequence is not proper
to those removed VFs, since they are re-created by PF in a fresh state. A flag
in eeh_dev is introduce to mark the eeh_dev is in error state. By doing so, we
track whether this device needs to be reset or not.

This has been tested both on host and in guest on Power8 with latest kernel
version.

v9:
   * split pcibios_bus_add_device() into a separate patch
   * Bjorn acked the PCI part and agreed this patch set to be merged from ppc
     tree
   * rebased on mpe/linux.git next branch
v8:
   * fix on checking the return value of pnv_eeh_do_flr()
   * introduced a weak function pcibios_bus_add_device() to create PE for VFs
v7:
   * fix compile error when PCI_IOV is not set
v6:
   * code / commit log refactor by Gavin
v5:
   * remove the compound field, iterate on Master VF PE instead
   * some code refine on PCI config restore and reset on VF
     the wait time for assert and deassert
     PCI device address format
     check on edev->pcie_cap and edev->aer_cap before access them
v4:
   * refine the change logs, comment and code style
   * change pnv_pci_fixup_vf_eeh() to pnv_eeh_vf_final_fixup() and remove the
     CONFIG_PCI_IOV macro
   * reorder patch 5/6 to make the logic more reasonable
   * remove remove_dev_pci_data()
   * remove the EEH_DEV_VF flag, use edev->physfn to identify a VF EEH DEV and
     remove related CONFIG_PCI_IOV macro
   * add the option for VF reset
   * fix the pnv_eeh_cfg_blocked() logic
   * replace pnv_pci_cfg_{read,write} with eeh_ops->{read,write}_config in
     pnv_eeh_vf_restore_config()
   * rename pnv_eeh_vf_restore_config() to pnv_eeh_restore_vf_config()
   * rename pnv_pci_fixup_vf_caps() to pnv_pci_vf_header_fixup() and move it
     to arch/powerpc/platforms/powernv/pci.c
   * add a field compound in pnv_ioda_pe to link compound PEs
   * handle compound PE for VF PEs
v3:
   * add back vf_index in pci_dn to track the VF's index
   * rename ppdev in eeh_dev to physfn for consistency
   * move edev->physfn assignment before dev->dev.archdata.edev is set
   * move pnv_pci_fixup_vf_eeh() and pnv_pci_fixup_vf_caps() to eeh-powernv.c
   * more clear and detail in commit log and comment in code
   * merge eeh_rmv_virt_device() with eeh_rmv_device()
   * move the cfg_blocked check logic from pnv_eeh_read/write_config() to
     pnv_eeh_cfg_blocked()
   * move the vf reset/restore logic into its own patch, two patches are
     created.
     powerpc/powernv: Support PCI config restore for VFs
     powerpc/powernv: Support EEH reset for VFs
   * simplify the vf reset logic
v2:
   * add prefix pci_iov_ to virtfn_add/virtfn_remove
   * use EEH_DEV_VF as a flag for a VF's eeh_dev
   * use eeh_dev instead of edev in change log
   * remove vf_index in eeh_dev, calculate it from pdn->busno and devfn
   * do eeh_add_device_late() and eeh_sysfs_add_device() both after pci_dev is
     well initialized
   * do FLR to reset a VF PE
   * imitate the restore function in FW for VF
   * remove the reverse order patch, since it is still under discussion

Wei Yang (11):
  PCI/IOV: Rename and export virtfn_add/virtfn_remove
  PCI: Add pcibios_bus_add_device() weak function
  powerpc/pci: Cache VF index in pci_dn
  powerpc/pci: Remove VFs prior to PF
  powerpc/eeh: Cache only BARs, not windows or IOV BARs
  powerpc/powernv: EEH device for VF
  powerpc/eeh: Create PE for VFs
  powerpc/powernv: Support EEH reset for VF PE
  powerpc/powernv: Support PCI config restore for VFs
  powerpc/eeh: Support error recovery for VF PE
  powerpc/powernv: compound PE for VFs

 arch/powerpc/include/asm/eeh.h               |    4 +
 arch/powerpc/include/asm/pci-bridge.h        |    2 +
 arch/powerpc/kernel/eeh.c                    |    8 +
 arch/powerpc/kernel/eeh_cache.c              |    6 +-
 arch/powerpc/kernel/eeh_driver.c             |  100 +++++++++---
 arch/powerpc/kernel/eeh_pe.c                 |   13 +-
 arch/powerpc/kernel/pci-hotplug.c            |    2 +-
 arch/powerpc/kernel/pci_dn.c                 |   16 +-
 arch/powerpc/platforms/powernv/eeh-powernv.c |  220 +++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci-ioda.c    |   46 +++++-
 arch/powerpc/platforms/powernv/pci.c         |   35 +++-
 drivers/pci/bus.c                            |    3 +
 drivers/pci/iov.c                            |   10 +-
 include/linux/pci.h                          |    8 +
 14 files changed, 428 insertions(+), 45 deletions(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH V9 01/11] PCI/IOV: Rename and export virtfn_add/virtfn_remove
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-17  6:02 ` [PATCH V9 02/11] PCI: Add pcibios_bus_add_device() weak function Wei Yang
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

During EEH recovery, hotplug is applied to the devices which don't
have drivers or their drivers don't support EEH. However, the hotplug,
which was implemented based on PCI bus, can't be applied to VF directly.

The patch renames virtn_{add,remove}() and exports them so that they
can be used in PCI hotplug during EEH recovery.

[gwshan: changelog]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/iov.c   |   10 +++++-----
 include/linux/pci.h |    8 ++++++++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index ee0ebff..cc941dd 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -108,7 +108,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
 	return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
 }
 
-static int virtfn_add(struct pci_dev *dev, int id, int reset)
+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset)
 {
 	int i;
 	int rc = -ENOMEM;
@@ -183,7 +183,7 @@ failed:
 	return rc;
 }
 
-static void virtfn_remove(struct pci_dev *dev, int id, int reset)
+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset)
 {
 	char buf[VIRTFN_ID_LEN];
 	struct pci_dev *virtfn;
@@ -320,7 +320,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	}
 
 	for (i = 0; i < initial; i++) {
-		rc = virtfn_add(dev, i, 0);
+		rc = pci_iov_virtfn_add(dev, i, 0);
 		if (rc)
 			goto failed;
 	}
@@ -332,7 +332,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 failed:
 	for (j = 0; j < i; j++)
-		virtfn_remove(dev, j, 0);
+		pci_iov_virtfn_remove(dev, j, 0);
 
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
 	pci_cfg_access_lock(dev);
@@ -361,7 +361,7 @@ static void sriov_disable(struct pci_dev *dev)
 		return;
 
 	for (i = 0; i < iov->num_VFs; i++)
-		virtfn_remove(dev, i, 0);
+		pci_iov_virtfn_remove(dev, i, 0);
 
 	pcibios_sriov_disable(dev);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8a0321a..3fed437 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1668,6 +1668,8 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
 
 int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 void pci_disable_sriov(struct pci_dev *dev);
+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset);
+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset);
 int pci_num_vf(struct pci_dev *dev);
 int pci_vfs_assigned(struct pci_dev *dev);
 int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
@@ -1685,6 +1687,12 @@ static inline int pci_iov_virtfn_devfn(struct pci_dev *dev, int id)
 static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
 { return -ENODEV; }
 static inline void pci_disable_sriov(struct pci_dev *dev) { }
+static inline int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset)
+{
+	return -ENOSYS;
+}
+static inline void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset)
+{ }
 static inline int pci_num_vf(struct pci_dev *dev) { return 0; }
 static inline int pci_vfs_assigned(struct pci_dev *dev)
 { return 0; }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V9 02/11] PCI: Add pcibios_bus_add_device() weak function
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
  2015-07-17  6:02 ` [PATCH V9 01/11] PCI/IOV: Rename and export virtfn_add/virtfn_remove Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-17  6:02 ` [PATCH V9 03/11] powerpc/pci: Cache VF index in pci_dn Wei Yang
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

This patch adds a weak function pcibios_bus_add_device() for arch dependent
code could do proper setup. For example, powerpc could setup EEH related
resources.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/bus.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 6fbd3f2..b7e30a7 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -267,6 +267,7 @@ bool pci_bus_clip_resource(struct pci_dev *dev, int idx)
 
 void __weak pcibios_resource_survey_bus(struct pci_bus *bus) { }
 
+void __weak pcibios_bus_add_device(struct pci_dev *dev) { }
 /**
  * pci_bus_add_device - start driver for a single device
  * @dev: device to add
@@ -277,6 +278,8 @@ void pci_bus_add_device(struct pci_dev *dev)
 {
 	int retval;
 
+	pcibios_bus_add_device(dev);
+
 	/*
 	 * Can not put in pci_device_add yet because resources
 	 * are not assigned yet for some devices.
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V9 03/11] powerpc/pci: Cache VF index in pci_dn
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
  2015-07-17  6:02 ` [PATCH V9 01/11] PCI/IOV: Rename and export virtfn_add/virtfn_remove Wei Yang
  2015-07-17  6:02 ` [PATCH V9 02/11] PCI: Add pcibios_bus_add_device() weak function Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-17  6:02 ` [PATCH V9 04/11] powerpc/pci: Remove VFs prior to PF Wei Yang
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

The patch caches the VF index in pci_dn, which can be used to calculate
VF's bus, device and function number. Those information helps to locate
the VF's PCI device instance when doing hotplug during EEH recovery if
necessary.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci-bridge.h |    1 +
 arch/powerpc/kernel/pci_dn.c          |    4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 712add5..7a72f68 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -210,6 +210,7 @@ struct pci_dn {
 #define IODA_INVALID_PE		(-1)
 #ifdef CONFIG_PPC_POWERNV
 	int	pe_number;
+	int     vf_index;		/* VF index in the PF */
 #ifdef CONFIG_PCI_IOV
 	u16     vfs_expanded;		/* number of VFs IOV BAR expanded */
 	u16     num_vfs;		/* number of VFs enabled*/
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index b3b4df9..f771130 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -139,6 +139,7 @@ struct pci_dn *pci_get_pdn(struct pci_dev *pdev)
 #ifdef CONFIG_PCI_IOV
 static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 					   struct pci_dev *pdev,
+					   int vf_index,
 					   int busno, int devfn)
 {
 	struct pci_dn *pdn;
@@ -157,6 +158,7 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 	pdn->parent = parent;
 	pdn->busno = busno;
 	pdn->devfn = devfn;
+	pdn->vf_index = vf_index;
 #ifdef CONFIG_PPC_POWERNV
 	pdn->pe_number = IODA_INVALID_PE;
 #endif
@@ -196,7 +198,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 		return NULL;
 
 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
-		pdn = add_one_dev_pci_data(parent, NULL,
+		pdn = add_one_dev_pci_data(parent, NULL, i,
 					   pci_iov_virtfn_bus(pdev, i),
 					   pci_iov_virtfn_devfn(pdev, i));
 		if (!pdn) {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V9 04/11] powerpc/pci: Remove VFs prior to PF
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
                   ` (2 preceding siblings ...)
  2015-07-17  6:02 ` [PATCH V9 03/11] powerpc/pci: Cache VF index in pci_dn Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-17  6:02 ` [PATCH V9 05/11] powerpc/eeh: Cache only BARs, not windows or IOV BARs Wei Yang
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

As commit ac205b7bb72f ("PCI: make sriov work with hotplug remove") indicates,
VFs, which might be hooked to same PCI bus as their PF should be removed
before the PF. Otherwise, the PCI hot unplugging on the PCI bus would
cause kernel crash.

The patch applies the above pattern to PowerPC PCI hotplug path.

[gwshan: changelog]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/pci-hotplug.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 7f9ed0c..59c4361 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -55,7 +55,7 @@ void pcibios_remove_pci_devices(struct pci_bus *bus)
 
 	pr_debug("PCI: Removing devices on bus %04x:%02x\n",
 		 pci_domain_nr(bus),  bus->number);
-	list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) {
+	list_for_each_entry_safe_reverse(dev, tmp, &bus->devices, bus_list) {
 		pr_debug("   Removing %s...\n", pci_name(dev));
 		pci_stop_and_remove_bus_device(dev);
 	}
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V9 05/11] powerpc/eeh: Cache only BARs, not windows or IOV BARs
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
                   ` (3 preceding siblings ...)
  2015-07-17  6:02 ` [PATCH V9 04/11] powerpc/pci: Remove VFs prior to PF Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-17  6:02 ` [PATCH V9 06/11] powerpc/powernv: EEH device for VF Wei Yang
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

EEH address cache, which helps to locate the PCI device according to
the given (physical) MMIO address, didn't cover PCI bridges. Also, it
shouldn't return PF with address in PF's IOV BARs. Instead, the VFs
should be returned.

Also, by doing so, it removes the type check in
eeh_addr_cache_insert_dev(), since bridge's window would not be cached.

The patch restricts the address cache to cover first 7 BARs for the
above purposes.

[gwshan: changelog]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_cache.c |    6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index a1e86e1..e6887f0 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -196,7 +196,7 @@ static void __eeh_addr_cache_insert_dev(struct pci_dev *dev)
 	}
 
 	/* Walk resources on this device, poke them into the tree */
-	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+	for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
 		resource_size_t start = pci_resource_start(dev,i);
 		resource_size_t end = pci_resource_end(dev,i);
 		unsigned long flags = pci_resource_flags(dev,i);
@@ -222,10 +222,6 @@ void eeh_addr_cache_insert_dev(struct pci_dev *dev)
 {
 	unsigned long flags;
 
-	/* Ignore PCI bridges */
-	if ((dev->class >> 16) == PCI_BASE_CLASS_BRIDGE)
-		return;
-
 	spin_lock_irqsave(&pci_io_addr_cache_root.piar_lock, flags);
 	__eeh_addr_cache_insert_dev(dev);
 	spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V9 06/11] powerpc/powernv: EEH device for VF
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
                   ` (4 preceding siblings ...)
  2015-07-17  6:02 ` [PATCH V9 05/11] powerpc/eeh: Cache only BARs, not windows or IOV BARs Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-17  6:02 ` [PATCH V9 07/11] powerpc/eeh: Create PE for VFs Wei Yang
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

VFs and their corresponding pci_dn instances are created and released
dynamically as their PF's SRIOV capability is enabled and disabled.
The patch creates and releases EEH devices for VFs when creating and
releasing their pci_dn instances, which means EEH devices and pci_dn
instances have same life cycle. Also, VF's EEH device is identified
by (struct eeh_dev::physfn).

[gwshan: changelog and removed CONFIG_PCI_IOV]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h |    1 +
 arch/powerpc/kernel/pci_dn.c   |   12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index c5eb86f..6c383ad 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -140,6 +140,7 @@ struct eeh_dev {
 	struct pci_controller *phb;	/* Associated PHB		*/
 	struct pci_dn *pdn;		/* Associated PCI device node	*/
 	struct pci_dev *pdev;		/* Associated PCI device	*/
+	struct pci_dev *physfn;		/* Associated PF PORT		*/
 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
 };
 
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index f771130..f0ddde7 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -180,7 +180,9 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 {
 #ifdef CONFIG_PCI_IOV
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
 	struct pci_dn *parent, *pdn;
+	struct eeh_dev *edev;
 	int i;
 
 	/* Only support IOV for now */
@@ -206,6 +208,9 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 				 __func__, i);
 			return NULL;
 		}
+		eeh_dev_init(pdn, hose);
+		edev = pdn_to_eeh_dev(pdn);
+		edev->physfn = pdev;
 	}
 #endif /* CONFIG_PCI_IOV */
 
@@ -254,10 +259,17 @@ void remove_dev_pci_data(struct pci_dev *pdev)
 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
 		list_for_each_entry_safe(pdn, tmp,
 			&parent->child_list, list) {
+			struct eeh_dev *edev;
 			if (pdn->busno != pci_iov_virtfn_bus(pdev, i) ||
 			    pdn->devfn != pci_iov_virtfn_devfn(pdev, i))
 				continue;
 
+			edev = pdn_to_eeh_dev(pdn);
+			if (edev) {
+				pdn->edev = NULL;
+				kfree(edev);
+			}
+
 			if (!list_empty(&pdn->list))
 				list_del(&pdn->list);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V9 07/11] powerpc/eeh: Create PE for VFs
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
                   ` (5 preceding siblings ...)
  2015-07-17  6:02 ` [PATCH V9 06/11] powerpc/powernv: EEH device for VF Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-17  6:02 ` [PATCH V9 08/11] powerpc/powernv: Support EEH reset for VF PE Wei Yang
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

Current EEH recovery code works with the assumption: the PE has primary
bus. Unfortunately, that's not true for VF PEs, which generally contains
one or multiple VFs (for VF group case).

The patch creates PEs for VFs in the weak function
pcibios_bus_add_device(). Those PEs for VFs are identified with newly
introduced flag EEH_PE_VF so that we handle them differently during EEH
recovery.

[gwshan: changelog and code refactoring]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |    1 +
 arch/powerpc/kernel/eeh_pe.c                 |   10 ++++++++--
 arch/powerpc/platforms/powernv/eeh-powernv.c |   16 ++++++++++++++++
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 6c383ad..ec21f8f 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -72,6 +72,7 @@ struct pci_dn;
 #define EEH_PE_PHB	(1 << 1)	/* PHB PE    */
 #define EEH_PE_DEVICE 	(1 << 2)	/* Device PE */
 #define EEH_PE_BUS	(1 << 3)	/* Bus PE    */
+#define EEH_PE_VF	(1 << 4)	/* VF PE     */
 
 #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
 #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 35f0b62..260a701 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -299,7 +299,10 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
 	 * EEH device already having associated PE, but
 	 * the direct parent EEH device doesn't have yet.
 	 */
-	pdn = pdn ? pdn->parent : NULL;
+	if (edev->physfn)
+		pdn = pci_get_pdn(edev->physfn);
+	else
+		pdn = pdn ? pdn->parent : NULL;
 	while (pdn) {
 		/* We're poking out of PCI territory */
 		parent = pdn_to_eeh_dev(pdn);
@@ -382,7 +385,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 	}
 
 	/* Create a new EEH PE */
-	pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
+	if (edev->physfn)
+		pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
+	else
+		pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
 	if (!pe) {
 		pr_err("%s: out of memory!\n", __func__);
 		return -ENOMEM;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 5cf5e6e..e9aec1d 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1524,6 +1524,22 @@ static struct eeh_ops pnv_eeh_ops = {
 	.restore_config		= pnv_eeh_restore_config
 };
 
+void pcibios_bus_add_device(struct pci_dev *pdev)
+{
+	struct pci_dn *pdn = pci_get_pdn(pdev);
+
+	if (!pdev->is_virtfn)
+		return;
+
+	/*
+	 * The following operations will fail if VF's sysfs files
+	 * aren't created or its resources aren't finalized.
+	 */
+	eeh_add_device_early(pdn);
+	eeh_add_device_late(pdev);
+	eeh_sysfs_add_device(pdev);
+}
+
 /**
  * eeh_powernv_init - Register platform dependent EEH operations
  *
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V9 08/11] powerpc/powernv: Support EEH reset for VF PE
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
                   ` (6 preceding siblings ...)
  2015-07-17  6:02 ` [PATCH V9 07/11] powerpc/eeh: Create PE for VFs Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-17  6:02 ` [PATCH V9 09/11] powerpc/powernv: Support PCI config restore for VFs Wei Yang
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

PEs for VFs don't have primary bus. So they have to have their own reset
backend, which is used during EEH recovery. The patch implements the reset
backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained
in the PE.

[gwshan: changelog and code refactoring]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |    1 +
 arch/powerpc/platforms/powernv/eeh-powernv.c |  134 +++++++++++++++++++++++++-
 2 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index ec21f8f..331c856 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -136,6 +136,7 @@ struct eeh_dev {
 	int pcix_cap;			/* Saved PCIx capability	*/
 	int pcie_cap;			/* Saved PCIe capability	*/
 	int aer_cap;			/* Saved AER capability		*/
+	int af_cap;			/* Saved AF capability		*/
 	struct eeh_pe *pe;		/* Associated PE		*/
 	struct list_head list;		/* Form link list in the PE	*/
 	struct pci_controller *phb;	/* Associated PHB		*/
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index e9aec1d..8d88be1 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -404,6 +404,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX);
 	edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
 	edev->aer_cap  = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
+	edev->af_cap   = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF);
 	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
 		edev->mode |= EEH_DEV_BRIDGE;
 		if (edev->pcie_cap) {
@@ -893,6 +894,127 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 	return 0;
 }
 
+static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, int pos,
+				     u16 mask, bool af_flr_rst)
+{
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+	int status, i;
+
+	/* Wait for Transaction Pending bit to be cleared */
+	for (i = 0; i < 4; i++) {
+		eeh_ops->read_config(pdn, pos, 2, &status);
+		if (!(status & mask))
+			return;
+
+		msleep((1 << i) * 100);
+	}
+
+	pr_warn("%s: Pending transaction while issuing %s FLR to "
+		"%04x:%02x:%02x.%01x\n",
+		__func__, af_flr_rst ? "AF" : "",
+		edev->phb->global_number, pdn->busno,
+		PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+}
+
+static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
+{
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+	u32 reg;
+
+	if (!edev->pcie_cap)
+		return -ENOTTY;
+
+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &reg);
+	if (!(reg & PCI_EXP_DEVCAP_FLR))
+		return -ENOTTY;
+
+	switch (option) {
+	case EEH_RESET_HOT:
+	case EEH_RESET_FUNDAMENTAL:
+		pnv_eeh_wait_for_pending(pdn, edev->pcie_cap + PCI_EXP_DEVSTA,
+					 PCI_EXP_DEVSTA_TRPND, false);
+		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+				     4, &reg);
+		reg |= PCI_EXP_DEVCTL_BCR_FLR;
+		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+				      4, reg);
+		msleep(EEH_PE_RST_HOLD_TIME);
+		break;
+	case EEH_RESET_DEACTIVATE:
+		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+				     4, &reg);
+		reg &= ~PCI_EXP_DEVCTL_BCR_FLR;
+		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+				      4, reg);
+		msleep(EEH_PE_RST_SETTLE_TIME);
+		break;
+	}
+
+	return 0;
+}
+
+static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
+{
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+	u32 cap;
+
+	if (!edev->af_cap)
+		return -ENOTTY;
+
+	eeh_ops->read_config(pdn, edev->af_cap + PCI_AF_CAP, 1, &cap);
+	if (!(cap & PCI_AF_CAP_TP) || !(cap & PCI_AF_CAP_FLR))
+		return -ENOTTY;
+
+	switch (option) {
+	case EEH_RESET_HOT:
+	case EEH_RESET_FUNDAMENTAL:
+		/*
+		 * Wait for Transaction Pending bit to clear. A word-aligned
+		 * test is used, so we use the conrol offset rather than status
+		 * and shift the test bit to match.
+		 */
+		pnv_eeh_wait_for_pending(pdn, edev->af_cap + PCI_AF_CTRL,
+					 PCI_AF_STATUS_TP << 8, true);
+		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL,
+				      1, PCI_AF_CTRL_FLR);
+		msleep(EEH_PE_RST_HOLD_TIME);
+		break;
+	case EEH_RESET_DEACTIVATE:
+		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1, 0);
+		msleep(EEH_PE_RST_SETTLE_TIME);
+		break;
+	}
+
+	return 0;
+}
+
+static int pnv_eeh_reset_vf(struct pci_dn *pdn, int option)
+{
+	int ret;
+
+	ret = pnv_eeh_do_flr(pdn, option);
+	if (ret != -ENOTTY)
+		return ret;
+
+	return pnv_eeh_do_af_flr(pdn, option);
+}
+
+static int pnv_eeh_vf_pe_reset(struct eeh_pe *pe, int option)
+{
+	struct eeh_dev *edev, *tmp;
+	struct pci_dn *pdn;
+	int ret;
+
+	eeh_pe_for_each_dev(pe, edev, tmp) {
+		pdn = eeh_dev_to_pdn(edev);
+		ret = pnv_eeh_reset_vf(pdn, option);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 {
 	struct pci_controller *hose;
@@ -968,7 +1090,9 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
 		}
 
 		bus = eeh_pe_bus_get(pe);
-		if (pci_is_root_bus(bus) ||
+		if (pe->type & EEH_PE_VF)
+			ret = pnv_eeh_vf_pe_reset(pe, option);
+		else if (pci_is_root_bus(bus) ||
 			pci_is_root_bus(bus->parent))
 			ret = pnv_eeh_root_reset(hose, option);
 		else
@@ -1108,6 +1232,14 @@ static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
 	if (!edev || !edev->pe)
 		return false;
 
+	/*
+	 * We will issue FLR or AF FLR to all VFs, which are contained
+	 * in VF PE. It relies on the EEH PCI config accessors. So we
+	 * can't block them during the window.
+	 */
+	if ((edev->physfn) && (edev->pe->state & EEH_PE_RESET))
+		return false;
+
 	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
 		return true;
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V9 09/11] powerpc/powernv: Support PCI config restore for VFs
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
                   ` (7 preceding siblings ...)
  2015-07-17  6:02 ` [PATCH V9 08/11] powerpc/powernv: Support EEH reset for VF PE Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-17  6:02 ` [PATCH V9 10/11] powerpc/eeh: Support error recovery for VF PE Wei Yang
  2015-07-17  6:02 ` [PATCH V9 11/11] powerpc/powernv: compound PE for VFs Wei Yang
  10 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

After PE reset, OPAL API opal_pci_reinit() is called on all devices
contained in the PE to reinitialize them. However, VFs can't be seen
from skiboot firmware. We have to implement the functions, similar
those in skiboot firmware, to reinitialize VFs after reset on PE
for VFs.

[gwshan: changelog and code refactoring]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci-bridge.h        |    1 +
 arch/powerpc/platforms/powernv/eeh-powernv.c |   70 +++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci.c         |   18 +++++++
 3 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 7a72f68..c927d5b 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -220,6 +220,7 @@ struct pci_dn {
 #define IODA_INVALID_M64        (-1)
 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
 #endif /* CONFIG_PCI_IOV */
+	int	mps;
 #endif
 	struct list_head child_list;
 	struct list_head list;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 8d88be1..b09c0d1 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1616,6 +1616,67 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
 	return ret;
 }
 
+static int pnv_eeh_restore_vf_config(struct pci_dn *pdn)
+{
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+	u32 devctl, cmd, cap2, aer_capctl;
+	int old_mps;
+
+	/* Restore MPS */
+	if (edev->pcie_cap) {
+		old_mps = (ffs(pdn->mps) - 8) << 5;
+		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+				     2, &devctl);
+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
+		devctl |= old_mps;
+		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+				      2, devctl);
+	}
+
+	/* Disable Completion Timeout */
+	if (edev->pcie_cap) {
+		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP2,
+				     4, &cap2);
+		if (cap2 & 0x10) {
+			eeh_ops->read_config(pdn,
+					edev->pcie_cap + PCI_EXP_DEVCTL2,
+					4, &cap2);
+			cap2 |= 0x10;
+			eeh_ops->write_config(pdn,
+					edev->pcie_cap + PCI_EXP_DEVCTL2,
+					4, cap2);
+		}
+	}
+
+	/* Enable SERR and parity checking */
+	eeh_ops->read_config(pdn, PCI_COMMAND, 2, &cmd);
+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
+	eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd);
+
+	/* Enable report various errors */
+	if (edev->pcie_cap) {
+		eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+				2, &devctl);
+		devctl &= ~PCI_EXP_DEVCTL_CERE;
+		devctl |= (PCI_EXP_DEVCTL_NFERE |
+			   PCI_EXP_DEVCTL_FERE |
+			   PCI_EXP_DEVCTL_URRE);
+		eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL,
+				2, devctl);
+	}
+
+	/* Enable ECRC generation and check */
+	if (edev->pcie_cap && edev->aer_cap) {
+		eeh_ops->read_config(pdn, edev->aer_cap + PCI_ERR_CAP,
+				4, &aer_capctl);
+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
+		eeh_ops->write_config(pdn, edev->aer_cap + PCI_ERR_CAP,
+				4, aer_capctl);
+	}
+
+	return 0;
+}
+
 static int pnv_eeh_restore_config(struct pci_dn *pdn)
 {
 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
@@ -1626,7 +1687,14 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
 		return -EEXIST;
 
 	phb = edev->phb->private_data;
-	ret = opal_pci_reinit(phb->opal_id,
+	/*
+	 * We have to restore the PCI config space after reset since the
+	 * firmware can't see SRIOV VFs.
+	 */
+	if (edev->physfn)
+		ret = pnv_eeh_restore_vf_config(pdn);
+	else
+		ret = opal_pci_reinit(phb->opal_id,
 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
 	if (ret) {
 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 765d8ed..0e4f42e 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -788,6 +788,24 @@ static void pnv_p7ioc_rc_quirk(struct pci_dev *dev)
 }
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_IBM, 0x3b9, pnv_p7ioc_rc_quirk);
 
+#ifdef CONFIG_PCI_IOV
+static void pnv_pci_fixup_vf_mps(struct pci_dev *pdev)
+{
+	struct pci_dn *pdn = pci_get_pdn(pdev);
+	int parent_mps;
+
+	if (!pdev->is_virtfn)
+		return;
+
+	/* Synchronize MPS for VF and PF */
+	parent_mps = pcie_get_mps(pdev->physfn);
+	if ((128 << pdev->pcie_mpss) >= parent_mps)
+		pcie_set_mps(pdev, parent_mps);
+	pdn->mps = pcie_get_mps(pdev);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_mps);
+#endif /* CONFIG_PCI_IOV */
+
 void __init pnv_pci_init(void)
 {
 	struct device_node *np;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V9 10/11] powerpc/eeh: Support error recovery for VF PE
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
                   ` (8 preceding siblings ...)
  2015-07-17  6:02 ` [PATCH V9 09/11] powerpc/powernv: Support PCI config restore for VFs Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-17  6:02 ` [PATCH V9 11/11] powerpc/powernv: compound PE for VFs Wei Yang
  10 siblings, 0 replies; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

Different from PCI bus dependent PE, PE for VFs doesn't have the
primary bus, on which the PCI hotplug is implemented. The patch
supports error recovery, especially the PCI hotplug for VF's PE.
The hotplug on VF's PE is implemented based on VFs, instead of
PCI bus any more.

[gwshan: changelog and code refactoring]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h   |    1 +
 arch/powerpc/kernel/eeh.c        |    8 +++
 arch/powerpc/kernel/eeh_driver.c |  100 ++++++++++++++++++++++++++++++--------
 arch/powerpc/kernel/eeh_pe.c     |    3 +-
 4 files changed, 90 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 331c856..ea1f13c4 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -142,6 +142,7 @@ struct eeh_dev {
 	struct pci_controller *phb;	/* Associated PHB		*/
 	struct pci_dn *pdn;		/* Associated PCI device node	*/
 	struct pci_dev *pdev;		/* Associated PCI device	*/
+	int    in_error;		/* Error flag for eeh_dev	*/
 	struct pci_dev *physfn;		/* Associated PF PORT		*/
 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
 };
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index af9b597..28e4d73 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1227,6 +1227,14 @@ void eeh_remove_device(struct pci_dev *dev)
 	 * from the parent PE during the BAR resotre.
 	 */
 	edev->pdev = NULL;
+
+	/*
+	 * The flag "in_error" is used to trace EEH devices for VFs
+	 * in error state or not. It's set in eeh_report_error(). If
+	 * it's not set, eeh_report_{reset,resume}() won't be called
+	 * for the VF EEH device.
+	 */
+	edev->in_error = 0;
 	dev->dev.archdata.edev = NULL;
 	if (!(edev->pe->state & EEH_PE_KEEP))
 		eeh_rmv_from_parent_pe(edev);
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 89eb4bc..99868e2 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -211,6 +211,7 @@ static void *eeh_report_error(void *data, void *userdata)
 	if (rc == PCI_ERS_RESULT_NEED_RESET) *res = rc;
 	if (*res == PCI_ERS_RESULT_NONE) *res = rc;
 
+	edev->in_error = 1;
 	eeh_pcid_put(dev);
 	return NULL;
 }
@@ -282,7 +283,8 @@ static void *eeh_report_reset(void *data, void *userdata)
 
 	if (!driver->err_handler ||
 	    !driver->err_handler->slot_reset ||
-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
+	    (!edev->in_error)) {
 		eeh_pcid_put(dev);
 		return NULL;
 	}
@@ -339,14 +341,16 @@ static void *eeh_report_resume(void *data, void *userdata)
 
 	if (!driver->err_handler ||
 	    !driver->err_handler->resume ||
-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
+	    (!edev->in_error)) {
 		edev->mode &= ~EEH_DEV_NO_HANDLER;
-		eeh_pcid_put(dev);
-		return NULL;
+		goto out;
 	}
 
 	driver->err_handler->resume(dev);
 
+out:
+	edev->in_error = 0;
 	eeh_pcid_put(dev);
 	return NULL;
 }
@@ -386,12 +390,38 @@ static void *eeh_report_failure(void *data, void *userdata)
 	return NULL;
 }
 
+static void *eeh_add_virt_device(void *data, void *userdata)
+{
+	struct pci_driver *driver;
+	struct eeh_dev *edev = (struct eeh_dev *)data;
+	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
+	if (!(edev->physfn)) {
+		pr_warn("%s: EEH dev %04x:%02x:%02x.%01x not for VF\n",
+			__func__, edev->phb->global_number, pdn->busno,
+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+		return NULL;
+	}
+
+	driver = eeh_pcid_get(dev);
+	if (driver) {
+		eeh_pcid_put(dev);
+		if (driver->err_handler)
+			return NULL;
+	}
+
+	pci_iov_virtfn_add(edev->physfn, pdn->vf_index, 0);
+	return NULL;
+}
+
 static void *eeh_rmv_device(void *data, void *userdata)
 {
 	struct pci_driver *driver;
 	struct eeh_dev *edev = (struct eeh_dev *)data;
 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
 	int *removed = (int *)userdata;
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
 
 	/*
 	 * Actually, we should remove the PCI bridges as well.
@@ -416,7 +446,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
 	driver = eeh_pcid_get(dev);
 	if (driver) {
 		eeh_pcid_put(dev);
-		if (driver->err_handler)
+		if (removed && driver->err_handler)
 			return NULL;
 	}
 
@@ -425,11 +455,23 @@ static void *eeh_rmv_device(void *data, void *userdata)
 		 pci_name(dev));
 	edev->bus = dev->bus;
 	edev->mode |= EEH_DEV_DISCONNECTED;
-	(*removed)++;
+	if (removed)
+		(*removed)++;
 
-	pci_lock_rescan_remove();
-	pci_stop_and_remove_bus_device(dev);
-	pci_unlock_rescan_remove();
+	if (edev->physfn) {
+		pci_iov_virtfn_remove(edev->physfn, pdn->vf_index, 0);
+		edev->pdev = NULL;
+
+		/*
+		 * We have to set the VF PE number to invalid one, which is
+		 * required to plug the VF successfully.
+		 */
+		pdn->pe_number = IODA_INVALID_PE;
+	} else {
+		pci_lock_rescan_remove();
+		pci_stop_and_remove_bus_device(dev);
+		pci_unlock_rescan_remove();
+	}
 
 	return NULL;
 }
@@ -548,6 +590,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	struct pci_bus *frozen_bus = eeh_pe_bus_get(pe);
 	struct timeval tstamp;
 	int cnt, rc, removed = 0;
+	struct eeh_dev *edev;
 
 	/* pcibios will clear the counter; save the value */
 	cnt = pe->freeze_count;
@@ -561,12 +604,15 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	 */
 	eeh_pe_state_mark(pe, EEH_PE_KEEP);
 	if (bus) {
-		pci_lock_rescan_remove();
-		pcibios_remove_pci_devices(bus);
-		pci_unlock_rescan_remove();
-	} else if (frozen_bus) {
+		if (pe->type & EEH_PE_VF)
+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
+		else {
+			pci_lock_rescan_remove();
+			pcibios_remove_pci_devices(bus);
+			pci_unlock_rescan_remove();
+		}
+	} else if (frozen_bus)
 		eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
-	}
 
 	/*
 	 * Reset the pci controller. (Asserts RST#; resets config space).
@@ -607,14 +653,22 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 		 * PE. We should disconnect it so the binding can be
 		 * rebuilt when adding PCI devices.
 		 */
+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
-		pcibios_add_pci_devices(bus);
+		if (pe->type & EEH_PE_VF)
+			eeh_add_virt_device(edev, NULL);
+		else
+			pcibios_add_pci_devices(bus);
 	} else if (frozen_bus && removed) {
 		pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
 		ssleep(5);
 
+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
-		pcibios_add_pci_devices(frozen_bus);
+		if (pe->type & EEH_PE_VF)
+			eeh_add_virt_device(edev, NULL);
+		else
+			pcibios_add_pci_devices(frozen_bus);
 	}
 	eeh_pe_state_clear(pe, EEH_PE_KEEP);
 
@@ -792,11 +846,15 @@ perm_error:
 	 * the their PCI config any more.
 	 */
 	if (frozen_bus) {
-		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
-
-		pci_lock_rescan_remove();
-		pcibios_remove_pci_devices(frozen_bus);
-		pci_unlock_rescan_remove();
+		if (pe->type & EEH_PE_VF) {
+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+		} else {
+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+			pci_lock_rescan_remove();
+			pcibios_remove_pci_devices(frozen_bus);
+			pci_unlock_rescan_remove();
+		}
 	}
 }
 
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 260a701..5cde950 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -914,7 +914,8 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe)
 	if (pe->type & EEH_PE_PHB) {
 		bus = pe->phb->bus;
 	} else if (pe->type & EEH_PE_BUS ||
-		   pe->type & EEH_PE_DEVICE) {
+		   pe->type & EEH_PE_DEVICE ||
+		   pe->type & EEH_PE_VF) {
 		if (pe->bus) {
 			bus = pe->bus;
 			goto out;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
                   ` (9 preceding siblings ...)
  2015-07-17  6:02 ` [PATCH V9 10/11] powerpc/eeh: Support error recovery for VF PE Wei Yang
@ 2015-07-17  6:02 ` Wei Yang
  2015-07-29  3:17   ` Wei Yang
  10 siblings, 1 reply; 22+ messages in thread
From: Wei Yang @ 2015-07-17  6:02 UTC (permalink / raw)
  To: gwshan, bhelgaas, mpe; +Cc: linuxppc-dev, linux-pci, Wei Yang

When VF BAR size is larger than 64MB, we group VFs in terms of M64 BAR,
which means those VFs in a group should form a compound PE.

This patch links those VF PEs into compound PE in this case.

[gwshan: code refactoring for a bit]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |   46 +++++++++++++++++++++++++----
 arch/powerpc/platforms/powernv/pci.c      |   17 +++++++++--
 2 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 5738d31..d1530cb 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1359,9 +1359,20 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 	}
 
 	list_for_each_entry_safe(pe, pe_n, &phb->ioda.pe_list, list) {
+		struct pnv_ioda_pe *s, *sn;
 		if (pe->parent_dev != pdev)
 			continue;
 
+		if ((pe->flags & PNV_IODA_PE_MASTER) &&
+		    (pe->flags & PNV_IODA_PE_VF)) {
+			list_for_each_entry_safe(s, sn, &pe->slaves, list) {
+				pnv_pci_ioda2_release_dma_pe(pdev, s);
+				list_del(&s->list);
+				pnv_ioda_deconfigure_pe(phb, s);
+				pnv_ioda_free_pe(phb, s->pe_number);
+			}
+		}
+
 		pnv_pci_ioda2_release_dma_pe(pdev, pe);
 
 		/* Remove from list */
@@ -1414,7 +1425,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 	struct pci_bus        *bus;
 	struct pci_controller *hose;
 	struct pnv_phb        *phb;
-	struct pnv_ioda_pe    *pe;
+	struct pnv_ioda_pe    *pe, *master_pe;
 	int                    pe_num;
 	u16                    vf_index;
 	struct pci_dn         *pdn;
@@ -1456,10 +1467,13 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 			continue;
 		}
 
-		/* Put PE to the list */
-		mutex_lock(&phb->ioda.pe_list_mutex);
-		list_add_tail(&pe->list, &phb->ioda.pe_list);
-		mutex_unlock(&phb->ioda.pe_list_mutex);
+		/* Put PE to the list, or postpone it for compound PEs */
+		if ((pdn->m64_per_iov != M64_PER_IOV) ||
+		    (num_vfs <= M64_PER_IOV)) {
+			mutex_lock(&phb->ioda.pe_list_mutex);
+			list_add_tail(&pe->list, &phb->ioda.pe_list);
+			mutex_unlock(&phb->ioda.pe_list_mutex);
+		}
 
 		pnv_pci_ioda2_setup_dma_pe(phb, pe);
 	}
@@ -1472,10 +1486,32 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 		vf_per_group = roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
 
 		for (vf_group = 0; vf_group < M64_PER_IOV; vf_group++) {
+			master_pe = NULL;
+
 			for (vf_index = vf_group * vf_per_group;
 			     vf_index < (vf_group + 1) * vf_per_group &&
 			     vf_index < num_vfs;
 			     vf_index++) {
+
+				/*
+				 * Figure out the master PE and put all slave
+				 * PEs to master PE's list.
+				 */
+				pe = &phb->ioda.pe_array[pdn->offset + vf_index];
+				if (!master_pe) {
+					pe->flags |= PNV_IODA_PE_MASTER;
+					INIT_LIST_HEAD(&pe->slaves);
+					master_pe = pe;
+					mutex_lock(&phb->ioda.pe_list_mutex);
+					list_add_tail(&pe->list, &phb->ioda.pe_list);
+					mutex_unlock(&phb->ioda.pe_list_mutex);
+				} else {
+					pe->flags |= PNV_IODA_PE_SLAVE;
+					pe->master = master_pe;
+					list_add_tail(&pe->list,
+						&master_pe->slaves);
+				}
+
 				for (vf_index1 = vf_group * vf_per_group;
 				     vf_index1 < (vf_group + 1) * vf_per_group &&
 				     vf_index1 < num_vfs;
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 0e4f42e..f3aead0 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -739,7 +739,7 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
 	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
 	struct pnv_phb *phb = hose->private_data;
 #ifdef CONFIG_PCI_IOV
-	struct pnv_ioda_pe *pe;
+	struct pnv_ioda_pe *pe, *slave;
 	struct pci_dn *pdn;
 
 	/* Fix the VF pdn PE number */
@@ -751,10 +751,23 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
 			    (pdev->devfn & 0xff))) {
 				pdn->pe_number = pe->pe_number;
 				pe->pdev = pdev;
-				break;
+				goto found;
+			}
+
+			if ((pe->flags & PNV_IODA_PE_MASTER) &&
+			    (pe->flags & PNV_IODA_PE_VF)) {
+				list_for_each_entry(slave, &pe->slaves, list) {
+					if (slave->rid == ((pdev->bus->number << 8)
+					   | (pdev->devfn & 0xff))) {
+						pdn->pe_number = slave->pe_number;
+						slave->pdev = pdev;
+						goto found;
+					}
+				}
 			}
 		}
 	}
+found:
 #endif /* CONFIG_PCI_IOV */
 
 	if (phb && phb->dma_dev_setup)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-07-17  6:02 ` [PATCH V9 11/11] powerpc/powernv: compound PE for VFs Wei Yang
@ 2015-07-29  3:17   ` Wei Yang
  2015-09-09  2:48     ` Gavin Shan
  0 siblings, 1 reply; 22+ messages in thread
From: Wei Yang @ 2015-07-29  3:17 UTC (permalink / raw)
  To: Wei Yang; +Cc: gwshan, bhelgaas, mpe, linuxppc-dev, linux-pci

Hi, Michael

Hope you didn't take this yet. We may change this patch a little.

On Fri, Jul 17, 2015 at 02:02:41PM +0800, Wei Yang wrote:
>When VF BAR size is larger than 64MB, we group VFs in terms of M64 BAR,
>which means those VFs in a group should form a compound PE.
>
>This patch links those VF PEs into compound PE in this case.
>
>[gwshan: code refactoring for a bit]
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>---
> arch/powerpc/platforms/powernv/pci-ioda.c |   46 +++++++++++++++++++++++++----
> arch/powerpc/platforms/powernv/pci.c      |   17 +++++++++--
> 2 files changed, 56 insertions(+), 7 deletions(-)
>
>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>index 5738d31..d1530cb 100644
>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>@@ -1359,9 +1359,20 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
> 	}
>
> 	list_for_each_entry_safe(pe, pe_n, &phb->ioda.pe_list, list) {
>+		struct pnv_ioda_pe *s, *sn;
> 		if (pe->parent_dev != pdev)
> 			continue;
>
>+		if ((pe->flags & PNV_IODA_PE_MASTER) &&
>+		    (pe->flags & PNV_IODA_PE_VF)) {
>+			list_for_each_entry_safe(s, sn, &pe->slaves, list) {
>+				pnv_pci_ioda2_release_dma_pe(pdev, s);
>+				list_del(&s->list);
>+				pnv_ioda_deconfigure_pe(phb, s);
>+				pnv_ioda_free_pe(phb, s->pe_number);
>+			}
>+		}
>+
> 		pnv_pci_ioda2_release_dma_pe(pdev, pe);
>
> 		/* Remove from list */
>@@ -1414,7 +1425,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
> 	struct pci_bus        *bus;
> 	struct pci_controller *hose;
> 	struct pnv_phb        *phb;
>-	struct pnv_ioda_pe    *pe;
>+	struct pnv_ioda_pe    *pe, *master_pe;
> 	int                    pe_num;
> 	u16                    vf_index;
> 	struct pci_dn         *pdn;
>@@ -1456,10 +1467,13 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
> 			continue;
> 		}
>
>-		/* Put PE to the list */
>-		mutex_lock(&phb->ioda.pe_list_mutex);
>-		list_add_tail(&pe->list, &phb->ioda.pe_list);
>-		mutex_unlock(&phb->ioda.pe_list_mutex);
>+		/* Put PE to the list, or postpone it for compound PEs */
>+		if ((pdn->m64_per_iov != M64_PER_IOV) ||
>+		    (num_vfs <= M64_PER_IOV)) {
>+			mutex_lock(&phb->ioda.pe_list_mutex);
>+			list_add_tail(&pe->list, &phb->ioda.pe_list);
>+			mutex_unlock(&phb->ioda.pe_list_mutex);
>+		}
>
> 		pnv_pci_ioda2_setup_dma_pe(phb, pe);
> 	}
>@@ -1472,10 +1486,32 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
> 		vf_per_group = roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
>
> 		for (vf_group = 0; vf_group < M64_PER_IOV; vf_group++) {
>+			master_pe = NULL;
>+
> 			for (vf_index = vf_group * vf_per_group;
> 			     vf_index < (vf_group + 1) * vf_per_group &&
> 			     vf_index < num_vfs;
> 			     vf_index++) {
>+
>+				/*
>+				 * Figure out the master PE and put all slave
>+				 * PEs to master PE's list.
>+				 */
>+				pe = &phb->ioda.pe_array[pdn->offset + vf_index];
>+				if (!master_pe) {
>+					pe->flags |= PNV_IODA_PE_MASTER;
>+					INIT_LIST_HEAD(&pe->slaves);
>+					master_pe = pe;
>+					mutex_lock(&phb->ioda.pe_list_mutex);
>+					list_add_tail(&pe->list, &phb->ioda.pe_list);
>+					mutex_unlock(&phb->ioda.pe_list_mutex);
>+				} else {
>+					pe->flags |= PNV_IODA_PE_SLAVE;
>+					pe->master = master_pe;
>+					list_add_tail(&pe->list,
>+						&master_pe->slaves);
>+				}
>+
> 				for (vf_index1 = vf_group * vf_per_group;
> 				     vf_index1 < (vf_group + 1) * vf_per_group &&
> 				     vf_index1 < num_vfs;
>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
>index 0e4f42e..f3aead0 100644
>--- a/arch/powerpc/platforms/powernv/pci.c
>+++ b/arch/powerpc/platforms/powernv/pci.c
>@@ -739,7 +739,7 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
> 	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> 	struct pnv_phb *phb = hose->private_data;
> #ifdef CONFIG_PCI_IOV
>-	struct pnv_ioda_pe *pe;
>+	struct pnv_ioda_pe *pe, *slave;
> 	struct pci_dn *pdn;
>
> 	/* Fix the VF pdn PE number */
>@@ -751,10 +751,23 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
> 			    (pdev->devfn & 0xff))) {
> 				pdn->pe_number = pe->pe_number;
> 				pe->pdev = pdev;
>-				break;
>+				goto found;
>+			}
>+
>+			if ((pe->flags & PNV_IODA_PE_MASTER) &&
>+			    (pe->flags & PNV_IODA_PE_VF)) {
>+				list_for_each_entry(slave, &pe->slaves, list) {
>+					if (slave->rid == ((pdev->bus->number << 8)
>+					   | (pdev->devfn & 0xff))) {
>+						pdn->pe_number = slave->pe_number;
>+						slave->pdev = pdev;
>+						goto found;
>+					}
>+				}
> 			}
> 		}
> 	}
>+found:
> #endif /* CONFIG_PCI_IOV */
>
> 	if (phb && phb->dma_dev_setup)
>-- 
>1.7.9.5

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-07-29  3:17   ` Wei Yang
@ 2015-09-09  2:48     ` Gavin Shan
  2015-09-09  3:36       ` Richard Yang
  0 siblings, 1 reply; 22+ messages in thread
From: Gavin Shan @ 2015-09-09  2:48 UTC (permalink / raw)
  To: Wei Yang; +Cc: gwshan, bhelgaas, mpe, linuxppc-dev, linux-pci, aik

On Wed, Jul 29, 2015 at 11:17:18AM +0800, Wei Yang wrote:
>Hi, Michael
>
>Hope you didn't take this yet. We may change this patch a little.
>

[Cc Alexey who might concern the SRIOV status]

Richard, do you have plan to get it upstream? It seems it's hanged
over here for long time.

>On Fri, Jul 17, 2015 at 02:02:41PM +0800, Wei Yang wrote:
>>When VF BAR size is larger than 64MB, we group VFs in terms of M64 BAR,
>>which means those VFs in a group should form a compound PE.
>>
>>This patch links those VF PEs into compound PE in this case.
>>
>>[gwshan: code refactoring for a bit]
>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>---
>> arch/powerpc/platforms/powernv/pci-ioda.c |   46 +++++++++++++++++++++++++----
>> arch/powerpc/platforms/powernv/pci.c      |   17 +++++++++--
>> 2 files changed, 56 insertions(+), 7 deletions(-)
>>
>>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>index 5738d31..d1530cb 100644
>>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>@@ -1359,9 +1359,20 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>> 	}
>>
>> 	list_for_each_entry_safe(pe, pe_n, &phb->ioda.pe_list, list) {
>>+		struct pnv_ioda_pe *s, *sn;
>> 		if (pe->parent_dev != pdev)
>> 			continue;
>>
>>+		if ((pe->flags & PNV_IODA_PE_MASTER) &&
>>+		    (pe->flags & PNV_IODA_PE_VF)) {
>>+			list_for_each_entry_safe(s, sn, &pe->slaves, list) {
>>+				pnv_pci_ioda2_release_dma_pe(pdev, s);
>>+				list_del(&s->list);
>>+				pnv_ioda_deconfigure_pe(phb, s);
>>+				pnv_ioda_free_pe(phb, s->pe_number);
>>+			}
>>+		}
>>+
>> 		pnv_pci_ioda2_release_dma_pe(pdev, pe);
>>
>> 		/* Remove from list */
>>@@ -1414,7 +1425,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>> 	struct pci_bus        *bus;
>> 	struct pci_controller *hose;
>> 	struct pnv_phb        *phb;
>>-	struct pnv_ioda_pe    *pe;
>>+	struct pnv_ioda_pe    *pe, *master_pe;
>> 	int                    pe_num;
>> 	u16                    vf_index;
>> 	struct pci_dn         *pdn;
>>@@ -1456,10 +1467,13 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>> 			continue;
>> 		}
>>
>>-		/* Put PE to the list */
>>-		mutex_lock(&phb->ioda.pe_list_mutex);
>>-		list_add_tail(&pe->list, &phb->ioda.pe_list);
>>-		mutex_unlock(&phb->ioda.pe_list_mutex);
>>+		/* Put PE to the list, or postpone it for compound PEs */
>>+		if ((pdn->m64_per_iov != M64_PER_IOV) ||
>>+		    (num_vfs <= M64_PER_IOV)) {
>>+			mutex_lock(&phb->ioda.pe_list_mutex);
>>+			list_add_tail(&pe->list, &phb->ioda.pe_list);
>>+			mutex_unlock(&phb->ioda.pe_list_mutex);
>>+		}
>>
>> 		pnv_pci_ioda2_setup_dma_pe(phb, pe);
>> 	}
>>@@ -1472,10 +1486,32 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>> 		vf_per_group = roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
>>
>> 		for (vf_group = 0; vf_group < M64_PER_IOV; vf_group++) {
>>+			master_pe = NULL;
>>+
>> 			for (vf_index = vf_group * vf_per_group;
>> 			     vf_index < (vf_group + 1) * vf_per_group &&
>> 			     vf_index < num_vfs;
>> 			     vf_index++) {
>>+
>>+				/*
>>+				 * Figure out the master PE and put all slave
>>+				 * PEs to master PE's list.
>>+				 */
>>+				pe = &phb->ioda.pe_array[pdn->offset + vf_index];
>>+				if (!master_pe) {
>>+					pe->flags |= PNV_IODA_PE_MASTER;
>>+					INIT_LIST_HEAD(&pe->slaves);
>>+					master_pe = pe;
>>+					mutex_lock(&phb->ioda.pe_list_mutex);
>>+					list_add_tail(&pe->list, &phb->ioda.pe_list);
>>+					mutex_unlock(&phb->ioda.pe_list_mutex);
>>+				} else {
>>+					pe->flags |= PNV_IODA_PE_SLAVE;
>>+					pe->master = master_pe;
>>+					list_add_tail(&pe->list,
>>+						&master_pe->slaves);
>>+				}
>>+
>> 				for (vf_index1 = vf_group * vf_per_group;
>> 				     vf_index1 < (vf_group + 1) * vf_per_group &&
>> 				     vf_index1 < num_vfs;
>>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
>>index 0e4f42e..f3aead0 100644
>>--- a/arch/powerpc/platforms/powernv/pci.c
>>+++ b/arch/powerpc/platforms/powernv/pci.c
>>@@ -739,7 +739,7 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>> 	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
>> 	struct pnv_phb *phb = hose->private_data;
>> #ifdef CONFIG_PCI_IOV
>>-	struct pnv_ioda_pe *pe;
>>+	struct pnv_ioda_pe *pe, *slave;
>> 	struct pci_dn *pdn;
>>
>> 	/* Fix the VF pdn PE number */
>>@@ -751,10 +751,23 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>> 			    (pdev->devfn & 0xff))) {
>> 				pdn->pe_number = pe->pe_number;
>> 				pe->pdev = pdev;
>>-				break;
>>+				goto found;
>>+			}
>>+
>>+			if ((pe->flags & PNV_IODA_PE_MASTER) &&
>>+			    (pe->flags & PNV_IODA_PE_VF)) {
>>+				list_for_each_entry(slave, &pe->slaves, list) {
>>+					if (slave->rid == ((pdev->bus->number << 8)
>>+					   | (pdev->devfn & 0xff))) {
>>+						pdn->pe_number = slave->pe_number;
>>+						slave->pdev = pdev;
>>+						goto found;
>>+					}
>>+				}
>> 			}
>> 		}
>> 	}
>>+found:
>> #endif /* CONFIG_PCI_IOV */
>>
>> 	if (phb && phb->dma_dev_setup)
>>-- 
>>1.7.9.5
>
>-- 
>Richard Yang
>Help you, Help me


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-09-09  2:48     ` Gavin Shan
@ 2015-09-09  3:36       ` Richard Yang
  2015-09-09  3:52         ` Gavin Shan
                           ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Richard Yang @ 2015-09-09  3:36 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Wei Yang, bhelgaas, mpe, linuxppc-dev, linux-pci, aik

On Wed, Sep 09, 2015 at 12:48:21PM +1000, Gavin Shan wrote:
>On Wed, Jul 29, 2015 at 11:17:18AM +0800, Wei Yang wrote:
>>Hi, Michael
>>
>>Hope you didn't take this yet. We may change this patch a little.
>>
>
>[Cc Alexey who might concern the SRIOV status]
>
>Richard, do you have plan to get it upstream? It seems it's hanged
>over here for long time.
>

The VF EEH is hung since we re-designed the SRIOV. After the re-design, we
don't have VF groups.

My plan is to push the VF EEH patch set after the SRIOV Redesign is accepted.

>>On Fri, Jul 17, 2015 at 02:02:41PM +0800, Wei Yang wrote:
>>>When VF BAR size is larger than 64MB, we group VFs in terms of M64 BAR,
>>>which means those VFs in a group should form a compound PE.
>>>
>>>This patch links those VF PEs into compound PE in this case.
>>>
>>>[gwshan: code refactoring for a bit]
>>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>>Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>>---
>>> arch/powerpc/platforms/powernv/pci-ioda.c |   46 +++++++++++++++++++++++++----
>>> arch/powerpc/platforms/powernv/pci.c      |   17 +++++++++--
>>> 2 files changed, 56 insertions(+), 7 deletions(-)
>>>
>>>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>>index 5738d31..d1530cb 100644
>>>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>>@@ -1359,9 +1359,20 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>> 	}
>>>
>>> 	list_for_each_entry_safe(pe, pe_n, &phb->ioda.pe_list, list) {
>>>+		struct pnv_ioda_pe *s, *sn;
>>> 		if (pe->parent_dev != pdev)
>>> 			continue;
>>>
>>>+		if ((pe->flags & PNV_IODA_PE_MASTER) &&
>>>+		    (pe->flags & PNV_IODA_PE_VF)) {
>>>+			list_for_each_entry_safe(s, sn, &pe->slaves, list) {
>>>+				pnv_pci_ioda2_release_dma_pe(pdev, s);
>>>+				list_del(&s->list);
>>>+				pnv_ioda_deconfigure_pe(phb, s);
>>>+				pnv_ioda_free_pe(phb, s->pe_number);
>>>+			}
>>>+		}
>>>+
>>> 		pnv_pci_ioda2_release_dma_pe(pdev, pe);
>>>
>>> 		/* Remove from list */
>>>@@ -1414,7 +1425,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>> 	struct pci_bus        *bus;
>>> 	struct pci_controller *hose;
>>> 	struct pnv_phb        *phb;
>>>-	struct pnv_ioda_pe    *pe;
>>>+	struct pnv_ioda_pe    *pe, *master_pe;
>>> 	int                    pe_num;
>>> 	u16                    vf_index;
>>> 	struct pci_dn         *pdn;
>>>@@ -1456,10 +1467,13 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>> 			continue;
>>> 		}
>>>
>>>-		/* Put PE to the list */
>>>-		mutex_lock(&phb->ioda.pe_list_mutex);
>>>-		list_add_tail(&pe->list, &phb->ioda.pe_list);
>>>-		mutex_unlock(&phb->ioda.pe_list_mutex);
>>>+		/* Put PE to the list, or postpone it for compound PEs */
>>>+		if ((pdn->m64_per_iov != M64_PER_IOV) ||
>>>+		    (num_vfs <= M64_PER_IOV)) {
>>>+			mutex_lock(&phb->ioda.pe_list_mutex);
>>>+			list_add_tail(&pe->list, &phb->ioda.pe_list);
>>>+			mutex_unlock(&phb->ioda.pe_list_mutex);
>>>+		}
>>>
>>> 		pnv_pci_ioda2_setup_dma_pe(phb, pe);
>>> 	}
>>>@@ -1472,10 +1486,32 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>> 		vf_per_group = roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
>>>
>>> 		for (vf_group = 0; vf_group < M64_PER_IOV; vf_group++) {
>>>+			master_pe = NULL;
>>>+
>>> 			for (vf_index = vf_group * vf_per_group;
>>> 			     vf_index < (vf_group + 1) * vf_per_group &&
>>> 			     vf_index < num_vfs;
>>> 			     vf_index++) {
>>>+
>>>+				/*
>>>+				 * Figure out the master PE and put all slave
>>>+				 * PEs to master PE's list.
>>>+				 */
>>>+				pe = &phb->ioda.pe_array[pdn->offset + vf_index];
>>>+				if (!master_pe) {
>>>+					pe->flags |= PNV_IODA_PE_MASTER;
>>>+					INIT_LIST_HEAD(&pe->slaves);
>>>+					master_pe = pe;
>>>+					mutex_lock(&phb->ioda.pe_list_mutex);
>>>+					list_add_tail(&pe->list, &phb->ioda.pe_list);
>>>+					mutex_unlock(&phb->ioda.pe_list_mutex);
>>>+				} else {
>>>+					pe->flags |= PNV_IODA_PE_SLAVE;
>>>+					pe->master = master_pe;
>>>+					list_add_tail(&pe->list,
>>>+						&master_pe->slaves);
>>>+				}
>>>+
>>> 				for (vf_index1 = vf_group * vf_per_group;
>>> 				     vf_index1 < (vf_group + 1) * vf_per_group &&
>>> 				     vf_index1 < num_vfs;
>>>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
>>>index 0e4f42e..f3aead0 100644
>>>--- a/arch/powerpc/platforms/powernv/pci.c
>>>+++ b/arch/powerpc/platforms/powernv/pci.c
>>>@@ -739,7 +739,7 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>>> 	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
>>> 	struct pnv_phb *phb = hose->private_data;
>>> #ifdef CONFIG_PCI_IOV
>>>-	struct pnv_ioda_pe *pe;
>>>+	struct pnv_ioda_pe *pe, *slave;
>>> 	struct pci_dn *pdn;
>>>
>>> 	/* Fix the VF pdn PE number */
>>>@@ -751,10 +751,23 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>>> 			    (pdev->devfn & 0xff))) {
>>> 				pdn->pe_number = pe->pe_number;
>>> 				pe->pdev = pdev;
>>>-				break;
>>>+				goto found;
>>>+			}
>>>+
>>>+			if ((pe->flags & PNV_IODA_PE_MASTER) &&
>>>+			    (pe->flags & PNV_IODA_PE_VF)) {
>>>+				list_for_each_entry(slave, &pe->slaves, list) {
>>>+					if (slave->rid == ((pdev->bus->number << 8)
>>>+					   | (pdev->devfn & 0xff))) {
>>>+						pdn->pe_number = slave->pe_number;
>>>+						slave->pdev = pdev;
>>>+						goto found;
>>>+					}
>>>+				}
>>> 			}
>>> 		}
>>> 	}
>>>+found:
>>> #endif /* CONFIG_PCI_IOV */
>>>
>>> 	if (phb && phb->dma_dev_setup)
>>>-- 
>>>1.7.9.5
>>
>>-- 
>>Richard Yang
>>Help you, Help me

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-09-09  3:36       ` Richard Yang
@ 2015-09-09  3:52         ` Gavin Shan
  2015-09-09  6:00           ` Richard Yang
  2015-09-09  4:59         ` Benjamin Herrenschmidt
  2015-09-09  5:22         ` Alexey Kardashevskiy
  2 siblings, 1 reply; 22+ messages in thread
From: Gavin Shan @ 2015-09-09  3:52 UTC (permalink / raw)
  To: Richard Yang; +Cc: Gavin Shan, bhelgaas, mpe, linuxppc-dev, linux-pci, aik

On Wed, Sep 09, 2015 at 11:36:16AM +0800, Richard Yang wrote:
>On Wed, Sep 09, 2015 at 12:48:21PM +1000, Gavin Shan wrote:
>>On Wed, Jul 29, 2015 at 11:17:18AM +0800, Wei Yang wrote:
>>>Hi, Michael
>>>
>>>Hope you didn't take this yet. We may change this patch a little.
>>>
>>
>>[Cc Alexey who might concern the SRIOV status]
>>
>>Richard, do you have plan to get it upstream? It seems it's hanged
>>over here for long time.
>>
>
>The VF EEH is hung since we re-designed the SRIOV. After the re-design, we
>don't have VF groups.
>

How can this SRIOV redesign patchset affect EEH part greatly? The EEH
VF patchset already support VF PE which contains only one VF.

>My plan is to push the VF EEH patch set after the SRIOV Redesign is accepted.
>

That SRIOV redesign patchset missed 4.3 merge window obviously. I think the
code has been reviewed by Alexey and me. If Alexey isn't going to have more
comments about it, you can refresh the series (EEH support for VF) based on
it and send the updated series. I don't think there is any dependencies.

>>>On Fri, Jul 17, 2015 at 02:02:41PM +0800, Wei Yang wrote:
>>>>When VF BAR size is larger than 64MB, we group VFs in terms of M64 BAR,
>>>>which means those VFs in a group should form a compound PE.
>>>>
>>>>This patch links those VF PEs into compound PE in this case.
>>>>
>>>>[gwshan: code refactoring for a bit]
>>>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>>>Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>>>---
>>>> arch/powerpc/platforms/powernv/pci-ioda.c |   46 +++++++++++++++++++++++++----
>>>> arch/powerpc/platforms/powernv/pci.c      |   17 +++++++++--
>>>> 2 files changed, 56 insertions(+), 7 deletions(-)
>>>>
>>>>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>>>index 5738d31..d1530cb 100644
>>>>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>>>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>>>@@ -1359,9 +1359,20 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>>> 	}
>>>>
>>>> 	list_for_each_entry_safe(pe, pe_n, &phb->ioda.pe_list, list) {
>>>>+		struct pnv_ioda_pe *s, *sn;
>>>> 		if (pe->parent_dev != pdev)
>>>> 			continue;
>>>>
>>>>+		if ((pe->flags & PNV_IODA_PE_MASTER) &&
>>>>+		    (pe->flags & PNV_IODA_PE_VF)) {
>>>>+			list_for_each_entry_safe(s, sn, &pe->slaves, list) {
>>>>+				pnv_pci_ioda2_release_dma_pe(pdev, s);
>>>>+				list_del(&s->list);
>>>>+				pnv_ioda_deconfigure_pe(phb, s);
>>>>+				pnv_ioda_free_pe(phb, s->pe_number);
>>>>+			}
>>>>+		}
>>>>+
>>>> 		pnv_pci_ioda2_release_dma_pe(pdev, pe);
>>>>
>>>> 		/* Remove from list */
>>>>@@ -1414,7 +1425,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>>> 	struct pci_bus        *bus;
>>>> 	struct pci_controller *hose;
>>>> 	struct pnv_phb        *phb;
>>>>-	struct pnv_ioda_pe    *pe;
>>>>+	struct pnv_ioda_pe    *pe, *master_pe;
>>>> 	int                    pe_num;
>>>> 	u16                    vf_index;
>>>> 	struct pci_dn         *pdn;
>>>>@@ -1456,10 +1467,13 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>>> 			continue;
>>>> 		}
>>>>
>>>>-		/* Put PE to the list */
>>>>-		mutex_lock(&phb->ioda.pe_list_mutex);
>>>>-		list_add_tail(&pe->list, &phb->ioda.pe_list);
>>>>-		mutex_unlock(&phb->ioda.pe_list_mutex);
>>>>+		/* Put PE to the list, or postpone it for compound PEs */
>>>>+		if ((pdn->m64_per_iov != M64_PER_IOV) ||
>>>>+		    (num_vfs <= M64_PER_IOV)) {
>>>>+			mutex_lock(&phb->ioda.pe_list_mutex);
>>>>+			list_add_tail(&pe->list, &phb->ioda.pe_list);
>>>>+			mutex_unlock(&phb->ioda.pe_list_mutex);
>>>>+		}
>>>>
>>>> 		pnv_pci_ioda2_setup_dma_pe(phb, pe);
>>>> 	}
>>>>@@ -1472,10 +1486,32 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>>> 		vf_per_group = roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
>>>>
>>>> 		for (vf_group = 0; vf_group < M64_PER_IOV; vf_group++) {
>>>>+			master_pe = NULL;
>>>>+
>>>> 			for (vf_index = vf_group * vf_per_group;
>>>> 			     vf_index < (vf_group + 1) * vf_per_group &&
>>>> 			     vf_index < num_vfs;
>>>> 			     vf_index++) {
>>>>+
>>>>+				/*
>>>>+				 * Figure out the master PE and put all slave
>>>>+				 * PEs to master PE's list.
>>>>+				 */
>>>>+				pe = &phb->ioda.pe_array[pdn->offset + vf_index];
>>>>+				if (!master_pe) {
>>>>+					pe->flags |= PNV_IODA_PE_MASTER;
>>>>+					INIT_LIST_HEAD(&pe->slaves);
>>>>+					master_pe = pe;
>>>>+					mutex_lock(&phb->ioda.pe_list_mutex);
>>>>+					list_add_tail(&pe->list, &phb->ioda.pe_list);
>>>>+					mutex_unlock(&phb->ioda.pe_list_mutex);
>>>>+				} else {
>>>>+					pe->flags |= PNV_IODA_PE_SLAVE;
>>>>+					pe->master = master_pe;
>>>>+					list_add_tail(&pe->list,
>>>>+						&master_pe->slaves);
>>>>+				}
>>>>+
>>>> 				for (vf_index1 = vf_group * vf_per_group;
>>>> 				     vf_index1 < (vf_group + 1) * vf_per_group &&
>>>> 				     vf_index1 < num_vfs;
>>>>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
>>>>index 0e4f42e..f3aead0 100644
>>>>--- a/arch/powerpc/platforms/powernv/pci.c
>>>>+++ b/arch/powerpc/platforms/powernv/pci.c
>>>>@@ -739,7 +739,7 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>>>> 	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
>>>> 	struct pnv_phb *phb = hose->private_data;
>>>> #ifdef CONFIG_PCI_IOV
>>>>-	struct pnv_ioda_pe *pe;
>>>>+	struct pnv_ioda_pe *pe, *slave;
>>>> 	struct pci_dn *pdn;
>>>>
>>>> 	/* Fix the VF pdn PE number */
>>>>@@ -751,10 +751,23 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>>>> 			    (pdev->devfn & 0xff))) {
>>>> 				pdn->pe_number = pe->pe_number;
>>>> 				pe->pdev = pdev;
>>>>-				break;
>>>>+				goto found;
>>>>+			}
>>>>+
>>>>+			if ((pe->flags & PNV_IODA_PE_MASTER) &&
>>>>+			    (pe->flags & PNV_IODA_PE_VF)) {
>>>>+				list_for_each_entry(slave, &pe->slaves, list) {
>>>>+					if (slave->rid == ((pdev->bus->number << 8)
>>>>+					   | (pdev->devfn & 0xff))) {
>>>>+						pdn->pe_number = slave->pe_number;
>>>>+						slave->pdev = pdev;
>>>>+						goto found;
>>>>+					}
>>>>+				}
>>>> 			}
>>>> 		}
>>>> 	}
>>>>+found:
>>>> #endif /* CONFIG_PCI_IOV */
>>>>
>>>> 	if (phb && phb->dma_dev_setup)
>>>>-- 
>>>>1.7.9.5
>>>
>>>-- 
>>>Richard Yang
>>>Help you, Help me
>
>-- 
>Richard Yang
>Help you, Help me


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-09-09  3:36       ` Richard Yang
  2015-09-09  3:52         ` Gavin Shan
@ 2015-09-09  4:59         ` Benjamin Herrenschmidt
  2015-09-09  5:53           ` Richard Yang
  2015-09-09  5:22         ` Alexey Kardashevskiy
  2 siblings, 1 reply; 22+ messages in thread
From: Benjamin Herrenschmidt @ 2015-09-09  4:59 UTC (permalink / raw)
  To: Richard Yang, Gavin Shan; +Cc: aik, linux-pci, bhelgaas, linuxppc-dev

On Wed, 2015-09-09 at 11:36 +0800, Richard Yang wrote:
> The VF EEH is hung since we re-designed the SRIOV. After the re
> -design, we
> don't have VF groups.
> 
> My plan is to push the VF EEH patch set after the SRIOV Redesign is
> accepted.

What do you mean taht we don't have VF groups ?

If we don't have IOMMU groups per VF that means we can't assign them to
KVM partitions -> they are completely useless.

Ben.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-09-09  3:36       ` Richard Yang
  2015-09-09  3:52         ` Gavin Shan
  2015-09-09  4:59         ` Benjamin Herrenschmidt
@ 2015-09-09  5:22         ` Alexey Kardashevskiy
  2015-09-09  6:01           ` Richard Yang
  2 siblings, 1 reply; 22+ messages in thread
From: Alexey Kardashevskiy @ 2015-09-09  5:22 UTC (permalink / raw)
  To: Richard Yang, Gavin Shan; +Cc: bhelgaas, mpe, linuxppc-dev, linux-pci

On 09/09/2015 01:36 PM, Richard Yang wrote:
> On Wed, Sep 09, 2015 at 12:48:21PM +1000, Gavin Shan wrote:
>> On Wed, Jul 29, 2015 at 11:17:18AM +0800, Wei Yang wrote:
>>> Hi, Michael
>>>
>>> Hope you didn't take this yet. We may change this patch a little.
>>>
>>
>> [Cc Alexey who might concern the SRIOV status]
>>
>> Richard, do you have plan to get it upstream? It seems it's hanged
>> over here for long time.
>>
>
> The VF EEH is hung since we re-designed the SRIOV. After the re-design, we
> don't have VF groups.
>
> My plan is to push the VF EEH patch set after the SRIOV Redesign is accepted.


Can you please rebase on v4.2 (or v4.2 + sriov rework) and repost VF EEH 
patchset just to me? Or share the tree somewhere where I can pull it from? 
Thanks.

As for now, I cannot tell what difference your SRIOV patchset actually makes.

-- 
Alexey

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-09-09  4:59         ` Benjamin Herrenschmidt
@ 2015-09-09  5:53           ` Richard Yang
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Yang @ 2015-09-09  5:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Richard Yang, Gavin Shan, aik, linux-pci, bhelgaas, linuxppc-dev

On Wed, Sep 09, 2015 at 02:59:11PM +1000, Benjamin Herrenschmidt wrote:
>On Wed, 2015-09-09 at 11:36 +0800, Richard Yang wrote:
>> The VF EEH is hung since we re-designed the SRIOV. After the re
>> -design, we
>> don't have VF groups.
>> 
>> My plan is to push the VF EEH patch set after the SRIOV Redesign is
>> accepted.
>
>What do you mean taht we don't have VF groups ?
>

Before we redesign the SRIOV, several VFs may share one M64 segment. This
introduced the compound PE for VFs. The VF group in previous mail means the
compound PE composed of a master VF PE will have several slave VF PEs.

>If we don't have IOMMU groups per VF that means we can't assign them to
>KVM partitions -> they are completely useless.

Yes, I don't mean for the IOMMU group.

>
>Ben.

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-09-09  3:52         ` Gavin Shan
@ 2015-09-09  6:00           ` Richard Yang
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Yang @ 2015-09-09  6:00 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Richard Yang, bhelgaas, mpe, linuxppc-dev, linux-pci, aik

On Wed, Sep 09, 2015 at 01:52:32PM +1000, Gavin Shan wrote:
>tatus: O
>Content-Length: 6303
>Lines: 176
>
>On Wed, Sep 09, 2015 at 11:36:16AM +0800, Richard Yang wrote:
>>On Wed, Sep 09, 2015 at 12:48:21PM +1000, Gavin Shan wrote:
>>>On Wed, Jul 29, 2015 at 11:17:18AM +0800, Wei Yang wrote:
>>>>Hi, Michael
>>>>
>>>>Hope you didn't take this yet. We may change this patch a little.
>>>>
>>>
>>>[Cc Alexey who might concern the SRIOV status]
>>>
>>>Richard, do you have plan to get it upstream? It seems it's hanged
>>>over here for long time.
>>>
>>
>>The VF EEH is hung since we re-designed the SRIOV. After the re-design, we
>>don't have VF groups.
>>
>
>How can this SRIOV redesign patchset affect EEH part greatly? The EEH
>VF patchset already support VF PE which contains only one VF.
>

Yes, that's not greatly. Mostly the difference after SRIOV redesign is the
last patch "powerpc/powernv: compound PE for VFs" will be removed.

>>My plan is to push the VF EEH patch set after the SRIOV Redesign is accepted.
>>
>
>That SRIOV redesign patchset missed 4.3 merge window obviously. I think the
>code has been reviewed by Alexey and me. If Alexey isn't going to have more
>comments about it, you can refresh the series (EEH support for VF) based on
>it and send the updated series. I don't think there is any dependencies.
>

The difference is simple, while we can't apply a patch series without the last
patch in this thread to the upstream. The upstream version will have the
compound VF PE, while after SRIOV Redesign, we don't.

>>>>On Fri, Jul 17, 2015 at 02:02:41PM +0800, Wei Yang wrote:
>>>>>When VF BAR size is larger than 64MB, we group VFs in terms of M64 BAR,
>>>>>which means those VFs in a group should form a compound PE.
>>>>>
>>>>>This patch links those VF PEs into compound PE in this case.
>>>>>
>>>>>[gwshan: code refactoring for a bit]
>>>>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>>>>Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>>>>---
>>>>> arch/powerpc/platforms/powernv/pci-ioda.c |   46 +++++++++++++++++++++++++----
>>>>> arch/powerpc/platforms/powernv/pci.c      |   17 +++++++++--
>>>>> 2 files changed, 56 insertions(+), 7 deletions(-)
>>>>>
>>>>>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>>>>index 5738d31..d1530cb 100644
>>>>>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>>>>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>>>>@@ -1359,9 +1359,20 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>>>> 	}
>>>>>
>>>>> 	list_for_each_entry_safe(pe, pe_n, &phb->ioda.pe_list, list) {
>>>>>+		struct pnv_ioda_pe *s, *sn;
>>>>> 		if (pe->parent_dev != pdev)
>>>>> 			continue;
>>>>>
>>>>>+		if ((pe->flags & PNV_IODA_PE_MASTER) &&
>>>>>+		    (pe->flags & PNV_IODA_PE_VF)) {
>>>>>+			list_for_each_entry_safe(s, sn, &pe->slaves, list) {
>>>>>+				pnv_pci_ioda2_release_dma_pe(pdev, s);
>>>>>+				list_del(&s->list);
>>>>>+				pnv_ioda_deconfigure_pe(phb, s);
>>>>>+				pnv_ioda_free_pe(phb, s->pe_number);
>>>>>+			}
>>>>>+		}
>>>>>+
>>>>> 		pnv_pci_ioda2_release_dma_pe(pdev, pe);
>>>>>
>>>>> 		/* Remove from list */
>>>>>@@ -1414,7 +1425,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>>>> 	struct pci_bus        *bus;
>>>>> 	struct pci_controller *hose;
>>>>> 	struct pnv_phb        *phb;
>>>>>-	struct pnv_ioda_pe    *pe;
>>>>>+	struct pnv_ioda_pe    *pe, *master_pe;
>>>>> 	int                    pe_num;
>>>>> 	u16                    vf_index;
>>>>> 	struct pci_dn         *pdn;
>>>>>@@ -1456,10 +1467,13 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>>>> 			continue;
>>>>> 		}
>>>>>
>>>>>-		/* Put PE to the list */
>>>>>-		mutex_lock(&phb->ioda.pe_list_mutex);
>>>>>-		list_add_tail(&pe->list, &phb->ioda.pe_list);
>>>>>-		mutex_unlock(&phb->ioda.pe_list_mutex);
>>>>>+		/* Put PE to the list, or postpone it for compound PEs */
>>>>>+		if ((pdn->m64_per_iov != M64_PER_IOV) ||
>>>>>+		    (num_vfs <= M64_PER_IOV)) {
>>>>>+			mutex_lock(&phb->ioda.pe_list_mutex);
>>>>>+			list_add_tail(&pe->list, &phb->ioda.pe_list);
>>>>>+			mutex_unlock(&phb->ioda.pe_list_mutex);
>>>>>+		}
>>>>>
>>>>> 		pnv_pci_ioda2_setup_dma_pe(phb, pe);
>>>>> 	}
>>>>>@@ -1472,10 +1486,32 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>>>>> 		vf_per_group = roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
>>>>>
>>>>> 		for (vf_group = 0; vf_group < M64_PER_IOV; vf_group++) {
>>>>>+			master_pe = NULL;
>>>>>+
>>>>> 			for (vf_index = vf_group * vf_per_group;
>>>>> 			     vf_index < (vf_group + 1) * vf_per_group &&
>>>>> 			     vf_index < num_vfs;
>>>>> 			     vf_index++) {
>>>>>+
>>>>>+				/*
>>>>>+				 * Figure out the master PE and put all slave
>>>>>+				 * PEs to master PE's list.
>>>>>+				 */
>>>>>+				pe = &phb->ioda.pe_array[pdn->offset + vf_index];
>>>>>+				if (!master_pe) {
>>>>>+					pe->flags |= PNV_IODA_PE_MASTER;
>>>>>+					INIT_LIST_HEAD(&pe->slaves);
>>>>>+					master_pe = pe;
>>>>>+					mutex_lock(&phb->ioda.pe_list_mutex);
>>>>>+					list_add_tail(&pe->list, &phb->ioda.pe_list);
>>>>>+					mutex_unlock(&phb->ioda.pe_list_mutex);
>>>>>+				} else {
>>>>>+					pe->flags |= PNV_IODA_PE_SLAVE;
>>>>>+					pe->master = master_pe;
>>>>>+					list_add_tail(&pe->list,
>>>>>+						&master_pe->slaves);
>>>>>+				}
>>>>>+
>>>>> 				for (vf_index1 = vf_group * vf_per_group;
>>>>> 				     vf_index1 < (vf_group + 1) * vf_per_group &&
>>>>> 				     vf_index1 < num_vfs;
>>>>>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
>>>>>index 0e4f42e..f3aead0 100644
>>>>>--- a/arch/powerpc/platforms/powernv/pci.c
>>>>>+++ b/arch/powerpc/platforms/powernv/pci.c
>>>>>@@ -739,7 +739,7 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>>>>> 	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
>>>>> 	struct pnv_phb *phb = hose->private_data;
>>>>> #ifdef CONFIG_PCI_IOV
>>>>>-	struct pnv_ioda_pe *pe;
>>>>>+	struct pnv_ioda_pe *pe, *slave;
>>>>> 	struct pci_dn *pdn;
>>>>>
>>>>> 	/* Fix the VF pdn PE number */
>>>>>@@ -751,10 +751,23 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
>>>>> 			    (pdev->devfn & 0xff))) {
>>>>> 				pdn->pe_number = pe->pe_number;
>>>>> 				pe->pdev = pdev;
>>>>>-				break;
>>>>>+				goto found;
>>>>>+			}
>>>>>+
>>>>>+			if ((pe->flags & PNV_IODA_PE_MASTER) &&
>>>>>+			    (pe->flags & PNV_IODA_PE_VF)) {
>>>>>+				list_for_each_entry(slave, &pe->slaves, list) {
>>>>>+					if (slave->rid == ((pdev->bus->number << 8)
>>>>>+					   | (pdev->devfn & 0xff))) {
>>>>>+						pdn->pe_number = slave->pe_number;
>>>>>+						slave->pdev = pdev;
>>>>>+						goto found;
>>>>>+					}
>>>>>+				}
>>>>> 			}
>>>>> 		}
>>>>> 	}
>>>>>+found:
>>>>> #endif /* CONFIG_PCI_IOV */
>>>>>
>>>>> 	if (phb && phb->dma_dev_setup)
>>>>>-- 
>>>>>1.7.9.5
>>>>
>>>>-- 
>>>>Richard Yang
>>>>Help you, Help m

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-09-09  5:22         ` Alexey Kardashevskiy
@ 2015-09-09  6:01           ` Richard Yang
  2015-09-17  0:28             ` Gavin Shan
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Yang @ 2015-09-09  6:01 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Richard Yang, Gavin Shan, bhelgaas, mpe, linuxppc-dev, linux-pci

On Wed, Sep 09, 2015 at 03:22:08PM +1000, Alexey Kardashevskiy wrote:
>On 09/09/2015 01:36 PM, Richard Yang wrote:
>>On Wed, Sep 09, 2015 at 12:48:21PM +1000, Gavin Shan wrote:
>>>On Wed, Jul 29, 2015 at 11:17:18AM +0800, Wei Yang wrote:
>>>>Hi, Michael
>>>>
>>>>Hope you didn't take this yet. We may change this patch a little.
>>>>
>>>
>>>[Cc Alexey who might concern the SRIOV status]
>>>
>>>Richard, do you have plan to get it upstream? It seems it's hanged
>>>over here for long time.
>>>
>>
>>The VF EEH is hung since we re-designed the SRIOV. After the re-design, we
>>don't have VF groups.
>>
>>My plan is to push the VF EEH patch set after the SRIOV Redesign is accepted.
>
>
>Can you please rebase on v4.2 (or v4.2 + sriov rework) and repost VF EEH
>patchset just to me? Or share the tree somewhere where I can pull it from?
>Thanks.
>

Yep, this is what I am planning to do.

>As for now, I cannot tell what difference your SRIOV patchset actually makes.
>
>-- 
>Alexey

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH V9 11/11] powerpc/powernv: compound PE for VFs
  2015-09-09  6:01           ` Richard Yang
@ 2015-09-17  0:28             ` Gavin Shan
  0 siblings, 0 replies; 22+ messages in thread
From: Gavin Shan @ 2015-09-17  0:28 UTC (permalink / raw)
  To: Richard Yang
  Cc: Alexey Kardashevskiy, Gavin Shan, bhelgaas, mpe, linuxppc-dev, linux-pci

On Wed, Sep 09, 2015 at 02:01:29PM +0800, Richard Yang wrote:
>On Wed, Sep 09, 2015 at 03:22:08PM +1000, Alexey Kardashevskiy wrote:
>>On 09/09/2015 01:36 PM, Richard Yang wrote:
>>>On Wed, Sep 09, 2015 at 12:48:21PM +1000, Gavin Shan wrote:
>>>>On Wed, Jul 29, 2015 at 11:17:18AM +0800, Wei Yang wrote:
>>>>>Hi, Michael
>>>>>
>>>>>Hope you didn't take this yet. We may change this patch a little.
>>>>>
>>>>
>>>>[Cc Alexey who might concern the SRIOV status]
>>>>
>>>>Richard, do you have plan to get it upstream? It seems it's hanged
>>>>over here for long time.
>>>>
>>>
>>>The VF EEH is hung since we re-designed the SRIOV. After the re-design, we
>>>don't have VF groups.
>>>
>>>My plan is to push the VF EEH patch set after the SRIOV Redesign is accepted.
>>
>>
>>Can you please rebase on v4.2 (or v4.2 + sriov rework) and repost VF EEH
>>patchset just to me? Or share the tree somewhere where I can pull it from?
>>Thanks.
>>
>
>Yep, this is what I am planning to do.
>

Can you rebase your patchset on v4.3.rc1+sriov rework and then repost to
linux-ppc-dev maillist?


>>As for now, I cannot tell what difference your SRIOV patchset actually makes.
>>
>>-- 
>>Alexey
>
>-- 
>Richard Yang
>Help you, Help me


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-09-17  0:29 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-17  6:02 [PATCH V9 00/11] VF EEH on Power8 Wei Yang
2015-07-17  6:02 ` [PATCH V9 01/11] PCI/IOV: Rename and export virtfn_add/virtfn_remove Wei Yang
2015-07-17  6:02 ` [PATCH V9 02/11] PCI: Add pcibios_bus_add_device() weak function Wei Yang
2015-07-17  6:02 ` [PATCH V9 03/11] powerpc/pci: Cache VF index in pci_dn Wei Yang
2015-07-17  6:02 ` [PATCH V9 04/11] powerpc/pci: Remove VFs prior to PF Wei Yang
2015-07-17  6:02 ` [PATCH V9 05/11] powerpc/eeh: Cache only BARs, not windows or IOV BARs Wei Yang
2015-07-17  6:02 ` [PATCH V9 06/11] powerpc/powernv: EEH device for VF Wei Yang
2015-07-17  6:02 ` [PATCH V9 07/11] powerpc/eeh: Create PE for VFs Wei Yang
2015-07-17  6:02 ` [PATCH V9 08/11] powerpc/powernv: Support EEH reset for VF PE Wei Yang
2015-07-17  6:02 ` [PATCH V9 09/11] powerpc/powernv: Support PCI config restore for VFs Wei Yang
2015-07-17  6:02 ` [PATCH V9 10/11] powerpc/eeh: Support error recovery for VF PE Wei Yang
2015-07-17  6:02 ` [PATCH V9 11/11] powerpc/powernv: compound PE for VFs Wei Yang
2015-07-29  3:17   ` Wei Yang
2015-09-09  2:48     ` Gavin Shan
2015-09-09  3:36       ` Richard Yang
2015-09-09  3:52         ` Gavin Shan
2015-09-09  6:00           ` Richard Yang
2015-09-09  4:59         ` Benjamin Herrenschmidt
2015-09-09  5:53           ` Richard Yang
2015-09-09  5:22         ` Alexey Kardashevskiy
2015-09-09  6:01           ` Richard Yang
2015-09-17  0:28             ` Gavin Shan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.