linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/9] VF EEH on Power8
@ 2015-05-04  7:07 Wei Yang
  2015-05-04  7:07 ` [PATCH V3 1/9] pci/iov: rename and export virtfn_add/virtfn_remove Wei Yang
                   ` (8 more replies)
  0 siblings, 9 replies; 33+ messages in thread
From: Wei Yang @ 2015-05-04  7:07 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

This patchset enables EEH on SRIOV VFs. The general idea is to create proper
VF edev and VF PE and handle them properly.

Different from the Bus PE, VF PE just contain one VF. This introduces the
difference of EEH error handling on a VF PE. Generally, it has several
differences.

First, the VF's removal and re-enumerate rely on its PF. VF has a tight
relationship between its PF. This is not proper to enumerate a VF by usual
scan procedure. That's why virtfn_add/virtfn_remove are exported in this patch
set.

Second, the reset/restore of a VF is done in kernel space. FW is not aware of
the VF, this means the usual reset function done in FW will not work. One of
the patch will imitate the reset/restore function in kernel space.

Third, the VF may be removed during the PF's error_detected function. In this
case, the original error_detected->slot_reset->resume sequence is not proper
to those removed VFs, since they are re-created by PF in a fresh state. A flag
in eeh_dev is introduce to mark the eeh_dev is in error state. By doing so, we
track whether this device needs to be reset or not.

This has been tested both on host and in guest on Power8 with latest kernel
version.

v3:
   * add back vf_index in pci_dn to track the VF's index
   * rename ppdev in eeh_dev to physfn for consistency
   * move edev->physfn assignment before dev->dev.archdata.edev is set
   * move pnv_pci_fixup_vf_eeh() and pnv_pci_fixup_vf_caps() to eeh-powernv.c
   * more clear and detail in commit log and comment in code
   * merge eeh_rmv_virt_device() with eeh_rmv_device()
   * move the cfg_blocked check logic from pnv_eeh_read/write_config() to
     pnv_eeh_cfg_blocked()
   * move the vf reset/restore logic into its own patch, two patches are
     created.
     powerpc/powernv: Support PCI config restore for VFs
     powerpc/powernv: Support EEH reset for VFs
   * simplify the vf reset logic
v2:
   * add prefix pci_iov_ to virtfn_add/virtfn_remove
   * use EEH_DEV_VF as a flag for a VF's eeh_dev
   * use eeh_dev instead of edev in change log
   * remove vf_index in eeh_dev, calculate it from pdn->busno and devfn
   * do eeh_add_device_late() and eeh_sysfs_add_device() both after pci_dev is
     well initialized
   * do FLR to reset a VF PE
   * imitate the restore function in FW for VF
   * remove the reverse order patch, since it is still under discussion

Wei Yang (9):
  pci/iov: rename and export virtfn_add/virtfn_remove
  powerpc/pci_dn: cache vf_index in pci_dn
  powerpc/pci: remove PCI devices in reverse order
  powerpc/eeh: cache address range just for normal device
  powerpc/eeh: create EEH_PE_VF for VF PE
  powerpc/powernv: create/release eeh_dev for VF
  powerpc/powernv: Support EEH reset for VFs
  powerpc/powernv: Support PCI config restore for VFs
  powerpc/eeh: handle VF PE properly

 arch/powerpc/include/asm/eeh.h               |   10 ++
 arch/powerpc/include/asm/pci-bridge.h        |    2 +
 arch/powerpc/kernel/eeh.c                    |    5 +
 arch/powerpc/kernel/eeh_cache.c              |    2 +-
 arch/powerpc/kernel/eeh_dev.c                |   20 +++
 arch/powerpc/kernel/eeh_driver.c             |  108 +++++++++++---
 arch/powerpc/kernel/eeh_pe.c                 |   15 +-
 arch/powerpc/kernel/pci-hotplug.c            |    2 +-
 arch/powerpc/kernel/pci_dn.c                 |   12 +-
 arch/powerpc/platforms/powernv/eeh-powernv.c |  200 +++++++++++++++++++++++++-
 drivers/pci/iov.c                            |   10 +-
 include/linux/pci.h                          |    2 +
 12 files changed, 352 insertions(+), 36 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH V3 1/9] pci/iov: rename and export virtfn_add/virtfn_remove
  2015-05-04  7:07 [PATCH V3 0/9] VF EEH on Power8 Wei Yang
@ 2015-05-04  7:07 ` Wei Yang
  2015-05-11  2:13   ` Gavin Shan
  2015-05-04  7:07 ` [PATCH V3 2/9] powerpc/pci_dn: cache vf_index in pci_dn Wei Yang
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-04  7:07 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

During the EEH procedure, when a device's driver is not EEH aware or no
driver is binded with a device, EEH core would do hotplug on this devices.
While it isn't feasible for a VF with usual hotplug procedure. During
removal of a VF, virt_bus should be removed if necessary. During the
re-creation, the pci_scan_slot() doesn't work on a VF.

This patch exports two functions to handle the hotplug case for VF
properly. They will be invoked when the EEH core do the hotplug case for
VFs.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 drivers/pci/iov.c   |   10 +++++-----
 include/linux/pci.h |    2 ++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index ee0ebff..cc941dd 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -108,7 +108,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
 	return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
 }
 
-static int virtfn_add(struct pci_dev *dev, int id, int reset)
+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset)
 {
 	int i;
 	int rc = -ENOMEM;
@@ -183,7 +183,7 @@ failed:
 	return rc;
 }
 
-static void virtfn_remove(struct pci_dev *dev, int id, int reset)
+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset)
 {
 	char buf[VIRTFN_ID_LEN];
 	struct pci_dev *virtfn;
@@ -320,7 +320,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	}
 
 	for (i = 0; i < initial; i++) {
-		rc = virtfn_add(dev, i, 0);
+		rc = pci_iov_virtfn_add(dev, i, 0);
 		if (rc)
 			goto failed;
 	}
@@ -332,7 +332,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 failed:
 	for (j = 0; j < i; j++)
-		virtfn_remove(dev, j, 0);
+		pci_iov_virtfn_remove(dev, j, 0);
 
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
 	pci_cfg_access_lock(dev);
@@ -361,7 +361,7 @@ static void sriov_disable(struct pci_dev *dev)
 		return;
 
 	for (i = 0; i < iov->num_VFs; i++)
-		virtfn_remove(dev, i, 0);
+		pci_iov_virtfn_remove(dev, i, 0);
 
 	pcibios_sriov_disable(dev);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 353db8d..94bacfa 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1679,6 +1679,8 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
 
 int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 void pci_disable_sriov(struct pci_dev *dev);
+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset);
+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset);
 int pci_num_vf(struct pci_dev *dev);
 int pci_vfs_assigned(struct pci_dev *dev);
 int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V3 2/9] powerpc/pci_dn: cache vf_index in pci_dn
  2015-05-04  7:07 [PATCH V3 0/9] VF EEH on Power8 Wei Yang
  2015-05-04  7:07 ` [PATCH V3 1/9] pci/iov: rename and export virtfn_add/virtfn_remove Wei Yang
@ 2015-05-04  7:07 ` Wei Yang
  2015-05-11  2:21   ` Gavin Shan
  2015-05-04  7:07 ` [PATCH V3 3/9] powerpc/pci: remove PCI devices in reverse order Wei Yang
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-04  7:07 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

This patch caches the index of a VF in its PF in pci_dn.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci-bridge.h |    1 +
 arch/powerpc/kernel/pci_dn.c          |    5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 1811c44..9582aa2 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -199,6 +199,7 @@ struct pci_dn {
 #ifdef CONFIG_PCI_IOV
 	u16     vfs_expanded;		/* number of VFs IOV BAR expanded */
 	u16     num_vfs;		/* number of VFs enabled*/
+	int     vf_index;		/* Index to PF for VF dev */
 	int     offset;			/* PE# for the first VF PE */
 #define M64_PER_IOV 4
 	int     m64_per_iov;
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index b3b4df9..bf0fb873 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -138,7 +138,7 @@ struct pci_dn *pci_get_pdn(struct pci_dev *pdev)
 
 #ifdef CONFIG_PCI_IOV
 static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
-					   struct pci_dev *pdev,
+					   struct pci_dev *pdev, int vf_index,
 					   int busno, int devfn)
 {
 	struct pci_dn *pdn;
@@ -157,6 +157,7 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 	pdn->parent = parent;
 	pdn->busno = busno;
 	pdn->devfn = devfn;
+	pdn->vf_index = vf_index;
 #ifdef CONFIG_PPC_POWERNV
 	pdn->pe_number = IODA_INVALID_PE;
 #endif
@@ -196,7 +197,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 		return NULL;
 
 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
-		pdn = add_one_dev_pci_data(parent, NULL,
+		pdn = add_one_dev_pci_data(parent, NULL, i,
 					   pci_iov_virtfn_bus(pdev, i),
 					   pci_iov_virtfn_devfn(pdev, i));
 		if (!pdn) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V3 3/9] powerpc/pci: remove PCI devices in reverse order
  2015-05-04  7:07 [PATCH V3 0/9] VF EEH on Power8 Wei Yang
  2015-05-04  7:07 ` [PATCH V3 1/9] pci/iov: rename and export virtfn_add/virtfn_remove Wei Yang
  2015-05-04  7:07 ` [PATCH V3 2/9] powerpc/pci_dn: cache vf_index in pci_dn Wei Yang
@ 2015-05-04  7:07 ` Wei Yang
  2015-05-04  7:07 ` [PATCH V3 4/9] powerpc/eeh: cache address range just for normal device Wei Yang
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Wei Yang @ 2015-05-04  7:07 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

As commit ac205b7b ("PCI: make sriov work with hotplug remove") indicates,
when removing PCI devices on a bus which has VFs, we need to remove them
in the reverse order.

This patch applies this pattern to the hotplug removal code for the powerpc
arch.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/pci-hotplug.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 7ed85a6..98f84ed 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -50,7 +50,7 @@ void pcibios_remove_pci_devices(struct pci_bus *bus)
 
 	pr_debug("PCI: Removing devices on bus %04x:%02x\n",
 		 pci_domain_nr(bus),  bus->number);
-	list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) {
+	list_for_each_entry_safe_reverse(dev, tmp, &bus->devices, bus_list) {
 		pr_debug("   Removing %s...\n", pci_name(dev));
 		pci_stop_and_remove_bus_device(dev);
 	}
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V3 4/9] powerpc/eeh: cache address range just for normal device
  2015-05-04  7:07 [PATCH V3 0/9] VF EEH on Power8 Wei Yang
                   ` (2 preceding siblings ...)
  2015-05-04  7:07 ` [PATCH V3 3/9] powerpc/pci: remove PCI devices in reverse order Wei Yang
@ 2015-05-04  7:07 ` Wei Yang
  2015-05-04  7:07 ` [PATCH V3 5/9] powerpc/eeh: create EEH_PE_VF for VF PE Wei Yang
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Wei Yang @ 2015-05-04  7:07 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

The address cache is used to find the related eeh_dev for a given MMIO
address.  From the definition of pci_dev.resource[], it keeps MMIO address
in following order: 6 normal BAR, ROM BAR, 6 IOV BAR, 4 Bridge window.

In the address cache, first it doesn't cache bridge device, second the IOV
BAR range should map to their own VFs separately. This means it just need
to cache the first 7 BARs for a normal device.

This patch restricts the address cache to save the first 7 BARs for a pci
device.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_cache.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index a1e86e1..f0ce2a3 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -196,7 +196,7 @@ static void __eeh_addr_cache_insert_dev(struct pci_dev *dev)
 	}
 
 	/* Walk resources on this device, poke them into the tree */
-	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+	for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
 		resource_size_t start = pci_resource_start(dev,i);
 		resource_size_t end = pci_resource_end(dev,i);
 		unsigned long flags = pci_resource_flags(dev,i);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V3 5/9] powerpc/eeh: create EEH_PE_VF for VF PE
  2015-05-04  7:07 [PATCH V3 0/9] VF EEH on Power8 Wei Yang
                   ` (3 preceding siblings ...)
  2015-05-04  7:07 ` [PATCH V3 4/9] powerpc/eeh: cache address range just for normal device Wei Yang
@ 2015-05-04  7:07 ` Wei Yang
  2015-05-11  2:37   ` Gavin Shan
  2015-05-04  7:07 ` [PATCH V3 6/9] powerpc/powernv: create/release eeh_dev for VF Wei Yang
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-04  7:07 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

On powernv platform, VF PE is a special PE which is different from the Bus
PE.  On the EEH side, it needs a corresponding concept to handle the VF PE
properly. For example, we need to create VF PE when VF's pci_dev is
initialized in kernel. And add a flag to mark it is a VF PF.

This patch introduces the EEH_PE_VF type for VF PE and creates it for a VF.
At the mean time, it creates the sysfs and address cache for VF PE.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |    1 +
 arch/powerpc/kernel/eeh_pe.c                 |   12 ++++++++++--
 arch/powerpc/platforms/powernv/eeh-powernv.c |   12 ++++++++++++
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index a52db28..56e8cd9 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -70,6 +70,7 @@ struct pci_dn;
 #define EEH_PE_PHB	(1 << 1)	/* PHB PE    */
 #define EEH_PE_DEVICE 	(1 << 2)	/* Device PE */
 #define EEH_PE_BUS	(1 << 3)	/* Bus PE    */
+#define EEH_PE_VF	(1 << 4)	/* VF PE     */
 
 #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
 #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 35f0b62..edfe63a 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -299,7 +299,12 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
 	 * EEH device already having associated PE, but
 	 * the direct parent EEH device doesn't have yet.
 	 */
-	pdn = pdn ? pdn->parent : NULL;
+#ifdef CONFIG_PCI_IOV
+	if (edev->mode & EEH_DEV_VF)
+		pdn = pci_get_pdn(edev->physfn);
+	else
+#endif
+		pdn = pdn ? pdn->parent : NULL;
 	while (pdn) {
 		/* We're poking out of PCI territory */
 		parent = pdn_to_eeh_dev(pdn);
@@ -382,7 +387,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 	}
 
 	/* Create a new EEH PE */
-	pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
+	if (edev->mode & EEH_DEV_VF)
+		pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
+	else
+		pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
 	if (!pe) {
 		pr_err("%s: out of memory!\n", __func__);
 		return -ENOMEM;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 622f08c..5447481 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1540,3 +1540,15 @@ static int __init eeh_powernv_init(void)
 	return ret;
 }
 machine_early_initcall(powernv, eeh_powernv_init);
+
+#ifdef CONFIG_PCI_IOV
+static void pnv_pci_fixup_vf_eeh(struct pci_dev *pdev)
+{
+	/* sysfs files should only be added after devices are added */
+	if (pdev->is_virtfn) {
+		eeh_add_device_late(pdev);
+		eeh_sysfs_add_device(pdev);
+	}
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_eeh);
+#endif /* CONFIG_PCI_IOV */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V3 6/9] powerpc/powernv: create/release eeh_dev for VF
  2015-05-04  7:07 [PATCH V3 0/9] VF EEH on Power8 Wei Yang
                   ` (4 preceding siblings ...)
  2015-05-04  7:07 ` [PATCH V3 5/9] powerpc/eeh: create EEH_PE_VF for VF PE Wei Yang
@ 2015-05-04  7:07 ` Wei Yang
  2015-05-11  2:48   ` Gavin Shan
  2015-05-04  7:07 ` [PATCH V3 7/9] powerpc/powernv: Support EEH reset for VFs Wei Yang
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-04  7:07 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

EEH on powerpc platform needs eeh_dev structure to track the pci device
status. Since VFs are created/released dynamically, VF's eeh_dev is also
dynamically created/released in system.

This patch creates/removes eeh_dev when pci_dn is created/removed for VFs,
and marks it with EEH_DEV_VF type.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h |    7 +++++++
 arch/powerpc/kernel/eeh.c      |    4 ++++
 arch/powerpc/kernel/eeh_dev.c  |   20 ++++++++++++++++++++
 arch/powerpc/kernel/pci_dn.c   |    7 +++++++
 4 files changed, 38 insertions(+)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 56e8cd9..2067de4 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -124,6 +124,7 @@ static inline bool eeh_pe_passed(struct eeh_pe *pe)
 #define EEH_DEV_NO_HANDLER	(1 << 8)	/* No error handler	*/
 #define EEH_DEV_SYSFS		(1 << 9)	/* Sysfs created	*/
 #define EEH_DEV_REMOVED		(1 << 10)	/* Removed permanently	*/
+#define EEH_DEV_VF		(1 << 11)	/* VF port		*/
 
 struct eeh_dev {
 	int mode;			/* EEH mode			*/
@@ -139,6 +140,9 @@ struct eeh_dev {
 	struct pci_controller *phb;	/* Associated PHB		*/
 	struct pci_dn *pdn;		/* Associated PCI device node	*/
 	struct pci_dev *pdev;		/* Associated PCI device	*/
+#ifdef CONFIG_PCI_IOV
+	struct pci_dev *physfn;		/* Associated PF PORT		*/
+#endif /* CONFIG_PCI_IOV */
 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
 };
 
@@ -273,6 +277,7 @@ const char *eeh_pe_loc_get(struct eeh_pe *pe);
 struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
 
 void *eeh_dev_init(struct pci_dn *pdn, void *data);
+void eeh_dev_remove(struct pci_dn *pdn);
 void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
 int eeh_init(void);
 int __init eeh_ops_register(struct eeh_ops *ops);
@@ -328,6 +333,8 @@ static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
 	return NULL;
 }
 
+void eeh_dev_remove(struct pci_dn *pdn) { }
+
 static inline void eeh_dev_phb_init_dynamic(struct pci_controller *phb) { }
 
 static inline int eeh_check_failure(const volatile void __iomem *token)
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 6c7ce1b..221e280 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1135,6 +1135,10 @@ void eeh_add_device_late(struct pci_dev *dev)
 	}
 
 	edev->pdev = dev;
+#ifdef CONFIG_PCI_IOV
+	if (dev->is_virtfn)
+		edev->physfn = dev->physfn;
+#endif
 	dev->dev.archdata.edev = edev;
 
 	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c
index aabba94..ab88e61 100644
--- a/arch/powerpc/kernel/eeh_dev.c
+++ b/arch/powerpc/kernel/eeh_dev.c
@@ -72,6 +72,26 @@ void *eeh_dev_init(struct pci_dn *pdn, void *data)
 }
 
 /**
+ * eeh_dev_remove - Release EEH device according to OF node
+ * @pdn: PCI device node
+ */
+void eeh_dev_remove(struct pci_dn *pdn)
+{
+	struct eeh_dev *edev;
+
+	if (!pdn)
+		return;
+
+	edev = pdn_to_eeh_dev(pdn);
+	if(!edev)
+		return;
+
+	eeh_rmv_from_parent_pe(edev);
+	kfree(edev);
+	pdn->edev = NULL;
+}
+
+/**
  * eeh_dev_phb_init_dynamic - Create EEH devices for devices included in PHB
  * @phb: PHB
  *
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index bf0fb873..a35f865 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -179,7 +179,9 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 {
 #ifdef CONFIG_PCI_IOV
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
 	struct pci_dn *parent, *pdn;
+	struct eeh_dev *edev;
 	int i;
 
 	/* Only support IOV for now */
@@ -205,6 +207,9 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 				 __func__, i);
 			return NULL;
 		}
+		eeh_dev_init(pdn, hose);
+		edev = pdn_to_eeh_dev(pdn);
+		edev->mode |= EEH_DEV_VF;
 	}
 #endif /* CONFIG_PCI_IOV */
 
@@ -257,6 +262,8 @@ void remove_dev_pci_data(struct pci_dev *pdev)
 			    pdn->devfn != pci_iov_virtfn_devfn(pdev, i))
 				continue;
 
+			eeh_dev_remove(pdn);
+
 			if (!list_empty(&pdn->list))
 				list_del(&pdn->list);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V3 7/9] powerpc/powernv: Support EEH reset for VFs
  2015-05-04  7:07 [PATCH V3 0/9] VF EEH on Power8 Wei Yang
                   ` (5 preceding siblings ...)
  2015-05-04  7:07 ` [PATCH V3 6/9] powerpc/powernv: create/release eeh_dev for VF Wei Yang
@ 2015-05-04  7:07 ` Wei Yang
  2015-05-11  3:03   ` Gavin Shan
  2015-05-04  7:07 ` [PATCH V3 8/9] powerpc/powernv: Support PCI config restore " Wei Yang
  2015-05-04  7:07 ` [PATCH V3 9/9] powerpc/eeh: handle VF PE properly Wei Yang
  8 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-04  7:07 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

Before VF PE introduced, there isn't a method to reset an individual pci
function. And since FW is not aware of the VF, the VF's reset should be
done in kernel.

This patch introduce a pnv_eeh_vf_pe_reset() to do the flr or af_flr to a
VF.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |    1 +
 arch/powerpc/platforms/powernv/eeh-powernv.c |  111 +++++++++++++++++++++++++-
 2 files changed, 111 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 2067de4..78c8bec 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -135,6 +135,7 @@ struct eeh_dev {
 	int pcix_cap;			/* Saved PCIx capability	*/
 	int pcie_cap;			/* Saved PCIe capability	*/
 	int aer_cap;			/* Saved AER capability		*/
+	int af_cap;			/* Saved AF capability		*/
 	struct eeh_pe *pe;		/* Associated PE		*/
 	struct list_head list;		/* Form link list in the PE	*/
 	struct pci_controller *phb;	/* Associated PHB		*/
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 5447481..1ad322f 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -402,6 +402,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX);
 	edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
 	edev->aer_cap  = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
+	edev->af_cap   = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF);
 	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
 		edev->mode |= EEH_DEV_BRIDGE;
 		if (edev->pcie_cap) {
@@ -891,6 +892,105 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 	return 0;
 }
 
+static int pnv_pci_wait_for_pending(struct pci_dn *pdn, int pos, u16 mask)
+{
+	int i;
+
+	/* Wait for Transaction Pending bit clean */
+	for (i = 0; i < 4; i++) {
+		u32 status;
+		if (i)
+			msleep((1 << (i - 1)) * 100);
+
+		eeh_ops->read_config(pdn, pos, 2, &status);
+		if (!(status & mask))
+			return 1;
+	}
+
+	return 0;
+}
+
+static int pnv_eeh_do_flr(struct pci_dn *pdn)
+{
+	u32 cap;
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+
+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &cap);
+	if (!(cap & PCI_EXP_DEVCAP_FLR))
+		return -ENOTTY;
+
+	if (!pnv_pci_wait_for_pending(pdn, edev->pcie_cap + PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_TRPND))
+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending transaction; performing function level reset anyway\n",
+			edev->phb->global_number, pdn->busno,
+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+
+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, &cap);
+	cap |= PCI_EXP_DEVCTL_BCR_FLR;
+	eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, cap);
+	msleep(100);
+	return 0;
+}
+
+static int pnv_eeh_do_af_flr(struct pci_dn *pdn)
+{
+	u32 cap;
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+
+	if (!edev->af_cap)
+		return -ENOTTY;
+
+	eeh_ops->read_config(pdn, edev->af_cap + PCI_AF_CAP, 1, &cap);
+	if (!(cap & PCI_AF_CAP_TP) || !(cap & PCI_AF_CAP_FLR))
+		return -ENOTTY;
+
+	/*
+	 * Wait for Transaction Pending bit to clear.  A word-aligned test
+	 * is used, so we use the conrol offset rather than status and shift
+	 * the test bit to match.
+	 */
+	if (!pnv_pci_wait_for_pending(pdn, edev->af_cap + PCI_AF_CTRL,
+				 PCI_AF_STATUS_TP << 8))
+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending transaction; performing AF function level reset anyway\n",
+			edev->phb->global_number, pdn->busno,
+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+
+	eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1, PCI_AF_CTRL_FLR);
+	msleep(100);
+	return 0;
+}
+
+static int pnv_eeh_reset_vf(struct pci_dn *pdn)
+{
+	int rc;
+
+	might_sleep();
+
+	rc = pnv_eeh_do_flr(pdn);
+	if (rc != -ENOTTY)
+		goto done;
+
+	rc = pnv_eeh_do_af_flr(pdn);
+	if (rc != -ENOTTY)
+		goto done;
+
+done:
+	return rc;
+}
+
+static int pnv_eeh_vf_pe_reset(struct eeh_pe *pe, int option)
+{
+	struct eeh_dev *edev, *tmp;
+	struct pci_dn *pdn;
+	int ret = 0;
+
+	eeh_pe_for_each_dev(pe, edev, tmp) {
+		pdn = eeh_dev_to_pdn(edev);
+		ret |= pnv_eeh_reset_vf(pdn);
+	}
+
+	return ret;
+}
+
 void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 {
 	struct pci_controller *hose;
@@ -966,7 +1066,9 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
 		}
 
 		bus = eeh_pe_bus_get(pe);
-		if (pci_is_root_bus(bus) ||
+		if (pe->type & EEH_PE_VF)
+			ret = pnv_eeh_vf_pe_reset(pe, option);
+		else if (pci_is_root_bus(bus) ||
 			pci_is_root_bus(bus->parent))
 			ret = pnv_eeh_root_reset(hose, option);
 		else
@@ -1106,6 +1208,13 @@ static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
 	if (!edev || !edev->pe)
 		return false;
 
+	/*
+	 * For VF's reset operation, we need to rely on the kernel to
+	 * do those pci read/write, since FW isn't aware of VFs.
+	 */
+	if ((edev->mode & EEH_DEV_VF) && (edev->pe->state & EEH_PE_RESET))
+		return true;
+
 	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
 		return true;
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V3 8/9] powerpc/powernv: Support PCI config restore for VFs
  2015-05-04  7:07 [PATCH V3 0/9] VF EEH on Power8 Wei Yang
                   ` (6 preceding siblings ...)
  2015-05-04  7:07 ` [PATCH V3 7/9] powerpc/powernv: Support EEH reset for VFs Wei Yang
@ 2015-05-04  7:07 ` Wei Yang
  2015-05-11  4:22   ` Gavin Shan
  2015-05-04  7:07 ` [PATCH V3 9/9] powerpc/eeh: handle VF PE properly Wei Yang
  8 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-04  7:07 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

Since FW is not aware of VFs, the restore action for VF should be done in
kernel.

This patch introduces pnv_eeh_vf_restore_config() for VF.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci-bridge.h        |    1 +
 arch/powerpc/platforms/powernv/eeh-powernv.c |   77 +++++++++++++++++++++++++-
 2 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 9582aa2..de55ef6 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -205,6 +205,7 @@ struct pci_dn {
 	int     m64_per_iov;
 #define IODA_INVALID_M64        (-1)
 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
+	int	mps;
 #endif /* CONFIG_PCI_IOV */
 #endif
 	struct list_head child_list;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 1ad322f..6ba6d87 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1589,6 +1589,59 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
 	return ret;
 }
 
+#ifdef CONFIG_PCI_IOV
+static int pnv_eeh_vf_restore_config(struct pci_dn *pdn)
+{
+	int pcie_cap, aer_cap, old_mps;
+	u32 devctl, cmd, cap2, aer_capctl;
+
+	/* Restore MPS */
+	pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
+	if (pcie_cap) {
+		old_mps = (ffs(pdn->mps) - 8) << 5;
+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
+		devctl |= old_mps;
+		pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
+	}
+
+	/* Disable Completion Timeout */
+	if (pcie_cap) {
+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
+		if (cap2 & 0x10) {
+			pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
+			cap2 |= 0x10;
+			pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
+		}
+	}
+
+	/* Enable SERR and parity checking */
+	pnv_pci_cfg_read(pdn, PCI_COMMAND, 2, &cmd);
+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
+	pnv_pci_cfg_write(pdn, PCI_COMMAND, 2, cmd);
+
+	/* Enable report various errors */
+	if (pcie_cap) {
+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
+		devctl &= ~PCI_EXP_DEVCTL_CERE;
+		devctl |= (PCI_EXP_DEVCTL_NFERE |
+			   PCI_EXP_DEVCTL_FERE |
+			   PCI_EXP_DEVCTL_URRE);
+		pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
+	}
+
+	/* Enable ECRC generation and check */
+	if (pcie_cap) {
+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
+		pnv_pci_cfg_read(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
+		pnv_pci_cfg_write(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
+	}
+
+	return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
 static int pnv_eeh_restore_config(struct pci_dn *pdn)
 {
 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
@@ -1599,7 +1652,13 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
 		return -EEXIST;
 
 	phb = edev->phb->private_data;
-	ret = opal_pci_reinit(phb->opal_id,
+#ifdef CONFIG_PCI_IOV
+	/* FW is not VF aware, we rely on OS to restore it */
+	if (edev->mode & EEH_DEV_VF)
+		ret = pnv_eeh_vf_restore_config(pdn);
+	else
+#endif
+		ret = opal_pci_reinit(phb->opal_id,
 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
 	if (ret) {
 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
@@ -1660,4 +1719,20 @@ static void pnv_pci_fixup_vf_eeh(struct pci_dev *pdev)
 	}
 }
 DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_eeh);
+
+static void pnv_pci_fixup_vf_caps(struct pci_dev *pdev)
+{
+	struct pci_dn *pdn = pci_get_pdn(pdev);
+	int parent_mps;
+
+	if (!pdev->is_virtfn)
+		return;
+
+	/* Synchronize MPS for VF and PF */
+	parent_mps = pcie_get_mps(pdev->physfn);
+	if ((128 << pdev->pcie_mpss) >= parent_mps)
+		pcie_set_mps(pdev, parent_mps);
+	pdn->mps = pcie_get_mps(pdev);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_caps);
 #endif /* CONFIG_PCI_IOV */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V3 9/9] powerpc/eeh: handle VF PE properly
  2015-05-04  7:07 [PATCH V3 0/9] VF EEH on Power8 Wei Yang
                   ` (7 preceding siblings ...)
  2015-05-04  7:07 ` [PATCH V3 8/9] powerpc/powernv: Support PCI config restore " Wei Yang
@ 2015-05-04  7:07 ` Wei Yang
  2015-05-13  1:16   ` Gavin Shan
  8 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-04  7:07 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

Compared with Bus PE, VF PE just has one single pci function. This
introduces the difference of error handling on a VF PE.

For example in the hotplug case, EEH needs to remove and re-create the VF
properly. In the case when PF's error_detected() disable SRIOV, this patch
introduces a flag to mark the eeh_dev of a VF to avoid the slot_reset() and
resume(). Since the FW is not ware of the VF, this patch handles the VF
restore/reset in kernel directly.

This patch is to handle the VF PE properly in these cases.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h   |    1 +
 arch/powerpc/kernel/eeh.c        |    1 +
 arch/powerpc/kernel/eeh_driver.c |  108 ++++++++++++++++++++++++++++++--------
 arch/powerpc/kernel/eeh_pe.c     |    3 +-
 4 files changed, 90 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 78c8bec..43e8a24 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -141,6 +141,7 @@ struct eeh_dev {
 	struct pci_controller *phb;	/* Associated PHB		*/
 	struct pci_dn *pdn;		/* Associated PCI device node	*/
 	struct pci_dev *pdev;		/* Associated PCI device	*/
+	int    in_error;		/* Error flag for eeh_dev	*/
 #ifdef CONFIG_PCI_IOV
 	struct pci_dev *physfn;		/* Associated PF PORT		*/
 #endif /* CONFIG_PCI_IOV */
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 221e280..077c3d1 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1226,6 +1226,7 @@ void eeh_remove_device(struct pci_dev *dev)
 	 * from the parent PE during the BAR resotre.
 	 */
 	edev->pdev = NULL;
+	edev->in_error = 0;
 	dev->dev.archdata.edev = NULL;
 	if (!(edev->pe->state & EEH_PE_KEEP))
 		eeh_rmv_from_parent_pe(edev);
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 89eb4bc..6e42cad 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -211,6 +211,7 @@ static void *eeh_report_error(void *data, void *userdata)
 	if (rc == PCI_ERS_RESULT_NEED_RESET) *res = rc;
 	if (*res == PCI_ERS_RESULT_NONE) *res = rc;
 
+	edev->in_error = 1;
 	eeh_pcid_put(dev);
 	return NULL;
 }
@@ -282,7 +283,8 @@ static void *eeh_report_reset(void *data, void *userdata)
 
 	if (!driver->err_handler ||
 	    !driver->err_handler->slot_reset ||
-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
+	    (!edev->in_error)) {
 		eeh_pcid_put(dev);
 		return NULL;
 	}
@@ -339,14 +341,16 @@ static void *eeh_report_resume(void *data, void *userdata)
 
 	if (!driver->err_handler ||
 	    !driver->err_handler->resume ||
-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
+	    (!edev->in_error)) {
 		edev->mode &= ~EEH_DEV_NO_HANDLER;
-		eeh_pcid_put(dev);
-		return NULL;
+		goto out;
 	}
 
 	driver->err_handler->resume(dev);
 
+out:
+	edev->in_error = 0;
 	eeh_pcid_put(dev);
 	return NULL;
 }
@@ -386,12 +390,42 @@ static void *eeh_report_failure(void *data, void *userdata)
 	return NULL;
 }
 
+#ifdef CONFIG_PCI_IOV
+static void *eeh_add_virt_device(void *data, void *userdata)
+{
+	struct pci_driver *driver;
+	struct eeh_dev *edev = (struct eeh_dev *)data;
+	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
+	if (!(edev->mode & EEH_DEV_VF)) {
+		pr_warn("EEH: eeh_dev(%04x:%02x:%02x:%01x) is not a VF\n",
+			edev->phb->global_number, pdn->busno,
+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+		return NULL;
+	}
+
+	driver = eeh_pcid_get(dev);
+	if (driver) {
+		eeh_pcid_put(dev);
+		if (driver->err_handler)
+			return NULL;
+	}
+
+	pci_iov_virtfn_add(edev->physfn, pdn->vf_index, 0);
+	return NULL;
+}
+#endif /* CONFIG_PCI_IOV */
+
 static void *eeh_rmv_device(void *data, void *userdata)
 {
 	struct pci_driver *driver;
 	struct eeh_dev *edev = (struct eeh_dev *)data;
 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
 	int *removed = (int *)userdata;
+#ifdef CONFIG_PCI_IOV
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+#endif
 
 	/*
 	 * Actually, we should remove the PCI bridges as well.
@@ -416,7 +450,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
 	driver = eeh_pcid_get(dev);
 	if (driver) {
 		eeh_pcid_put(dev);
-		if (driver->err_handler)
+		if (removed && driver->err_handler)
 			return NULL;
 	}
 
@@ -425,11 +459,21 @@ static void *eeh_rmv_device(void *data, void *userdata)
 		 pci_name(dev));
 	edev->bus = dev->bus;
 	edev->mode |= EEH_DEV_DISCONNECTED;
-	(*removed)++;
-
-	pci_lock_rescan_remove();
-	pci_stop_and_remove_bus_device(dev);
-	pci_unlock_rescan_remove();
+	if (removed)
+		(*removed)++;
+
+#ifdef CONFIG_PCI_IOV
+	if (edev->mode & EEH_DEV_VF) {
+		pci_iov_virtfn_remove(edev->physfn, pdn->vf_index, 0);
+		edev->pdev = NULL;
+		pdn->pe_number = IODA_INVALID_PE;
+	} else
+#endif
+	{
+		pci_lock_rescan_remove();
+		pci_stop_and_remove_bus_device(dev);
+		pci_unlock_rescan_remove();
+	}
 
 	return NULL;
 }
@@ -548,6 +592,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	struct pci_bus *frozen_bus = eeh_pe_bus_get(pe);
 	struct timeval tstamp;
 	int cnt, rc, removed = 0;
+	struct eeh_dev *edev;
 
 	/* pcibios will clear the counter; save the value */
 	cnt = pe->freeze_count;
@@ -561,12 +606,15 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	 */
 	eeh_pe_state_mark(pe, EEH_PE_KEEP);
 	if (bus) {
-		pci_lock_rescan_remove();
-		pcibios_remove_pci_devices(bus);
-		pci_unlock_rescan_remove();
-	} else if (frozen_bus) {
+		if (pe->type & EEH_PE_VF)
+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
+		else {
+			pci_lock_rescan_remove();
+			pcibios_remove_pci_devices(bus);
+			pci_unlock_rescan_remove();
+		}
+	} else if (frozen_bus)
 		eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
-	}
 
 	/*
 	 * Reset the pci controller. (Asserts RST#; resets config space).
@@ -607,14 +655,26 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 		 * PE. We should disconnect it so the binding can be
 		 * rebuilt when adding PCI devices.
 		 */
+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
-		pcibios_add_pci_devices(bus);
+#ifdef CONFIG_PCI_IOV
+		if (pe->type & EEH_PE_VF)
+			eeh_add_virt_device(edev, NULL);
+		else
+#endif
+			pcibios_add_pci_devices(bus);
 	} else if (frozen_bus && removed) {
 		pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
 		ssleep(5);
 
+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
-		pcibios_add_pci_devices(frozen_bus);
+#ifdef CONFIG_PCI_IOV
+		if (pe->type & EEH_PE_VF)
+			eeh_add_virt_device(edev, NULL);
+		else
+#endif
+			pcibios_add_pci_devices(frozen_bus);
 	}
 	eeh_pe_state_clear(pe, EEH_PE_KEEP);
 
@@ -792,11 +852,15 @@ perm_error:
 	 * the their PCI config any more.
 	 */
 	if (frozen_bus) {
-		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
-
-		pci_lock_rescan_remove();
-		pcibios_remove_pci_devices(frozen_bus);
-		pci_unlock_rescan_remove();
+		if (pe->type & EEH_PE_VF) {
+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+		} else {
+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+			pci_lock_rescan_remove();
+			pcibios_remove_pci_devices(frozen_bus);
+			pci_unlock_rescan_remove();
+		}
 	}
 }
 
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index edfe63a..3b8b32f 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -916,7 +916,8 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe)
 	if (pe->type & EEH_PE_PHB) {
 		bus = pe->phb->bus;
 	} else if (pe->type & EEH_PE_BUS ||
-		   pe->type & EEH_PE_DEVICE) {
+		   pe->type & EEH_PE_DEVICE ||
+		   pe->type & EEH_PE_VF) {
 		if (pe->bus) {
 			bus = pe->bus;
 			goto out;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 1/9] pci/iov: rename and export virtfn_add/virtfn_remove
  2015-05-04  7:07 ` [PATCH V3 1/9] pci/iov: rename and export virtfn_add/virtfn_remove Wei Yang
@ 2015-05-11  2:13   ` Gavin Shan
  0 siblings, 0 replies; 33+ messages in thread
From: Gavin Shan @ 2015-05-11  2:13 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Mon, May 04, 2015 at 03:07:30PM +0800, Wei Yang wrote:
>During the EEH procedure, when a device's driver is not EEH aware or no
            ^^^^^^^^^^^^^
            EEH recovery,

>driver is binded with a device, EEH core would do hotplug on this devices.
           ^^^^^^                                             ^^^^^^^^^^^^

>While it isn't feasible for a VF with usual hotplug procedure. During
>removal of a VF, virt_bus should be removed if necessary. During the
>re-creation, the pci_scan_slot() doesn't work on a VF.
>

Is "virt_bus" related to those functions you try to export? Even so,
you can simply have "virt_bus" in the commit log, please have "
virtual bus" or the formal things like that.

>This patch exports two functions to handle the hotplug case for VF
>properly. They will be invoked when the EEH core do the hotplug case for
                                                 ^^^^

>VFs.

Couple of typos as above.

Thanks,
Gavin

>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> drivers/pci/iov.c   |   10 +++++-----
> include/linux/pci.h |    2 ++
> 2 files changed, 7 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index ee0ebff..cc941dd 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -108,7 +108,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
> 	return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
> }
>
>-static int virtfn_add(struct pci_dev *dev, int id, int reset)
>+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset)
> {
> 	int i;
> 	int rc = -ENOMEM;
>@@ -183,7 +183,7 @@ failed:
> 	return rc;
> }
>
>-static void virtfn_remove(struct pci_dev *dev, int id, int reset)
>+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset)
> {
> 	char buf[VIRTFN_ID_LEN];
> 	struct pci_dev *virtfn;
>@@ -320,7 +320,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 	}
>
> 	for (i = 0; i < initial; i++) {
>-		rc = virtfn_add(dev, i, 0);
>+		rc = pci_iov_virtfn_add(dev, i, 0);
> 		if (rc)
> 			goto failed;
> 	}
>@@ -332,7 +332,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>
> failed:
> 	for (j = 0; j < i; j++)
>-		virtfn_remove(dev, j, 0);
>+		pci_iov_virtfn_remove(dev, j, 0);
>
> 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
> 	pci_cfg_access_lock(dev);
>@@ -361,7 +361,7 @@ static void sriov_disable(struct pci_dev *dev)
> 		return;
>
> 	for (i = 0; i < iov->num_VFs; i++)
>-		virtfn_remove(dev, i, 0);
>+		pci_iov_virtfn_remove(dev, i, 0);
>
> 	pcibios_sriov_disable(dev);
>
>diff --git a/include/linux/pci.h b/include/linux/pci.h
>index 353db8d..94bacfa 100644
>--- a/include/linux/pci.h
>+++ b/include/linux/pci.h
>@@ -1679,6 +1679,8 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
>
> int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
> void pci_disable_sriov(struct pci_dev *dev);
>+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset);
>+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset);
> int pci_num_vf(struct pci_dev *dev);
> int pci_vfs_assigned(struct pci_dev *dev);
> int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
>-- 
>1.7.9.5
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 2/9] powerpc/pci_dn: cache vf_index in pci_dn
  2015-05-04  7:07 ` [PATCH V3 2/9] powerpc/pci_dn: cache vf_index in pci_dn Wei Yang
@ 2015-05-11  2:21   ` Gavin Shan
  2015-05-11  5:54     ` Wei Yang
  0 siblings, 1 reply; 33+ messages in thread
From: Gavin Shan @ 2015-05-11  2:21 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Mon, May 04, 2015 at 03:07:31PM +0800, Wei Yang wrote:
>This patch caches the index of a VF in its PF in pci_dn.
>

At least you can mention the purpose of vf_index to make the commit log
complete. The following message looks better?

The patch caches the VF index in pci_dn, which can be used to calculate
VF's bus, device and function number. Those information helps to locate
the VF's PCI device instance when doing hotplug during EEH recovery if
necessary.

>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/include/asm/pci-bridge.h |    1 +
> arch/powerpc/kernel/pci_dn.c          |    5 +++--
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>index 1811c44..9582aa2 100644
>--- a/arch/powerpc/include/asm/pci-bridge.h
>+++ b/arch/powerpc/include/asm/pci-bridge.h
>@@ -199,6 +199,7 @@ struct pci_dn {
> #ifdef CONFIG_PCI_IOV
> 	u16     vfs_expanded;		/* number of VFs IOV BAR expanded */
> 	u16     num_vfs;		/* number of VFs enabled*/
>+	int     vf_index;		/* Index to PF for VF dev */
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^

					/* VF index in the PF */

And I believe it can be "unsigned int", or u16. We should have
non-negative vf_index, no?

> 	int     offset;			/* PE# for the first VF PE */
> #define M64_PER_IOV 4
> 	int     m64_per_iov;
>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>index b3b4df9..bf0fb873 100644
>--- a/arch/powerpc/kernel/pci_dn.c
>+++ b/arch/powerpc/kernel/pci_dn.c
>@@ -138,7 +138,7 @@ struct pci_dn *pci_get_pdn(struct pci_dev *pdev)
>
> #ifdef CONFIG_PCI_IOV
> static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
>-					   struct pci_dev *pdev,
>+					   struct pci_dev *pdev, int vf_index,

					   struct pci_dev *pdev,
					   int vf_index;

> 					   int busno, int devfn)
> {
> 	struct pci_dn *pdn;
>@@ -157,6 +157,7 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
> 	pdn->parent = parent;
> 	pdn->busno = busno;
> 	pdn->devfn = devfn;
>+	pdn->vf_index = vf_index;
> #ifdef CONFIG_PPC_POWERNV
> 	pdn->pe_number = IODA_INVALID_PE;
> #endif
>@@ -196,7 +197,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
> 		return NULL;
>
> 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>-		pdn = add_one_dev_pci_data(parent, NULL,
>+		pdn = add_one_dev_pci_data(parent, NULL, i,
> 					   pci_iov_virtfn_bus(pdev, i),
> 					   pci_iov_virtfn_devfn(pdev, i));
> 		if (!pdn) {

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 5/9] powerpc/eeh: create EEH_PE_VF for VF PE
  2015-05-04  7:07 ` [PATCH V3 5/9] powerpc/eeh: create EEH_PE_VF for VF PE Wei Yang
@ 2015-05-11  2:37   ` Gavin Shan
  2015-05-11  6:25     ` Wei Yang
  0 siblings, 1 reply; 33+ messages in thread
From: Gavin Shan @ 2015-05-11  2:37 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Mon, May 04, 2015 at 03:07:34PM +0800, Wei Yang wrote:

Please reorder PATCH[6] with this one because the EEH device is expected
to be created before EEH PE.

>On powernv platform, VF PE is a special PE which is different from the Bus
>PE.  On the EEH side, it needs a corresponding concept to handle the VF PE
>properly. For example, we need to create VF PE when VF's pci_dev is
>initialized in kernel. And add a flag to mark it is a VF PF.
                                                       ^^^^^
>

>From above commit log, my understanding is that you're adding a flag to
identify VF PE, which is handled differently from bus PE. You missed the
details on the difference between them and the speical treament to VF PE.
Could you help add those information in the commit log to make it looks
complete?

>This patch introduces the EEH_PE_VF type for VF PE and creates it for a VF.
>At the mean time, it creates the sysfs and address cache for VF PE.
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

                   it creates the sysfs and address cache for VF PE at PCI
device final fixup time.

>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/include/asm/eeh.h               |    1 +
> arch/powerpc/kernel/eeh_pe.c                 |   12 ++++++++++--
> arch/powerpc/platforms/powernv/eeh-powernv.c |   12 ++++++++++++
> 3 files changed, 23 insertions(+), 2 deletions(-)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index a52db28..56e8cd9 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -70,6 +70,7 @@ struct pci_dn;
> #define EEH_PE_PHB	(1 << 1)	/* PHB PE    */
> #define EEH_PE_DEVICE 	(1 << 2)	/* Device PE */
> #define EEH_PE_BUS	(1 << 3)	/* Bus PE    */
>+#define EEH_PE_VF	(1 << 4)	/* VF PE     */
>
> #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
> #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>index 35f0b62..edfe63a 100644
>--- a/arch/powerpc/kernel/eeh_pe.c
>+++ b/arch/powerpc/kernel/eeh_pe.c
>@@ -299,7 +299,12 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
> 	 * EEH device already having associated PE, but
> 	 * the direct parent EEH device doesn't have yet.
> 	 */
>-	pdn = pdn ? pdn->parent : NULL;
>+#ifdef CONFIG_PCI_IOV
>+	if (edev->mode & EEH_DEV_VF)
>+		pdn = pci_get_pdn(edev->physfn);
>+	else
>+#endif
>+		pdn = pdn ? pdn->parent : NULL;

[A]

> 	while (pdn) {
> 		/* We're poking out of PCI territory */
> 		parent = pdn_to_eeh_dev(pdn);
>@@ -382,7 +387,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
> 	}
>
> 	/* Create a new EEH PE */
>-	pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
>+	if (edev->mode & EEH_DEV_VF)
>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
>+	else
>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);

You don't have CONFIG_PCI_IOV here to protect the code, but you had
that at [A]. In order to keep the code look consistent, you either
add it or remove it for all places. I prefer to remove it, which
we don't need CONFIG_PCI_IOV.
 
> 	if (!pe) {
> 		pr_err("%s: out of memory!\n", __func__);
> 		return -ENOMEM;
>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>index 622f08c..5447481 100644
>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>@@ -1540,3 +1540,15 @@ static int __init eeh_powernv_init(void)
> 	return ret;
> }
> machine_early_initcall(powernv, eeh_powernv_init);
>+
>+#ifdef CONFIG_PCI_IOV
>+static void pnv_pci_fixup_vf_eeh(struct pci_dev *pdev)
>+{

Please rename it to pnv_eeh_vf_final_fixup(). Names of all functions
in this file expect prefix "pnv_eeh_". With "_final_", it's clearly
to tell it's called on PCI device final fixup time.

>+	/* sysfs files should only be added after devices are added */

It's nice to explain why here: sysfs for the PCI device isn't populated
and the MMIO resource isn't finalized for the PCI device yet.

>+	if (pdev->is_virtfn) {
>+		eeh_add_device_late(pdev);
>+		eeh_sysfs_add_device(pdev);
>+	}
>+}

The nested ifdef can be avoided as:

	if (!pdev->is_virtfn)
		return;

	eeh_add_device_late(pdev);
	eeh_sysfs_add_device(pdev);

>+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_eeh);
>+#endif /* CONFIG_PCI_IOV */

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 6/9] powerpc/powernv: create/release eeh_dev for VF
  2015-05-04  7:07 ` [PATCH V3 6/9] powerpc/powernv: create/release eeh_dev for VF Wei Yang
@ 2015-05-11  2:48   ` Gavin Shan
  2015-05-12  8:06     ` Wei Yang
  0 siblings, 1 reply; 33+ messages in thread
From: Gavin Shan @ 2015-05-11  2:48 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Mon, May 04, 2015 at 03:07:35PM +0800, Wei Yang wrote:

Please order this patch and PATCH[5] because EEH device is expected to
be created before EEH PE.

>EEH on powerpc platform needs eeh_dev structure to track the pci device
                                                              ^^^
							      PCI

>status. Since VFs are created/released dynamically, VF's eeh_dev is also
>dynamically created/released in system.
>
>This patch creates/removes eeh_dev when pci_dn is created/removed for VFs,
>and marks it with EEH_DEV_VF type.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/include/asm/eeh.h |    7 +++++++
> arch/powerpc/kernel/eeh.c      |    4 ++++
> arch/powerpc/kernel/eeh_dev.c  |   20 ++++++++++++++++++++
> arch/powerpc/kernel/pci_dn.c   |    7 +++++++
> 4 files changed, 38 insertions(+)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index 56e8cd9..2067de4 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -124,6 +124,7 @@ static inline bool eeh_pe_passed(struct eeh_pe *pe)
> #define EEH_DEV_NO_HANDLER	(1 << 8)	/* No error handler	*/
> #define EEH_DEV_SYSFS		(1 << 9)	/* Sysfs created	*/
> #define EEH_DEV_REMOVED		(1 << 10)	/* Removed permanently	*/
>+#define EEH_DEV_VF		(1 << 11)	/* VF port		*/
>

Why you need this flag? I guess "edev->physfn" can be used to distinguish
it's a normal or VF eeh_dev.

> struct eeh_dev {
> 	int mode;			/* EEH mode			*/
>@@ -139,6 +140,9 @@ struct eeh_dev {
> 	struct pci_controller *phb;	/* Associated PHB		*/
> 	struct pci_dn *pdn;		/* Associated PCI device node	*/
> 	struct pci_dev *pdev;		/* Associated PCI device	*/
>+#ifdef CONFIG_PCI_IOV
>+	struct pci_dev *physfn;		/* Associated PF PORT		*/
>+#endif /* CONFIG_PCI_IOV */
> 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
> };
>
>@@ -273,6 +277,7 @@ const char *eeh_pe_loc_get(struct eeh_pe *pe);
> struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
>
> void *eeh_dev_init(struct pci_dn *pdn, void *data);
>+void eeh_dev_remove(struct pci_dn *pdn);
> void eeh_dev_phb_init_dynamic(struct pci_controller *phb);
> int eeh_init(void);
> int __init eeh_ops_register(struct eeh_ops *ops);
>@@ -328,6 +333,8 @@ static inline void *eeh_dev_init(struct pci_dn *pdn, void *data)
> 	return NULL;
> }
>
>+void eeh_dev_remove(struct pci_dn *pdn) { }
>+
> static inline void eeh_dev_phb_init_dynamic(struct pci_controller *phb) { }
>
> static inline int eeh_check_failure(const volatile void __iomem *token)
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index 6c7ce1b..221e280 100644
>--- a/arch/powerpc/kernel/eeh.c
>+++ b/arch/powerpc/kernel/eeh.c
>@@ -1135,6 +1135,10 @@ void eeh_add_device_late(struct pci_dev *dev)
> 	}
>
> 	edev->pdev = dev;
>+#ifdef CONFIG_PCI_IOV
>+	if (dev->is_virtfn)
>+		edev->physfn = dev->physfn;
>+#endif
> 	dev->dev.archdata.edev = edev;
>
> 	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
>diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c
>index aabba94..ab88e61 100644
>--- a/arch/powerpc/kernel/eeh_dev.c
>+++ b/arch/powerpc/kernel/eeh_dev.c
>@@ -72,6 +72,26 @@ void *eeh_dev_init(struct pci_dn *pdn, void *data)
> }
>
> /**
>+ * eeh_dev_remove - Release EEH device according to OF node
>+ * @pdn: PCI device node
>+ */
>+void eeh_dev_remove(struct pci_dn *pdn)
>+{
>+	struct eeh_dev *edev;
>+
>+	if (!pdn)
>+		return;
>+
>+	edev = pdn_to_eeh_dev(pdn);
>+	if(!edev)
>+		return;
>+

You don't have to have the check "if (!pdn)" here because edev should
be NULL if pdn is NULL, which is covered by pdn_to_eeh_dev().

	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);

	if (!edev)
		return;

>+	eeh_rmv_from_parent_pe(edev);
>+	kfree(edev);
>+	pdn->edev = NULL;
>+}
>+

[A]

The function itself isn't correct enough. When VF's pci_dev is released,
eeh_remove_device(), detaching eeh_dev from parent PE and destroying EEH
address cache, should be called. The function isn't complete and you don't
have to detach eeh_dev from parent PE again.

>+/**
>  * eeh_dev_phb_init_dynamic - Create EEH devices for devices included in PHB
>  * @phb: PHB
>  *
>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>index bf0fb873..a35f865 100644
>--- a/arch/powerpc/kernel/pci_dn.c
>+++ b/arch/powerpc/kernel/pci_dn.c
>@@ -179,7 +179,9 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
> struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
> {
> #ifdef CONFIG_PCI_IOV
>+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> 	struct pci_dn *parent, *pdn;
>+	struct eeh_dev *edev;
> 	int i;
>
> 	/* Only support IOV for now */
>@@ -205,6 +207,9 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
> 				 __func__, i);
> 			return NULL;
> 		}
>+		eeh_dev_init(pdn, hose);
>+		edev = pdn_to_eeh_dev(pdn);
>+		edev->mode |= EEH_DEV_VF;
> 	}
> #endif /* CONFIG_PCI_IOV */
>
>@@ -257,6 +262,8 @@ void remove_dev_pci_data(struct pci_dev *pdev)
> 			    pdn->devfn != pci_iov_virtfn_devfn(pdev, i))
> 				continue;
>
>+			eeh_dev_remove(pdn);
>+

Remove the function in [A] and you just need:

	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);

	if (edev) {
		pdn->edev = NULL;
		kfree(edev);
	}

> 			if (!list_empty(&pdn->list))
> 				list_del(&pdn->list);
>
>-- 
>1.7.9.5
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 7/9] powerpc/powernv: Support EEH reset for VFs
  2015-05-04  7:07 ` [PATCH V3 7/9] powerpc/powernv: Support EEH reset for VFs Wei Yang
@ 2015-05-11  3:03   ` Gavin Shan
  0 siblings, 0 replies; 33+ messages in thread
From: Gavin Shan @ 2015-05-11  3:03 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Mon, May 04, 2015 at 03:07:36PM +0800, Wei Yang wrote:
>Before VF PE introduced, there isn't a method to reset an individual pci
             ^                                                        ^^^
             is                                                       PCI

>function. And since FW is not aware of the VF, the VF's reset should be
                     ^^
                    skiboot firmware

>done in kernel.
>
>This patch introduce a pnv_eeh_vf_pe_reset() to do the flr or af_flr to a
            ^^^^^^^^^  ^                                ^^^    ^^^^^^
            introduces function                         FLR    AF FLR
>VF.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/include/asm/eeh.h               |    1 +
> arch/powerpc/platforms/powernv/eeh-powernv.c |  111 +++++++++++++++++++++++++-
> 2 files changed, 111 insertions(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index 2067de4..78c8bec 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -135,6 +135,7 @@ struct eeh_dev {
> 	int pcix_cap;			/* Saved PCIx capability	*/
> 	int pcie_cap;			/* Saved PCIe capability	*/
> 	int aer_cap;			/* Saved AER capability		*/
>+	int af_cap;			/* Saved AF capability		*/
> 	struct eeh_pe *pe;		/* Associated PE		*/
> 	struct list_head list;		/* Form link list in the PE	*/
> 	struct pci_controller *phb;	/* Associated PHB		*/
>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>index 5447481..1ad322f 100644
>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>@@ -402,6 +402,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
> 	edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX);
> 	edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
> 	edev->aer_cap  = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
>+	edev->af_cap   = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF);
> 	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
> 		edev->mode |= EEH_DEV_BRIDGE;
> 		if (edev->pcie_cap) {
>@@ -891,6 +892,105 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
> 	return 0;
> }
>
>+static int pnv_pci_wait_for_pending(struct pci_dn *pdn, int pos, u16 mask)
>+{
>+	int i;
>+
>+	/* Wait for Transaction Pending bit clean */
>+	for (i = 0; i < 4; i++) {
>+		u32 status;
>+		if (i)
>+			msleep((1 << (i - 1)) * 100);
>+
>+		eeh_ops->read_config(pdn, pos, 2, &status);
>+		if (!(status & mask))
>+			return 1;
>+	}
>+
>+	return 0;
>+}
>+
>+static int pnv_eeh_do_flr(struct pci_dn *pdn)
>+{
>+	u32 cap;
>+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>+
>+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &cap);
>+	if (!(cap & PCI_EXP_DEVCAP_FLR))
>+		return -ENOTTY;
>+
>+	if (!pnv_pci_wait_for_pending(pdn, edev->pcie_cap + PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_TRPND))
>+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending transaction; performing function level reset anyway\n",
>+			edev->phb->global_number, pdn->busno,
>+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
>+

It's breaking the limitation that each line has 80 columns to maximal degree.

>+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, &cap);
>+	cap |= PCI_EXP_DEVCTL_BCR_FLR;
>+	eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, cap);
>+	msleep(100);
>+	return 0;
>+}
>+
>+static int pnv_eeh_do_af_flr(struct pci_dn *pdn)
>+{
>+	u32 cap;
>+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>+
>+	if (!edev->af_cap)
>+		return -ENOTTY;
>+
>+	eeh_ops->read_config(pdn, edev->af_cap + PCI_AF_CAP, 1, &cap);
>+	if (!(cap & PCI_AF_CAP_TP) || !(cap & PCI_AF_CAP_FLR))
>+		return -ENOTTY;

Why not cache af_cap is it really support reset during probe time? With
that, you don't have the check.

>+
>+	/*
>+	 * Wait for Transaction Pending bit to clear.  A word-aligned test
>+	 * is used, so we use the conrol offset rather than status and shift
>+	 * the test bit to match.
>+	 */
>+	if (!pnv_pci_wait_for_pending(pdn, edev->af_cap + PCI_AF_CTRL,
>+				 PCI_AF_STATUS_TP << 8))
>+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending transaction; performing AF function level reset anyway\n",
>+			edev->phb->global_number, pdn->busno,
>+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
>+

It's breaking the limitation that each line has 80 columns.

>+	eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1, PCI_AF_CTRL_FLR);
>+	msleep(100);
>+	return 0;
>+}
>+

Can you merge the above two functions to one (pnv_eeh_do_vf_reset())? 

>+static int pnv_eeh_reset_vf(struct pci_dn *pdn)
>+{
>+	int rc;
>+
>+	might_sleep();
>+
>+	rc = pnv_eeh_do_flr(pdn);
>+	if (rc != -ENOTTY)
>+		goto done;
>+
>+	rc = pnv_eeh_do_af_flr(pdn);
>+	if (rc != -ENOTTY)
>+		goto done;
>+
>+done:
>+	return rc;
>+}
>+
>+static int pnv_eeh_vf_pe_reset(struct eeh_pe *pe, int option)
>+{
>+	struct eeh_dev *edev, *tmp;
>+	struct pci_dn *pdn;
>+	int ret = 0;
>+

Why you ignore "option" here? In that case, the PE will be resetted
for twice at asserting/deasserting time.

>+	eeh_pe_for_each_dev(pe, edev, tmp) {
>+		pdn = eeh_dev_to_pdn(edev);
>+		ret |= pnv_eeh_reset_vf(pdn);

When hitting error from reset on one VF, you don't need proceed, no?

>+	}
>+
>+	return ret;
>+}
>+
> void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
> {
> 	struct pci_controller *hose;
>@@ -966,7 +1066,9 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
> 		}
>
> 		bus = eeh_pe_bus_get(pe);
>-		if (pci_is_root_bus(bus) ||
>+		if (pe->type & EEH_PE_VF)
>+			ret = pnv_eeh_vf_pe_reset(pe, option);
>+		else if (pci_is_root_bus(bus) ||
> 			pci_is_root_bus(bus->parent))
> 			ret = pnv_eeh_root_reset(hose, option);
> 		else
>@@ -1106,6 +1208,13 @@ static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
> 	if (!edev || !edev->pe)
> 		return false;
>
>+	/*
>+	 * For VF's reset operation, we need to rely on the kernel to
>+	 * do those pci read/write, since FW isn't aware of VFs.
              ^^^^^^^^^^^^^^^^^^^^

              PCI config operations since firmware isn't aware of VFs.

>+	 */
>+	if ((edev->mode & EEH_DEV_VF) && (edev->pe->state & EEH_PE_RESET))
>+		return true;
>+

Hrm, did you test the code? The PCI config is blocked when issuing reset
to VF PE. "false" should be returned here, isn't it?

> 	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
> 		return true;
>

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 8/9] powerpc/powernv: Support PCI config restore for VFs
  2015-05-04  7:07 ` [PATCH V3 8/9] powerpc/powernv: Support PCI config restore " Wei Yang
@ 2015-05-11  4:22   ` Gavin Shan
  2015-05-12  1:31     ` Wei Yang
  0 siblings, 1 reply; 33+ messages in thread
From: Gavin Shan @ 2015-05-11  4:22 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Mon, May 04, 2015 at 03:07:37PM +0800, Wei Yang wrote:
>Since FW is not aware of VFs, the restore action for VF should be done in
       ^^
       skiboot firmware
>kernel.
>
>This patch introduces pnv_eeh_vf_restore_config() for VF.
>

Would it be better?

The patch introduces function pnv_eeh_vf_restore_config() to restore PCI
config space for VFs after reset.

Also, the function name would be better with pnv_eeh_restore_vf_config()?

>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/include/asm/pci-bridge.h        |    1 +
> arch/powerpc/platforms/powernv/eeh-powernv.c |   77 +++++++++++++++++++++++++-
> 2 files changed, 77 insertions(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>index 9582aa2..de55ef6 100644
>--- a/arch/powerpc/include/asm/pci-bridge.h
>+++ b/arch/powerpc/include/asm/pci-bridge.h
>@@ -205,6 +205,7 @@ struct pci_dn {
> 	int     m64_per_iov;
> #define IODA_INVALID_M64        (-1)
> 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
>+	int	mps;
> #endif /* CONFIG_PCI_IOV */
> #endif
> 	struct list_head child_list;
>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>index 1ad322f..6ba6d87 100644
>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>@@ -1589,6 +1589,59 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
> 	return ret;
> }
>
>+#ifdef CONFIG_PCI_IOV
>+static int pnv_eeh_vf_restore_config(struct pci_dn *pdn)
>+{
>+	int pcie_cap, aer_cap, old_mps;
>+	u32 devctl, cmd, cap2, aer_capctl;
>+
>+	/* Restore MPS */
>+	pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
>+	if (pcie_cap) {
>+		old_mps = (ffs(pdn->mps) - 8) << 5;
>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
>+		devctl |= old_mps;
>+		pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>+	}
>+

hrm, You can't use pnv_pci_cfg_{read,write} here. Instead, you should use
eeh_ops->{read,write}_config. By design, the PCI config accessors have been
classified to 2 classes: one is used for pci_config_{read,write}_* and another
one is eeh_ops->{read,write}. From EEH perspective, the former isn't controlled
strictly, but the later one is under control completely. "Not controlled" here
means the kernel can't determine when the PCI config is accessed, e.g. PCI
config accesses from user land.
 
>+	/* Disable Completion Timeout */
>+	if (pcie_cap) {
>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
>+		if (cap2 & 0x10) {
>+			pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
>+			cap2 |= 0x10;
>+			pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
>+		}
>+	}
>+
>+	/* Enable SERR and parity checking */
>+	pnv_pci_cfg_read(pdn, PCI_COMMAND, 2, &cmd);
>+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
>+	pnv_pci_cfg_write(pdn, PCI_COMMAND, 2, cmd);
>+
>+	/* Enable report various errors */
>+	if (pcie_cap) {
>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>+		devctl &= ~PCI_EXP_DEVCTL_CERE;
>+		devctl |= (PCI_EXP_DEVCTL_NFERE |
>+			   PCI_EXP_DEVCTL_FERE |
>+			   PCI_EXP_DEVCTL_URRE);
>+		pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>+	}
>+
>+	/* Enable ECRC generation and check */
>+	if (pcie_cap) {
>+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
>+		pnv_pci_cfg_read(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
>+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
>+		pnv_pci_cfg_write(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
>+	}
>+
>+	return 0;
>+}
>+#endif /* CONFIG_PCI_IOV */
>+

The code is copied over from skiboot firmware. I still dislike the fact that
we have to maintain two sets of similar functions in skiboot/kernel. I still
believe the way I suggested can help: the firmware exports the error routing
rules and kernel has support it based on the rules. With it, the skiboot is
the source of the information to avoid mismatching between kernel/firmware.

> static int pnv_eeh_restore_config(struct pci_dn *pdn)
> {
> 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>@@ -1599,7 +1652,13 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
> 		return -EEXIST;
>
> 	phb = edev->phb->private_data;
>-	ret = opal_pci_reinit(phb->opal_id,
>+#ifdef CONFIG_PCI_IOV
>+	/* FW is not VF aware, we rely on OS to restore it */
>+	if (edev->mode & EEH_DEV_VF)
>+		ret = pnv_eeh_vf_restore_config(pdn);
>+	else
>+#endif

You don't even have to have CONFIG_PCI_IOV since it won't save much
.text space.

>+		ret = opal_pci_reinit(phb->opal_id,
> 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
> 	if (ret) {
> 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
>@@ -1660,4 +1719,20 @@ static void pnv_pci_fixup_vf_eeh(struct pci_dev *pdev)
> 	}
> }
> DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_eeh);
>+
>+static void pnv_pci_fixup_vf_caps(struct pci_dev *pdev)
>+{

As I said before, this function shouldn't be part of this file because:

- When CONFIG_EEH=n, this file won't be complied/included.
- This function isn't part of EEH naturally.

Also, pnv_pci_vf_header_fixup() would be better name in case you need
apply more fixups for VFs in the function.

>+	struct pci_dn *pdn = pci_get_pdn(pdev);
>+	int parent_mps;
>+
>+	if (!pdev->is_virtfn)
>+		return;
>+
>+	/* Synchronize MPS for VF and PF */
>+	parent_mps = pcie_get_mps(pdev->physfn);
>+	if ((128 << pdev->pcie_mpss) >= parent_mps)
>+		pcie_set_mps(pdev, parent_mps);

Hrm, Again, do we have possibility: (128 << pdev->pcie_mpss) < parent_mps ?
And why we bother if MPS of PF/VF are equal?

>+	pdn->mps = pcie_get_mps(pdev);
>+}
>+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_caps);
> #endif /* CONFIG_PCI_IOV */
>-- 
>1.7.9.5
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 2/9] powerpc/pci_dn: cache vf_index in pci_dn
  2015-05-11  2:21   ` Gavin Shan
@ 2015-05-11  5:54     ` Wei Yang
  2015-05-12  6:15       ` Gavin Shan
  0 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-11  5:54 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Mon, May 11, 2015 at 12:21:04PM +1000, Gavin Shan wrote:
>On Mon, May 04, 2015 at 03:07:31PM +0800, Wei Yang wrote:
>>This patch caches the index of a VF in its PF in pci_dn.
>>
>
>At least you can mention the purpose of vf_index to make the commit log
>complete. The following message looks better?
>
>The patch caches the VF index in pci_dn, which can be used to calculate
>VF's bus, device and function number. Those information helps to locate
>the VF's PCI device instance when doing hotplug during EEH recovery if
>necessary.
>

Thanks, looks better. I added it in the log.

>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>---
>> arch/powerpc/include/asm/pci-bridge.h |    1 +
>> arch/powerpc/kernel/pci_dn.c          |    5 +++--
>> 2 files changed, 4 insertions(+), 2 deletions(-)
>>
>>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>>index 1811c44..9582aa2 100644
>>--- a/arch/powerpc/include/asm/pci-bridge.h
>>+++ b/arch/powerpc/include/asm/pci-bridge.h
>>@@ -199,6 +199,7 @@ struct pci_dn {
>> #ifdef CONFIG_PCI_IOV
>> 	u16     vfs_expanded;		/* number of VFs IOV BAR expanded */
>> 	u16     num_vfs;		/* number of VFs enabled*/
>>+	int     vf_index;		/* Index to PF for VF dev */
>                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>					/* VF index in the PF */

Ok, changed in the code.

>
>And I believe it can be "unsigned int", or u16. We should have
>non-negative vf_index, no?

Take a look in the virtfn_add(), the index in drivers/pci/iov.c is int. So I
copy that.

>
>> 	int     offset;			/* PE# for the first VF PE */
>> #define M64_PER_IOV 4
>> 	int     m64_per_iov;
>>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>>index b3b4df9..bf0fb873 100644
>>--- a/arch/powerpc/kernel/pci_dn.c
>>+++ b/arch/powerpc/kernel/pci_dn.c
>>@@ -138,7 +138,7 @@ struct pci_dn *pci_get_pdn(struct pci_dev *pdev)
>>
>> #ifdef CONFIG_PCI_IOV
>> static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
>>-					   struct pci_dev *pdev,
>>+					   struct pci_dev *pdev, int vf_index,
>
>					   struct pci_dev *pdev,
>					   int vf_index;

Some reason for this comment?

That does not exceed 80 characters.

>
>> 					   int busno, int devfn)
>> {
>> 	struct pci_dn *pdn;
>>@@ -157,6 +157,7 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
>> 	pdn->parent = parent;
>> 	pdn->busno = busno;
>> 	pdn->devfn = devfn;
>>+	pdn->vf_index = vf_index;
>> #ifdef CONFIG_PPC_POWERNV
>> 	pdn->pe_number = IODA_INVALID_PE;
>> #endif
>>@@ -196,7 +197,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
>> 		return NULL;
>>
>> 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>>-		pdn = add_one_dev_pci_data(parent, NULL,
>>+		pdn = add_one_dev_pci_data(parent, NULL, i,
>> 					   pci_iov_virtfn_bus(pdev, i),
>> 					   pci_iov_virtfn_devfn(pdev, i));
>> 		if (!pdn) {
>
>Thanks,
>Gavin

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 5/9] powerpc/eeh: create EEH_PE_VF for VF PE
  2015-05-11  2:37   ` Gavin Shan
@ 2015-05-11  6:25     ` Wei Yang
  2015-05-12  6:28       ` Gavin Shan
  0 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-11  6:25 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Mon, May 11, 2015 at 12:37:07PM +1000, Gavin Shan wrote:
>On Mon, May 04, 2015 at 03:07:34PM +0800, Wei Yang wrote:
>
>Please reorder PATCH[6] with this one because the EEH device is expected
>to be created before EEH PE.

That's a good idea.

>
>>On powernv platform, VF PE is a special PE which is different from the Bus
>>PE.  On the EEH side, it needs a corresponding concept to handle the VF PE
>>properly. For example, we need to create VF PE when VF's pci_dev is
>>initialized in kernel. And add a flag to mark it is a VF PF.
>                                                       ^^^^^
>>
>
>>From above commit log, my understanding is that you're adding a flag to
>identify VF PE, which is handled differently from bus PE. You missed the
>details on the difference between them and the speical treament to VF PE.
>Could you help add those information in the commit log to make it looks
>complete?
>

This patch just introduce the VF PE. For those differences, we have another
patch "handle VF PE properly" to cover. In the log of that patch, I listed
those differences. Do you think this is fine?

>>This patch introduces the EEH_PE_VF type for VF PE and creates it for a VF.
>>At the mean time, it creates the sysfs and address cache for VF PE.
>                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>                   it creates the sysfs and address cache for VF PE at PCI
>device final fixup time.
>
>>
>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>---
>> arch/powerpc/include/asm/eeh.h               |    1 +
>> arch/powerpc/kernel/eeh_pe.c                 |   12 ++++++++++--
>> arch/powerpc/platforms/powernv/eeh-powernv.c |   12 ++++++++++++
>> 3 files changed, 23 insertions(+), 2 deletions(-)
>>
>>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>>index a52db28..56e8cd9 100644
>>--- a/arch/powerpc/include/asm/eeh.h
>>+++ b/arch/powerpc/include/asm/eeh.h
>>@@ -70,6 +70,7 @@ struct pci_dn;
>> #define EEH_PE_PHB	(1 << 1)	/* PHB PE    */
>> #define EEH_PE_DEVICE 	(1 << 2)	/* Device PE */
>> #define EEH_PE_BUS	(1 << 3)	/* Bus PE    */
>>+#define EEH_PE_VF	(1 << 4)	/* VF PE     */
>>
>> #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
>> #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
>>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>>index 35f0b62..edfe63a 100644
>>--- a/arch/powerpc/kernel/eeh_pe.c
>>+++ b/arch/powerpc/kernel/eeh_pe.c
>>@@ -299,7 +299,12 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
>> 	 * EEH device already having associated PE, but
>> 	 * the direct parent EEH device doesn't have yet.
>> 	 */
>>-	pdn = pdn ? pdn->parent : NULL;
>>+#ifdef CONFIG_PCI_IOV
>>+	if (edev->mode & EEH_DEV_VF)
>>+		pdn = pci_get_pdn(edev->physfn);
>>+	else
>>+#endif
>>+		pdn = pdn ? pdn->parent : NULL;
>
>[A]
>
>> 	while (pdn) {
>> 		/* We're poking out of PCI territory */
>> 		parent = pdn_to_eeh_dev(pdn);
>>@@ -382,7 +387,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
>> 	}
>>
>> 	/* Create a new EEH PE */
>>-	pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
>>+	if (edev->mode & EEH_DEV_VF)
>>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
>>+	else
>>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
>
>You don't have CONFIG_PCI_IOV here to protect the code, but you had
>that at [A]. In order to keep the code look consistent, you either
>add it or remove it for all places. I prefer to remove it, which
>we don't need CONFIG_PCI_IOV.
>

Ok, that's fine to remove it.

BTW, if remove the CONFIG_PCI_IOV, we need to remove it around the physfn in
eeh_dev definition. That's fine?

>> 	if (!pe) {
>> 		pr_err("%s: out of memory!\n", __func__);
>> 		return -ENOMEM;
>>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>index 622f08c..5447481 100644
>>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>@@ -1540,3 +1540,15 @@ static int __init eeh_powernv_init(void)
>> 	return ret;
>> }
>> machine_early_initcall(powernv, eeh_powernv_init);
>>+
>>+#ifdef CONFIG_PCI_IOV
>>+static void pnv_pci_fixup_vf_eeh(struct pci_dev *pdev)
>>+{
>
>Please rename it to pnv_eeh_vf_final_fixup(). Names of all functions
>in this file expect prefix "pnv_eeh_". With "_final_", it's clearly
>to tell it's called on PCI device final fixup time.
>

ok

>>+	/* sysfs files should only be added after devices are added */
>
>It's nice to explain why here: sysfs for the PCI device isn't populated
>and the MMIO resource isn't finalized for the PCI device yet.
>

Don't get your point.

sysfs of the PCI device is populated at this point.

>>+	if (pdev->is_virtfn) {
>>+		eeh_add_device_late(pdev);
>>+		eeh_sysfs_add_device(pdev);
>>+	}
>>+}
>
>The nested ifdef can be avoided as:
>
>	if (!pdev->is_virtfn)
>		return;
>
>	eeh_add_device_late(pdev);
>	eeh_sysfs_add_device(pdev);
>

Ok.

>>+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_eeh);
>>+#endif /* CONFIG_PCI_IOV */
>
>Thanks,
>Gavin

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 8/9] powerpc/powernv: Support PCI config restore for VFs
  2015-05-11  4:22   ` Gavin Shan
@ 2015-05-12  1:31     ` Wei Yang
  2015-05-12  6:34       ` Gavin Shan
  0 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-12  1:31 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Mon, May 11, 2015 at 02:22:38PM +1000, Gavin Shan wrote:
>On Mon, May 04, 2015 at 03:07:37PM +0800, Wei Yang wrote:
>>Since FW is not aware of VFs, the restore action for VF should be done in
>       ^^
>       skiboot firmware
>>kernel.
>>
>>This patch introduces pnv_eeh_vf_restore_config() for VF.
>>
>
>Would it be better?
>
>The patch introduces function pnv_eeh_vf_restore_config() to restore PCI
>config space for VFs after reset.
>

Ok.

>Also, the function name would be better with pnv_eeh_restore_vf_config()?

Ok.

>
>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>---
>> arch/powerpc/include/asm/pci-bridge.h        |    1 +
>> arch/powerpc/platforms/powernv/eeh-powernv.c |   77 +++++++++++++++++++++++++-
>> 2 files changed, 77 insertions(+), 1 deletion(-)
>>
>>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>>index 9582aa2..de55ef6 100644
>>--- a/arch/powerpc/include/asm/pci-bridge.h
>>+++ b/arch/powerpc/include/asm/pci-bridge.h
>>@@ -205,6 +205,7 @@ struct pci_dn {
>> 	int     m64_per_iov;
>> #define IODA_INVALID_M64        (-1)
>> 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
>>+	int	mps;
>> #endif /* CONFIG_PCI_IOV */
>> #endif
>> 	struct list_head child_list;
>>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>index 1ad322f..6ba6d87 100644
>>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>@@ -1589,6 +1589,59 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
>> 	return ret;
>> }
>>
>>+#ifdef CONFIG_PCI_IOV
>>+static int pnv_eeh_vf_restore_config(struct pci_dn *pdn)
>>+{
>>+	int pcie_cap, aer_cap, old_mps;
>>+	u32 devctl, cmd, cap2, aer_capctl;
>>+
>>+	/* Restore MPS */
>>+	pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
>>+	if (pcie_cap) {
>>+		old_mps = (ffs(pdn->mps) - 8) << 5;
>>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>>+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
>>+		devctl |= old_mps;
>>+		pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>>+	}
>>+
>
>hrm, You can't use pnv_pci_cfg_{read,write} here. Instead, you should use
>eeh_ops->{read,write}_config. By design, the PCI config accessors have been
>classified to 2 classes: one is used for pci_config_{read,write}_* and another
>one is eeh_ops->{read,write}. From EEH perspective, the former isn't controlled
>strictly, but the later one is under control completely. "Not controlled" here
>means the kernel can't determine when the PCI config is accessed, e.g. PCI
>config accesses from user land.
>

Reasonable, will change it.

>>+	/* Disable Completion Timeout */
>>+	if (pcie_cap) {
>>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
>>+		if (cap2 & 0x10) {
>>+			pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
>>+			cap2 |= 0x10;
>>+			pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
>>+		}
>>+	}
>>+
>>+	/* Enable SERR and parity checking */
>>+	pnv_pci_cfg_read(pdn, PCI_COMMAND, 2, &cmd);
>>+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
>>+	pnv_pci_cfg_write(pdn, PCI_COMMAND, 2, cmd);
>>+
>>+	/* Enable report various errors */
>>+	if (pcie_cap) {
>>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>>+		devctl &= ~PCI_EXP_DEVCTL_CERE;
>>+		devctl |= (PCI_EXP_DEVCTL_NFERE |
>>+			   PCI_EXP_DEVCTL_FERE |
>>+			   PCI_EXP_DEVCTL_URRE);
>>+		pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>>+	}
>>+
>>+	/* Enable ECRC generation and check */
>>+	if (pcie_cap) {
>>+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
>>+		pnv_pci_cfg_read(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
>>+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
>>+		pnv_pci_cfg_write(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
>>+	}
>>+
>>+	return 0;
>>+}
>>+#endif /* CONFIG_PCI_IOV */
>>+
>
>The code is copied over from skiboot firmware. I still dislike the fact that
>we have to maintain two sets of similar functions in skiboot/kernel. I still
>believe the way I suggested can help: the firmware exports the error routing
>rules and kernel has support it based on the rules. With it, the skiboot is
>the source of the information to avoid mismatching between kernel/firmware.

Yes, it looks we have duplicate code in kernel and skiboot.

As you suggest, if we export some bit map from device node, we still have the
real logic in kernel, until we remove that part in skiboot.

By removing that part in skiboot, we may have some compatibility problem. For
example, an old kernel may not run on the new version of skiboot.

>
>> static int pnv_eeh_restore_config(struct pci_dn *pdn)
>> {
>> 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>>@@ -1599,7 +1652,13 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
>> 		return -EEXIST;
>>
>> 	phb = edev->phb->private_data;
>>-	ret = opal_pci_reinit(phb->opal_id,
>>+#ifdef CONFIG_PCI_IOV
>>+	/* FW is not VF aware, we rely on OS to restore it */
>>+	if (edev->mode & EEH_DEV_VF)
>>+		ret = pnv_eeh_vf_restore_config(pdn);
>>+	else
>>+#endif
>
>You don't even have to have CONFIG_PCI_IOV since it won't save much
>.text space.
>

ok

>>+		ret = opal_pci_reinit(phb->opal_id,
>> 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
>> 	if (ret) {
>> 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
>>@@ -1660,4 +1719,20 @@ static void pnv_pci_fixup_vf_eeh(struct pci_dev *pdev)
>> 	}
>> }
>> DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_eeh);
>>+
>>+static void pnv_pci_fixup_vf_caps(struct pci_dev *pdev)
>>+{
>
>As I said before, this function shouldn't be part of this file because:
>
>- When CONFIG_EEH=n, this file won't be complied/included.
>- This function isn't part of EEH naturally.
>

Moved to arch/powerpc/platform/powernv/pci.c

>Also, pnv_pci_vf_header_fixup() would be better name in case you need
>apply more fixups for VFs in the function.
>

Ok.

>>+	struct pci_dn *pdn = pci_get_pdn(pdev);
>>+	int parent_mps;
>>+
>>+	if (!pdev->is_virtfn)
>>+		return;
>>+
>>+	/* Synchronize MPS for VF and PF */
>>+	parent_mps = pcie_get_mps(pdev->physfn);
>>+	if ((128 << pdev->pcie_mpss) >= parent_mps)
>>+		pcie_set_mps(pdev, parent_mps);
>
>Hrm, Again, do we have possibility: (128 << pdev->pcie_mpss) < parent_mps ?
>And why we bother if MPS of PF/VF are equal?
>

pcie_mpss is the MPS supported, not the MPS itself.

This line means if the pci_dev support the parents mps, apply it.
Otherwise, just cache the mps.

>>+	pdn->mps = pcie_get_mps(pdev);
>>+}
>>+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_caps);
>> #endif /* CONFIG_PCI_IOV */
>>-- 
>>1.7.9.5
>>
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 2/9] powerpc/pci_dn: cache vf_index in pci_dn
  2015-05-11  5:54     ` Wei Yang
@ 2015-05-12  6:15       ` Gavin Shan
  2015-05-12  7:29         ` Wei Yang
  0 siblings, 1 reply; 33+ messages in thread
From: Gavin Shan @ 2015-05-12  6:15 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, Gavin Shan

On Mon, May 11, 2015 at 01:54:12PM +0800, Wei Yang wrote:
>On Mon, May 11, 2015 at 12:21:04PM +1000, Gavin Shan wrote:
>>On Mon, May 04, 2015 at 03:07:31PM +0800, Wei Yang wrote:
>>>This patch caches the index of a VF in its PF in pci_dn.
>>>
>>
>>At least you can mention the purpose of vf_index to make the commit log
>>complete. The following message looks better?
>>
>>The patch caches the VF index in pci_dn, which can be used to calculate
>>VF's bus, device and function number. Those information helps to locate
>>the VF's PCI device instance when doing hotplug during EEH recovery if
>>necessary.
>>
>
>Thanks, looks better. I added it in the log.
>
>>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>>---
>>> arch/powerpc/include/asm/pci-bridge.h |    1 +
>>> arch/powerpc/kernel/pci_dn.c          |    5 +++--
>>> 2 files changed, 4 insertions(+), 2 deletions(-)
>>>
>>>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>>>index 1811c44..9582aa2 100644
>>>--- a/arch/powerpc/include/asm/pci-bridge.h
>>>+++ b/arch/powerpc/include/asm/pci-bridge.h
>>>@@ -199,6 +199,7 @@ struct pci_dn {
>>> #ifdef CONFIG_PCI_IOV
>>> 	u16     vfs_expanded;		/* number of VFs IOV BAR expanded */
>>> 	u16     num_vfs;		/* number of VFs enabled*/
>>>+	int     vf_index;		/* Index to PF for VF dev */
>>                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>>					/* VF index in the PF */
>
>Ok, changed in the code.
>
>>
>>And I believe it can be "unsigned int", or u16. We should have
>>non-negative vf_index, no?
>
>Take a look in the virtfn_add(), the index in drivers/pci/iov.c is int. So I
>copy that.
>
>>
>>> 	int     offset;			/* PE# for the first VF PE */
>>> #define M64_PER_IOV 4
>>> 	int     m64_per_iov;
>>>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>>>index b3b4df9..bf0fb873 100644
>>>--- a/arch/powerpc/kernel/pci_dn.c
>>>+++ b/arch/powerpc/kernel/pci_dn.c
>>>@@ -138,7 +138,7 @@ struct pci_dn *pci_get_pdn(struct pci_dev *pdev)
>>>
>>> #ifdef CONFIG_PCI_IOV
>>> static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
>>>-					   struct pci_dev *pdev,
>>>+					   struct pci_dev *pdev, int vf_index,
>>
>>					   struct pci_dev *pdev,
>>					   int vf_index;
>
>Some reason for this comment?
>
>That does not exceed 80 characters.
>

No, it doesn't exceed 80 characters as you said. You take one of the following
formats, not the one you're using:

add_one_dev_pci_data(foo1, foo2,          add_one_dev_pci_data(foo1,
                     foo3, foo4,                               foo2,
                     foo5, foo6);                                :
                                                               foo6);

>>> 					   int busno, int devfn)
>>> {
>>> 	struct pci_dn *pdn;
>>>@@ -157,6 +157,7 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
>>> 	pdn->parent = parent;
>>> 	pdn->busno = busno;
>>> 	pdn->devfn = devfn;
>>>+	pdn->vf_index = vf_index;
>>> #ifdef CONFIG_PPC_POWERNV
>>> 	pdn->pe_number = IODA_INVALID_PE;
>>> #endif
>>>@@ -196,7 +197,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
>>> 		return NULL;
>>>
>>> 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>>>-		pdn = add_one_dev_pci_data(parent, NULL,
>>>+		pdn = add_one_dev_pci_data(parent, NULL, i,
>>> 					   pci_iov_virtfn_bus(pdev, i),
>>> 					   pci_iov_virtfn_devfn(pdev, i));
>>> 		if (!pdn) {

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 5/9] powerpc/eeh: create EEH_PE_VF for VF PE
  2015-05-11  6:25     ` Wei Yang
@ 2015-05-12  6:28       ` Gavin Shan
  2015-05-12  7:52         ` Wei Yang
  0 siblings, 1 reply; 33+ messages in thread
From: Gavin Shan @ 2015-05-12  6:28 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, Gavin Shan

On Mon, May 11, 2015 at 02:25:49PM +0800, Wei Yang wrote:
>On Mon, May 11, 2015 at 12:37:07PM +1000, Gavin Shan wrote:
>>On Mon, May 04, 2015 at 03:07:34PM +0800, Wei Yang wrote:
>>
>>Please reorder PATCH[6] with this one because the EEH device is expected
>>to be created before EEH PE.
>
>That's a good idea.
>
>>
>>>On powernv platform, VF PE is a special PE which is different from the Bus
>>>PE.  On the EEH side, it needs a corresponding concept to handle the VF PE
>>>properly. For example, we need to create VF PE when VF's pci_dev is
>>>initialized in kernel. And add a flag to mark it is a VF PF.
>>                                                       ^^^^^
>>>
>>
>>>From above commit log, my understanding is that you're adding a flag to
>>identify VF PE, which is handled differently from bus PE. You missed the
>>details on the difference between them and the speical treament to VF PE.
>>Could you help add those information in the commit log to make it looks
>>complete?
>>
>
>This patch just introduce the VF PE. For those differences, we have another
>patch "handle VF PE properly" to cover. In the log of that patch, I listed
>those differences. Do you think this is fine?
>

It's fine to me.

>>>This patch introduces the EEH_PE_VF type for VF PE and creates it for a VF.
>>>At the mean time, it creates the sysfs and address cache for VF PE.
>>                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>>                   it creates the sysfs and address cache for VF PE at PCI
>>device final fixup time.
>>
>>>
>>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>>---
>>> arch/powerpc/include/asm/eeh.h               |    1 +
>>> arch/powerpc/kernel/eeh_pe.c                 |   12 ++++++++++--
>>> arch/powerpc/platforms/powernv/eeh-powernv.c |   12 ++++++++++++
>>> 3 files changed, 23 insertions(+), 2 deletions(-)
>>>
>>>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>>>index a52db28..56e8cd9 100644
>>>--- a/arch/powerpc/include/asm/eeh.h
>>>+++ b/arch/powerpc/include/asm/eeh.h
>>>@@ -70,6 +70,7 @@ struct pci_dn;
>>> #define EEH_PE_PHB	(1 << 1)	/* PHB PE    */
>>> #define EEH_PE_DEVICE 	(1 << 2)	/* Device PE */
>>> #define EEH_PE_BUS	(1 << 3)	/* Bus PE    */
>>>+#define EEH_PE_VF	(1 << 4)	/* VF PE     */
>>>
>>> #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
>>> #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
>>>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>>>index 35f0b62..edfe63a 100644
>>>--- a/arch/powerpc/kernel/eeh_pe.c
>>>+++ b/arch/powerpc/kernel/eeh_pe.c
>>>@@ -299,7 +299,12 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
>>> 	 * EEH device already having associated PE, but
>>> 	 * the direct parent EEH device doesn't have yet.
>>> 	 */
>>>-	pdn = pdn ? pdn->parent : NULL;
>>>+#ifdef CONFIG_PCI_IOV
>>>+	if (edev->mode & EEH_DEV_VF)
>>>+		pdn = pci_get_pdn(edev->physfn);
>>>+	else
>>>+#endif
>>>+		pdn = pdn ? pdn->parent : NULL;
>>
>>[A]
>>
>>> 	while (pdn) {
>>> 		/* We're poking out of PCI territory */
>>> 		parent = pdn_to_eeh_dev(pdn);
>>>@@ -382,7 +387,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
>>> 	}
>>>
>>> 	/* Create a new EEH PE */
>>>-	pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
>>>+	if (edev->mode & EEH_DEV_VF)
>>>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
>>>+	else
>>>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
>>
>>You don't have CONFIG_PCI_IOV here to protect the code, but you had
>>that at [A]. In order to keep the code look consistent, you either
>>add it or remove it for all places. I prefer to remove it, which
>>we don't need CONFIG_PCI_IOV.
>>
>
>Ok, that's fine to remove it.
>
>BTW, if remove the CONFIG_PCI_IOV, we need to remove it around the physfn in
>eeh_dev definition. That's fine?
>

It's fine to me.

>>> 	if (!pe) {
>>> 		pr_err("%s: out of memory!\n", __func__);
>>> 		return -ENOMEM;
>>>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>>index 622f08c..5447481 100644
>>>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>>>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>>@@ -1540,3 +1540,15 @@ static int __init eeh_powernv_init(void)
>>> 	return ret;
>>> }
>>> machine_early_initcall(powernv, eeh_powernv_init);
>>>+
>>>+#ifdef CONFIG_PCI_IOV
>>>+static void pnv_pci_fixup_vf_eeh(struct pci_dev *pdev)
>>>+{
>>
>>Please rename it to pnv_eeh_vf_final_fixup(). Names of all functions
>>in this file expect prefix "pnv_eeh_". With "_final_", it's clearly
>>to tell it's called on PCI device final fixup time.
>>
>
>ok
>
>>>+	/* sysfs files should only be added after devices are added */
>>
>>It's nice to explain why here: sysfs for the PCI device isn't populated
>>and the MMIO resource isn't finalized for the PCI device yet.
>>
>
>Don't get your point.
>
>sysfs of the PCI device is populated at this point.
>

You have two operations here: (A) add the PCI device to EEH address cache;
(B) add EEH related sysfs entries. (A) requires that the resources (MMIO
on PHB3) of the VF is finalized. (B) requires VF's sysfs entries have been
populated. So you need two conditions here to make sure (A) and (B) work
correctly: sysfs files are created and MMIO resources are populated. I was
saying your comments is a bit confusing. Could you have something like this:

       /*
        * The following operations will fail if VF's sysfs files aren't
        * created or its resources aren't finalized.
        */ 

Also, the PCI device's sysfs files aren't created at this point (final
fixup time). Note that, PCI final fixup is invoked separately from PCI
enumeration by fs_initcall_sync().

>>>+	if (pdev->is_virtfn) {
>>>+		eeh_add_device_late(pdev);
>>>+		eeh_sysfs_add_device(pdev);
>>>+	}
>>>+}
>>
>>The nested ifdef can be avoided as:
>>
>>	if (!pdev->is_virtfn)
>>		return;
>>
>>	eeh_add_device_late(pdev);
>>	eeh_sysfs_add_device(pdev);
>>
>
>Ok.
>
>>>+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_eeh);
>>>+#endif /* CONFIG_PCI_IOV */

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 8/9] powerpc/powernv: Support PCI config restore for VFs
  2015-05-12  1:31     ` Wei Yang
@ 2015-05-12  6:34       ` Gavin Shan
  2015-05-12  8:16         ` Wei Yang
  0 siblings, 1 reply; 33+ messages in thread
From: Gavin Shan @ 2015-05-12  6:34 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, Gavin Shan

On Tue, May 12, 2015 at 09:31:34AM +0800, Wei Yang wrote:
>On Mon, May 11, 2015 at 02:22:38PM +1000, Gavin Shan wrote:
>>On Mon, May 04, 2015 at 03:07:37PM +0800, Wei Yang wrote:
>>>Since FW is not aware of VFs, the restore action for VF should be done in
>>       ^^
>>       skiboot firmware
>>>kernel.
>>>
>>>This patch introduces pnv_eeh_vf_restore_config() for VF.
>>>
>>
>>Would it be better?
>>
>>The patch introduces function pnv_eeh_vf_restore_config() to restore PCI
>>config space for VFs after reset.
>>
>
>Ok.
>
>>Also, the function name would be better with pnv_eeh_restore_vf_config()?
>
>Ok.
>
>>
>>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>>---
>>> arch/powerpc/include/asm/pci-bridge.h        |    1 +
>>> arch/powerpc/platforms/powernv/eeh-powernv.c |   77 +++++++++++++++++++++++++-
>>> 2 files changed, 77 insertions(+), 1 deletion(-)
>>>
>>>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>>>index 9582aa2..de55ef6 100644
>>>--- a/arch/powerpc/include/asm/pci-bridge.h
>>>+++ b/arch/powerpc/include/asm/pci-bridge.h
>>>@@ -205,6 +205,7 @@ struct pci_dn {
>>> 	int     m64_per_iov;
>>> #define IODA_INVALID_M64        (-1)
>>> 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
>>>+	int	mps;
>>> #endif /* CONFIG_PCI_IOV */
>>> #endif
>>> 	struct list_head child_list;
>>>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>>index 1ad322f..6ba6d87 100644
>>>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>>>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>>@@ -1589,6 +1589,59 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
>>> 	return ret;
>>> }
>>>
>>>+#ifdef CONFIG_PCI_IOV
>>>+static int pnv_eeh_vf_restore_config(struct pci_dn *pdn)
>>>+{
>>>+	int pcie_cap, aer_cap, old_mps;
>>>+	u32 devctl, cmd, cap2, aer_capctl;
>>>+
>>>+	/* Restore MPS */
>>>+	pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
>>>+	if (pcie_cap) {
>>>+		old_mps = (ffs(pdn->mps) - 8) << 5;
>>>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>>>+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
>>>+		devctl |= old_mps;
>>>+		pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>>>+	}
>>>+
>>
>>hrm, You can't use pnv_pci_cfg_{read,write} here. Instead, you should use
>>eeh_ops->{read,write}_config. By design, the PCI config accessors have been
>>classified to 2 classes: one is used for pci_config_{read,write}_* and another
>>one is eeh_ops->{read,write}. From EEH perspective, the former isn't controlled
>>strictly, but the later one is under control completely. "Not controlled" here
>>means the kernel can't determine when the PCI config is accessed, e.g. PCI
>>config accesses from user land.
>>
>
>Reasonable, will change it.
>
>>>+	/* Disable Completion Timeout */
>>>+	if (pcie_cap) {
>>>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
>>>+		if (cap2 & 0x10) {
>>>+			pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
>>>+			cap2 |= 0x10;
>>>+			pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
>>>+		}
>>>+	}
>>>+
>>>+	/* Enable SERR and parity checking */
>>>+	pnv_pci_cfg_read(pdn, PCI_COMMAND, 2, &cmd);
>>>+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
>>>+	pnv_pci_cfg_write(pdn, PCI_COMMAND, 2, cmd);
>>>+
>>>+	/* Enable report various errors */
>>>+	if (pcie_cap) {
>>>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>>>+		devctl &= ~PCI_EXP_DEVCTL_CERE;
>>>+		devctl |= (PCI_EXP_DEVCTL_NFERE |
>>>+			   PCI_EXP_DEVCTL_FERE |
>>>+			   PCI_EXP_DEVCTL_URRE);
>>>+		pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>>>+	}
>>>+
>>>+	/* Enable ECRC generation and check */
>>>+	if (pcie_cap) {
>>>+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
>>>+		pnv_pci_cfg_read(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
>>>+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
>>>+		pnv_pci_cfg_write(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
>>>+	}
>>>+
>>>+	return 0;
>>>+}
>>>+#endif /* CONFIG_PCI_IOV */
>>>+
>>
>>The code is copied over from skiboot firmware. I still dislike the fact that
>>we have to maintain two sets of similar functions in skiboot/kernel. I still
>>believe the way I suggested can help: the firmware exports the error routing
>>rules and kernel has support it based on the rules. With it, the skiboot is
>>the source of the information to avoid mismatching between kernel/firmware.
>
>Yes, it looks we have duplicate code in kernel and skiboot.
>
>As you suggest, if we export some bit map from device node, we still have the
>real logic in kernel, until we remove that part in skiboot.
>
>By removing that part in skiboot, we may have some compatibility problem. For
>example, an old kernel may not run on the new version of skiboot.
>

It's fine to keep two set code which bear with same rule, which is exported
from skiboot. In that case, the rule is the only thing we have to care. We
don't need care the code any more to avoid mismatch between kernel/firmware.

>>
>>> static int pnv_eeh_restore_config(struct pci_dn *pdn)
>>> {
>>> 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>>>@@ -1599,7 +1652,13 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
>>> 		return -EEXIST;
>>>
>>> 	phb = edev->phb->private_data;
>>>-	ret = opal_pci_reinit(phb->opal_id,
>>>+#ifdef CONFIG_PCI_IOV
>>>+	/* FW is not VF aware, we rely on OS to restore it */
>>>+	if (edev->mode & EEH_DEV_VF)
>>>+		ret = pnv_eeh_vf_restore_config(pdn);
>>>+	else
>>>+#endif
>>
>>You don't even have to have CONFIG_PCI_IOV since it won't save much
>>.text space.
>>
>
>ok
>
>>>+		ret = opal_pci_reinit(phb->opal_id,
>>> 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
>>> 	if (ret) {
>>> 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
>>>@@ -1660,4 +1719,20 @@ static void pnv_pci_fixup_vf_eeh(struct pci_dev *pdev)
>>> 	}
>>> }
>>> DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_eeh);
>>>+
>>>+static void pnv_pci_fixup_vf_caps(struct pci_dev *pdev)
>>>+{
>>
>>As I said before, this function shouldn't be part of this file because:
>>
>>- When CONFIG_EEH=n, this file won't be complied/included.
>>- This function isn't part of EEH naturally.
>>
>
>Moved to arch/powerpc/platform/powernv/pci.c
>
>>Also, pnv_pci_vf_header_fixup() would be better name in case you need
>>apply more fixups for VFs in the function.
>>
>
>Ok.
>
>>>+	struct pci_dn *pdn = pci_get_pdn(pdev);
>>>+	int parent_mps;
>>>+
>>>+	if (!pdev->is_virtfn)
>>>+		return;
>>>+
>>>+	/* Synchronize MPS for VF and PF */
>>>+	parent_mps = pcie_get_mps(pdev->physfn);
>>>+	if ((128 << pdev->pcie_mpss) >= parent_mps)
>>>+		pcie_set_mps(pdev, parent_mps);
>>
>>Hrm, Again, do we have possibility: (128 << pdev->pcie_mpss) < parent_mps ?
>>And why we bother if MPS of PF/VF are equal?
>>
>
>pcie_mpss is the MPS supported, not the MPS itself.
>
>This line means if the pci_dev support the parents mps, apply it.
>Otherwise, just cache the mps.
>

Ok. It then looks good to me.

>>>+	pdn->mps = pcie_get_mps(pdev);
>>>+}
>>>+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_caps);
>>> #endif /* CONFIG_PCI_IOV */

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 2/9] powerpc/pci_dn: cache vf_index in pci_dn
  2015-05-12  6:15       ` Gavin Shan
@ 2015-05-12  7:29         ` Wei Yang
  0 siblings, 0 replies; 33+ messages in thread
From: Wei Yang @ 2015-05-12  7:29 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Tue, May 12, 2015 at 04:15:58PM +1000, Gavin Shan wrote:
>On Mon, May 11, 2015 at 01:54:12PM +0800, Wei Yang wrote:
>>On Mon, May 11, 2015 at 12:21:04PM +1000, Gavin Shan wrote:
>>>On Mon, May 04, 2015 at 03:07:31PM +0800, Wei Yang wrote:
>>>>This patch caches the index of a VF in its PF in pci_dn.
>>>>
>>>
>>>At least you can mention the purpose of vf_index to make the commit log
>>>complete. The following message looks better?
>>>
>>>The patch caches the VF index in pci_dn, which can be used to calculate
>>>VF's bus, device and function number. Those information helps to locate
>>>the VF's PCI device instance when doing hotplug during EEH recovery if
>>>necessary.
>>>
>>
>>Thanks, looks better. I added it in the log.
>>
>>>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>>>---
>>>> arch/powerpc/include/asm/pci-bridge.h |    1 +
>>>> arch/powerpc/kernel/pci_dn.c          |    5 +++--
>>>> 2 files changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>>>>index 1811c44..9582aa2 100644
>>>>--- a/arch/powerpc/include/asm/pci-bridge.h
>>>>+++ b/arch/powerpc/include/asm/pci-bridge.h
>>>>@@ -199,6 +199,7 @@ struct pci_dn {
>>>> #ifdef CONFIG_PCI_IOV
>>>> 	u16     vfs_expanded;		/* number of VFs IOV BAR expanded */
>>>> 	u16     num_vfs;		/* number of VFs enabled*/
>>>>+	int     vf_index;		/* Index to PF for VF dev */
>>>                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>
>>>					/* VF index in the PF */
>>
>>Ok, changed in the code.
>>
>>>
>>>And I believe it can be "unsigned int", or u16. We should have
>>>non-negative vf_index, no?
>>
>>Take a look in the virtfn_add(), the index in drivers/pci/iov.c is int. So I
>>copy that.
>>
>>>
>>>> 	int     offset;			/* PE# for the first VF PE */
>>>> #define M64_PER_IOV 4
>>>> 	int     m64_per_iov;
>>>>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>>>>index b3b4df9..bf0fb873 100644
>>>>--- a/arch/powerpc/kernel/pci_dn.c
>>>>+++ b/arch/powerpc/kernel/pci_dn.c
>>>>@@ -138,7 +138,7 @@ struct pci_dn *pci_get_pdn(struct pci_dev *pdev)
>>>>
>>>> #ifdef CONFIG_PCI_IOV
>>>> static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
>>>>-					   struct pci_dev *pdev,
>>>>+					   struct pci_dev *pdev, int vf_index,
>>>
>>>					   struct pci_dev *pdev,
>>>					   int vf_index;
>>
>>Some reason for this comment?
>>
>>That does not exceed 80 characters.
>>
>
>No, it doesn't exceed 80 characters as you said. You take one of the following
>formats, not the one you're using:
>
>add_one_dev_pci_data(foo1, foo2,          add_one_dev_pci_data(foo1,
>                     foo3, foo4,                               foo2,
>                     foo5, foo6);                                :
>                                                               foo6);
>

Thanks

>>>> 					   int busno, int devfn)
>>>> {
>>>> 	struct pci_dn *pdn;
>>>>@@ -157,6 +157,7 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
>>>> 	pdn->parent = parent;
>>>> 	pdn->busno = busno;
>>>> 	pdn->devfn = devfn;
>>>>+	pdn->vf_index = vf_index;
>>>> #ifdef CONFIG_PPC_POWERNV
>>>> 	pdn->pe_number = IODA_INVALID_PE;
>>>> #endif
>>>>@@ -196,7 +197,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
>>>> 		return NULL;
>>>>
>>>> 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>>>>-		pdn = add_one_dev_pci_data(parent, NULL,
>>>>+		pdn = add_one_dev_pci_data(parent, NULL, i,
>>>> 					   pci_iov_virtfn_bus(pdev, i),
>>>> 					   pci_iov_virtfn_devfn(pdev, i));
>>>> 		if (!pdn) {
>
>Thanks,
>Gavin

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 5/9] powerpc/eeh: create EEH_PE_VF for VF PE
  2015-05-12  6:28       ` Gavin Shan
@ 2015-05-12  7:52         ` Wei Yang
  0 siblings, 0 replies; 33+ messages in thread
From: Wei Yang @ 2015-05-12  7:52 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Tue, May 12, 2015 at 04:28:23PM +1000, Gavin Shan wrote:
>On Mon, May 11, 2015 at 02:25:49PM +0800, Wei Yang wrote:
>>On Mon, May 11, 2015 at 12:37:07PM +1000, Gavin Shan wrote:
>>>On Mon, May 04, 2015 at 03:07:34PM +0800, Wei Yang wrote:
>>>
>>>Please reorder PATCH[6] with this one because the EEH device is expected
>>>to be created before EEH PE.
>>
>>That's a good idea.
>>
>>>
>>>>On powernv platform, VF PE is a special PE which is different from the Bus
>>>>PE.  On the EEH side, it needs a corresponding concept to handle the VF PE
>>>>properly. For example, we need to create VF PE when VF's pci_dev is
>>>>initialized in kernel. And add a flag to mark it is a VF PF.
>>>                                                       ^^^^^
>>>>
>>>
>>>>From above commit log, my understanding is that you're adding a flag to
>>>identify VF PE, which is handled differently from bus PE. You missed the
>>>details on the difference between them and the speical treament to VF PE.
>>>Could you help add those information in the commit log to make it looks
>>>complete?
>>>
>>
>>This patch just introduce the VF PE. For those differences, we have another
>>patch "handle VF PE properly" to cover. In the log of that patch, I listed
>>those differences. Do you think this is fine?
>>
>
>It's fine to me.
>
>>>>This patch introduces the EEH_PE_VF type for VF PE and creates it for a VF.
>>>>At the mean time, it creates the sysfs and address cache for VF PE.
>>>                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>
>>>                   it creates the sysfs and address cache for VF PE at PCI
>>>device final fixup time.
>>>
>>>>
>>>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>>>---
>>>> arch/powerpc/include/asm/eeh.h               |    1 +
>>>> arch/powerpc/kernel/eeh_pe.c                 |   12 ++++++++++--
>>>> arch/powerpc/platforms/powernv/eeh-powernv.c |   12 ++++++++++++
>>>> 3 files changed, 23 insertions(+), 2 deletions(-)
>>>>
>>>>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>>>>index a52db28..56e8cd9 100644
>>>>--- a/arch/powerpc/include/asm/eeh.h
>>>>+++ b/arch/powerpc/include/asm/eeh.h
>>>>@@ -70,6 +70,7 @@ struct pci_dn;
>>>> #define EEH_PE_PHB	(1 << 1)	/* PHB PE    */
>>>> #define EEH_PE_DEVICE 	(1 << 2)	/* Device PE */
>>>> #define EEH_PE_BUS	(1 << 3)	/* Bus PE    */
>>>>+#define EEH_PE_VF	(1 << 4)	/* VF PE     */
>>>>
>>>> #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
>>>> #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
>>>>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>>>>index 35f0b62..edfe63a 100644
>>>>--- a/arch/powerpc/kernel/eeh_pe.c
>>>>+++ b/arch/powerpc/kernel/eeh_pe.c
>>>>@@ -299,7 +299,12 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
>>>> 	 * EEH device already having associated PE, but
>>>> 	 * the direct parent EEH device doesn't have yet.
>>>> 	 */
>>>>-	pdn = pdn ? pdn->parent : NULL;
>>>>+#ifdef CONFIG_PCI_IOV
>>>>+	if (edev->mode & EEH_DEV_VF)
>>>>+		pdn = pci_get_pdn(edev->physfn);
>>>>+	else
>>>>+#endif
>>>>+		pdn = pdn ? pdn->parent : NULL;
>>>
>>>[A]
>>>
>>>> 	while (pdn) {
>>>> 		/* We're poking out of PCI territory */
>>>> 		parent = pdn_to_eeh_dev(pdn);
>>>>@@ -382,7 +387,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
>>>> 	}
>>>>
>>>> 	/* Create a new EEH PE */
>>>>-	pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
>>>>+	if (edev->mode & EEH_DEV_VF)
>>>>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
>>>>+	else
>>>>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
>>>
>>>You don't have CONFIG_PCI_IOV here to protect the code, but you had
>>>that at [A]. In order to keep the code look consistent, you either
>>>add it or remove it for all places. I prefer to remove it, which
>>>we don't need CONFIG_PCI_IOV.
>>>
>>
>>Ok, that's fine to remove it.
>>
>>BTW, if remove the CONFIG_PCI_IOV, we need to remove it around the physfn in
>>eeh_dev definition. That's fine?
>>
>
>It's fine to me.
>
>>>> 	if (!pe) {
>>>> 		pr_err("%s: out of memory!\n", __func__);
>>>> 		return -ENOMEM;
>>>>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>>>index 622f08c..5447481 100644
>>>>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>>>>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>>>@@ -1540,3 +1540,15 @@ static int __init eeh_powernv_init(void)
>>>> 	return ret;
>>>> }
>>>> machine_early_initcall(powernv, eeh_powernv_init);
>>>>+
>>>>+#ifdef CONFIG_PCI_IOV
>>>>+static void pnv_pci_fixup_vf_eeh(struct pci_dev *pdev)
>>>>+{
>>>
>>>Please rename it to pnv_eeh_vf_final_fixup(). Names of all functions
>>>in this file expect prefix "pnv_eeh_". With "_final_", it's clearly
>>>to tell it's called on PCI device final fixup time.
>>>
>>
>>ok
>>
>>>>+	/* sysfs files should only be added after devices are added */
>>>
>>>It's nice to explain why here: sysfs for the PCI device isn't populated
>>>and the MMIO resource isn't finalized for the PCI device yet.
>>>
>>
>>Don't get your point.
>>
>>sysfs of the PCI device is populated at this point.
>>
>
>You have two operations here: (A) add the PCI device to EEH address cache;
>(B) add EEH related sysfs entries. (A) requires that the resources (MMIO
>on PHB3) of the VF is finalized. (B) requires VF's sysfs entries have been
>populated. So you need two conditions here to make sure (A) and (B) work
>correctly: sysfs files are created and MMIO resources are populated. I was
>saying your comments is a bit confusing. Could you have something like this:
>
>       /*
>        * The following operations will fail if VF's sysfs files aren't
>        * created or its resources aren't finalized.
>        */ 
>

Will change the comment with this.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 6/9] powerpc/powernv: create/release eeh_dev for VF
  2015-05-11  2:48   ` Gavin Shan
@ 2015-05-12  8:06     ` Wei Yang
  2015-05-12 23:09       ` Gavin Shan
  0 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-12  8:06 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Mon, May 11, 2015 at 12:48:56PM +1000, Gavin Shan wrote:
>On Mon, May 04, 2015 at 03:07:35PM +0800, Wei Yang wrote:
>
>Please order this patch and PATCH[5] because EEH device is expected to
>be created before EEH PE.
>
>>EEH on powerpc platform needs eeh_dev structure to track the pci device
>                                                              ^^^
>							      PCI
>
>>status. Since VFs are created/released dynamically, VF's eeh_dev is also
>>dynamically created/released in system.
>>
>>This patch creates/removes eeh_dev when pci_dn is created/removed for VFs,
>>and marks it with EEH_DEV_VF type.
>>
>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>---
>> arch/powerpc/include/asm/eeh.h |    7 +++++++
>> arch/powerpc/kernel/eeh.c      |    4 ++++
>> arch/powerpc/kernel/eeh_dev.c  |   20 ++++++++++++++++++++
>> arch/powerpc/kernel/pci_dn.c   |    7 +++++++
>> 4 files changed, 38 insertions(+)
>>
>>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>>index 56e8cd9..2067de4 100644
>>--- a/arch/powerpc/include/asm/eeh.h
>>+++ b/arch/powerpc/include/asm/eeh.h
>>@@ -124,6 +124,7 @@ static inline bool eeh_pe_passed(struct eeh_pe *pe)
>> #define EEH_DEV_NO_HANDLER	(1 << 8)	/* No error handler	*/
>> #define EEH_DEV_SYSFS		(1 << 9)	/* Sysfs created	*/
>> #define EEH_DEV_REMOVED		(1 << 10)	/* Removed permanently	*/
>>+#define EEH_DEV_VF		(1 << 11)	/* VF port		*/
>>
>
>Why you need this flag? I guess "edev->physfn" can be used to distinguish
>it's a normal or VF eeh_dev.
>

Just like we have EEH_DEV_BRIDGE and EEH_DEV_DS_PORT, I use the flag
EEH_DEV_VF to mark it a VF eeh_dev.


-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 8/9] powerpc/powernv: Support PCI config restore for VFs
  2015-05-12  6:34       ` Gavin Shan
@ 2015-05-12  8:16         ` Wei Yang
  2015-05-12 23:16           ` Gavin Shan
  0 siblings, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-12  8:16 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Tue, May 12, 2015 at 04:34:03PM +1000, Gavin Shan wrote:
>>
>>>>+	/* Disable Completion Timeout */
>>>>+	if (pcie_cap) {
>>>>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
>>>>+		if (cap2 & 0x10) {
>>>>+			pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
>>>>+			cap2 |= 0x10;
>>>>+			pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
>>>>+		}
>>>>+	}
>>>>+
>>>>+	/* Enable SERR and parity checking */
>>>>+	pnv_pci_cfg_read(pdn, PCI_COMMAND, 2, &cmd);
>>>>+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
>>>>+	pnv_pci_cfg_write(pdn, PCI_COMMAND, 2, cmd);
>>>>+
>>>>+	/* Enable report various errors */
>>>>+	if (pcie_cap) {
>>>>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>>>>+		devctl &= ~PCI_EXP_DEVCTL_CERE;
>>>>+		devctl |= (PCI_EXP_DEVCTL_NFERE |
>>>>+			   PCI_EXP_DEVCTL_FERE |
>>>>+			   PCI_EXP_DEVCTL_URRE);
>>>>+		pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>>>>+	}
>>>>+
>>>>+	/* Enable ECRC generation and check */
>>>>+	if (pcie_cap) {
>>>>+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
>>>>+		pnv_pci_cfg_read(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
>>>>+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
>>>>+		pnv_pci_cfg_write(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
>>>>+	}
>>>>+
>>>>+	return 0;
>>>>+}
>>>>+#endif /* CONFIG_PCI_IOV */
>>>>+
>>>
>>>The code is copied over from skiboot firmware. I still dislike the fact that
>>>we have to maintain two sets of similar functions in skiboot/kernel. I still
>>>believe the way I suggested can help: the firmware exports the error routing
>>>rules and kernel has support it based on the rules. With it, the skiboot is
>>>the source of the information to avoid mismatching between kernel/firmware.
>>
>>Yes, it looks we have duplicate code in kernel and skiboot.
>>
>>As you suggest, if we export some bit map from device node, we still have the
>>real logic in kernel, until we remove that part in skiboot.
>>
>>By removing that part in skiboot, we may have some compatibility problem. For
>>example, an old kernel may not run on the new version of skiboot.
>>
>
>It's fine to keep two set code which bear with same rule, which is exported
>from skiboot. In that case, the rule is the only thing we have to care. We
>don't need care the code any more to avoid mismatch between kernel/firmware.
>

Ok, duplication is reasonable, then the major point for this is we need to
have a clear rule for restoring configuration for a device.

Than I suggest we could have another patch set to handle this. Define the rule
clearly and restore the configuration in kernel when skiboot firmware export
such rules.

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 6/9] powerpc/powernv: create/release eeh_dev for VF
  2015-05-12  8:06     ` Wei Yang
@ 2015-05-12 23:09       ` Gavin Shan
  0 siblings, 0 replies; 33+ messages in thread
From: Gavin Shan @ 2015-05-12 23:09 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, Gavin Shan

On Tue, May 12, 2015 at 04:06:43PM +0800, Wei Yang wrote:
>On Mon, May 11, 2015 at 12:48:56PM +1000, Gavin Shan wrote:
>>On Mon, May 04, 2015 at 03:07:35PM +0800, Wei Yang wrote:
>>
>>Please order this patch and PATCH[5] because EEH device is expected to
>>be created before EEH PE.
>>
>>>EEH on powerpc platform needs eeh_dev structure to track the pci device
>>                                                              ^^^
>>							      PCI
>>
>>>status. Since VFs are created/released dynamically, VF's eeh_dev is also
>>>dynamically created/released in system.
>>>
>>>This patch creates/removes eeh_dev when pci_dn is created/removed for VFs,
>>>and marks it with EEH_DEV_VF type.
>>>
>>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>>---
>>> arch/powerpc/include/asm/eeh.h |    7 +++++++
>>> arch/powerpc/kernel/eeh.c      |    4 ++++
>>> arch/powerpc/kernel/eeh_dev.c  |   20 ++++++++++++++++++++
>>> arch/powerpc/kernel/pci_dn.c   |    7 +++++++
>>> 4 files changed, 38 insertions(+)
>>>
>>>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>>>index 56e8cd9..2067de4 100644
>>>--- a/arch/powerpc/include/asm/eeh.h
>>>+++ b/arch/powerpc/include/asm/eeh.h
>>>@@ -124,6 +124,7 @@ static inline bool eeh_pe_passed(struct eeh_pe *pe)
>>> #define EEH_DEV_NO_HANDLER	(1 << 8)	/* No error handler	*/
>>> #define EEH_DEV_SYSFS		(1 << 9)	/* Sysfs created	*/
>>> #define EEH_DEV_REMOVED		(1 << 10)	/* Removed permanently	*/
>>>+#define EEH_DEV_VF		(1 << 11)	/* VF port		*/
>>>
>>
>>Why you need this flag? I guess "edev->physfn" can be used to distinguish
>>it's a normal or VF eeh_dev.
>>
>
>Just like we have EEH_DEV_BRIDGE and EEH_DEV_DS_PORT, I use the flag
>EEH_DEV_VF to mark it a VF eeh_dev.
>
>

They're not equal from my point of review. When the VF edev is created
from VF's pdn, edev->physfn is determined. Furthermore, edev->physfn
is only meaningful to VF's edev. It's why I think you needn't introduce
another one flag (EEH_DEV_VF), which is just duplicated to (edev->physfn).
So I think you can use following condition to check if one edev is created
from VF's pdn or others:

	if (edev->physfn)
		/* VF's edev */
	else
		/* All other cases */

In order to use the above condition, edev->phyfn is initialized to dereference
the VF's physical function when it's created from VF's pdn in arch/powerpc/kernel/pdn.c.

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 8/9] powerpc/powernv: Support PCI config restore for VFs
  2015-05-12  8:16         ` Wei Yang
@ 2015-05-12 23:16           ` Gavin Shan
  0 siblings, 0 replies; 33+ messages in thread
From: Gavin Shan @ 2015-05-12 23:16 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, Gavin Shan

On Tue, May 12, 2015 at 04:16:45PM +0800, Wei Yang wrote:
>On Tue, May 12, 2015 at 04:34:03PM +1000, Gavin Shan wrote:
>>>
>>>>>+	/* Disable Completion Timeout */
>>>>>+	if (pcie_cap) {
>>>>>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
>>>>>+		if (cap2 & 0x10) {
>>>>>+			pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
>>>>>+			cap2 |= 0x10;
>>>>>+			pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
>>>>>+		}
>>>>>+	}
>>>>>+
>>>>>+	/* Enable SERR and parity checking */
>>>>>+	pnv_pci_cfg_read(pdn, PCI_COMMAND, 2, &cmd);
>>>>>+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
>>>>>+	pnv_pci_cfg_write(pdn, PCI_COMMAND, 2, cmd);
>>>>>+
>>>>>+	/* Enable report various errors */
>>>>>+	if (pcie_cap) {
>>>>>+		pnv_pci_cfg_read(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>>>>>+		devctl &= ~PCI_EXP_DEVCTL_CERE;
>>>>>+		devctl |= (PCI_EXP_DEVCTL_NFERE |
>>>>>+			   PCI_EXP_DEVCTL_FERE |
>>>>>+			   PCI_EXP_DEVCTL_URRE);
>>>>>+		pnv_pci_cfg_write(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>>>>>+	}
>>>>>+
>>>>>+	/* Enable ECRC generation and check */
>>>>>+	if (pcie_cap) {
>>>>>+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
>>>>>+		pnv_pci_cfg_read(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
>>>>>+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
>>>>>+		pnv_pci_cfg_write(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
>>>>>+	}
>>>>>+
>>>>>+	return 0;
>>>>>+}
>>>>>+#endif /* CONFIG_PCI_IOV */
>>>>>+
>>>>
>>>>The code is copied over from skiboot firmware. I still dislike the fact that
>>>>we have to maintain two sets of similar functions in skiboot/kernel. I still
>>>>believe the way I suggested can help: the firmware exports the error routing
>>>>rules and kernel has support it based on the rules. With it, the skiboot is
>>>>the source of the information to avoid mismatching between kernel/firmware.
>>>
>>>Yes, it looks we have duplicate code in kernel and skiboot.
>>>
>>>As you suggest, if we export some bit map from device node, we still have the
>>>real logic in kernel, until we remove that part in skiboot.
>>>
>>>By removing that part in skiboot, we may have some compatibility problem. For
>>>example, an old kernel may not run on the new version of skiboot.
>>>
>>
>>It's fine to keep two set code which bear with same rule, which is exported
>>from skiboot. In that case, the rule is the only thing we have to care. We
>>don't need care the code any more to avoid mismatch between kernel/firmware.
>>
>
>Ok, duplication is reasonable, then the major point for this is we need to
>have a clear rule for restoring configuration for a device.
>

Well, I have to explain a bit more if I didn't make myself clear enough, then
you change the code in another way, which will waste your time.

- From skiboot, each PHB's device node maintains the rules, which *could* be
  described as the data structures I have given in previous replies if you
  can't figure out better data structures.
- Skiboot will reinitialize those devices for the following 3 cases: PCI
  enumeration, PCI config space restore requested from EEH, after PCI hotplug/reset.
  Obviously, the code needs changes to utilize the rules in PHB's device node.
- Kernel will do similiar thing as skiboot will do: Reinitialize VFs according
  to the rules in PHB's device node.

Yes, we have duplicated code, not rules. Hopefully, I make myself clear enough.

>Than I suggest we could have another patch set to handle this. Define the rule
>clearly and restore the configuration in kernel when skiboot firmware export
>such rules.
>

Sure.

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 9/9] powerpc/eeh: handle VF PE properly
  2015-05-04  7:07 ` [PATCH V3 9/9] powerpc/eeh: handle VF PE properly Wei Yang
@ 2015-05-13  1:16   ` Gavin Shan
  2015-05-14  9:35     ` Wei Yang
  2015-05-14 10:02     ` Wei Yang
  0 siblings, 2 replies; 33+ messages in thread
From: Gavin Shan @ 2015-05-13  1:16 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Mon, May 04, 2015 at 03:07:38PM +0800, Wei Yang wrote:
>Compared with Bus PE, VF PE just has one single pci function. This
                                                 ^^^
>introduces the difference of error handling on a VF PE.
>

Lets have simple example to make the discussion easy: Suppose that one
particular PF has only one IOV BAR and we're going to enable 8 VFs. Also
the VF BAR size exceeds 64MB. For this case, we're going to have 2 VF
groups, which have 4 VFs. First of all, there are 8 PE numbers consumed
in total and lets say they're PE#(x) ... PE#(x+7).

For above case, the M64 runs in "single" mode. In other word, we just
need two M64 BARs here and they're owned by PE#x and PE#(x+1) separately.
Equally speaking, the relationship between PE# and VF index becomes as
below from MMIO's view: PE#x (VF#0/1/2/3), PE#x(VF#4/5/6/7). However,
this mapping isn't updated to PELTM. On the other hand, each VF has
separate PE# number as we update to PELTM. Lastly, we have to chain
PE#x/+1/+2/+3 and PE#x+4/5/6/7 because the VF group is sharing M64
resources. Hopefully, I didn't miss anything.

If above description is complete, I think you need expose VF (PE) group
to EEH. Potentially, I guess you need improve it later for VFIO so that
it can pass VF (PE) group to *same* guest. Generally speaking, the VF (PE)
group should be visible to EEH as single PE, which is similar to what I
did for "huge BAR" support where we have master/slave PE. The master PE
is exposed to EEH while the slave PEs are invisible from EEH. However,
the situation for VF (PE) group is different, but not too much: the
slave PEs (PE#x+1/2/3, PE#x+5/6/7 in this case) do contain PCI device.
So think you can something as below if you can't figure out better
solution:

- When the PE group is finalized in pcibios_fixup_sriov_enable(), you
  mark the master/slave PE accordingly, and put the slave PE to the
  master PE's slave list.
- When doing EEH probe on VF's edev, detach it to EEH with master PE#.

With above mechanism, each VF PE can potentially have multiple EEH devices,
not one single PCI function as you said in the commit log, which needs to
be changed accordingly.

>For example in the hotplug case, EEH needs to remove and re-create the VF
>properly. In the case when PF's error_detected() disable SRIOV, this patch
>introduces a flag to mark the eeh_dev of a VF to avoid the slot_reset() and
>resume(). Since the FW is not ware of the VF, this patch handles the VF
>restore/reset in kernel directly.
>
>This patch is to handle the VF PE properly in these cases.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/include/asm/eeh.h   |    1 +
> arch/powerpc/kernel/eeh.c        |    1 +
> arch/powerpc/kernel/eeh_driver.c |  108 ++++++++++++++++++++++++++++++--------
> arch/powerpc/kernel/eeh_pe.c     |    3 +-
> 4 files changed, 90 insertions(+), 23 deletions(-)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index 78c8bec..43e8a24 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -141,6 +141,7 @@ struct eeh_dev {
> 	struct pci_controller *phb;	/* Associated PHB		*/
> 	struct pci_dn *pdn;		/* Associated PCI device node	*/
> 	struct pci_dev *pdev;		/* Associated PCI device	*/
>+	int    in_error;		/* Error flag for eeh_dev	*/

Begining from "v2" (maybe I'm wrong about the revision), I suggested to
remove "in_error" because we already had similar flag EEH_DEV_NO_HANDLER.
I don't think it's different to merge them and I already told how to do
that. If you still find it's difficult to do it, I'm fine to keep it and
I'll fix it up later by myself.

> #ifdef CONFIG_PCI_IOV
> 	struct pci_dev *physfn;		/* Associated PF PORT		*/
> #endif /* CONFIG_PCI_IOV */
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index 221e280..077c3d1 100644
>--- a/arch/powerpc/kernel/eeh.c
>+++ b/arch/powerpc/kernel/eeh.c
>@@ -1226,6 +1226,7 @@ void eeh_remove_device(struct pci_dev *dev)
> 	 * from the parent PE during the BAR resotre.
> 	 */
> 	edev->pdev = NULL;
>+	edev->in_error = 0;
> 	dev->dev.archdata.edev = NULL;
> 	if (!(edev->pe->state & EEH_PE_KEEP))
> 		eeh_rmv_from_parent_pe(edev);
>diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>index 89eb4bc..6e42cad 100644
>--- a/arch/powerpc/kernel/eeh_driver.c
>+++ b/arch/powerpc/kernel/eeh_driver.c
>@@ -211,6 +211,7 @@ static void *eeh_report_error(void *data, void *userdata)
> 	if (rc == PCI_ERS_RESULT_NEED_RESET) *res = rc;
> 	if (*res == PCI_ERS_RESULT_NONE) *res = rc;
>
>+	edev->in_error = 1;
> 	eeh_pcid_put(dev);
> 	return NULL;
> }
>@@ -282,7 +283,8 @@ static void *eeh_report_reset(void *data, void *userdata)
>
> 	if (!driver->err_handler ||
> 	    !driver->err_handler->slot_reset ||
>-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
>+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
>+	    (!edev->in_error)) {
> 		eeh_pcid_put(dev);
> 		return NULL;
> 	}
>@@ -339,14 +341,16 @@ static void *eeh_report_resume(void *data, void *userdata)
>
> 	if (!driver->err_handler ||
> 	    !driver->err_handler->resume ||
>-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
>+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
>+	    (!edev->in_error)) {
> 		edev->mode &= ~EEH_DEV_NO_HANDLER;
>-		eeh_pcid_put(dev);
>-		return NULL;
>+		goto out;
> 	}
>
> 	driver->err_handler->resume(dev);
>
>+out:
>+	edev->in_error = 0;
> 	eeh_pcid_put(dev);
> 	return NULL;
> }
>@@ -386,12 +390,42 @@ static void *eeh_report_failure(void *data, void *userdata)
> 	return NULL;
> }
>
>+#ifdef CONFIG_PCI_IOV
>+static void *eeh_add_virt_device(void *data, void *userdata)
>+{
>+	struct pci_driver *driver;
>+	struct eeh_dev *edev = (struct eeh_dev *)data;
>+	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
>+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
>+
>+	if (!(edev->mode & EEH_DEV_VF)) {
>+		pr_warn("EEH: eeh_dev(%04x:%02x:%02x:%01x) is not a VF\n",
>+			edev->phb->global_number, pdn->busno,
>+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
>+		return NULL;
>+	}

Please don't use logs starting with "EEH:" arbitrarily. It's assumed
for principle EEH logs. This one isn't a major one and print the function
name helps for debugging:

	if (!edev->physfn) {
		pr_warn("%s: EEH dev %04x:%02x:%02x:%01x not for VF\n",
			edev->phb->global_number, pdn->busno,
			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
		return NULL;
	}

>+
>+	driver = eeh_pcid_get(dev);
>+	if (driver) {
>+		eeh_pcid_put(dev);
>+		if (driver->err_handler)
>+			return NULL;
>+	}

dev and driver are NULL for those VFs that have been unplugged. For those
VFs weren't unplugged, driver and err_handler should be valid. The code
looks correct. However, for consistence, please use EEH_DEV_DISCONNECTED
that has been marked to those EEH devices which were unplugged. Do you
think it would be better?

	if (!(dev->flags & EEH_DEV_DISCONNECTED))
		return NULL;

>+
>+	pci_iov_virtfn_add(edev->physfn, pdn->vf_index, 0);
>+	return NULL;
>+}

>+#endif /* CONFIG_PCI_IOV */
>+
> static void *eeh_rmv_device(void *data, void *userdata)
> {
> 	struct pci_driver *driver;
> 	struct eeh_dev *edev = (struct eeh_dev *)data;
> 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
> 	int *removed = (int *)userdata;
>+#ifdef CONFIG_PCI_IOV
>+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
>+#endif

You don't have to have CONFIG_PCI_IOV here. I guess you probably remove
all CONFIG_PCI_IOV checks because the flag or conditions introduced
to edev/PE can tell they're VF sensitive or not.

>
> 	/*
> 	 * Actually, we should remove the PCI bridges as well.
>@@ -416,7 +450,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
> 	driver = eeh_pcid_get(dev);
> 	if (driver) {
> 		eeh_pcid_put(dev);
>-		if (driver->err_handler)
>+		if (removed && driver->err_handler)
> 			return NULL;
> 	}
>
>@@ -425,11 +459,21 @@ static void *eeh_rmv_device(void *data, void *userdata)
> 		 pci_name(dev));
> 	edev->bus = dev->bus;
> 	edev->mode |= EEH_DEV_DISCONNECTED;
>-	(*removed)++;
>-
>-	pci_lock_rescan_remove();
>-	pci_stop_and_remove_bus_device(dev);
>-	pci_unlock_rescan_remove();
>+	if (removed)
>+		(*removed)++;
>+
>+#ifdef CONFIG_PCI_IOV
>+	if (edev->mode & EEH_DEV_VF) {
>+		pci_iov_virtfn_remove(edev->physfn, pdn->vf_index, 0);
>+		edev->pdev = NULL;
>+		pdn->pe_number = IODA_INVALID_PE;

Setting the PE number to invalid one seems not correct because the PE
is still consumed by the VF's RID.

>+	} else
>+#endif

You can remove this CONFIG_PCI_IOV here since you already had EEH_DEV_VF
or "edev->physfn" as I said previously.

>+	{
>+		pci_lock_rescan_remove();
>+		pci_stop_and_remove_bus_device(dev);
>+		pci_unlock_rescan_remove();
>+	}
>
> 	return NULL;
> }
>@@ -548,6 +592,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
> 	struct pci_bus *frozen_bus = eeh_pe_bus_get(pe);
> 	struct timeval tstamp;
> 	int cnt, rc, removed = 0;
>+	struct eeh_dev *edev;
>
> 	/* pcibios will clear the counter; save the value */
> 	cnt = pe->freeze_count;
>@@ -561,12 +606,15 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
> 	 */
> 	eeh_pe_state_mark(pe, EEH_PE_KEEP);
> 	if (bus) {
>-		pci_lock_rescan_remove();
>-		pcibios_remove_pci_devices(bus);
>-		pci_unlock_rescan_remove();
>-	} else if (frozen_bus) {
>+		if (pe->type & EEH_PE_VF)
>+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
>+		else {
>+			pci_lock_rescan_remove();
>+			pcibios_remove_pci_devices(bus);
>+			pci_unlock_rescan_remove();
>+		}
>+	} else if (frozen_bus)
> 		eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
>-	}
>
> 	/*
> 	 * Reset the pci controller. (Asserts RST#; resets config space).
>@@ -607,14 +655,26 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
> 		 * PE. We should disconnect it so the binding can be
> 		 * rebuilt when adding PCI devices.
> 		 */
>+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);

As above, VF PE can have multiple EEH devices.

> 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
>-		pcibios_add_pci_devices(bus);
>+#ifdef CONFIG_PCI_IOV
>+		if (pe->type & EEH_PE_VF)
>+			eeh_add_virt_device(edev, NULL);
>+		else
>+#endif
>+			pcibios_add_pci_devices(bus);
> 	} else if (frozen_bus && removed) {
> 		pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
> 		ssleep(5);
>
>+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
> 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
>-		pcibios_add_pci_devices(frozen_bus);
>+#ifdef CONFIG_PCI_IOV
>+		if (pe->type & EEH_PE_VF)
>+			eeh_add_virt_device(edev, NULL);
>+		else
>+#endif
>+			pcibios_add_pci_devices(frozen_bus);
> 	}
> 	eeh_pe_state_clear(pe, EEH_PE_KEEP);
>
>@@ -792,11 +852,15 @@ perm_error:
> 	 * the their PCI config any more.
> 	 */
> 	if (frozen_bus) {
>-		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>-
>-		pci_lock_rescan_remove();
>-		pcibios_remove_pci_devices(frozen_bus);
>-		pci_unlock_rescan_remove();
>+		if (pe->type & EEH_PE_VF) {
>+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
>+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>+		} else {
>+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>+			pci_lock_rescan_remove();
>+			pcibios_remove_pci_devices(frozen_bus);
>+			pci_unlock_rescan_remove();
>+		}
> 	}
> }
>
>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>index edfe63a..3b8b32f 100644
>--- a/arch/powerpc/kernel/eeh_pe.c
>+++ b/arch/powerpc/kernel/eeh_pe.c
>@@ -916,7 +916,8 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe)
> 	if (pe->type & EEH_PE_PHB) {
> 		bus = pe->phb->bus;
> 	} else if (pe->type & EEH_PE_BUS ||
>-		   pe->type & EEH_PE_DEVICE) {
>+		   pe->type & EEH_PE_DEVICE ||
>+		   pe->type & EEH_PE_VF) {

I still have the concern: VF PE isn't supposed to have PE primary bus as the
PE contains single or multiple (virtual) functions. So you can't simply
return the "shared" bus to caller to apply hotplug on it or for other
purpose.

> 		if (pe->bus) {
> 			bus = pe->bus;
> 			goto out;

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 9/9] powerpc/eeh: handle VF PE properly
  2015-05-13  1:16   ` Gavin Shan
@ 2015-05-14  9:35     ` Wei Yang
  2015-05-14 12:15       ` Gavin Shan
  2015-05-14 10:02     ` Wei Yang
  1 sibling, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-14  9:35 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Wed, May 13, 2015 at 11:16:30AM +1000, Gavin Shan wrote:
>On Mon, May 04, 2015 at 03:07:38PM +0800, Wei Yang wrote:
>
>	if (!edev->physfn) {
>		pr_warn("%s: EEH dev %04x:%02x:%02x:%01x not for VF\n",
>			edev->phb->global_number, pdn->busno,
>			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
>		return NULL;
>	}
>
>>+
>>+	driver = eeh_pcid_get(dev);
>>+	if (driver) {
>>+		eeh_pcid_put(dev);
>>+		if (driver->err_handler)
>>+			return NULL;
>>+	}
>
>dev and driver are NULL for those VFs that have been unplugged. For those
>VFs weren't unplugged, driver and err_handler should be valid. The code
>looks correct. However, for consistence, please use EEH_DEV_DISCONNECTED
>that has been marked to those EEH devices which were unplugged. Do you
>think it would be better?
>
>	if (!(dev->flags & EEH_DEV_DISCONNECTED))
>		return NULL;
>

Hi, Gavin,

I think this is a nice idea, while this may not work.

We mark the DISCONNECTED flag when remove a PCI device, while before we do the
hot plug we will detach it from the tree and remove this flag in
eeh_pe_detach_dev().

This will leads to the VF not be hot plugged.


-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 9/9] powerpc/eeh: handle VF PE properly
  2015-05-13  1:16   ` Gavin Shan
  2015-05-14  9:35     ` Wei Yang
@ 2015-05-14 10:02     ` Wei Yang
  2015-05-14 12:30       ` Gavin Shan
  1 sibling, 1 reply; 33+ messages in thread
From: Wei Yang @ 2015-05-14 10:02 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Wed, May 13, 2015 at 11:16:30AM +1000, Gavin Shan wrote:
>> 	 * Actually, we should remove the PCI bridges as well.
>>@@ -416,7 +450,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
>> 	driver = eeh_pcid_get(dev);
>> 	if (driver) {
>> 		eeh_pcid_put(dev);
>>-		if (driver->err_handler)
>>+		if (removed && driver->err_handler)
>> 			return NULL;
>> 	}
>>
>>@@ -425,11 +459,21 @@ static void *eeh_rmv_device(void *data, void *userdata)
>> 		 pci_name(dev));
>> 	edev->bus = dev->bus;
>> 	edev->mode |= EEH_DEV_DISCONNECTED;
>>-	(*removed)++;
>>-
>>-	pci_lock_rescan_remove();
>>-	pci_stop_and_remove_bus_device(dev);
>>-	pci_unlock_rescan_remove();
>>+	if (removed)
>>+		(*removed)++;
>>+
>>+#ifdef CONFIG_PCI_IOV
>>+	if (edev->mode & EEH_DEV_VF) {
>>+		pci_iov_virtfn_remove(edev->physfn, pdn->vf_index, 0);
>>+		edev->pdev = NULL;
>>+		pdn->pe_number = IODA_INVALID_PE;
>
>Setting the PE number to invalid one seems not correct because the PE
>is still consumed by the VF's RID.
>

In commit 781a868f3136, we introduce the check of pdn->pe_number in
pnv_pci_dma_dev_setup(). Since VFs are create/released dynamically, we need to
delay the bind between PE and the device. Then to avoid rebind it, we check
the pdn->pe_number. The WARN_ON() is what you suggested.

So if don't clear the pe_number here, we break that rule.


-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 9/9] powerpc/eeh: handle VF PE properly
  2015-05-14  9:35     ` Wei Yang
@ 2015-05-14 12:15       ` Gavin Shan
  0 siblings, 0 replies; 33+ messages in thread
From: Gavin Shan @ 2015-05-14 12:15 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, Gavin Shan

On Thu, May 14, 2015 at 05:35:31PM +0800, Wei Yang wrote:
>On Wed, May 13, 2015 at 11:16:30AM +1000, Gavin Shan wrote:
>>On Mon, May 04, 2015 at 03:07:38PM +0800, Wei Yang wrote:
>>
>>	if (!edev->physfn) {
>>		pr_warn("%s: EEH dev %04x:%02x:%02x:%01x not for VF\n",
>>			edev->phb->global_number, pdn->busno,
>>			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
>>		return NULL;
>>	}
>>
>>>+
>>>+	driver = eeh_pcid_get(dev);
>>>+	if (driver) {
>>>+		eeh_pcid_put(dev);
>>>+		if (driver->err_handler)
>>>+			return NULL;
>>>+	}
>>
>>dev and driver are NULL for those VFs that have been unplugged. For those
>>VFs weren't unplugged, driver and err_handler should be valid. The code
>>looks correct. However, for consistence, please use EEH_DEV_DISCONNECTED
>>that has been marked to those EEH devices which were unplugged. Do you
>>think it would be better?
>>
>>	if (!(dev->flags & EEH_DEV_DISCONNECTED))
>>		return NULL;
>>
>
>
>I think this is a nice idea, while this may not work.
>
>We mark the DISCONNECTED flag when remove a PCI device, while before we do the
>hot plug we will detach it from the tree and remove this flag in
>eeh_pe_detach_dev().
>
>This will leads to the VF not be hot plugged.
>

Ok, the following way, with the original idea improved for a bit, would
work for you:

- Don't clear DISCONNECTED flag in eeh_pe_detach_dev().
- Use the flag for your case.
- Clear DISCONNECTED flag after the hogplug is done, right before
  eeh_pe_state_clear(pe, EEH_PE_KEEP) in eeh_reset_device().

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3 9/9] powerpc/eeh: handle VF PE properly
  2015-05-14 10:02     ` Wei Yang
@ 2015-05-14 12:30       ` Gavin Shan
  0 siblings, 0 replies; 33+ messages in thread
From: Gavin Shan @ 2015-05-14 12:30 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, Gavin Shan

On Thu, May 14, 2015 at 06:02:50PM +0800, Wei Yang wrote:
>On Wed, May 13, 2015 at 11:16:30AM +1000, Gavin Shan wrote:
>>> 	 * Actually, we should remove the PCI bridges as well.
>>>@@ -416,7 +450,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
>>> 	driver = eeh_pcid_get(dev);
>>> 	if (driver) {
>>> 		eeh_pcid_put(dev);
>>>-		if (driver->err_handler)
>>>+		if (removed && driver->err_handler)
>>> 			return NULL;
>>> 	}
>>>
>>>@@ -425,11 +459,21 @@ static void *eeh_rmv_device(void *data, void *userdata)
>>> 		 pci_name(dev));
>>> 	edev->bus = dev->bus;
>>> 	edev->mode |= EEH_DEV_DISCONNECTED;
>>>-	(*removed)++;
>>>-
>>>-	pci_lock_rescan_remove();
>>>-	pci_stop_and_remove_bus_device(dev);
>>>-	pci_unlock_rescan_remove();
>>>+	if (removed)
>>>+		(*removed)++;
>>>+
>>>+#ifdef CONFIG_PCI_IOV
>>>+	if (edev->mode & EEH_DEV_VF) {
>>>+		pci_iov_virtfn_remove(edev->physfn, pdn->vf_index, 0);
>>>+		edev->pdev = NULL;
>>>+		pdn->pe_number = IODA_INVALID_PE;
>>
>>Setting the PE number to invalid one seems not correct because the PE
>>is still consumed by the VF's RID.
>>
>
>In commit 781a868f3136, we introduce the check of pdn->pe_number in
>pnv_pci_dma_dev_setup(). Since VFs are create/released dynamically, we need to
>delay the bind between PE and the device. Then to avoid rebind it, we check
>the pdn->pe_number. The WARN_ON() is what you suggested.
>
>So if don't clear the pe_number here, we break that rule.
>

Ok. You choose one of the following two options:

- Remove WARN_ON() in pnv_pci_dma_dev_setup(). Actually, I don't think
  it's quite correct to update PE# to VF's pdn at this function. Instead,
  it would have been done at the SRIOV enablement backend. But it's not
  related to this patchset. You may improve it later.
- Keep the code you had: set the PE# to invalid one, and put some comments
  here as below.

  /*
   * We have to set the VF PE number to invalid one, which is required
   * to plug the VF successfully.
   */

Thanks,
Gavin
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-05-14 12:31 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-04  7:07 [PATCH V3 0/9] VF EEH on Power8 Wei Yang
2015-05-04  7:07 ` [PATCH V3 1/9] pci/iov: rename and export virtfn_add/virtfn_remove Wei Yang
2015-05-11  2:13   ` Gavin Shan
2015-05-04  7:07 ` [PATCH V3 2/9] powerpc/pci_dn: cache vf_index in pci_dn Wei Yang
2015-05-11  2:21   ` Gavin Shan
2015-05-11  5:54     ` Wei Yang
2015-05-12  6:15       ` Gavin Shan
2015-05-12  7:29         ` Wei Yang
2015-05-04  7:07 ` [PATCH V3 3/9] powerpc/pci: remove PCI devices in reverse order Wei Yang
2015-05-04  7:07 ` [PATCH V3 4/9] powerpc/eeh: cache address range just for normal device Wei Yang
2015-05-04  7:07 ` [PATCH V3 5/9] powerpc/eeh: create EEH_PE_VF for VF PE Wei Yang
2015-05-11  2:37   ` Gavin Shan
2015-05-11  6:25     ` Wei Yang
2015-05-12  6:28       ` Gavin Shan
2015-05-12  7:52         ` Wei Yang
2015-05-04  7:07 ` [PATCH V3 6/9] powerpc/powernv: create/release eeh_dev for VF Wei Yang
2015-05-11  2:48   ` Gavin Shan
2015-05-12  8:06     ` Wei Yang
2015-05-12 23:09       ` Gavin Shan
2015-05-04  7:07 ` [PATCH V3 7/9] powerpc/powernv: Support EEH reset for VFs Wei Yang
2015-05-11  3:03   ` Gavin Shan
2015-05-04  7:07 ` [PATCH V3 8/9] powerpc/powernv: Support PCI config restore " Wei Yang
2015-05-11  4:22   ` Gavin Shan
2015-05-12  1:31     ` Wei Yang
2015-05-12  6:34       ` Gavin Shan
2015-05-12  8:16         ` Wei Yang
2015-05-12 23:16           ` Gavin Shan
2015-05-04  7:07 ` [PATCH V3 9/9] powerpc/eeh: handle VF PE properly Wei Yang
2015-05-13  1:16   ` Gavin Shan
2015-05-14  9:35     ` Wei Yang
2015-05-14 12:15       ` Gavin Shan
2015-05-14 10:02     ` Wei Yang
2015-05-14 12:30       ` Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).