All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 00/11] VF EEH on Power8
@ 2015-05-15  5:46 ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

This patchset enables EEH on SRIOV VFs. The general idea is to create proper
VF edev and VF PE and handle them properly.

Different from the Bus PE, VF PE just contain one VF. This introduces the
difference of EEH error handling on a VF PE. Generally, it has several
differences.

First, the VF's removal and re-enumerate rely on its PF. VF has a tight
relationship between its PF. This is not proper to enumerate a VF by usual
scan procedure. That's why virtfn_add/virtfn_remove are exported in this patch
set.

Second, the reset/restore of a VF is done in kernel space. FW is not aware of
the VF, this means the usual reset function done in FW will not work. One of
the patch will imitate the reset/restore function in kernel space.

Third, the VF may be removed during the PF's error_detected function. In this
case, the original error_detected->slot_reset->resume sequence is not proper
to those removed VFs, since they are re-created by PF in a fresh state. A flag
in eeh_dev is introduce to mark the eeh_dev is in error state. By doing so, we
track whether this device needs to be reset or not.

This has been tested both on host and in guest on Power8 with latest kernel
version.

v4:
   * refine the change logs, comment and code style
   * change pnv_pci_fixup_vf_eeh() to pnv_eeh_vf_final_fixup() and remove the
     CONFIG_PCI_IOV macro
   * reorder patch 5/6 to make the logic more reasonable
   * remove remove_dev_pci_data()
   * remove the EEH_DEV_VF flag, use edev->physfn to identify a VF EEH DEV and
     remove related CONFIG_PCI_IOV macro
   * add the option for VF reset
   * fix the pnv_eeh_cfg_blocked() logic
   * replace pnv_pci_cfg_{read,write} with eeh_ops->{read,write}_config in
     pnv_eeh_vf_restore_config()
   * rename pnv_eeh_vf_restore_config() to pnv_eeh_restore_vf_config()
   * rename pnv_pci_fixup_vf_caps() to pnv_pci_vf_header_fixup() and move it
     to arch/powerpc/platforms/powernv/pci.c
   * add a field compound in pnv_ioda_pe to link compound PEs
   * handle compound PE for VF PEs
v3:
   * add back vf_index in pci_dn to track the VF's index
   * rename ppdev in eeh_dev to physfn for consistency
   * move edev->physfn assignment before dev->dev.archdata.edev is set
   * move pnv_pci_fixup_vf_eeh() and pnv_pci_fixup_vf_caps() to eeh-powernv.c
   * more clear and detail in commit log and comment in code
   * merge eeh_rmv_virt_device() with eeh_rmv_device()
   * move the cfg_blocked check logic from pnv_eeh_read/write_config() to
     pnv_eeh_cfg_blocked()
   * move the vf reset/restore logic into its own patch, two patches are
     created.
     powerpc/powernv: Support PCI config restore for VFs
     powerpc/powernv: Support EEH reset for VFs
   * simplify the vf reset logic
v2:
   * add prefix pci_iov_ to virtfn_add/virtfn_remove
   * use EEH_DEV_VF as a flag for a VF's eeh_dev
   * use eeh_dev instead of edev in change log
   * remove vf_index in eeh_dev, calculate it from pdn->busno and devfn
   * do eeh_add_device_late() and eeh_sysfs_add_device() both after pci_dev is
     well initialized
   * do FLR to reset a VF PE
   * imitate the restore function in FW for VF
   * remove the reverse order patch, since it is still under discussion

Wei Yang (11):
  pci/iov: rename and export virtfn_add/virtfn_remove
  powerpc/pci_dn: cache vf_index in pci_dn
  powerpc/pci: remove PCI devices in reverse order
  powerpc/eeh: cache address range just for normal device
  powerpc/powernv: create/release eeh_dev for VF
  powerpc/eeh: create EEH_PE_VF for VF PE
  powerpc/powernv: Support EEH reset for VFs
  powerpc/powernv: Support PCI config restore for VFs
  powerpc/eeh: handle VF PE properly
  powerpc/powernv: use "compound" as the child's list_head for compound
    PE
  powerpc/powernv: compound PE for VFs

 arch/powerpc/include/asm/eeh.h               |    4 +
 arch/powerpc/include/asm/pci-bridge.h        |    2 +
 arch/powerpc/kernel/eeh.c                    |    5 +
 arch/powerpc/kernel/eeh_cache.c              |    2 +-
 arch/powerpc/kernel/eeh_driver.c             |  103 +++++++++++---
 arch/powerpc/kernel/eeh_pe.c                 |   13 +-
 arch/powerpc/kernel/pci-hotplug.c            |    2 +-
 arch/powerpc/kernel/pci_dn.c                 |   15 +-
 arch/powerpc/platforms/powernv/eeh-powernv.c |  196 +++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci-ioda.c    |   35 ++++-
 arch/powerpc/platforms/powernv/pci.c         |   16 +++
 arch/powerpc/platforms/powernv/pci.h         |    1 +
 drivers/pci/iov.c                            |   10 +-
 include/linux/pci.h                          |    2 +
 14 files changed, 366 insertions(+), 40 deletions(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH V4 00/11] VF EEH on Power8
@ 2015-05-15  5:46 ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

This patchset enables EEH on SRIOV VFs. The general idea is to create proper
VF edev and VF PE and handle them properly.

Different from the Bus PE, VF PE just contain one VF. This introduces the
difference of EEH error handling on a VF PE. Generally, it has several
differences.

First, the VF's removal and re-enumerate rely on its PF. VF has a tight
relationship between its PF. This is not proper to enumerate a VF by usual
scan procedure. That's why virtfn_add/virtfn_remove are exported in this patch
set.

Second, the reset/restore of a VF is done in kernel space. FW is not aware of
the VF, this means the usual reset function done in FW will not work. One of
the patch will imitate the reset/restore function in kernel space.

Third, the VF may be removed during the PF's error_detected function. In this
case, the original error_detected->slot_reset->resume sequence is not proper
to those removed VFs, since they are re-created by PF in a fresh state. A flag
in eeh_dev is introduce to mark the eeh_dev is in error state. By doing so, we
track whether this device needs to be reset or not.

This has been tested both on host and in guest on Power8 with latest kernel
version.

v4:
   * refine the change logs, comment and code style
   * change pnv_pci_fixup_vf_eeh() to pnv_eeh_vf_final_fixup() and remove the
     CONFIG_PCI_IOV macro
   * reorder patch 5/6 to make the logic more reasonable
   * remove remove_dev_pci_data()
   * remove the EEH_DEV_VF flag, use edev->physfn to identify a VF EEH DEV and
     remove related CONFIG_PCI_IOV macro
   * add the option for VF reset
   * fix the pnv_eeh_cfg_blocked() logic
   * replace pnv_pci_cfg_{read,write} with eeh_ops->{read,write}_config in
     pnv_eeh_vf_restore_config()
   * rename pnv_eeh_vf_restore_config() to pnv_eeh_restore_vf_config()
   * rename pnv_pci_fixup_vf_caps() to pnv_pci_vf_header_fixup() and move it
     to arch/powerpc/platforms/powernv/pci.c
   * add a field compound in pnv_ioda_pe to link compound PEs
   * handle compound PE for VF PEs
v3:
   * add back vf_index in pci_dn to track the VF's index
   * rename ppdev in eeh_dev to physfn for consistency
   * move edev->physfn assignment before dev->dev.archdata.edev is set
   * move pnv_pci_fixup_vf_eeh() and pnv_pci_fixup_vf_caps() to eeh-powernv.c
   * more clear and detail in commit log and comment in code
   * merge eeh_rmv_virt_device() with eeh_rmv_device()
   * move the cfg_blocked check logic from pnv_eeh_read/write_config() to
     pnv_eeh_cfg_blocked()
   * move the vf reset/restore logic into its own patch, two patches are
     created.
     powerpc/powernv: Support PCI config restore for VFs
     powerpc/powernv: Support EEH reset for VFs
   * simplify the vf reset logic
v2:
   * add prefix pci_iov_ to virtfn_add/virtfn_remove
   * use EEH_DEV_VF as a flag for a VF's eeh_dev
   * use eeh_dev instead of edev in change log
   * remove vf_index in eeh_dev, calculate it from pdn->busno and devfn
   * do eeh_add_device_late() and eeh_sysfs_add_device() both after pci_dev is
     well initialized
   * do FLR to reset a VF PE
   * imitate the restore function in FW for VF
   * remove the reverse order patch, since it is still under discussion

Wei Yang (11):
  pci/iov: rename and export virtfn_add/virtfn_remove
  powerpc/pci_dn: cache vf_index in pci_dn
  powerpc/pci: remove PCI devices in reverse order
  powerpc/eeh: cache address range just for normal device
  powerpc/powernv: create/release eeh_dev for VF
  powerpc/eeh: create EEH_PE_VF for VF PE
  powerpc/powernv: Support EEH reset for VFs
  powerpc/powernv: Support PCI config restore for VFs
  powerpc/eeh: handle VF PE properly
  powerpc/powernv: use "compound" as the child's list_head for compound
    PE
  powerpc/powernv: compound PE for VFs

 arch/powerpc/include/asm/eeh.h               |    4 +
 arch/powerpc/include/asm/pci-bridge.h        |    2 +
 arch/powerpc/kernel/eeh.c                    |    5 +
 arch/powerpc/kernel/eeh_cache.c              |    2 +-
 arch/powerpc/kernel/eeh_driver.c             |  103 +++++++++++---
 arch/powerpc/kernel/eeh_pe.c                 |   13 +-
 arch/powerpc/kernel/pci-hotplug.c            |    2 +-
 arch/powerpc/kernel/pci_dn.c                 |   15 +-
 arch/powerpc/platforms/powernv/eeh-powernv.c |  196 +++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci-ioda.c    |   35 ++++-
 arch/powerpc/platforms/powernv/pci.c         |   16 +++
 arch/powerpc/platforms/powernv/pci.h         |    1 +
 drivers/pci/iov.c                            |   10 +-
 include/linux/pci.h                          |    2 +
 14 files changed, 366 insertions(+), 40 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH V4 01/11] pci/iov: rename and export virtfn_add/virtfn_remove
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

During the EEH recovery, when a device's driver is not EEH aware or no
driver is bound with a device, EEH core would do hotplug on this device.
While it isn't feasible for a VF with usual hotplug procedure. During
removal of a VF, virtual bus should be removed if necessary. During the
re-creation, the pci_scan_slot() doesn't work on a VF.

This patch exports two functions to handle the hotplug case for VF
properly. They will be invoked when the EEH core does the hotplug case for
VFs.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 drivers/pci/iov.c   |   10 +++++-----
 include/linux/pci.h |    2 ++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 47daf2f..f353e6f 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -106,7 +106,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
 	return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
 }
 
-static int virtfn_add(struct pci_dev *dev, int id, int reset)
+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset)
 {
 	int i;
 	int rc = -ENOMEM;
@@ -181,7 +181,7 @@ failed:
 	return rc;
 }
 
-static void virtfn_remove(struct pci_dev *dev, int id, int reset)
+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset)
 {
 	struct pci_dev *virtfn;
 
@@ -302,7 +302,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	}
 
 	for (i = 0; i < initial; i++) {
-		rc = virtfn_add(dev, i, 0);
+		rc = pci_iov_virtfn_add(dev, i, 0);
 		if (rc)
 			goto failed;
 	}
@@ -314,7 +314,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 failed:
 	for (j = 0; j < i; j++)
-		virtfn_remove(dev, j, 0);
+		pci_iov_virtfn_remove(dev, j, 0);
 
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
 	pci_cfg_access_lock(dev);
@@ -343,7 +343,7 @@ static void sriov_disable(struct pci_dev *dev)
 		return;
 
 	for (i = 0; i < iov->num_VFs; i++)
-		virtfn_remove(dev, i, 0);
+		pci_iov_virtfn_remove(dev, i, 0);
 
 	pcibios_sriov_disable(dev);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 353db8d..94bacfa 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1679,6 +1679,8 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
 
 int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 void pci_disable_sriov(struct pci_dev *dev);
+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset);
+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset);
 int pci_num_vf(struct pci_dev *dev);
 int pci_vfs_assigned(struct pci_dev *dev);
 int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 01/11] pci/iov: rename and export virtfn_add/virtfn_remove
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

During the EEH recovery, when a device's driver is not EEH aware or no
driver is bound with a device, EEH core would do hotplug on this device.
While it isn't feasible for a VF with usual hotplug procedure. During
removal of a VF, virtual bus should be removed if necessary. During the
re-creation, the pci_scan_slot() doesn't work on a VF.

This patch exports two functions to handle the hotplug case for VF
properly. They will be invoked when the EEH core does the hotplug case for
VFs.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 drivers/pci/iov.c   |   10 +++++-----
 include/linux/pci.h |    2 ++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 47daf2f..f353e6f 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -106,7 +106,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
 	return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
 }
 
-static int virtfn_add(struct pci_dev *dev, int id, int reset)
+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset)
 {
 	int i;
 	int rc = -ENOMEM;
@@ -181,7 +181,7 @@ failed:
 	return rc;
 }
 
-static void virtfn_remove(struct pci_dev *dev, int id, int reset)
+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset)
 {
 	struct pci_dev *virtfn;
 
@@ -302,7 +302,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	}
 
 	for (i = 0; i < initial; i++) {
-		rc = virtfn_add(dev, i, 0);
+		rc = pci_iov_virtfn_add(dev, i, 0);
 		if (rc)
 			goto failed;
 	}
@@ -314,7 +314,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 failed:
 	for (j = 0; j < i; j++)
-		virtfn_remove(dev, j, 0);
+		pci_iov_virtfn_remove(dev, j, 0);
 
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
 	pci_cfg_access_lock(dev);
@@ -343,7 +343,7 @@ static void sriov_disable(struct pci_dev *dev)
 		return;
 
 	for (i = 0; i < iov->num_VFs; i++)
-		virtfn_remove(dev, i, 0);
+		pci_iov_virtfn_remove(dev, i, 0);
 
 	pcibios_sriov_disable(dev);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 353db8d..94bacfa 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1679,6 +1679,8 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
 
 int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 void pci_disable_sriov(struct pci_dev *dev);
+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset);
+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset);
 int pci_num_vf(struct pci_dev *dev);
 int pci_vfs_assigned(struct pci_dev *dev);
 int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 02/11] powerpc/pci_dn: cache vf_index in pci_dn
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

The patch caches the VF index in pci_dn, which can be used to calculate
VF's bus, device and function number. Those information helps to locate
the VF's PCI device instance when doing hotplug during EEH recovery if
necessary.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci-bridge.h |    1 +
 arch/powerpc/kernel/pci_dn.c          |    4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 1811c44..d78afe4 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -199,6 +199,7 @@ struct pci_dn {
 #ifdef CONFIG_PCI_IOV
 	u16     vfs_expanded;		/* number of VFs IOV BAR expanded */
 	u16     num_vfs;		/* number of VFs enabled*/
+	int     vf_index;		/* VF index in the PF */
 	int     offset;			/* PE# for the first VF PE */
 #define M64_PER_IOV 4
 	int     m64_per_iov;
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index b3b4df9..f771130 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -139,6 +139,7 @@ struct pci_dn *pci_get_pdn(struct pci_dev *pdev)
 #ifdef CONFIG_PCI_IOV
 static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 					   struct pci_dev *pdev,
+					   int vf_index,
 					   int busno, int devfn)
 {
 	struct pci_dn *pdn;
@@ -157,6 +158,7 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 	pdn->parent = parent;
 	pdn->busno = busno;
 	pdn->devfn = devfn;
+	pdn->vf_index = vf_index;
 #ifdef CONFIG_PPC_POWERNV
 	pdn->pe_number = IODA_INVALID_PE;
 #endif
@@ -196,7 +198,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 		return NULL;
 
 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
-		pdn = add_one_dev_pci_data(parent, NULL,
+		pdn = add_one_dev_pci_data(parent, NULL, i,
 					   pci_iov_virtfn_bus(pdev, i),
 					   pci_iov_virtfn_devfn(pdev, i));
 		if (!pdn) {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 02/11] powerpc/pci_dn: cache vf_index in pci_dn
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

The patch caches the VF index in pci_dn, which can be used to calculate
VF's bus, device and function number. Those information helps to locate
the VF's PCI device instance when doing hotplug during EEH recovery if
necessary.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci-bridge.h |    1 +
 arch/powerpc/kernel/pci_dn.c          |    4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index 1811c44..d78afe4 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -199,6 +199,7 @@ struct pci_dn {
 #ifdef CONFIG_PCI_IOV
 	u16     vfs_expanded;		/* number of VFs IOV BAR expanded */
 	u16     num_vfs;		/* number of VFs enabled*/
+	int     vf_index;		/* VF index in the PF */
 	int     offset;			/* PE# for the first VF PE */
 #define M64_PER_IOV 4
 	int     m64_per_iov;
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index b3b4df9..f771130 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -139,6 +139,7 @@ struct pci_dn *pci_get_pdn(struct pci_dev *pdev)
 #ifdef CONFIG_PCI_IOV
 static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 					   struct pci_dev *pdev,
+					   int vf_index,
 					   int busno, int devfn)
 {
 	struct pci_dn *pdn;
@@ -157,6 +158,7 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 	pdn->parent = parent;
 	pdn->busno = busno;
 	pdn->devfn = devfn;
+	pdn->vf_index = vf_index;
 #ifdef CONFIG_PPC_POWERNV
 	pdn->pe_number = IODA_INVALID_PE;
 #endif
@@ -196,7 +198,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 		return NULL;
 
 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
-		pdn = add_one_dev_pci_data(parent, NULL,
+		pdn = add_one_dev_pci_data(parent, NULL, i,
 					   pci_iov_virtfn_bus(pdev, i),
 					   pci_iov_virtfn_devfn(pdev, i));
 		if (!pdn) {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 03/11] powerpc/pci: remove PCI devices in reverse order
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

As commit ac205b7b ("PCI: make sriov work with hotplug remove") indicates,
when removing PCI devices on a bus which has VFs, we need to remove them
in the reverse order.

This patch applies this pattern to the hotplug removal code for the powerpc
arch.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/pci-hotplug.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 7ed85a6..98f84ed 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -50,7 +50,7 @@ void pcibios_remove_pci_devices(struct pci_bus *bus)
 
 	pr_debug("PCI: Removing devices on bus %04x:%02x\n",
 		 pci_domain_nr(bus),  bus->number);
-	list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) {
+	list_for_each_entry_safe_reverse(dev, tmp, &bus->devices, bus_list) {
 		pr_debug("   Removing %s...\n", pci_name(dev));
 		pci_stop_and_remove_bus_device(dev);
 	}
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 03/11] powerpc/pci: remove PCI devices in reverse order
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

As commit ac205b7b ("PCI: make sriov work with hotplug remove") indicates,
when removing PCI devices on a bus which has VFs, we need to remove them
in the reverse order.

This patch applies this pattern to the hotplug removal code for the powerpc
arch.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/pci-hotplug.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 7ed85a6..98f84ed 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -50,7 +50,7 @@ void pcibios_remove_pci_devices(struct pci_bus *bus)
 
 	pr_debug("PCI: Removing devices on bus %04x:%02x\n",
 		 pci_domain_nr(bus),  bus->number);
-	list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) {
+	list_for_each_entry_safe_reverse(dev, tmp, &bus->devices, bus_list) {
 		pr_debug("   Removing %s...\n", pci_name(dev));
 		pci_stop_and_remove_bus_device(dev);
 	}
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 04/11] powerpc/eeh: cache address range just for normal device
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

The address cache is used to find the related eeh_dev for a given MMIO
address.  From the definition of pci_dev.resource[], it keeps MMIO address
in following order: 6 normal BAR, ROM BAR, 6 IOV BAR, 4 Bridge window.

In the address cache, first it doesn't cache bridge device, second the IOV
BAR range should map to their own VFs separately. This means it just need
to cache the first 7 BARs for a normal device.

This patch restricts the address cache to save the first 7 BARs for a pci
device.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_cache.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index a1e86e1..f0ce2a3 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -196,7 +196,7 @@ static void __eeh_addr_cache_insert_dev(struct pci_dev *dev)
 	}
 
 	/* Walk resources on this device, poke them into the tree */
-	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+	for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
 		resource_size_t start = pci_resource_start(dev,i);
 		resource_size_t end = pci_resource_end(dev,i);
 		unsigned long flags = pci_resource_flags(dev,i);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 04/11] powerpc/eeh: cache address range just for normal device
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

The address cache is used to find the related eeh_dev for a given MMIO
address.  From the definition of pci_dev.resource[], it keeps MMIO address
in following order: 6 normal BAR, ROM BAR, 6 IOV BAR, 4 Bridge window.

In the address cache, first it doesn't cache bridge device, second the IOV
BAR range should map to their own VFs separately. This means it just need
to cache the first 7 BARs for a normal device.

This patch restricts the address cache to save the first 7 BARs for a pci
device.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_cache.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index a1e86e1..f0ce2a3 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -196,7 +196,7 @@ static void __eeh_addr_cache_insert_dev(struct pci_dev *dev)
 	}
 
 	/* Walk resources on this device, poke them into the tree */
-	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+	for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
 		resource_size_t start = pci_resource_start(dev,i);
 		resource_size_t end = pci_resource_end(dev,i);
 		unsigned long flags = pci_resource_flags(dev,i);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 05/11] powerpc/powernv: create/release eeh_dev for VF
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

EEH on powerpc platform needs eeh_dev structure to track the PCI device
status. Since VFs are created/released dynamically, VF's eeh_dev is also
dynamically created/released in system.

This patch creates/removes eeh_dev when pci_dn is created/removed for VFs,
and marks it with EEH_DEV_VF type.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h |    1 +
 arch/powerpc/kernel/eeh.c      |    4 ++++
 arch/powerpc/kernel/pci_dn.c   |   11 +++++++++++
 3 files changed, 16 insertions(+)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index a52db28..1b3614d 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -138,6 +138,7 @@ struct eeh_dev {
 	struct pci_controller *phb;	/* Associated PHB		*/
 	struct pci_dn *pdn;		/* Associated PCI device node	*/
 	struct pci_dev *pdev;		/* Associated PCI device	*/
+	struct pci_dev *physfn;		/* Associated PF PORT		*/
 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
 };
 
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 6c7ce1b..221e280 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1135,6 +1135,10 @@ void eeh_add_device_late(struct pci_dev *dev)
 	}
 
 	edev->pdev = dev;
+#ifdef CONFIG_PCI_IOV
+	if (dev->is_virtfn)
+		edev->physfn = dev->physfn;
+#endif
 	dev->dev.archdata.edev = edev;
 
 	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index f771130..94806a4 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -180,7 +180,9 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 {
 #ifdef CONFIG_PCI_IOV
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
 	struct pci_dn *parent, *pdn;
+	struct eeh_dev *edev;
 	int i;
 
 	/* Only support IOV for now */
@@ -206,6 +208,8 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 				 __func__, i);
 			return NULL;
 		}
+		eeh_dev_init(pdn, hose);
+		edev = pdn_to_eeh_dev(pdn);
 	}
 #endif /* CONFIG_PCI_IOV */
 
@@ -254,10 +258,17 @@ void remove_dev_pci_data(struct pci_dev *pdev)
 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
 		list_for_each_entry_safe(pdn, tmp,
 			&parent->child_list, list) {
+			struct eeh_dev *edev;
 			if (pdn->busno != pci_iov_virtfn_bus(pdev, i) ||
 			    pdn->devfn != pci_iov_virtfn_devfn(pdev, i))
 				continue;
 
+			edev = pdn_to_eeh_dev(pdn);
+			if (edev) {
+				pdn->edev = NULL;
+				kfree(edev);
+			}
+
 			if (!list_empty(&pdn->list))
 				list_del(&pdn->list);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 05/11] powerpc/powernv: create/release eeh_dev for VF
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

EEH on powerpc platform needs eeh_dev structure to track the PCI device
status. Since VFs are created/released dynamically, VF's eeh_dev is also
dynamically created/released in system.

This patch creates/removes eeh_dev when pci_dn is created/removed for VFs,
and marks it with EEH_DEV_VF type.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h |    1 +
 arch/powerpc/kernel/eeh.c      |    4 ++++
 arch/powerpc/kernel/pci_dn.c   |   11 +++++++++++
 3 files changed, 16 insertions(+)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index a52db28..1b3614d 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -138,6 +138,7 @@ struct eeh_dev {
 	struct pci_controller *phb;	/* Associated PHB		*/
 	struct pci_dn *pdn;		/* Associated PCI device node	*/
 	struct pci_dev *pdev;		/* Associated PCI device	*/
+	struct pci_dev *physfn;		/* Associated PF PORT		*/
 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
 };
 
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 6c7ce1b..221e280 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1135,6 +1135,10 @@ void eeh_add_device_late(struct pci_dev *dev)
 	}
 
 	edev->pdev = dev;
+#ifdef CONFIG_PCI_IOV
+	if (dev->is_virtfn)
+		edev->physfn = dev->physfn;
+#endif
 	dev->dev.archdata.edev = edev;
 
 	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index f771130..94806a4 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -180,7 +180,9 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
 struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 {
 #ifdef CONFIG_PCI_IOV
+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
 	struct pci_dn *parent, *pdn;
+	struct eeh_dev *edev;
 	int i;
 
 	/* Only support IOV for now */
@@ -206,6 +208,8 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
 				 __func__, i);
 			return NULL;
 		}
+		eeh_dev_init(pdn, hose);
+		edev = pdn_to_eeh_dev(pdn);
 	}
 #endif /* CONFIG_PCI_IOV */
 
@@ -254,10 +258,17 @@ void remove_dev_pci_data(struct pci_dev *pdev)
 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
 		list_for_each_entry_safe(pdn, tmp,
 			&parent->child_list, list) {
+			struct eeh_dev *edev;
 			if (pdn->busno != pci_iov_virtfn_bus(pdev, i) ||
 			    pdn->devfn != pci_iov_virtfn_devfn(pdev, i))
 				continue;
 
+			edev = pdn_to_eeh_dev(pdn);
+			if (edev) {
+				pdn->edev = NULL;
+				kfree(edev);
+			}
+
 			if (!list_empty(&pdn->list))
 				list_del(&pdn->list);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 06/11] powerpc/eeh: create EEH_PE_VF for VF PE
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

On powernv platform, VF PE is a special PE which is different from the Bus
PE.  On the EEH side, it needs a corresponding concept to handle the VF PE
properly. For example, we need to create VF PE when VF's pci_dev is
initialized in kernel. And add a flag to mark it is a VF PE.

This patch introduces the EEH_PE_VF type for VF PE and creates it for a VF.
At the mean time, it creates the sysfs and address cache for VF PE at PCI
device final fixup time.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |    1 +
 arch/powerpc/kernel/eeh_pe.c                 |   10 ++++++++--
 arch/powerpc/platforms/powernv/eeh-powernv.c |   14 ++++++++++++++
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 1b3614d..c1fde48 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -70,6 +70,7 @@ struct pci_dn;
 #define EEH_PE_PHB	(1 << 1)	/* PHB PE    */
 #define EEH_PE_DEVICE 	(1 << 2)	/* Device PE */
 #define EEH_PE_BUS	(1 << 3)	/* Bus PE    */
+#define EEH_PE_VF	(1 << 4)	/* VF PE     */
 
 #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
 #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 35f0b62..260a701 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -299,7 +299,10 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
 	 * EEH device already having associated PE, but
 	 * the direct parent EEH device doesn't have yet.
 	 */
-	pdn = pdn ? pdn->parent : NULL;
+	if (edev->physfn)
+		pdn = pci_get_pdn(edev->physfn);
+	else
+		pdn = pdn ? pdn->parent : NULL;
 	while (pdn) {
 		/* We're poking out of PCI territory */
 		parent = pdn_to_eeh_dev(pdn);
@@ -382,7 +385,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 	}
 
 	/* Create a new EEH PE */
-	pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
+	if (edev->physfn)
+		pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
+	else
+		pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
 	if (!pe) {
 		pr_err("%s: out of memory!\n", __func__);
 		return -ENOMEM;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 622f08c..31344a4 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1540,3 +1540,17 @@ static int __init eeh_powernv_init(void)
 	return ret;
 }
 machine_early_initcall(powernv, eeh_powernv_init);
+
+static void pnv_eeh_vf_final_fixup(struct pci_dev *pdev)
+{
+	/*
+	 * The following operations will fail if VF's sysfs files aren't
+	 * created or its resources aren't finalized.
+	 */
+	if (!pdev->is_virtfn)
+		return;
+
+	eeh_add_device_late(pdev);
+	eeh_sysfs_add_device(pdev);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_eeh_vf_final_fixup);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 06/11] powerpc/eeh: create EEH_PE_VF for VF PE
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

On powernv platform, VF PE is a special PE which is different from the Bus
PE.  On the EEH side, it needs a corresponding concept to handle the VF PE
properly. For example, we need to create VF PE when VF's pci_dev is
initialized in kernel. And add a flag to mark it is a VF PE.

This patch introduces the EEH_PE_VF type for VF PE and creates it for a VF.
At the mean time, it creates the sysfs and address cache for VF PE at PCI
device final fixup time.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |    1 +
 arch/powerpc/kernel/eeh_pe.c                 |   10 ++++++++--
 arch/powerpc/platforms/powernv/eeh-powernv.c |   14 ++++++++++++++
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 1b3614d..c1fde48 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -70,6 +70,7 @@ struct pci_dn;
 #define EEH_PE_PHB	(1 << 1)	/* PHB PE    */
 #define EEH_PE_DEVICE 	(1 << 2)	/* Device PE */
 #define EEH_PE_BUS	(1 << 3)	/* Bus PE    */
+#define EEH_PE_VF	(1 << 4)	/* VF PE     */
 
 #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
 #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 35f0b62..260a701 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -299,7 +299,10 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
 	 * EEH device already having associated PE, but
 	 * the direct parent EEH device doesn't have yet.
 	 */
-	pdn = pdn ? pdn->parent : NULL;
+	if (edev->physfn)
+		pdn = pci_get_pdn(edev->physfn);
+	else
+		pdn = pdn ? pdn->parent : NULL;
 	while (pdn) {
 		/* We're poking out of PCI territory */
 		parent = pdn_to_eeh_dev(pdn);
@@ -382,7 +385,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 	}
 
 	/* Create a new EEH PE */
-	pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
+	if (edev->physfn)
+		pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
+	else
+		pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
 	if (!pe) {
 		pr_err("%s: out of memory!\n", __func__);
 		return -ENOMEM;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 622f08c..31344a4 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1540,3 +1540,17 @@ static int __init eeh_powernv_init(void)
 	return ret;
 }
 machine_early_initcall(powernv, eeh_powernv_init);
+
+static void pnv_eeh_vf_final_fixup(struct pci_dev *pdev)
+{
+	/*
+	 * The following operations will fail if VF's sysfs files aren't
+	 * created or its resources aren't finalized.
+	 */
+	if (!pdev->is_virtfn)
+		return;
+
+	eeh_add_device_late(pdev);
+	eeh_sysfs_add_device(pdev);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_eeh_vf_final_fixup);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 07/11] powerpc/powernv: Support EEH reset for VFs
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

Before VF PE is introduced, there isn't a method to reset an individual PCI
function. And since skiboot firmware is not aware of the VF, the VF's reset
should be done in kernel.

This patch introduces a function pnv_eeh_vf_pe_reset() to do the FLR or AF
FLR to a VF.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |    1 +
 arch/powerpc/platforms/powernv/eeh-powernv.c |  123 +++++++++++++++++++++++++-
 2 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index c1fde48..3d64cf3 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -134,6 +134,7 @@ struct eeh_dev {
 	int pcix_cap;			/* Saved PCIx capability	*/
 	int pcie_cap;			/* Saved PCIe capability	*/
 	int aer_cap;			/* Saved AER capability		*/
+	int af_cap;			/* Saved AF capability		*/
 	struct eeh_pe *pe;		/* Associated PE		*/
 	struct list_head list;		/* Form link list in the PE	*/
 	struct pci_controller *phb;	/* Associated PHB		*/
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 31344a4..61f1a55 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -402,6 +402,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX);
 	edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
 	edev->aer_cap  = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
+	edev->af_cap   = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF);
 	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
 		edev->mode |= EEH_DEV_BRIDGE;
 		if (edev->pcie_cap) {
@@ -891,6 +892,117 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 	return 0;
 }
 
+static int pnv_pci_wait_for_pending(struct pci_dn *pdn, int pos, u16 mask)
+{
+	int i;
+
+	/* Wait for Transaction Pending bit clean */
+	for (i = 0; i < 4; i++) {
+		u32 status;
+		if (i)
+			msleep((1 << (i - 1)) * 100);
+
+		eeh_ops->read_config(pdn, pos, 2, &status);
+		if (!(status & mask))
+			return 1;
+	}
+
+	return 0;
+}
+
+static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
+{
+	u32 cap;
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+
+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &cap);
+	if (!(cap & PCI_EXP_DEVCAP_FLR))
+		return -ENOTTY;
+
+	if (!pnv_pci_wait_for_pending(pdn, edev->pcie_cap + PCI_EXP_DEVSTA,
+			PCI_EXP_DEVSTA_TRPND))
+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending "
+		       "transaction; performing function level reset anyway\n",
+			edev->phb->global_number, pdn->busno,
+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+
+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, &cap);
+	if (option == EEH_RESET_DEACTIVATE)
+		cap &= ~PCI_EXP_DEVCTL_BCR_FLR;
+	else
+		cap |= PCI_EXP_DEVCTL_BCR_FLR;
+	eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, cap);
+	msleep(100);
+	return 0;
+}
+
+static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
+{
+	u32 cap;
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+
+	if (!edev->af_cap)
+		return -ENOTTY;
+
+	eeh_ops->read_config(pdn, edev->af_cap + PCI_AF_CAP, 1, &cap);
+	if (!(cap & PCI_AF_CAP_TP) || !(cap & PCI_AF_CAP_FLR))
+		return -ENOTTY;
+
+	/*
+	 * Wait for Transaction Pending bit to clear.  A word-aligned test
+	 * is used, so we use the conrol offset rather than status and shift
+	 * the test bit to match.
+	 */
+	if (!pnv_pci_wait_for_pending(pdn, edev->af_cap + PCI_AF_CTRL,
+				 PCI_AF_STATUS_TP << 8))
+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending "
+		    "transaction; performing AF function level reset anyway\n",
+			edev->phb->global_number, pdn->busno,
+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+
+	if (option == EEH_RESET_DEACTIVATE)
+		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1, 0);
+	else
+		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1,
+				PCI_AF_CTRL_FLR);
+	msleep(100);
+	return 0;
+}
+
+static int pnv_eeh_reset_vf(struct pci_dn *pdn, int option)
+{
+	int rc;
+
+	might_sleep();
+
+	rc = pnv_eeh_do_flr(pdn, option);
+	if (rc != -ENOTTY)
+		goto done;
+
+	rc = pnv_eeh_do_af_flr(pdn, option);
+	if (rc != -ENOTTY)
+		goto done;
+
+done:
+	return rc;
+}
+
+static int pnv_eeh_vf_pe_reset(struct eeh_pe *pe, int option)
+{
+	struct eeh_dev *edev, *tmp;
+	struct pci_dn *pdn;
+	int ret = 0;
+
+	eeh_pe_for_each_dev(pe, edev, tmp) {
+		pdn = eeh_dev_to_pdn(edev);
+		ret |= pnv_eeh_reset_vf(pdn, option);
+		if (ret)
+			return ret;
+	}
+
+	return ret;
+}
+
 void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 {
 	struct pci_controller *hose;
@@ -966,7 +1078,9 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
 		}
 
 		bus = eeh_pe_bus_get(pe);
-		if (pci_is_root_bus(bus) ||
+		if (pe->type & EEH_PE_VF)
+			ret = pnv_eeh_vf_pe_reset(pe, option);
+		else if (pci_is_root_bus(bus) ||
 			pci_is_root_bus(bus->parent))
 			ret = pnv_eeh_root_reset(hose, option);
 		else
@@ -1106,6 +1220,13 @@ static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
 	if (!edev || !edev->pe)
 		return false;
 
+	/*
+	 * For VF's reset operation, we need to rely on the kernel to
+	 * do those PCI config operations since firmware isn't aware of VFs.
+	 */
+	if ((edev->physfn) && (edev->pe->state & EEH_PE_RESET))
+		return false;
+
 	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
 		return true;
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 07/11] powerpc/powernv: Support EEH reset for VFs
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

Before VF PE is introduced, there isn't a method to reset an individual PCI
function. And since skiboot firmware is not aware of the VF, the VF's reset
should be done in kernel.

This patch introduces a function pnv_eeh_vf_pe_reset() to do the FLR or AF
FLR to a VF.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |    1 +
 arch/powerpc/platforms/powernv/eeh-powernv.c |  123 +++++++++++++++++++++++++-
 2 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index c1fde48..3d64cf3 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -134,6 +134,7 @@ struct eeh_dev {
 	int pcix_cap;			/* Saved PCIx capability	*/
 	int pcie_cap;			/* Saved PCIe capability	*/
 	int aer_cap;			/* Saved AER capability		*/
+	int af_cap;			/* Saved AF capability		*/
 	struct eeh_pe *pe;		/* Associated PE		*/
 	struct list_head list;		/* Form link list in the PE	*/
 	struct pci_controller *phb;	/* Associated PHB		*/
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 31344a4..61f1a55 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -402,6 +402,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX);
 	edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
 	edev->aer_cap  = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
+	edev->af_cap   = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF);
 	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
 		edev->mode |= EEH_DEV_BRIDGE;
 		if (edev->pcie_cap) {
@@ -891,6 +892,117 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
 	return 0;
 }
 
+static int pnv_pci_wait_for_pending(struct pci_dn *pdn, int pos, u16 mask)
+{
+	int i;
+
+	/* Wait for Transaction Pending bit clean */
+	for (i = 0; i < 4; i++) {
+		u32 status;
+		if (i)
+			msleep((1 << (i - 1)) * 100);
+
+		eeh_ops->read_config(pdn, pos, 2, &status);
+		if (!(status & mask))
+			return 1;
+	}
+
+	return 0;
+}
+
+static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
+{
+	u32 cap;
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+
+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &cap);
+	if (!(cap & PCI_EXP_DEVCAP_FLR))
+		return -ENOTTY;
+
+	if (!pnv_pci_wait_for_pending(pdn, edev->pcie_cap + PCI_EXP_DEVSTA,
+			PCI_EXP_DEVSTA_TRPND))
+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending "
+		       "transaction; performing function level reset anyway\n",
+			edev->phb->global_number, pdn->busno,
+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+
+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, &cap);
+	if (option == EEH_RESET_DEACTIVATE)
+		cap &= ~PCI_EXP_DEVCTL_BCR_FLR;
+	else
+		cap |= PCI_EXP_DEVCTL_BCR_FLR;
+	eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, cap);
+	msleep(100);
+	return 0;
+}
+
+static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
+{
+	u32 cap;
+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
+
+	if (!edev->af_cap)
+		return -ENOTTY;
+
+	eeh_ops->read_config(pdn, edev->af_cap + PCI_AF_CAP, 1, &cap);
+	if (!(cap & PCI_AF_CAP_TP) || !(cap & PCI_AF_CAP_FLR))
+		return -ENOTTY;
+
+	/*
+	 * Wait for Transaction Pending bit to clear.  A word-aligned test
+	 * is used, so we use the conrol offset rather than status and shift
+	 * the test bit to match.
+	 */
+	if (!pnv_pci_wait_for_pending(pdn, edev->af_cap + PCI_AF_CTRL,
+				 PCI_AF_STATUS_TP << 8))
+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending "
+		    "transaction; performing AF function level reset anyway\n",
+			edev->phb->global_number, pdn->busno,
+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+
+	if (option == EEH_RESET_DEACTIVATE)
+		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1, 0);
+	else
+		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1,
+				PCI_AF_CTRL_FLR);
+	msleep(100);
+	return 0;
+}
+
+static int pnv_eeh_reset_vf(struct pci_dn *pdn, int option)
+{
+	int rc;
+
+	might_sleep();
+
+	rc = pnv_eeh_do_flr(pdn, option);
+	if (rc != -ENOTTY)
+		goto done;
+
+	rc = pnv_eeh_do_af_flr(pdn, option);
+	if (rc != -ENOTTY)
+		goto done;
+
+done:
+	return rc;
+}
+
+static int pnv_eeh_vf_pe_reset(struct eeh_pe *pe, int option)
+{
+	struct eeh_dev *edev, *tmp;
+	struct pci_dn *pdn;
+	int ret = 0;
+
+	eeh_pe_for_each_dev(pe, edev, tmp) {
+		pdn = eeh_dev_to_pdn(edev);
+		ret |= pnv_eeh_reset_vf(pdn, option);
+		if (ret)
+			return ret;
+	}
+
+	return ret;
+}
+
 void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
 {
 	struct pci_controller *hose;
@@ -966,7 +1078,9 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
 		}
 
 		bus = eeh_pe_bus_get(pe);
-		if (pci_is_root_bus(bus) ||
+		if (pe->type & EEH_PE_VF)
+			ret = pnv_eeh_vf_pe_reset(pe, option);
+		else if (pci_is_root_bus(bus) ||
 			pci_is_root_bus(bus->parent))
 			ret = pnv_eeh_root_reset(hose, option);
 		else
@@ -1106,6 +1220,13 @@ static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
 	if (!edev || !edev->pe)
 		return false;
 
+	/*
+	 * For VF's reset operation, we need to rely on the kernel to
+	 * do those PCI config operations since firmware isn't aware of VFs.
+	 */
+	if ((edev->physfn) && (edev->pe->state & EEH_PE_RESET))
+		return false;
+
 	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
 		return true;
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 08/11] powerpc/powernv: Support PCI config restore for VFs
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

Since skiboot firmware is not aware of VFs, the restore action for VF
should be done in kernel.

The patch introduces function pnv_eeh_restore_vf_config() to restore PCI
config space for VFs after reset.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci-bridge.h        |    1 +
 arch/powerpc/platforms/powernv/eeh-powernv.c |   59 +++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci.c         |   16 +++++++
 3 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index d78afe4..168b991 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -205,6 +205,7 @@ struct pci_dn {
 	int     m64_per_iov;
 #define IODA_INVALID_M64        (-1)
 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
+	int	mps;
 #endif /* CONFIG_PCI_IOV */
 #endif
 	struct list_head child_list;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 61f1a55..e200ed1 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1601,6 +1601,59 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
 	return ret;
 }
 
+#ifdef CONFIG_PCI_IOV
+static int pnv_eeh_restore_vf_config(struct pci_dn *pdn)
+{
+	int pcie_cap, aer_cap, old_mps;
+	u32 devctl, cmd, cap2, aer_capctl;
+
+	/* Restore MPS */
+	pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
+	if (pcie_cap) {
+		old_mps = (ffs(pdn->mps) - 8) << 5;
+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
+		devctl |= old_mps;
+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
+	}
+
+	/* Disable Completion Timeout */
+	if (pcie_cap) {
+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
+		if (cap2 & 0x10) {
+			eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
+			cap2 |= 0x10;
+			eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
+		}
+	}
+
+	/* Enable SERR and parity checking */
+	eeh_ops->read_config(pdn, PCI_COMMAND, 2, &cmd);
+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
+	eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd);
+
+	/* Enable report various errors */
+	if (pcie_cap) {
+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
+		devctl &= ~PCI_EXP_DEVCTL_CERE;
+		devctl |= (PCI_EXP_DEVCTL_NFERE |
+			   PCI_EXP_DEVCTL_FERE |
+			   PCI_EXP_DEVCTL_URRE);
+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
+	}
+
+	/* Enable ECRC generation and check */
+	if (pcie_cap) {
+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
+		eeh_ops->read_config(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
+		eeh_ops->write_config(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
+	}
+
+	return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
 static int pnv_eeh_restore_config(struct pci_dn *pdn)
 {
 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
@@ -1611,7 +1664,11 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
 		return -EEXIST;
 
 	phb = edev->phb->private_data;
-	ret = opal_pci_reinit(phb->opal_id,
+	/* FW is not VF aware, we rely on OS to restore it */
+	if (edev->physfn)
+		ret = pnv_eeh_restore_vf_config(pdn);
+	else
+		ret = opal_pci_reinit(phb->opal_id,
 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
 	if (ret) {
 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index bca2aeb..31d0258 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -781,3 +781,19 @@ machine_subsys_initcall_sync(powernv, tce_iommu_bus_notifier_init);
 struct pci_controller_ops pnv_pci_controller_ops = {
 	.dma_dev_setup = pnv_pci_dma_dev_setup,
 };
+
+static void pnv_pci_fixup_vf_caps(struct pci_dev *pdev)
+{
+	struct pci_dn *pdn = pci_get_pdn(pdev);
+	int parent_mps;
+
+	if (!pdev->is_virtfn)
+		return;
+
+	/* Synchronize MPS for VF and PF */
+	parent_mps = pcie_get_mps(pdev->physfn);
+	if ((128 << pdev->pcie_mpss) >= parent_mps)
+		pcie_set_mps(pdev, parent_mps);
+	pdn->mps = pcie_get_mps(pdev);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_caps);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 08/11] powerpc/powernv: Support PCI config restore for VFs
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

Since skiboot firmware is not aware of VFs, the restore action for VF
should be done in kernel.

The patch introduces function pnv_eeh_restore_vf_config() to restore PCI
config space for VFs after reset.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci-bridge.h        |    1 +
 arch/powerpc/platforms/powernv/eeh-powernv.c |   59 +++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci.c         |   16 +++++++
 3 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index d78afe4..168b991 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -205,6 +205,7 @@ struct pci_dn {
 	int     m64_per_iov;
 #define IODA_INVALID_M64        (-1)
 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
+	int	mps;
 #endif /* CONFIG_PCI_IOV */
 #endif
 	struct list_head child_list;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 61f1a55..e200ed1 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1601,6 +1601,59 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
 	return ret;
 }
 
+#ifdef CONFIG_PCI_IOV
+static int pnv_eeh_restore_vf_config(struct pci_dn *pdn)
+{
+	int pcie_cap, aer_cap, old_mps;
+	u32 devctl, cmd, cap2, aer_capctl;
+
+	/* Restore MPS */
+	pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
+	if (pcie_cap) {
+		old_mps = (ffs(pdn->mps) - 8) << 5;
+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
+		devctl |= old_mps;
+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
+	}
+
+	/* Disable Completion Timeout */
+	if (pcie_cap) {
+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
+		if (cap2 & 0x10) {
+			eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
+			cap2 |= 0x10;
+			eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
+		}
+	}
+
+	/* Enable SERR and parity checking */
+	eeh_ops->read_config(pdn, PCI_COMMAND, 2, &cmd);
+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
+	eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd);
+
+	/* Enable report various errors */
+	if (pcie_cap) {
+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
+		devctl &= ~PCI_EXP_DEVCTL_CERE;
+		devctl |= (PCI_EXP_DEVCTL_NFERE |
+			   PCI_EXP_DEVCTL_FERE |
+			   PCI_EXP_DEVCTL_URRE);
+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
+	}
+
+	/* Enable ECRC generation and check */
+	if (pcie_cap) {
+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
+		eeh_ops->read_config(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
+		eeh_ops->write_config(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
+	}
+
+	return 0;
+}
+#endif /* CONFIG_PCI_IOV */
+
 static int pnv_eeh_restore_config(struct pci_dn *pdn)
 {
 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
@@ -1611,7 +1664,11 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
 		return -EEXIST;
 
 	phb = edev->phb->private_data;
-	ret = opal_pci_reinit(phb->opal_id,
+	/* FW is not VF aware, we rely on OS to restore it */
+	if (edev->physfn)
+		ret = pnv_eeh_restore_vf_config(pdn);
+	else
+		ret = opal_pci_reinit(phb->opal_id,
 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
 	if (ret) {
 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index bca2aeb..31d0258 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -781,3 +781,19 @@ machine_subsys_initcall_sync(powernv, tce_iommu_bus_notifier_init);
 struct pci_controller_ops pnv_pci_controller_ops = {
 	.dma_dev_setup = pnv_pci_dma_dev_setup,
 };
+
+static void pnv_pci_fixup_vf_caps(struct pci_dev *pdev)
+{
+	struct pci_dn *pdn = pci_get_pdn(pdev);
+	int parent_mps;
+
+	if (!pdev->is_virtfn)
+		return;
+
+	/* Synchronize MPS for VF and PF */
+	parent_mps = pcie_get_mps(pdev->physfn);
+	if ((128 << pdev->pcie_mpss) >= parent_mps)
+		pcie_set_mps(pdev, parent_mps);
+	pdn->mps = pcie_get_mps(pdev);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_caps);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 09/11] powerpc/eeh: handle VF PE properly
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

Compared with Bus PE, VF PE just has one single pci function. This
introduces the difference of error handling on a VF PE.

For example in the hotplug case, EEH needs to remove and re-create the VF
properly. In the case when PF's error_detected() disable SRIOV, this patch
introduces a flag to mark the eeh_dev of a VF to avoid the slot_reset() and
resume(). Since the FW is not ware of the VF, this patch handles the VF
restore/reset in kernel directly.

This patch is to handle the VF PE properly in these cases.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h   |    1 +
 arch/powerpc/kernel/eeh.c        |    1 +
 arch/powerpc/kernel/eeh_driver.c |  103 ++++++++++++++++++++++++++++++--------
 arch/powerpc/kernel/eeh_pe.c     |    3 +-
 4 files changed, 85 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 3d64cf3..d24382c 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -140,6 +140,7 @@ struct eeh_dev {
 	struct pci_controller *phb;	/* Associated PHB		*/
 	struct pci_dn *pdn;		/* Associated PCI device node	*/
 	struct pci_dev *pdev;		/* Associated PCI device	*/
+	int    in_error;		/* Error flag for eeh_dev	*/
 	struct pci_dev *physfn;		/* Associated PF PORT		*/
 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
 };
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 221e280..077c3d1 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1226,6 +1226,7 @@ void eeh_remove_device(struct pci_dev *dev)
 	 * from the parent PE during the BAR resotre.
 	 */
 	edev->pdev = NULL;
+	edev->in_error = 0;
 	dev->dev.archdata.edev = NULL;
 	if (!(edev->pe->state & EEH_PE_KEEP))
 		eeh_rmv_from_parent_pe(edev);
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 89eb4bc..292089e 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -211,6 +211,7 @@ static void *eeh_report_error(void *data, void *userdata)
 	if (rc == PCI_ERS_RESULT_NEED_RESET) *res = rc;
 	if (*res == PCI_ERS_RESULT_NONE) *res = rc;
 
+	edev->in_error = 1;
 	eeh_pcid_put(dev);
 	return NULL;
 }
@@ -282,7 +283,8 @@ static void *eeh_report_reset(void *data, void *userdata)
 
 	if (!driver->err_handler ||
 	    !driver->err_handler->slot_reset ||
-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
+	    (!edev->in_error)) {
 		eeh_pcid_put(dev);
 		return NULL;
 	}
@@ -339,14 +341,16 @@ static void *eeh_report_resume(void *data, void *userdata)
 
 	if (!driver->err_handler ||
 	    !driver->err_handler->resume ||
-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
+	    (!edev->in_error)) {
 		edev->mode &= ~EEH_DEV_NO_HANDLER;
-		eeh_pcid_put(dev);
-		return NULL;
+		goto out;
 	}
 
 	driver->err_handler->resume(dev);
 
+out:
+	edev->in_error = 0;
 	eeh_pcid_put(dev);
 	return NULL;
 }
@@ -386,12 +390,40 @@ static void *eeh_report_failure(void *data, void *userdata)
 	return NULL;
 }
 
+#ifdef CONFIG_PCI_IOV
+static void *eeh_add_virt_device(void *data, void *userdata)
+{
+	struct pci_driver *driver;
+	struct eeh_dev *edev = (struct eeh_dev *)data;
+	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
+	if (!(edev->physfn)) {
+		pr_warn("%s: EEH dev %04x:%02x:%02x:%01x not for VF\n",
+			__func__, edev->phb->global_number, pdn->busno,
+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+		return NULL;
+	}
+
+	driver = eeh_pcid_get(dev);
+	if (driver) {
+		eeh_pcid_put(dev);
+		if (driver->err_handler)
+			return NULL;
+	}
+
+	pci_iov_virtfn_add(edev->physfn, pdn->vf_index, 0);
+	return NULL;
+}
+#endif /* CONFIG_PCI_IOV */
+
 static void *eeh_rmv_device(void *data, void *userdata)
 {
 	struct pci_driver *driver;
 	struct eeh_dev *edev = (struct eeh_dev *)data;
 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
 	int *removed = (int *)userdata;
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
 
 	/*
 	 * Actually, we should remove the PCI bridges as well.
@@ -416,7 +448,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
 	driver = eeh_pcid_get(dev);
 	if (driver) {
 		eeh_pcid_put(dev);
-		if (driver->err_handler)
+		if (removed && driver->err_handler)
 			return NULL;
 	}
 
@@ -425,11 +457,18 @@ static void *eeh_rmv_device(void *data, void *userdata)
 		 pci_name(dev));
 	edev->bus = dev->bus;
 	edev->mode |= EEH_DEV_DISCONNECTED;
-	(*removed)++;
-
-	pci_lock_rescan_remove();
-	pci_stop_and_remove_bus_device(dev);
-	pci_unlock_rescan_remove();
+	if (removed)
+		(*removed)++;
+
+	if (edev->physfn) {
+		pci_iov_virtfn_remove(edev->physfn, pdn->vf_index, 0);
+		edev->pdev = NULL;
+		pdn->pe_number = IODA_INVALID_PE;
+	} else {
+		pci_lock_rescan_remove();
+		pci_stop_and_remove_bus_device(dev);
+		pci_unlock_rescan_remove();
+	}
 
 	return NULL;
 }
@@ -548,6 +587,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	struct pci_bus *frozen_bus = eeh_pe_bus_get(pe);
 	struct timeval tstamp;
 	int cnt, rc, removed = 0;
+	struct eeh_dev *edev;
 
 	/* pcibios will clear the counter; save the value */
 	cnt = pe->freeze_count;
@@ -561,12 +601,15 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	 */
 	eeh_pe_state_mark(pe, EEH_PE_KEEP);
 	if (bus) {
-		pci_lock_rescan_remove();
-		pcibios_remove_pci_devices(bus);
-		pci_unlock_rescan_remove();
-	} else if (frozen_bus) {
+		if (pe->type & EEH_PE_VF)
+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
+		else {
+			pci_lock_rescan_remove();
+			pcibios_remove_pci_devices(bus);
+			pci_unlock_rescan_remove();
+		}
+	} else if (frozen_bus)
 		eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
-	}
 
 	/*
 	 * Reset the pci controller. (Asserts RST#; resets config space).
@@ -607,14 +650,26 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 		 * PE. We should disconnect it so the binding can be
 		 * rebuilt when adding PCI devices.
 		 */
+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
-		pcibios_add_pci_devices(bus);
+#ifdef CONFIG_PCI_IOV
+		if (pe->type & EEH_PE_VF)
+			eeh_add_virt_device(edev, NULL);
+		else
+#endif
+			pcibios_add_pci_devices(bus);
 	} else if (frozen_bus && removed) {
 		pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
 		ssleep(5);
 
+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
-		pcibios_add_pci_devices(frozen_bus);
+#ifdef CONFIG_PCI_IOV
+		if (pe->type & EEH_PE_VF)
+			eeh_add_virt_device(edev, NULL);
+		else
+#endif
+			pcibios_add_pci_devices(frozen_bus);
 	}
 	eeh_pe_state_clear(pe, EEH_PE_KEEP);
 
@@ -792,11 +847,15 @@ perm_error:
 	 * the their PCI config any more.
 	 */
 	if (frozen_bus) {
-		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
-
-		pci_lock_rescan_remove();
-		pcibios_remove_pci_devices(frozen_bus);
-		pci_unlock_rescan_remove();
+		if (pe->type & EEH_PE_VF) {
+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+		} else {
+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+			pci_lock_rescan_remove();
+			pcibios_remove_pci_devices(frozen_bus);
+			pci_unlock_rescan_remove();
+		}
 	}
 }
 
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 260a701..5cde950 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -914,7 +914,8 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe)
 	if (pe->type & EEH_PE_PHB) {
 		bus = pe->phb->bus;
 	} else if (pe->type & EEH_PE_BUS ||
-		   pe->type & EEH_PE_DEVICE) {
+		   pe->type & EEH_PE_DEVICE ||
+		   pe->type & EEH_PE_VF) {
 		if (pe->bus) {
 			bus = pe->bus;
 			goto out;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 09/11] powerpc/eeh: handle VF PE properly
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

Compared with Bus PE, VF PE just has one single pci function. This
introduces the difference of error handling on a VF PE.

For example in the hotplug case, EEH needs to remove and re-create the VF
properly. In the case when PF's error_detected() disable SRIOV, this patch
introduces a flag to mark the eeh_dev of a VF to avoid the slot_reset() and
resume(). Since the FW is not ware of the VF, this patch handles the VF
restore/reset in kernel directly.

This patch is to handle the VF PE properly in these cases.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/eeh.h   |    1 +
 arch/powerpc/kernel/eeh.c        |    1 +
 arch/powerpc/kernel/eeh_driver.c |  103 ++++++++++++++++++++++++++++++--------
 arch/powerpc/kernel/eeh_pe.c     |    3 +-
 4 files changed, 85 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 3d64cf3..d24382c 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -140,6 +140,7 @@ struct eeh_dev {
 	struct pci_controller *phb;	/* Associated PHB		*/
 	struct pci_dn *pdn;		/* Associated PCI device node	*/
 	struct pci_dev *pdev;		/* Associated PCI device	*/
+	int    in_error;		/* Error flag for eeh_dev	*/
 	struct pci_dev *physfn;		/* Associated PF PORT		*/
 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
 };
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 221e280..077c3d1 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1226,6 +1226,7 @@ void eeh_remove_device(struct pci_dev *dev)
 	 * from the parent PE during the BAR resotre.
 	 */
 	edev->pdev = NULL;
+	edev->in_error = 0;
 	dev->dev.archdata.edev = NULL;
 	if (!(edev->pe->state & EEH_PE_KEEP))
 		eeh_rmv_from_parent_pe(edev);
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 89eb4bc..292089e 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -211,6 +211,7 @@ static void *eeh_report_error(void *data, void *userdata)
 	if (rc == PCI_ERS_RESULT_NEED_RESET) *res = rc;
 	if (*res == PCI_ERS_RESULT_NONE) *res = rc;
 
+	edev->in_error = 1;
 	eeh_pcid_put(dev);
 	return NULL;
 }
@@ -282,7 +283,8 @@ static void *eeh_report_reset(void *data, void *userdata)
 
 	if (!driver->err_handler ||
 	    !driver->err_handler->slot_reset ||
-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
+	    (!edev->in_error)) {
 		eeh_pcid_put(dev);
 		return NULL;
 	}
@@ -339,14 +341,16 @@ static void *eeh_report_resume(void *data, void *userdata)
 
 	if (!driver->err_handler ||
 	    !driver->err_handler->resume ||
-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
+	    (!edev->in_error)) {
 		edev->mode &= ~EEH_DEV_NO_HANDLER;
-		eeh_pcid_put(dev);
-		return NULL;
+		goto out;
 	}
 
 	driver->err_handler->resume(dev);
 
+out:
+	edev->in_error = 0;
 	eeh_pcid_put(dev);
 	return NULL;
 }
@@ -386,12 +390,40 @@ static void *eeh_report_failure(void *data, void *userdata)
 	return NULL;
 }
 
+#ifdef CONFIG_PCI_IOV
+static void *eeh_add_virt_device(void *data, void *userdata)
+{
+	struct pci_driver *driver;
+	struct eeh_dev *edev = (struct eeh_dev *)data;
+	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
+
+	if (!(edev->physfn)) {
+		pr_warn("%s: EEH dev %04x:%02x:%02x:%01x not for VF\n",
+			__func__, edev->phb->global_number, pdn->busno,
+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
+		return NULL;
+	}
+
+	driver = eeh_pcid_get(dev);
+	if (driver) {
+		eeh_pcid_put(dev);
+		if (driver->err_handler)
+			return NULL;
+	}
+
+	pci_iov_virtfn_add(edev->physfn, pdn->vf_index, 0);
+	return NULL;
+}
+#endif /* CONFIG_PCI_IOV */
+
 static void *eeh_rmv_device(void *data, void *userdata)
 {
 	struct pci_driver *driver;
 	struct eeh_dev *edev = (struct eeh_dev *)data;
 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
 	int *removed = (int *)userdata;
+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
 
 	/*
 	 * Actually, we should remove the PCI bridges as well.
@@ -416,7 +448,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
 	driver = eeh_pcid_get(dev);
 	if (driver) {
 		eeh_pcid_put(dev);
-		if (driver->err_handler)
+		if (removed && driver->err_handler)
 			return NULL;
 	}
 
@@ -425,11 +457,18 @@ static void *eeh_rmv_device(void *data, void *userdata)
 		 pci_name(dev));
 	edev->bus = dev->bus;
 	edev->mode |= EEH_DEV_DISCONNECTED;
-	(*removed)++;
-
-	pci_lock_rescan_remove();
-	pci_stop_and_remove_bus_device(dev);
-	pci_unlock_rescan_remove();
+	if (removed)
+		(*removed)++;
+
+	if (edev->physfn) {
+		pci_iov_virtfn_remove(edev->physfn, pdn->vf_index, 0);
+		edev->pdev = NULL;
+		pdn->pe_number = IODA_INVALID_PE;
+	} else {
+		pci_lock_rescan_remove();
+		pci_stop_and_remove_bus_device(dev);
+		pci_unlock_rescan_remove();
+	}
 
 	return NULL;
 }
@@ -548,6 +587,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	struct pci_bus *frozen_bus = eeh_pe_bus_get(pe);
 	struct timeval tstamp;
 	int cnt, rc, removed = 0;
+	struct eeh_dev *edev;
 
 	/* pcibios will clear the counter; save the value */
 	cnt = pe->freeze_count;
@@ -561,12 +601,15 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 	 */
 	eeh_pe_state_mark(pe, EEH_PE_KEEP);
 	if (bus) {
-		pci_lock_rescan_remove();
-		pcibios_remove_pci_devices(bus);
-		pci_unlock_rescan_remove();
-	} else if (frozen_bus) {
+		if (pe->type & EEH_PE_VF)
+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
+		else {
+			pci_lock_rescan_remove();
+			pcibios_remove_pci_devices(bus);
+			pci_unlock_rescan_remove();
+		}
+	} else if (frozen_bus)
 		eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
-	}
 
 	/*
 	 * Reset the pci controller. (Asserts RST#; resets config space).
@@ -607,14 +650,26 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
 		 * PE. We should disconnect it so the binding can be
 		 * rebuilt when adding PCI devices.
 		 */
+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
-		pcibios_add_pci_devices(bus);
+#ifdef CONFIG_PCI_IOV
+		if (pe->type & EEH_PE_VF)
+			eeh_add_virt_device(edev, NULL);
+		else
+#endif
+			pcibios_add_pci_devices(bus);
 	} else if (frozen_bus && removed) {
 		pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
 		ssleep(5);
 
+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
-		pcibios_add_pci_devices(frozen_bus);
+#ifdef CONFIG_PCI_IOV
+		if (pe->type & EEH_PE_VF)
+			eeh_add_virt_device(edev, NULL);
+		else
+#endif
+			pcibios_add_pci_devices(frozen_bus);
 	}
 	eeh_pe_state_clear(pe, EEH_PE_KEEP);
 
@@ -792,11 +847,15 @@ perm_error:
 	 * the their PCI config any more.
 	 */
 	if (frozen_bus) {
-		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
-
-		pci_lock_rescan_remove();
-		pcibios_remove_pci_devices(frozen_bus);
-		pci_unlock_rescan_remove();
+		if (pe->type & EEH_PE_VF) {
+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+		} else {
+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+			pci_lock_rescan_remove();
+			pcibios_remove_pci_devices(frozen_bus);
+			pci_unlock_rescan_remove();
+		}
 	}
 }
 
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 260a701..5cde950 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -914,7 +914,8 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe)
 	if (pe->type & EEH_PE_PHB) {
 		bus = pe->phb->bus;
 	} else if (pe->type & EEH_PE_BUS ||
-		   pe->type & EEH_PE_DEVICE) {
+		   pe->type & EEH_PE_DEVICE ||
+		   pe->type & EEH_PE_VF) {
 		if (pe->bus) {
 			bus = pe->bus;
 			goto out;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 10/11] powerpc/powernv: use "compound" as the child's list_head for compound PE
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

Commit 262af557dd75(powerpc/powernv: Enable M64 aperatus for PHB3)
introduces the concept of compound PE, and they are linked together to
master PE's slaves lish_head with the list field. While this field is
usually used to linked to the phb->ioda.pe_list to represents the PE is
used.

This patch introduces a field "compound" to link those compound PEs.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |    8 ++++----
 arch/powerpc/platforms/powernv/pci.h      |    1 +
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 920c252..843457b 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -344,7 +344,7 @@ done:
 		} else {
 			pe->flags |= PNV_IODA_PE_SLAVE;
 			pe->master = master_pe;
-			list_add_tail(&pe->list, &master_pe->slaves);
+			list_add_tail(&pe->compound, &master_pe->slaves);
 		}
 	}
 
@@ -428,7 +428,7 @@ static void pnv_ioda_freeze_pe(struct pnv_phb *phb, int pe_no)
 	if (!(pe->flags & PNV_IODA_PE_MASTER))
 		return;
 
-	list_for_each_entry(slave, &pe->slaves, list) {
+	list_for_each_entry(slave, &pe->slaves, compound) {
 		rc = opal_pci_eeh_freeze_set(phb->opal_id,
 					     slave->pe_number,
 					     OPAL_EEH_ACTION_SET_FREEZE_ALL);
@@ -464,7 +464,7 @@ static int pnv_ioda_unfreeze_pe(struct pnv_phb *phb, int pe_no, int opt)
 		return 0;
 
 	/* Clear frozen state for slave PEs */
-	list_for_each_entry(slave, &pe->slaves, list) {
+	list_for_each_entry(slave, &pe->slaves, compound) {
 		rc = opal_pci_eeh_freeze_clear(phb->opal_id,
 					     slave->pe_number,
 					     opt);
@@ -516,7 +516,7 @@ static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
 	if (!(pe->flags & PNV_IODA_PE_MASTER))
 		return state;
 
-	list_for_each_entry(slave, &pe->slaves, list) {
+	list_for_each_entry(slave, &pe->slaves, compound) {
 		rc = opal_pci_eeh_freeze_status(phb->opal_id,
 						slave->pe_number,
 						&fstate,
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 070ee88..540ab1e 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -73,6 +73,7 @@ struct pnv_ioda_pe {
 	/* PEs in compound case */
 	struct pnv_ioda_pe	*master;
 	struct list_head	slaves;
+	struct list_head	compound;
 
 	/* Link in list of PE#s */
 	struct list_head	dma_link;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 10/11] powerpc/powernv: use "compound" as the child's list_head for compound PE
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

Commit 262af557dd75(powerpc/powernv: Enable M64 aperatus for PHB3)
introduces the concept of compound PE, and they are linked together to
master PE's slaves lish_head with the list field. While this field is
usually used to linked to the phb->ioda.pe_list to represents the PE is
used.

This patch introduces a field "compound" to link those compound PEs.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |    8 ++++----
 arch/powerpc/platforms/powernv/pci.h      |    1 +
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 920c252..843457b 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -344,7 +344,7 @@ done:
 		} else {
 			pe->flags |= PNV_IODA_PE_SLAVE;
 			pe->master = master_pe;
-			list_add_tail(&pe->list, &master_pe->slaves);
+			list_add_tail(&pe->compound, &master_pe->slaves);
 		}
 	}
 
@@ -428,7 +428,7 @@ static void pnv_ioda_freeze_pe(struct pnv_phb *phb, int pe_no)
 	if (!(pe->flags & PNV_IODA_PE_MASTER))
 		return;
 
-	list_for_each_entry(slave, &pe->slaves, list) {
+	list_for_each_entry(slave, &pe->slaves, compound) {
 		rc = opal_pci_eeh_freeze_set(phb->opal_id,
 					     slave->pe_number,
 					     OPAL_EEH_ACTION_SET_FREEZE_ALL);
@@ -464,7 +464,7 @@ static int pnv_ioda_unfreeze_pe(struct pnv_phb *phb, int pe_no, int opt)
 		return 0;
 
 	/* Clear frozen state for slave PEs */
-	list_for_each_entry(slave, &pe->slaves, list) {
+	list_for_each_entry(slave, &pe->slaves, compound) {
 		rc = opal_pci_eeh_freeze_clear(phb->opal_id,
 					     slave->pe_number,
 					     opt);
@@ -516,7 +516,7 @@ static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
 	if (!(pe->flags & PNV_IODA_PE_MASTER))
 		return state;
 
-	list_for_each_entry(slave, &pe->slaves, list) {
+	list_for_each_entry(slave, &pe->slaves, compound) {
 		rc = opal_pci_eeh_freeze_status(phb->opal_id,
 						slave->pe_number,
 						&fstate,
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 070ee88..540ab1e 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -73,6 +73,7 @@ struct pnv_ioda_pe {
 	/* PEs in compound case */
 	struct pnv_ioda_pe	*master;
 	struct list_head	slaves;
+	struct list_head	compound;
 
 	/* Link in list of PE#s */
 	struct list_head	dma_link;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 11/11] powerpc/powernv: compound PE for VFs
  2015-05-15  5:46 ` Wei Yang
@ 2015-05-15  5:46   ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linuxppc-dev, linux-pci, Wei Yang

When VF BAR size is larger than 64MB, we group VFs in terms of M64 BAR,
which means those VFs in a group should form a compound PE.

This patch links those VF PEs into compound PE in this case.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |   27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 843457b..157305f 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1345,6 +1345,7 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 				vf_index < (vf_group + 1) * vf_per_group &&
 				vf_index < num_vfs;
 				vf_index++)
+
 				for (vf_index1 = vf_group * vf_per_group;
 					vf_index1 < (vf_group + 1) * vf_per_group &&
 					vf_index1 < num_vfs;
@@ -1360,6 +1361,11 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 						__func__,
 						pdn->offset + vf_index1, rc);
 				}
+
+				/* Remove a Slave PE from Master PE */
+				pe = &phb->ioda.pe_array[pdn->offset + vf_index];
+				if (pe->flags & PNV_IODA_PE_SLAVE)
+					list_del(&pe->compound);
 	}
 
 	list_for_each_entry_safe(pe, pe_n, &phb->ioda.pe_list, list) {
@@ -1418,7 +1424,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 	struct pci_bus        *bus;
 	struct pci_controller *hose;
 	struct pnv_phb        *phb;
-	struct pnv_ioda_pe    *pe;
+	struct pnv_ioda_pe    *pe, *master_pe;
 	int                    pe_num;
 	u16                    vf_index;
 	struct pci_dn         *pdn;
@@ -1480,10 +1486,29 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 		vf_per_group = roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
 
 		for (vf_group = 0; vf_group < M64_PER_IOV; vf_group++) {
+			master_pe = NULL;
+
 			for (vf_index = vf_group * vf_per_group;
 			     vf_index < (vf_group + 1) * vf_per_group &&
 			     vf_index < num_vfs;
 			     vf_index++) {
+
+				/*
+				 * Figure out the master PE and put all slave
+				 * PEs to master PE's list.
+				 */
+				pe = &phb->ioda.pe_array[pdn->offset + vf_index];
+				if (!master_pe) {
+					pe->flags |= PNV_IODA_PE_MASTER;
+					INIT_LIST_HEAD(&pe->slaves);
+					master_pe = pe;
+				} else {
+					pe->flags |= PNV_IODA_PE_SLAVE;
+					pe->master = master_pe;
+					list_add_tail(&pe->compound,
+						&master_pe->slaves);
+				}
+
 				for (vf_index1 = vf_group * vf_per_group;
 				     vf_index1 < (vf_group + 1) * vf_per_group &&
 				     vf_index1 < num_vfs;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH V4 11/11] powerpc/powernv: compound PE for VFs
@ 2015-05-15  5:46   ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  5:46 UTC (permalink / raw)
  To: gwshan, bhelgaas; +Cc: linux-pci, Wei Yang, linuxppc-dev

When VF BAR size is larger than 64MB, we group VFs in terms of M64 BAR,
which means those VFs in a group should form a compound PE.

This patch links those VF PEs into compound PE in this case.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |   27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 843457b..157305f 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1345,6 +1345,7 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 				vf_index < (vf_group + 1) * vf_per_group &&
 				vf_index < num_vfs;
 				vf_index++)
+
 				for (vf_index1 = vf_group * vf_per_group;
 					vf_index1 < (vf_group + 1) * vf_per_group &&
 					vf_index1 < num_vfs;
@@ -1360,6 +1361,11 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 						__func__,
 						pdn->offset + vf_index1, rc);
 				}
+
+				/* Remove a Slave PE from Master PE */
+				pe = &phb->ioda.pe_array[pdn->offset + vf_index];
+				if (pe->flags & PNV_IODA_PE_SLAVE)
+					list_del(&pe->compound);
 	}
 
 	list_for_each_entry_safe(pe, pe_n, &phb->ioda.pe_list, list) {
@@ -1418,7 +1424,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 	struct pci_bus        *bus;
 	struct pci_controller *hose;
 	struct pnv_phb        *phb;
-	struct pnv_ioda_pe    *pe;
+	struct pnv_ioda_pe    *pe, *master_pe;
 	int                    pe_num;
 	u16                    vf_index;
 	struct pci_dn         *pdn;
@@ -1480,10 +1486,29 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
 		vf_per_group = roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
 
 		for (vf_group = 0; vf_group < M64_PER_IOV; vf_group++) {
+			master_pe = NULL;
+
 			for (vf_index = vf_group * vf_per_group;
 			     vf_index < (vf_group + 1) * vf_per_group &&
 			     vf_index < num_vfs;
 			     vf_index++) {
+
+				/*
+				 * Figure out the master PE and put all slave
+				 * PEs to master PE's list.
+				 */
+				pe = &phb->ioda.pe_array[pdn->offset + vf_index];
+				if (!master_pe) {
+					pe->flags |= PNV_IODA_PE_MASTER;
+					INIT_LIST_HEAD(&pe->slaves);
+					master_pe = pe;
+				} else {
+					pe->flags |= PNV_IODA_PE_SLAVE;
+					pe->master = master_pe;
+					list_add_tail(&pe->compound,
+						&master_pe->slaves);
+				}
+
 				for (vf_index1 = vf_group * vf_per_group;
 				     vf_index1 < (vf_group + 1) * vf_per_group &&
 				     vf_index1 < num_vfs;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 01/11] pci/iov: rename and export virtfn_add/virtfn_remove
  2015-05-15  5:46   ` Wei Yang
@ 2015-05-15  5:56     ` Gavin Shan
  -1 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  5:56 UTC (permalink / raw)
  To: Wei Yang; +Cc: gwshan, bhelgaas, linuxppc-dev, linux-pci

On Fri, May 15, 2015 at 01:46:16PM +0800, Wei Yang wrote:
>During the EEH recovery, when a device's driver is not EEH aware or no
>driver is bound with a device, EEH core would do hotplug on this device.
>While it isn't feasible for a VF with usual hotplug procedure. During
>removal of a VF, virtual bus should be removed if necessary. During the
>re-creation, the pci_scan_slot() doesn't work on a VF.
>
>This patch exports two functions to handle the hotplug case for VF
>properly. They will be invoked when the EEH core does the hotplug case for
>VFs.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>

Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Thanks,
Gavin

>---
> drivers/pci/iov.c   |   10 +++++-----
> include/linux/pci.h |    2 ++
> 2 files changed, 7 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index 47daf2f..f353e6f 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -106,7 +106,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
> 	return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
> }
>
>-static int virtfn_add(struct pci_dev *dev, int id, int reset)
>+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset)
> {
> 	int i;
> 	int rc = -ENOMEM;
>@@ -181,7 +181,7 @@ failed:
> 	return rc;
> }
>
>-static void virtfn_remove(struct pci_dev *dev, int id, int reset)
>+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset)
> {
> 	struct pci_dev *virtfn;
>
>@@ -302,7 +302,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 	}
>
> 	for (i = 0; i < initial; i++) {
>-		rc = virtfn_add(dev, i, 0);
>+		rc = pci_iov_virtfn_add(dev, i, 0);
> 		if (rc)
> 			goto failed;
> 	}
>@@ -314,7 +314,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>
> failed:
> 	for (j = 0; j < i; j++)
>-		virtfn_remove(dev, j, 0);
>+		pci_iov_virtfn_remove(dev, j, 0);
>
> 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
> 	pci_cfg_access_lock(dev);
>@@ -343,7 +343,7 @@ static void sriov_disable(struct pci_dev *dev)
> 		return;
>
> 	for (i = 0; i < iov->num_VFs; i++)
>-		virtfn_remove(dev, i, 0);
>+		pci_iov_virtfn_remove(dev, i, 0);
>
> 	pcibios_sriov_disable(dev);
>
>diff --git a/include/linux/pci.h b/include/linux/pci.h
>index 353db8d..94bacfa 100644
>--- a/include/linux/pci.h
>+++ b/include/linux/pci.h
>@@ -1679,6 +1679,8 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
>
> int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
> void pci_disable_sriov(struct pci_dev *dev);
>+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset);
>+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset);
> int pci_num_vf(struct pci_dev *dev);
> int pci_vfs_assigned(struct pci_dev *dev);
> int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
>-- 
>1.7.9.5
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 01/11] pci/iov: rename and export virtfn_add/virtfn_remove
@ 2015-05-15  5:56     ` Gavin Shan
  0 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  5:56 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Fri, May 15, 2015 at 01:46:16PM +0800, Wei Yang wrote:
>During the EEH recovery, when a device's driver is not EEH aware or no
>driver is bound with a device, EEH core would do hotplug on this device.
>While it isn't feasible for a VF with usual hotplug procedure. During
>removal of a VF, virtual bus should be removed if necessary. During the
>re-creation, the pci_scan_slot() doesn't work on a VF.
>
>This patch exports two functions to handle the hotplug case for VF
>properly. They will be invoked when the EEH core does the hotplug case for
>VFs.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>

Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Thanks,
Gavin

>---
> drivers/pci/iov.c   |   10 +++++-----
> include/linux/pci.h |    2 ++
> 2 files changed, 7 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index 47daf2f..f353e6f 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -106,7 +106,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
> 	return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
> }
>
>-static int virtfn_add(struct pci_dev *dev, int id, int reset)
>+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset)
> {
> 	int i;
> 	int rc = -ENOMEM;
>@@ -181,7 +181,7 @@ failed:
> 	return rc;
> }
>
>-static void virtfn_remove(struct pci_dev *dev, int id, int reset)
>+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset)
> {
> 	struct pci_dev *virtfn;
>
>@@ -302,7 +302,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 	}
>
> 	for (i = 0; i < initial; i++) {
>-		rc = virtfn_add(dev, i, 0);
>+		rc = pci_iov_virtfn_add(dev, i, 0);
> 		if (rc)
> 			goto failed;
> 	}
>@@ -314,7 +314,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>
> failed:
> 	for (j = 0; j < i; j++)
>-		virtfn_remove(dev, j, 0);
>+		pci_iov_virtfn_remove(dev, j, 0);
>
> 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
> 	pci_cfg_access_lock(dev);
>@@ -343,7 +343,7 @@ static void sriov_disable(struct pci_dev *dev)
> 		return;
>
> 	for (i = 0; i < iov->num_VFs; i++)
>-		virtfn_remove(dev, i, 0);
>+		pci_iov_virtfn_remove(dev, i, 0);
>
> 	pcibios_sriov_disable(dev);
>
>diff --git a/include/linux/pci.h b/include/linux/pci.h
>index 353db8d..94bacfa 100644
>--- a/include/linux/pci.h
>+++ b/include/linux/pci.h
>@@ -1679,6 +1679,8 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
>
> int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
> void pci_disable_sriov(struct pci_dev *dev);
>+int pci_iov_virtfn_add(struct pci_dev *dev, int id, int reset);
>+void pci_iov_virtfn_remove(struct pci_dev *dev, int id, int reset);
> int pci_num_vf(struct pci_dev *dev);
> int pci_vfs_assigned(struct pci_dev *dev);
> int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
>-- 
>1.7.9.5
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 02/11] powerpc/pci_dn: cache vf_index in pci_dn
  2015-05-15  5:46   ` Wei Yang
@ 2015-05-15  5:57     ` Gavin Shan
  -1 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  5:57 UTC (permalink / raw)
  To: Wei Yang; +Cc: gwshan, bhelgaas, linuxppc-dev, linux-pci

On Fri, May 15, 2015 at 01:46:17PM +0800, Wei Yang wrote:
>The patch caches the VF index in pci_dn, which can be used to calculate
>VF's bus, device and function number. Those information helps to locate
>the VF's PCI device instance when doing hotplug during EEH recovery if
>necessary.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>

Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Thanks,
Gavin


>---
> arch/powerpc/include/asm/pci-bridge.h |    1 +
> arch/powerpc/kernel/pci_dn.c          |    4 +++-
> 2 files changed, 4 insertions(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>index 1811c44..d78afe4 100644
>--- a/arch/powerpc/include/asm/pci-bridge.h
>+++ b/arch/powerpc/include/asm/pci-bridge.h
>@@ -199,6 +199,7 @@ struct pci_dn {
> #ifdef CONFIG_PCI_IOV
> 	u16     vfs_expanded;		/* number of VFs IOV BAR expanded */
> 	u16     num_vfs;		/* number of VFs enabled*/
>+	int     vf_index;		/* VF index in the PF */
> 	int     offset;			/* PE# for the first VF PE */
> #define M64_PER_IOV 4
> 	int     m64_per_iov;
>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>index b3b4df9..f771130 100644
>--- a/arch/powerpc/kernel/pci_dn.c
>+++ b/arch/powerpc/kernel/pci_dn.c
>@@ -139,6 +139,7 @@ struct pci_dn *pci_get_pdn(struct pci_dev *pdev)
> #ifdef CONFIG_PCI_IOV
> static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
> 					   struct pci_dev *pdev,
>+					   int vf_index,
> 					   int busno, int devfn)
> {
> 	struct pci_dn *pdn;
>@@ -157,6 +158,7 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
> 	pdn->parent = parent;
> 	pdn->busno = busno;
> 	pdn->devfn = devfn;
>+	pdn->vf_index = vf_index;
> #ifdef CONFIG_PPC_POWERNV
> 	pdn->pe_number = IODA_INVALID_PE;
> #endif
>@@ -196,7 +198,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
> 		return NULL;
>
> 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>-		pdn = add_one_dev_pci_data(parent, NULL,
>+		pdn = add_one_dev_pci_data(parent, NULL, i,
> 					   pci_iov_virtfn_bus(pdev, i),
> 					   pci_iov_virtfn_devfn(pdev, i));
> 		if (!pdn) {
>-- 
>1.7.9.5
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 02/11] powerpc/pci_dn: cache vf_index in pci_dn
@ 2015-05-15  5:57     ` Gavin Shan
  0 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  5:57 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Fri, May 15, 2015 at 01:46:17PM +0800, Wei Yang wrote:
>The patch caches the VF index in pci_dn, which can be used to calculate
>VF's bus, device and function number. Those information helps to locate
>the VF's PCI device instance when doing hotplug during EEH recovery if
>necessary.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>

Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Thanks,
Gavin


>---
> arch/powerpc/include/asm/pci-bridge.h |    1 +
> arch/powerpc/kernel/pci_dn.c          |    4 +++-
> 2 files changed, 4 insertions(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>index 1811c44..d78afe4 100644
>--- a/arch/powerpc/include/asm/pci-bridge.h
>+++ b/arch/powerpc/include/asm/pci-bridge.h
>@@ -199,6 +199,7 @@ struct pci_dn {
> #ifdef CONFIG_PCI_IOV
> 	u16     vfs_expanded;		/* number of VFs IOV BAR expanded */
> 	u16     num_vfs;		/* number of VFs enabled*/
>+	int     vf_index;		/* VF index in the PF */
> 	int     offset;			/* PE# for the first VF PE */
> #define M64_PER_IOV 4
> 	int     m64_per_iov;
>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>index b3b4df9..f771130 100644
>--- a/arch/powerpc/kernel/pci_dn.c
>+++ b/arch/powerpc/kernel/pci_dn.c
>@@ -139,6 +139,7 @@ struct pci_dn *pci_get_pdn(struct pci_dev *pdev)
> #ifdef CONFIG_PCI_IOV
> static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
> 					   struct pci_dev *pdev,
>+					   int vf_index,
> 					   int busno, int devfn)
> {
> 	struct pci_dn *pdn;
>@@ -157,6 +158,7 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
> 	pdn->parent = parent;
> 	pdn->busno = busno;
> 	pdn->devfn = devfn;
>+	pdn->vf_index = vf_index;
> #ifdef CONFIG_PPC_POWERNV
> 	pdn->pe_number = IODA_INVALID_PE;
> #endif
>@@ -196,7 +198,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
> 		return NULL;
>
> 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>-		pdn = add_one_dev_pci_data(parent, NULL,
>+		pdn = add_one_dev_pci_data(parent, NULL, i,
> 					   pci_iov_virtfn_bus(pdev, i),
> 					   pci_iov_virtfn_devfn(pdev, i));
> 		if (!pdn) {
>-- 
>1.7.9.5
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 05/11] powerpc/powernv: create/release eeh_dev for VF
  2015-05-15  5:46   ` Wei Yang
@ 2015-05-15  6:19     ` Gavin Shan
  -1 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  6:19 UTC (permalink / raw)
  To: Wei Yang; +Cc: gwshan, bhelgaas, linuxppc-dev, linux-pci

On Fri, May 15, 2015 at 01:46:20PM +0800, Wei Yang wrote:
>EEH on powerpc platform needs eeh_dev structure to track the PCI device
>status. Since VFs are created/released dynamically, VF's eeh_dev is also
>dynamically created/released in system.
>
>This patch creates/removes eeh_dev when pci_dn is created/removed for VFs,
>and marks it with EEH_DEV_VF type.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>

Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

After removing the unnecessary line of code as below.

>---
> arch/powerpc/include/asm/eeh.h |    1 +
> arch/powerpc/kernel/eeh.c      |    4 ++++
> arch/powerpc/kernel/pci_dn.c   |   11 +++++++++++
> 3 files changed, 16 insertions(+)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index a52db28..1b3614d 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -138,6 +138,7 @@ struct eeh_dev {
> 	struct pci_controller *phb;	/* Associated PHB		*/
> 	struct pci_dn *pdn;		/* Associated PCI device node	*/
> 	struct pci_dev *pdev;		/* Associated PCI device	*/
>+	struct pci_dev *physfn;		/* Associated PF PORT		*/
> 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
> };
>
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index 6c7ce1b..221e280 100644
>--- a/arch/powerpc/kernel/eeh.c
>+++ b/arch/powerpc/kernel/eeh.c
>@@ -1135,6 +1135,10 @@ void eeh_add_device_late(struct pci_dev *dev)
> 	}
>
> 	edev->pdev = dev;
>+#ifdef CONFIG_PCI_IOV
>+	if (dev->is_virtfn)
>+		edev->physfn = dev->physfn;
>+#endif
> 	dev->dev.archdata.edev = edev;
>
> 	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>index f771130..94806a4 100644
>--- a/arch/powerpc/kernel/pci_dn.c
>+++ b/arch/powerpc/kernel/pci_dn.c
>@@ -180,7 +180,9 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
> struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
> {
> #ifdef CONFIG_PCI_IOV
>+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> 	struct pci_dn *parent, *pdn;
>+	struct eeh_dev *edev;
> 	int i;
>
> 	/* Only support IOV for now */
>@@ -206,6 +208,8 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
> 				 __func__, i);
> 			return NULL;
> 		}
>+		eeh_dev_init(pdn, hose);
>+		edev = pdn_to_eeh_dev(pdn);

Nothing is done to edev after getting it. So I think the last line of changes
here isn't needed. Could you check and remove it if I'm correct?

Thanks,
Gavin

> 	}
> #endif /* CONFIG_PCI_IOV */
>
>@@ -254,10 +258,17 @@ void remove_dev_pci_data(struct pci_dev *pdev)
> 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
> 		list_for_each_entry_safe(pdn, tmp,
> 			&parent->child_list, list) {
>+			struct eeh_dev *edev;
> 			if (pdn->busno != pci_iov_virtfn_bus(pdev, i) ||
> 			    pdn->devfn != pci_iov_virtfn_devfn(pdev, i))
> 				continue;
>
>+			edev = pdn_to_eeh_dev(pdn);
>+			if (edev) {
>+				pdn->edev = NULL;
>+				kfree(edev);
>+			}
>+
> 			if (!list_empty(&pdn->list))
> 				list_del(&pdn->list);
>
>-- 
>1.7.9.5
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 05/11] powerpc/powernv: create/release eeh_dev for VF
@ 2015-05-15  6:19     ` Gavin Shan
  0 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  6:19 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Fri, May 15, 2015 at 01:46:20PM +0800, Wei Yang wrote:
>EEH on powerpc platform needs eeh_dev structure to track the PCI device
>status. Since VFs are created/released dynamically, VF's eeh_dev is also
>dynamically created/released in system.
>
>This patch creates/removes eeh_dev when pci_dn is created/removed for VFs,
>and marks it with EEH_DEV_VF type.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>

Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

After removing the unnecessary line of code as below.

>---
> arch/powerpc/include/asm/eeh.h |    1 +
> arch/powerpc/kernel/eeh.c      |    4 ++++
> arch/powerpc/kernel/pci_dn.c   |   11 +++++++++++
> 3 files changed, 16 insertions(+)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index a52db28..1b3614d 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -138,6 +138,7 @@ struct eeh_dev {
> 	struct pci_controller *phb;	/* Associated PHB		*/
> 	struct pci_dn *pdn;		/* Associated PCI device node	*/
> 	struct pci_dev *pdev;		/* Associated PCI device	*/
>+	struct pci_dev *physfn;		/* Associated PF PORT		*/
> 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
> };
>
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index 6c7ce1b..221e280 100644
>--- a/arch/powerpc/kernel/eeh.c
>+++ b/arch/powerpc/kernel/eeh.c
>@@ -1135,6 +1135,10 @@ void eeh_add_device_late(struct pci_dev *dev)
> 	}
>
> 	edev->pdev = dev;
>+#ifdef CONFIG_PCI_IOV
>+	if (dev->is_virtfn)
>+		edev->physfn = dev->physfn;
>+#endif
> 	dev->dev.archdata.edev = edev;
>
> 	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>index f771130..94806a4 100644
>--- a/arch/powerpc/kernel/pci_dn.c
>+++ b/arch/powerpc/kernel/pci_dn.c
>@@ -180,7 +180,9 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
> struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
> {
> #ifdef CONFIG_PCI_IOV
>+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
> 	struct pci_dn *parent, *pdn;
>+	struct eeh_dev *edev;
> 	int i;
>
> 	/* Only support IOV for now */
>@@ -206,6 +208,8 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
> 				 __func__, i);
> 			return NULL;
> 		}
>+		eeh_dev_init(pdn, hose);
>+		edev = pdn_to_eeh_dev(pdn);

Nothing is done to edev after getting it. So I think the last line of changes
here isn't needed. Could you check and remove it if I'm correct?

Thanks,
Gavin

> 	}
> #endif /* CONFIG_PCI_IOV */
>
>@@ -254,10 +258,17 @@ void remove_dev_pci_data(struct pci_dev *pdev)
> 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
> 		list_for_each_entry_safe(pdn, tmp,
> 			&parent->child_list, list) {
>+			struct eeh_dev *edev;
> 			if (pdn->busno != pci_iov_virtfn_bus(pdev, i) ||
> 			    pdn->devfn != pci_iov_virtfn_devfn(pdev, i))
> 				continue;
>
>+			edev = pdn_to_eeh_dev(pdn);
>+			if (edev) {
>+				pdn->edev = NULL;
>+				kfree(edev);
>+			}
>+
> 			if (!list_empty(&pdn->list))
> 				list_del(&pdn->list);
>
>-- 
>1.7.9.5
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 06/11] powerpc/eeh: create EEH_PE_VF for VF PE
  2015-05-15  5:46   ` Wei Yang
@ 2015-05-15  6:26     ` Gavin Shan
  -1 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  6:26 UTC (permalink / raw)
  To: Wei Yang; +Cc: gwshan, bhelgaas, linuxppc-dev, linux-pci

On Fri, May 15, 2015 at 01:46:21PM +0800, Wei Yang wrote:
>On powernv platform, VF PE is a special PE which is different from the Bus
>PE.  On the EEH side, it needs a corresponding concept to handle the VF PE
>properly. For example, we need to create VF PE when VF's pci_dev is
>initialized in kernel. And add a flag to mark it is a VF PE.
>
>This patch introduces the EEH_PE_VF type for VF PE and creates it for a VF.
>At the mean time, it creates the sysfs and address cache for VF PE at PCI
>device final fixup time.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>

Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

With one thing fixed as below.

>---
> arch/powerpc/include/asm/eeh.h               |    1 +
> arch/powerpc/kernel/eeh_pe.c                 |   10 ++++++++--
> arch/powerpc/platforms/powernv/eeh-powernv.c |   14 ++++++++++++++
> 3 files changed, 23 insertions(+), 2 deletions(-)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index 1b3614d..c1fde48 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -70,6 +70,7 @@ struct pci_dn;
> #define EEH_PE_PHB	(1 << 1)	/* PHB PE    */
> #define EEH_PE_DEVICE 	(1 << 2)	/* Device PE */
> #define EEH_PE_BUS	(1 << 3)	/* Bus PE    */
>+#define EEH_PE_VF	(1 << 4)	/* VF PE     */
>
> #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
> #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>index 35f0b62..260a701 100644
>--- a/arch/powerpc/kernel/eeh_pe.c
>+++ b/arch/powerpc/kernel/eeh_pe.c
>@@ -299,7 +299,10 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
> 	 * EEH device already having associated PE, but
> 	 * the direct parent EEH device doesn't have yet.
> 	 */
>-	pdn = pdn ? pdn->parent : NULL;
>+	if (edev->physfn)
>+		pdn = pci_get_pdn(edev->physfn);
>+	else
>+		pdn = pdn ? pdn->parent : NULL;
> 	while (pdn) {
> 		/* We're poking out of PCI territory */
> 		parent = pdn_to_eeh_dev(pdn);
>@@ -382,7 +385,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
> 	}
>
> 	/* Create a new EEH PE */
>-	pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
>+	if (edev->physfn)
>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
>+	else
>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
> 	if (!pe) {
> 		pr_err("%s: out of memory!\n", __func__);
> 		return -ENOMEM;
>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>index 622f08c..31344a4 100644
>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>@@ -1540,3 +1540,17 @@ static int __init eeh_powernv_init(void)
> 	return ret;
> }
> machine_early_initcall(powernv, eeh_powernv_init);
>+
>+static void pnv_eeh_vf_final_fixup(struct pci_dev *pdev)
>+{
>+	/*
>+	 * The following operations will fail if VF's sysfs files aren't
>+	 * created or its resources aren't finalized.
>+	 */
>+	if (!pdev->is_virtfn)
>+		return;
>+
>+	eeh_add_device_late(pdev);
>+	eeh_sysfs_add_device(pdev);

It's worthy to have following code to make the logic here complete. Otherwise,
we will run into problem quickly once the eeh_add_device_{early,late}() get
changed in eeh.c:

	eeh_add_device_early(pdn);
	eeh_add_device_late(pdev);
	eeh_sysfs_add_device(pdev);

Thanks,
Gavin

>+}
>+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_eeh_vf_final_fixup);
>-- 
>1.7.9.5
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 06/11] powerpc/eeh: create EEH_PE_VF for VF PE
@ 2015-05-15  6:26     ` Gavin Shan
  0 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  6:26 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Fri, May 15, 2015 at 01:46:21PM +0800, Wei Yang wrote:
>On powernv platform, VF PE is a special PE which is different from the Bus
>PE.  On the EEH side, it needs a corresponding concept to handle the VF PE
>properly. For example, we need to create VF PE when VF's pci_dev is
>initialized in kernel. And add a flag to mark it is a VF PE.
>
>This patch introduces the EEH_PE_VF type for VF PE and creates it for a VF.
>At the mean time, it creates the sysfs and address cache for VF PE at PCI
>device final fixup time.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>

Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

With one thing fixed as below.

>---
> arch/powerpc/include/asm/eeh.h               |    1 +
> arch/powerpc/kernel/eeh_pe.c                 |   10 ++++++++--
> arch/powerpc/platforms/powernv/eeh-powernv.c |   14 ++++++++++++++
> 3 files changed, 23 insertions(+), 2 deletions(-)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index 1b3614d..c1fde48 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -70,6 +70,7 @@ struct pci_dn;
> #define EEH_PE_PHB	(1 << 1)	/* PHB PE    */
> #define EEH_PE_DEVICE 	(1 << 2)	/* Device PE */
> #define EEH_PE_BUS	(1 << 3)	/* Bus PE    */
>+#define EEH_PE_VF	(1 << 4)	/* VF PE     */
>
> #define EEH_PE_ISOLATED		(1 << 0)	/* Isolated PE		*/
> #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>index 35f0b62..260a701 100644
>--- a/arch/powerpc/kernel/eeh_pe.c
>+++ b/arch/powerpc/kernel/eeh_pe.c
>@@ -299,7 +299,10 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev)
> 	 * EEH device already having associated PE, but
> 	 * the direct parent EEH device doesn't have yet.
> 	 */
>-	pdn = pdn ? pdn->parent : NULL;
>+	if (edev->physfn)
>+		pdn = pci_get_pdn(edev->physfn);
>+	else
>+		pdn = pdn ? pdn->parent : NULL;
> 	while (pdn) {
> 		/* We're poking out of PCI territory */
> 		parent = pdn_to_eeh_dev(pdn);
>@@ -382,7 +385,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
> 	}
>
> 	/* Create a new EEH PE */
>-	pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
>+	if (edev->physfn)
>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_VF);
>+	else
>+		pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE);
> 	if (!pe) {
> 		pr_err("%s: out of memory!\n", __func__);
> 		return -ENOMEM;
>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>index 622f08c..31344a4 100644
>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>@@ -1540,3 +1540,17 @@ static int __init eeh_powernv_init(void)
> 	return ret;
> }
> machine_early_initcall(powernv, eeh_powernv_init);
>+
>+static void pnv_eeh_vf_final_fixup(struct pci_dev *pdev)
>+{
>+	/*
>+	 * The following operations will fail if VF's sysfs files aren't
>+	 * created or its resources aren't finalized.
>+	 */
>+	if (!pdev->is_virtfn)
>+		return;
>+
>+	eeh_add_device_late(pdev);
>+	eeh_sysfs_add_device(pdev);

It's worthy to have following code to make the logic here complete. Otherwise,
we will run into problem quickly once the eeh_add_device_{early,late}() get
changed in eeh.c:

	eeh_add_device_early(pdn);
	eeh_add_device_late(pdev);
	eeh_sysfs_add_device(pdev);

Thanks,
Gavin

>+}
>+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, pnv_eeh_vf_final_fixup);
>-- 
>1.7.9.5
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 07/11] powerpc/powernv: Support EEH reset for VFs
  2015-05-15  5:46   ` Wei Yang
@ 2015-05-15  7:12     ` Gavin Shan
  -1 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  7:12 UTC (permalink / raw)
  To: Wei Yang; +Cc: gwshan, bhelgaas, linuxppc-dev, linux-pci

On Fri, May 15, 2015 at 01:46:22PM +0800, Wei Yang wrote:
>Before VF PE is introduced, there isn't a method to reset an individual PCI
>function. And since skiboot firmware is not aware of the VF, the VF's reset
>should be done in kernel.
>
>This patch introduces a function pnv_eeh_vf_pe_reset() to do the FLR or AF
>FLR to a VF.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/include/asm/eeh.h               |    1 +
> arch/powerpc/platforms/powernv/eeh-powernv.c |  123 +++++++++++++++++++++++++-
> 2 files changed, 123 insertions(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index c1fde48..3d64cf3 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -134,6 +134,7 @@ struct eeh_dev {
> 	int pcix_cap;			/* Saved PCIx capability	*/
> 	int pcie_cap;			/* Saved PCIe capability	*/
> 	int aer_cap;			/* Saved AER capability		*/
>+	int af_cap;			/* Saved AF capability		*/
> 	struct eeh_pe *pe;		/* Associated PE		*/
> 	struct list_head list;		/* Form link list in the PE	*/
> 	struct pci_controller *phb;	/* Associated PHB		*/
>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>index 31344a4..61f1a55 100644
>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>@@ -402,6 +402,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
> 	edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX);
> 	edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
> 	edev->aer_cap  = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
>+	edev->af_cap   = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF);
> 	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
> 		edev->mode |= EEH_DEV_BRIDGE;
> 		if (edev->pcie_cap) {
>@@ -891,6 +892,117 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
> 	return 0;
> }
>
>+static int pnv_pci_wait_for_pending(struct pci_dn *pdn, int pos, u16 mask)

Could you change this function to something as below?

static bool pnv_eeh_wait_for_pending(struct pci_dn *pdn, int pos, u16 mask)

>+{
>+	int i;

	u32 status;
	int i;

You don't need the following "u32 status".

>+
>+	/* Wait for Transaction Pending bit clean */
>+	for (i = 0; i < 4; i++) {
>+		u32 status;
>+		if (i)
>+			msleep((1 << (i - 1)) * 100);
>+
>+		eeh_ops->read_config(pdn, pos, 2, &status);
>+		if (!(status & mask))
>+			return 1;
>+	}
>+
>+	return 0;
>+}
>+
>+static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
>+{
>+	u32 cap;
>+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>+

It's worthy to check if the device has PCIE cap though this function is
used by VFs who always have PCIE cap. However, it's still used for one
without PCIE cap.

>+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &cap);
>+	if (!(cap & PCI_EXP_DEVCAP_FLR))
>+		return -ENOTTY;
>+
>+	if (!pnv_pci_wait_for_pending(pdn, edev->pcie_cap + PCI_EXP_DEVSTA,
>+			PCI_EXP_DEVSTA_TRPND))
>+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending "
>+		       "transaction; performing function level reset anyway\n",
>+			edev->phb->global_number, pdn->busno,
>+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));

Please print the function name and simplify the log into following one.
Also, the connection symbol between device and function number would
be ".", not ":".

		pr_warn("%s: Pending transaction while issuing FLR to "
			"%04x:%02x:%02x.%01x",
			__func__, .....);
>+
>+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, &cap);
>+	if (option == EEH_RESET_DEACTIVATE)
>+		cap &= ~PCI_EXP_DEVCTL_BCR_FLR;
>+	else
>+		cap |= PCI_EXP_DEVCTL_BCR_FLR;
>+	eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, cap);
>+	msleep(100);

The hold and stablization delay has been standarized in EEH as below:

EEH_PE_RST_HOLD_TIME         - After asserting reset
EEH_PE_RST_SETTLE_TIME       - After deasserting reset

>+	return 0;
>+}
>+
>+static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
>+{
>+	u32 cap;
>+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>+
>+	if (!edev->af_cap)
>+		return -ENOTTY;
>+
>+	eeh_ops->read_config(pdn, edev->af_cap + PCI_AF_CAP, 1, &cap);
>+	if (!(cap & PCI_AF_CAP_TP) || !(cap & PCI_AF_CAP_FLR))
>+		return -ENOTTY;
>+
>+	/*
>+	 * Wait for Transaction Pending bit to clear.  A word-aligned test
>+	 * is used, so we use the conrol offset rather than status and shift
>+	 * the test bit to match.
>+	 */
>+	if (!pnv_pci_wait_for_pending(pdn, edev->af_cap + PCI_AF_CTRL,
>+				 PCI_AF_STATUS_TP << 8))
>+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending "
>+		    "transaction; performing AF function level reset anyway\n",
>+			edev->phb->global_number, pdn->busno,
>+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));

Same as above.

>+
>+	if (option == EEH_RESET_DEACTIVATE)
>+		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1, 0);
>+	else
>+		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1,
>+				PCI_AF_CTRL_FLR);
>+	msleep(100);

Same as above.

>+	return 0;
>+}
>+
>+static int pnv_eeh_reset_vf(struct pci_dn *pdn, int option)
>+{
>+	int rc;
>+
>+	might_sleep();

This can be removed safely. This function should be called from
"eehd" kernel thread. So we don't need the check.

>+
>+	rc = pnv_eeh_do_flr(pdn, option);
>+	if (rc != -ENOTTY)
>+		goto done;
>+
>+	rc = pnv_eeh_do_af_flr(pdn, option);
>+	if (rc != -ENOTTY)
>+		goto done;
>+
>+done:
>+	return rc;
>+}

You can avoid using unnecessary tag:

	rc = pnv_eeh_do_flr();
	if (!rc)
		return rc;

	rc = pnv_eeh_do_af_flr();
	return rc;

>+
>+static int pnv_eeh_vf_pe_reset(struct eeh_pe *pe, int option)
>+{
>+	struct eeh_dev *edev, *tmp;
>+	struct pci_dn *pdn;
>+	int ret = 0;
>+
>+	eeh_pe_for_each_dev(pe, edev, tmp) {
>+		pdn = eeh_dev_to_pdn(edev);
>+		ret |= pnv_eeh_reset_vf(pdn, option);
>+		if (ret)
>+			return ret;
>+	}
>+
>+	return ret;
>+}
>+
> void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
> {
> 	struct pci_controller *hose;
>@@ -966,7 +1078,9 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
> 		}
>
> 		bus = eeh_pe_bus_get(pe);
>-		if (pci_is_root_bus(bus) ||
>+		if (pe->type & EEH_PE_VF)
>+			ret = pnv_eeh_vf_pe_reset(pe, option);
>+		else if (pci_is_root_bus(bus) ||
> 			pci_is_root_bus(bus->parent))
> 			ret = pnv_eeh_root_reset(hose, option);
> 		else
>@@ -1106,6 +1220,13 @@ static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
> 	if (!edev || !edev->pe)
> 		return false;
>
>+	/*
>+	 * For VF's reset operation, we need to rely on the kernel to
>+	 * do those PCI config operations since firmware isn't aware of VFs.
>+	 */
>+	if ((edev->physfn) && (edev->pe->state & EEH_PE_RESET))
>+		return false;
>+
> 	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
> 		return true;
>

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 07/11] powerpc/powernv: Support EEH reset for VFs
@ 2015-05-15  7:12     ` Gavin Shan
  0 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  7:12 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Fri, May 15, 2015 at 01:46:22PM +0800, Wei Yang wrote:
>Before VF PE is introduced, there isn't a method to reset an individual PCI
>function. And since skiboot firmware is not aware of the VF, the VF's reset
>should be done in kernel.
>
>This patch introduces a function pnv_eeh_vf_pe_reset() to do the FLR or AF
>FLR to a VF.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/include/asm/eeh.h               |    1 +
> arch/powerpc/platforms/powernv/eeh-powernv.c |  123 +++++++++++++++++++++++++-
> 2 files changed, 123 insertions(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index c1fde48..3d64cf3 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -134,6 +134,7 @@ struct eeh_dev {
> 	int pcix_cap;			/* Saved PCIx capability	*/
> 	int pcie_cap;			/* Saved PCIe capability	*/
> 	int aer_cap;			/* Saved AER capability		*/
>+	int af_cap;			/* Saved AF capability		*/
> 	struct eeh_pe *pe;		/* Associated PE		*/
> 	struct list_head list;		/* Form link list in the PE	*/
> 	struct pci_controller *phb;	/* Associated PHB		*/
>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>index 31344a4..61f1a55 100644
>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>@@ -402,6 +402,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
> 	edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX);
> 	edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
> 	edev->aer_cap  = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
>+	edev->af_cap   = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF);
> 	if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) {
> 		edev->mode |= EEH_DEV_BRIDGE;
> 		if (edev->pcie_cap) {
>@@ -891,6 +892,117 @@ static int pnv_eeh_bridge_reset(struct pci_dev *dev, int option)
> 	return 0;
> }
>
>+static int pnv_pci_wait_for_pending(struct pci_dn *pdn, int pos, u16 mask)

Could you change this function to something as below?

static bool pnv_eeh_wait_for_pending(struct pci_dn *pdn, int pos, u16 mask)

>+{
>+	int i;

	u32 status;
	int i;

You don't need the following "u32 status".

>+
>+	/* Wait for Transaction Pending bit clean */
>+	for (i = 0; i < 4; i++) {
>+		u32 status;
>+		if (i)
>+			msleep((1 << (i - 1)) * 100);
>+
>+		eeh_ops->read_config(pdn, pos, 2, &status);
>+		if (!(status & mask))
>+			return 1;
>+	}
>+
>+	return 0;
>+}
>+
>+static int pnv_eeh_do_flr(struct pci_dn *pdn, int option)
>+{
>+	u32 cap;
>+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>+

It's worthy to check if the device has PCIE cap though this function is
used by VFs who always have PCIE cap. However, it's still used for one
without PCIE cap.

>+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, &cap);
>+	if (!(cap & PCI_EXP_DEVCAP_FLR))
>+		return -ENOTTY;
>+
>+	if (!pnv_pci_wait_for_pending(pdn, edev->pcie_cap + PCI_EXP_DEVSTA,
>+			PCI_EXP_DEVSTA_TRPND))
>+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending "
>+		       "transaction; performing function level reset anyway\n",
>+			edev->phb->global_number, pdn->busno,
>+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));

Please print the function name and simplify the log into following one.
Also, the connection symbol between device and function number would
be ".", not ":".

		pr_warn("%s: Pending transaction while issuing FLR to "
			"%04x:%02x:%02x.%01x",
			__func__, .....);
>+
>+	eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, &cap);
>+	if (option == EEH_RESET_DEACTIVATE)
>+		cap &= ~PCI_EXP_DEVCTL_BCR_FLR;
>+	else
>+		cap |= PCI_EXP_DEVCTL_BCR_FLR;
>+	eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, 4, cap);
>+	msleep(100);

The hold and stablization delay has been standarized in EEH as below:

EEH_PE_RST_HOLD_TIME         - After asserting reset
EEH_PE_RST_SETTLE_TIME       - After deasserting reset

>+	return 0;
>+}
>+
>+static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
>+{
>+	u32 cap;
>+	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>+
>+	if (!edev->af_cap)
>+		return -ENOTTY;
>+
>+	eeh_ops->read_config(pdn, edev->af_cap + PCI_AF_CAP, 1, &cap);
>+	if (!(cap & PCI_AF_CAP_TP) || !(cap & PCI_AF_CAP_FLR))
>+		return -ENOTTY;
>+
>+	/*
>+	 * Wait for Transaction Pending bit to clear.  A word-aligned test
>+	 * is used, so we use the conrol offset rather than status and shift
>+	 * the test bit to match.
>+	 */
>+	if (!pnv_pci_wait_for_pending(pdn, edev->af_cap + PCI_AF_CTRL,
>+				 PCI_AF_STATUS_TP << 8))
>+		pr_err("%04x:%02x:%02x:%01x timed out waiting for pending "
>+		    "transaction; performing AF function level reset anyway\n",
>+			edev->phb->global_number, pdn->busno,
>+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));

Same as above.

>+
>+	if (option == EEH_RESET_DEACTIVATE)
>+		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1, 0);
>+	else
>+		eeh_ops->write_config(pdn, edev->af_cap + PCI_AF_CTRL, 1,
>+				PCI_AF_CTRL_FLR);
>+	msleep(100);

Same as above.

>+	return 0;
>+}
>+
>+static int pnv_eeh_reset_vf(struct pci_dn *pdn, int option)
>+{
>+	int rc;
>+
>+	might_sleep();

This can be removed safely. This function should be called from
"eehd" kernel thread. So we don't need the check.

>+
>+	rc = pnv_eeh_do_flr(pdn, option);
>+	if (rc != -ENOTTY)
>+		goto done;
>+
>+	rc = pnv_eeh_do_af_flr(pdn, option);
>+	if (rc != -ENOTTY)
>+		goto done;
>+
>+done:
>+	return rc;
>+}

You can avoid using unnecessary tag:

	rc = pnv_eeh_do_flr();
	if (!rc)
		return rc;

	rc = pnv_eeh_do_af_flr();
	return rc;

>+
>+static int pnv_eeh_vf_pe_reset(struct eeh_pe *pe, int option)
>+{
>+	struct eeh_dev *edev, *tmp;
>+	struct pci_dn *pdn;
>+	int ret = 0;
>+
>+	eeh_pe_for_each_dev(pe, edev, tmp) {
>+		pdn = eeh_dev_to_pdn(edev);
>+		ret |= pnv_eeh_reset_vf(pdn, option);
>+		if (ret)
>+			return ret;
>+	}
>+
>+	return ret;
>+}
>+
> void pnv_pci_reset_secondary_bus(struct pci_dev *dev)
> {
> 	struct pci_controller *hose;
>@@ -966,7 +1078,9 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
> 		}
>
> 		bus = eeh_pe_bus_get(pe);
>-		if (pci_is_root_bus(bus) ||
>+		if (pe->type & EEH_PE_VF)
>+			ret = pnv_eeh_vf_pe_reset(pe, option);
>+		else if (pci_is_root_bus(bus) ||
> 			pci_is_root_bus(bus->parent))
> 			ret = pnv_eeh_root_reset(hose, option);
> 		else
>@@ -1106,6 +1220,13 @@ static inline bool pnv_eeh_cfg_blocked(struct pci_dn *pdn)
> 	if (!edev || !edev->pe)
> 		return false;
>
>+	/*
>+	 * For VF's reset operation, we need to rely on the kernel to
>+	 * do those PCI config operations since firmware isn't aware of VFs.
>+	 */
>+	if ((edev->physfn) && (edev->pe->state & EEH_PE_RESET))
>+		return false;
>+
> 	if (edev->pe->state & EEH_PE_CFG_BLOCKED)
> 		return true;
>

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 08/11] powerpc/powernv: Support PCI config restore for VFs
  2015-05-15  5:46   ` Wei Yang
@ 2015-05-15  7:27     ` Gavin Shan
  -1 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  7:27 UTC (permalink / raw)
  To: Wei Yang; +Cc: gwshan, bhelgaas, linuxppc-dev, linux-pci

On Fri, May 15, 2015 at 01:46:23PM +0800, Wei Yang wrote:
>Since skiboot firmware is not aware of VFs, the restore action for VF
>should be done in kernel.
>
>The patch introduces function pnv_eeh_restore_vf_config() to restore PCI
>config space for VFs after reset.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/include/asm/pci-bridge.h        |    1 +
> arch/powerpc/platforms/powernv/eeh-powernv.c |   59 +++++++++++++++++++++++++-
> arch/powerpc/platforms/powernv/pci.c         |   16 +++++++
> 3 files changed, 75 insertions(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>index d78afe4..168b991 100644
>--- a/arch/powerpc/include/asm/pci-bridge.h
>+++ b/arch/powerpc/include/asm/pci-bridge.h
>@@ -205,6 +205,7 @@ struct pci_dn {
> 	int     m64_per_iov;
> #define IODA_INVALID_M64        (-1)
> 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
>+	int	mps;
> #endif /* CONFIG_PCI_IOV */
> #endif
> 	struct list_head child_list;
>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>index 61f1a55..e200ed1 100644
>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>@@ -1601,6 +1601,59 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
> 	return ret;
> }
>
>+#ifdef CONFIG_PCI_IOV
>+static int pnv_eeh_restore_vf_config(struct pci_dn *pdn)
>+{
>+	int pcie_cap, aer_cap, old_mps;
>+	u32 devctl, cmd, cap2, aer_capctl;
>+

It's worthy to check if PCIE cap is valid or not.

>+	/* Restore MPS */
>+	pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
>+	if (pcie_cap) {
>+		old_mps = (ffs(pdn->mps) - 8) << 5;
>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
>+		devctl |= old_mps;
>+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>+	}
>+
>+	/* Disable Completion Timeout */
>+	if (pcie_cap) {
>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
>+		if (cap2 & 0x10) {

There should have one macro for "0x10" in pci_regs.h. If so, please use that one.

>+			eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
>+			cap2 |= 0x10;
>+			eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
>+		}
>+	}
>+
>+	/* Enable SERR and parity checking */
>+	eeh_ops->read_config(pdn, PCI_COMMAND, 2, &cmd);
>+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
>+	eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd);
>+
>+	/* Enable report various errors */
>+	if (pcie_cap) {
>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>+		devctl &= ~PCI_EXP_DEVCTL_CERE;
>+		devctl |= (PCI_EXP_DEVCTL_NFERE |
>+			   PCI_EXP_DEVCTL_FERE |
>+			   PCI_EXP_DEVCTL_URRE);
>+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>+	}
>+
>+	/* Enable ECRC generation and check */
>+	if (pcie_cap) {
>+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);

The AER cap should have been cached in eeh-powernv.c::pnv_eeh_probe(). Similar
to the case of PCIE cap, you need check if the AER cap is valid or not.

>+		eeh_ops->read_config(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
>+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
>+		eeh_ops->write_config(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
>+	}
>+
>+	return 0;
>+}
>+#endif /* CONFIG_PCI_IOV */
>+
> static int pnv_eeh_restore_config(struct pci_dn *pdn)
> {
> 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>@@ -1611,7 +1664,11 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
> 		return -EEXIST;
>
> 	phb = edev->phb->private_data;
>-	ret = opal_pci_reinit(phb->opal_id,
>+	/* FW is not VF aware, we rely on OS to restore it */

Please change the comment to:

	/*
	 * We have to restore the PCI config space after reset since
	 * the firmware can't see SRIOV VFs.
	 */

>+	if (edev->physfn)
>+		ret = pnv_eeh_restore_vf_config(pdn);
>+	else
>+		ret = opal_pci_reinit(phb->opal_id,
> 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
> 	if (ret) {
> 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
>index bca2aeb..31d0258 100644
>--- a/arch/powerpc/platforms/powernv/pci.c
>+++ b/arch/powerpc/platforms/powernv/pci.c
>@@ -781,3 +781,19 @@ machine_subsys_initcall_sync(powernv, tce_iommu_bus_notifier_init);
> struct pci_controller_ops pnv_pci_controller_ops = {
> 	.dma_dev_setup = pnv_pci_dma_dev_setup,
> };
>+
>+static void pnv_pci_fixup_vf_caps(struct pci_dev *pdev)
>+{
>+	struct pci_dn *pdn = pci_get_pdn(pdev);
>+	int parent_mps;
>+
>+	if (!pdev->is_virtfn)
>+		return;
>+
>+	/* Synchronize MPS for VF and PF */
>+	parent_mps = pcie_get_mps(pdev->physfn);
>+	if ((128 << pdev->pcie_mpss) >= parent_mps)
>+		pcie_set_mps(pdev, parent_mps);
>+	pdn->mps = pcie_get_mps(pdev);
>+}
>+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_caps);

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 08/11] powerpc/powernv: Support PCI config restore for VFs
@ 2015-05-15  7:27     ` Gavin Shan
  0 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  7:27 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Fri, May 15, 2015 at 01:46:23PM +0800, Wei Yang wrote:
>Since skiboot firmware is not aware of VFs, the restore action for VF
>should be done in kernel.
>
>The patch introduces function pnv_eeh_restore_vf_config() to restore PCI
>config space for VFs after reset.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/include/asm/pci-bridge.h        |    1 +
> arch/powerpc/platforms/powernv/eeh-powernv.c |   59 +++++++++++++++++++++++++-
> arch/powerpc/platforms/powernv/pci.c         |   16 +++++++
> 3 files changed, 75 insertions(+), 1 deletion(-)
>
>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>index d78afe4..168b991 100644
>--- a/arch/powerpc/include/asm/pci-bridge.h
>+++ b/arch/powerpc/include/asm/pci-bridge.h
>@@ -205,6 +205,7 @@ struct pci_dn {
> 	int     m64_per_iov;
> #define IODA_INVALID_M64        (-1)
> 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
>+	int	mps;
> #endif /* CONFIG_PCI_IOV */
> #endif
> 	struct list_head child_list;
>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>index 61f1a55..e200ed1 100644
>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>@@ -1601,6 +1601,59 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
> 	return ret;
> }
>
>+#ifdef CONFIG_PCI_IOV
>+static int pnv_eeh_restore_vf_config(struct pci_dn *pdn)
>+{
>+	int pcie_cap, aer_cap, old_mps;
>+	u32 devctl, cmd, cap2, aer_capctl;
>+

It's worthy to check if PCIE cap is valid or not.

>+	/* Restore MPS */
>+	pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
>+	if (pcie_cap) {
>+		old_mps = (ffs(pdn->mps) - 8) << 5;
>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
>+		devctl |= old_mps;
>+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>+	}
>+
>+	/* Disable Completion Timeout */
>+	if (pcie_cap) {
>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
>+		if (cap2 & 0x10) {

There should have one macro for "0x10" in pci_regs.h. If so, please use that one.

>+			eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
>+			cap2 |= 0x10;
>+			eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
>+		}
>+	}
>+
>+	/* Enable SERR and parity checking */
>+	eeh_ops->read_config(pdn, PCI_COMMAND, 2, &cmd);
>+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
>+	eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd);
>+
>+	/* Enable report various errors */
>+	if (pcie_cap) {
>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>+		devctl &= ~PCI_EXP_DEVCTL_CERE;
>+		devctl |= (PCI_EXP_DEVCTL_NFERE |
>+			   PCI_EXP_DEVCTL_FERE |
>+			   PCI_EXP_DEVCTL_URRE);
>+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>+	}
>+
>+	/* Enable ECRC generation and check */
>+	if (pcie_cap) {
>+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);

The AER cap should have been cached in eeh-powernv.c::pnv_eeh_probe(). Similar
to the case of PCIE cap, you need check if the AER cap is valid or not.

>+		eeh_ops->read_config(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
>+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
>+		eeh_ops->write_config(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
>+	}
>+
>+	return 0;
>+}
>+#endif /* CONFIG_PCI_IOV */
>+
> static int pnv_eeh_restore_config(struct pci_dn *pdn)
> {
> 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>@@ -1611,7 +1664,11 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
> 		return -EEXIST;
>
> 	phb = edev->phb->private_data;
>-	ret = opal_pci_reinit(phb->opal_id,
>+	/* FW is not VF aware, we rely on OS to restore it */

Please change the comment to:

	/*
	 * We have to restore the PCI config space after reset since
	 * the firmware can't see SRIOV VFs.
	 */

>+	if (edev->physfn)
>+		ret = pnv_eeh_restore_vf_config(pdn);
>+	else
>+		ret = opal_pci_reinit(phb->opal_id,
> 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
> 	if (ret) {
> 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
>index bca2aeb..31d0258 100644
>--- a/arch/powerpc/platforms/powernv/pci.c
>+++ b/arch/powerpc/platforms/powernv/pci.c
>@@ -781,3 +781,19 @@ machine_subsys_initcall_sync(powernv, tce_iommu_bus_notifier_init);
> struct pci_controller_ops pnv_pci_controller_ops = {
> 	.dma_dev_setup = pnv_pci_dma_dev_setup,
> };
>+
>+static void pnv_pci_fixup_vf_caps(struct pci_dev *pdev)
>+{
>+	struct pci_dn *pdn = pci_get_pdn(pdev);
>+	int parent_mps;
>+
>+	if (!pdev->is_virtfn)
>+		return;
>+
>+	/* Synchronize MPS for VF and PF */
>+	parent_mps = pcie_get_mps(pdev->physfn);
>+	if ((128 << pdev->pcie_mpss) >= parent_mps)
>+		pcie_set_mps(pdev, parent_mps);
>+	pdn->mps = pcie_get_mps(pdev);
>+}
>+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_caps);

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 09/11] powerpc/eeh: handle VF PE properly
  2015-05-15  5:46   ` Wei Yang
@ 2015-05-15  7:31     ` Gavin Shan
  -1 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  7:31 UTC (permalink / raw)
  To: Wei Yang; +Cc: gwshan, bhelgaas, linuxppc-dev, linux-pci

On Fri, May 15, 2015 at 01:46:24PM +0800, Wei Yang wrote:
>Compared with Bus PE, VF PE just has one single pci function. This
>introduces the difference of error handling on a VF PE.
>
>For example in the hotplug case, EEH needs to remove and re-create the VF
>properly. In the case when PF's error_detected() disable SRIOV, this patch
>introduces a flag to mark the eeh_dev of a VF to avoid the slot_reset() and
>resume(). Since the FW is not ware of the VF, this patch handles the VF
>restore/reset in kernel directly.
>
>This patch is to handle the VF PE properly in these cases.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>

Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

With following things fixed:

>---
> arch/powerpc/include/asm/eeh.h   |    1 +
> arch/powerpc/kernel/eeh.c        |    1 +
> arch/powerpc/kernel/eeh_driver.c |  103 ++++++++++++++++++++++++++++++--------
> arch/powerpc/kernel/eeh_pe.c     |    3 +-
> 4 files changed, 85 insertions(+), 23 deletions(-)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index 3d64cf3..d24382c 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -140,6 +140,7 @@ struct eeh_dev {
> 	struct pci_controller *phb;	/* Associated PHB		*/
> 	struct pci_dn *pdn;		/* Associated PCI device node	*/
> 	struct pci_dev *pdev;		/* Associated PCI device	*/
>+	int    in_error;		/* Error flag for eeh_dev	*/
> 	struct pci_dev *physfn;		/* Associated PF PORT		*/
> 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
> };
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index 221e280..077c3d1 100644
>--- a/arch/powerpc/kernel/eeh.c
>+++ b/arch/powerpc/kernel/eeh.c
>@@ -1226,6 +1226,7 @@ void eeh_remove_device(struct pci_dev *dev)
> 	 * from the parent PE during the BAR resotre.
> 	 */
> 	edev->pdev = NULL;
>+	edev->in_error = 0;

Could you please put detailed comments aboug the the usage of "in_error" here?
I may look into it later to remove it. For now, you don't need to do that since
we're almost run out of time.

> 	dev->dev.archdata.edev = NULL;
> 	if (!(edev->pe->state & EEH_PE_KEEP))
> 		eeh_rmv_from_parent_pe(edev);
>diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>index 89eb4bc..292089e 100644
>--- a/arch/powerpc/kernel/eeh_driver.c
>+++ b/arch/powerpc/kernel/eeh_driver.c
>@@ -211,6 +211,7 @@ static void *eeh_report_error(void *data, void *userdata)
> 	if (rc == PCI_ERS_RESULT_NEED_RESET) *res = rc;
> 	if (*res == PCI_ERS_RESULT_NONE) *res = rc;
>
>+	edev->in_error = 1;
> 	eeh_pcid_put(dev);
> 	return NULL;
> }
>@@ -282,7 +283,8 @@ static void *eeh_report_reset(void *data, void *userdata)
>
> 	if (!driver->err_handler ||
> 	    !driver->err_handler->slot_reset ||
>-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
>+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
>+	    (!edev->in_error)) {
> 		eeh_pcid_put(dev);
> 		return NULL;
> 	}
>@@ -339,14 +341,16 @@ static void *eeh_report_resume(void *data, void *userdata)
>
> 	if (!driver->err_handler ||
> 	    !driver->err_handler->resume ||
>-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
>+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
>+	    (!edev->in_error)) {
> 		edev->mode &= ~EEH_DEV_NO_HANDLER;
>-		eeh_pcid_put(dev);
>-		return NULL;
>+		goto out;
> 	}
>
> 	driver->err_handler->resume(dev);
>
>+out:
>+	edev->in_error = 0;
> 	eeh_pcid_put(dev);
> 	return NULL;
> }
>@@ -386,12 +390,40 @@ static void *eeh_report_failure(void *data, void *userdata)
> 	return NULL;
> }
>
>+#ifdef CONFIG_PCI_IOV
>+static void *eeh_add_virt_device(void *data, void *userdata)
>+{
>+	struct pci_driver *driver;
>+	struct eeh_dev *edev = (struct eeh_dev *)data;
>+	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
>+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
>+
>+	if (!(edev->physfn)) {
>+		pr_warn("%s: EEH dev %04x:%02x:%02x:%01x not for VF\n",
>+			__func__, edev->phb->global_number, pdn->busno,
>+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));

			    %04x:%02x:%02x.%01x

The connection symbol between device/function number is ".", not ":".

>+		return NULL;
>+	}
>+
>+	driver = eeh_pcid_get(dev);
>+	if (driver) {
>+		eeh_pcid_put(dev);
>+		if (driver->err_handler)
>+			return NULL;
>+	}
>+
>+	pci_iov_virtfn_add(edev->physfn, pdn->vf_index, 0);
>+	return NULL;
>+}
>+#endif /* CONFIG_PCI_IOV */
>+
> static void *eeh_rmv_device(void *data, void *userdata)
> {
> 	struct pci_driver *driver;
> 	struct eeh_dev *edev = (struct eeh_dev *)data;
> 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
> 	int *removed = (int *)userdata;
>+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
>
> 	/*
> 	 * Actually, we should remove the PCI bridges as well.
>@@ -416,7 +448,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
> 	driver = eeh_pcid_get(dev);
> 	if (driver) {
> 		eeh_pcid_put(dev);
>-		if (driver->err_handler)
>+		if (removed && driver->err_handler)
> 			return NULL;
> 	}
>
>@@ -425,11 +457,18 @@ static void *eeh_rmv_device(void *data, void *userdata)
> 		 pci_name(dev));
> 	edev->bus = dev->bus;
> 	edev->mode |= EEH_DEV_DISCONNECTED;
>-	(*removed)++;
>-
>-	pci_lock_rescan_remove();
>-	pci_stop_and_remove_bus_device(dev);
>-	pci_unlock_rescan_remove();
>+	if (removed)
>+		(*removed)++;
>+
>+	if (edev->physfn) {
>+		pci_iov_virtfn_remove(edev->physfn, pdn->vf_index, 0);
>+		edev->pdev = NULL;
>+		pdn->pe_number = IODA_INVALID_PE;
>+	} else {
>+		pci_lock_rescan_remove();
>+		pci_stop_and_remove_bus_device(dev);
>+		pci_unlock_rescan_remove();
>+	}
>
> 	return NULL;
> }
>@@ -548,6 +587,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
> 	struct pci_bus *frozen_bus = eeh_pe_bus_get(pe);
> 	struct timeval tstamp;
> 	int cnt, rc, removed = 0;
>+	struct eeh_dev *edev;
>
> 	/* pcibios will clear the counter; save the value */
> 	cnt = pe->freeze_count;
>@@ -561,12 +601,15 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
> 	 */
> 	eeh_pe_state_mark(pe, EEH_PE_KEEP);
> 	if (bus) {
>-		pci_lock_rescan_remove();
>-		pcibios_remove_pci_devices(bus);
>-		pci_unlock_rescan_remove();
>-	} else if (frozen_bus) {
>+		if (pe->type & EEH_PE_VF)
>+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
>+		else {
>+			pci_lock_rescan_remove();
>+			pcibios_remove_pci_devices(bus);
>+			pci_unlock_rescan_remove();
>+		}
>+	} else if (frozen_bus)
> 		eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
>-	}
>
> 	/*
> 	 * Reset the pci controller. (Asserts RST#; resets config space).
>@@ -607,14 +650,26 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
> 		 * PE. We should disconnect it so the binding can be
> 		 * rebuilt when adding PCI devices.
> 		 */
>+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
> 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
>-		pcibios_add_pci_devices(bus);
>+#ifdef CONFIG_PCI_IOV
>+		if (pe->type & EEH_PE_VF)
>+			eeh_add_virt_device(edev, NULL);
>+		else
>+#endif
>+			pcibios_add_pci_devices(bus);
> 	} else if (frozen_bus && removed) {
> 		pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
> 		ssleep(5);
>
>+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
> 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
>-		pcibios_add_pci_devices(frozen_bus);
>+#ifdef CONFIG_PCI_IOV
>+		if (pe->type & EEH_PE_VF)
>+			eeh_add_virt_device(edev, NULL);
>+		else
>+#endif
>+			pcibios_add_pci_devices(frozen_bus);
> 	}
> 	eeh_pe_state_clear(pe, EEH_PE_KEEP);
>
>@@ -792,11 +847,15 @@ perm_error:
> 	 * the their PCI config any more.
> 	 */
> 	if (frozen_bus) {
>-		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>-
>-		pci_lock_rescan_remove();
>-		pcibios_remove_pci_devices(frozen_bus);
>-		pci_unlock_rescan_remove();
>+		if (pe->type & EEH_PE_VF) {
>+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
>+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>+		} else {
>+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>+			pci_lock_rescan_remove();
>+			pcibios_remove_pci_devices(frozen_bus);
>+			pci_unlock_rescan_remove();
>+		}
> 	}
> }
>
>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>index 260a701..5cde950 100644
>--- a/arch/powerpc/kernel/eeh_pe.c
>+++ b/arch/powerpc/kernel/eeh_pe.c
>@@ -914,7 +914,8 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe)
> 	if (pe->type & EEH_PE_PHB) {
> 		bus = pe->phb->bus;
> 	} else if (pe->type & EEH_PE_BUS ||
>-		   pe->type & EEH_PE_DEVICE) {
>+		   pe->type & EEH_PE_DEVICE ||
>+		   pe->type & EEH_PE_VF) {
> 		if (pe->bus) {
> 			bus = pe->bus;
> 			goto out;
>-- 
>1.7.9.5
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 09/11] powerpc/eeh: handle VF PE properly
@ 2015-05-15  7:31     ` Gavin Shan
  0 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  7:31 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Fri, May 15, 2015 at 01:46:24PM +0800, Wei Yang wrote:
>Compared with Bus PE, VF PE just has one single pci function. This
>introduces the difference of error handling on a VF PE.
>
>For example in the hotplug case, EEH needs to remove and re-create the VF
>properly. In the case when PF's error_detected() disable SRIOV, this patch
>introduces a flag to mark the eeh_dev of a VF to avoid the slot_reset() and
>resume(). Since the FW is not ware of the VF, this patch handles the VF
>restore/reset in kernel directly.
>
>This patch is to handle the VF PE properly in these cases.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>

Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

With following things fixed:

>---
> arch/powerpc/include/asm/eeh.h   |    1 +
> arch/powerpc/kernel/eeh.c        |    1 +
> arch/powerpc/kernel/eeh_driver.c |  103 ++++++++++++++++++++++++++++++--------
> arch/powerpc/kernel/eeh_pe.c     |    3 +-
> 4 files changed, 85 insertions(+), 23 deletions(-)
>
>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>index 3d64cf3..d24382c 100644
>--- a/arch/powerpc/include/asm/eeh.h
>+++ b/arch/powerpc/include/asm/eeh.h
>@@ -140,6 +140,7 @@ struct eeh_dev {
> 	struct pci_controller *phb;	/* Associated PHB		*/
> 	struct pci_dn *pdn;		/* Associated PCI device node	*/
> 	struct pci_dev *pdev;		/* Associated PCI device	*/
>+	int    in_error;		/* Error flag for eeh_dev	*/
> 	struct pci_dev *physfn;		/* Associated PF PORT		*/
> 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
> };
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index 221e280..077c3d1 100644
>--- a/arch/powerpc/kernel/eeh.c
>+++ b/arch/powerpc/kernel/eeh.c
>@@ -1226,6 +1226,7 @@ void eeh_remove_device(struct pci_dev *dev)
> 	 * from the parent PE during the BAR resotre.
> 	 */
> 	edev->pdev = NULL;
>+	edev->in_error = 0;

Could you please put detailed comments aboug the the usage of "in_error" here?
I may look into it later to remove it. For now, you don't need to do that since
we're almost run out of time.

> 	dev->dev.archdata.edev = NULL;
> 	if (!(edev->pe->state & EEH_PE_KEEP))
> 		eeh_rmv_from_parent_pe(edev);
>diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>index 89eb4bc..292089e 100644
>--- a/arch/powerpc/kernel/eeh_driver.c
>+++ b/arch/powerpc/kernel/eeh_driver.c
>@@ -211,6 +211,7 @@ static void *eeh_report_error(void *data, void *userdata)
> 	if (rc == PCI_ERS_RESULT_NEED_RESET) *res = rc;
> 	if (*res == PCI_ERS_RESULT_NONE) *res = rc;
>
>+	edev->in_error = 1;
> 	eeh_pcid_put(dev);
> 	return NULL;
> }
>@@ -282,7 +283,8 @@ static void *eeh_report_reset(void *data, void *userdata)
>
> 	if (!driver->err_handler ||
> 	    !driver->err_handler->slot_reset ||
>-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
>+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
>+	    (!edev->in_error)) {
> 		eeh_pcid_put(dev);
> 		return NULL;
> 	}
>@@ -339,14 +341,16 @@ static void *eeh_report_resume(void *data, void *userdata)
>
> 	if (!driver->err_handler ||
> 	    !driver->err_handler->resume ||
>-	    (edev->mode & EEH_DEV_NO_HANDLER)) {
>+	    (edev->mode & EEH_DEV_NO_HANDLER) ||
>+	    (!edev->in_error)) {
> 		edev->mode &= ~EEH_DEV_NO_HANDLER;
>-		eeh_pcid_put(dev);
>-		return NULL;
>+		goto out;
> 	}
>
> 	driver->err_handler->resume(dev);
>
>+out:
>+	edev->in_error = 0;
> 	eeh_pcid_put(dev);
> 	return NULL;
> }
>@@ -386,12 +390,40 @@ static void *eeh_report_failure(void *data, void *userdata)
> 	return NULL;
> }
>
>+#ifdef CONFIG_PCI_IOV
>+static void *eeh_add_virt_device(void *data, void *userdata)
>+{
>+	struct pci_driver *driver;
>+	struct eeh_dev *edev = (struct eeh_dev *)data;
>+	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
>+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
>+
>+	if (!(edev->physfn)) {
>+		pr_warn("%s: EEH dev %04x:%02x:%02x:%01x not for VF\n",
>+			__func__, edev->phb->global_number, pdn->busno,
>+			PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));

			    %04x:%02x:%02x.%01x

The connection symbol between device/function number is ".", not ":".

>+		return NULL;
>+	}
>+
>+	driver = eeh_pcid_get(dev);
>+	if (driver) {
>+		eeh_pcid_put(dev);
>+		if (driver->err_handler)
>+			return NULL;
>+	}
>+
>+	pci_iov_virtfn_add(edev->physfn, pdn->vf_index, 0);
>+	return NULL;
>+}
>+#endif /* CONFIG_PCI_IOV */
>+
> static void *eeh_rmv_device(void *data, void *userdata)
> {
> 	struct pci_driver *driver;
> 	struct eeh_dev *edev = (struct eeh_dev *)data;
> 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
> 	int *removed = (int *)userdata;
>+	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
>
> 	/*
> 	 * Actually, we should remove the PCI bridges as well.
>@@ -416,7 +448,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
> 	driver = eeh_pcid_get(dev);
> 	if (driver) {
> 		eeh_pcid_put(dev);
>-		if (driver->err_handler)
>+		if (removed && driver->err_handler)
> 			return NULL;
> 	}
>
>@@ -425,11 +457,18 @@ static void *eeh_rmv_device(void *data, void *userdata)
> 		 pci_name(dev));
> 	edev->bus = dev->bus;
> 	edev->mode |= EEH_DEV_DISCONNECTED;
>-	(*removed)++;
>-
>-	pci_lock_rescan_remove();
>-	pci_stop_and_remove_bus_device(dev);
>-	pci_unlock_rescan_remove();
>+	if (removed)
>+		(*removed)++;
>+
>+	if (edev->physfn) {
>+		pci_iov_virtfn_remove(edev->physfn, pdn->vf_index, 0);
>+		edev->pdev = NULL;
>+		pdn->pe_number = IODA_INVALID_PE;
>+	} else {
>+		pci_lock_rescan_remove();
>+		pci_stop_and_remove_bus_device(dev);
>+		pci_unlock_rescan_remove();
>+	}
>
> 	return NULL;
> }
>@@ -548,6 +587,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
> 	struct pci_bus *frozen_bus = eeh_pe_bus_get(pe);
> 	struct timeval tstamp;
> 	int cnt, rc, removed = 0;
>+	struct eeh_dev *edev;
>
> 	/* pcibios will clear the counter; save the value */
> 	cnt = pe->freeze_count;
>@@ -561,12 +601,15 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
> 	 */
> 	eeh_pe_state_mark(pe, EEH_PE_KEEP);
> 	if (bus) {
>-		pci_lock_rescan_remove();
>-		pcibios_remove_pci_devices(bus);
>-		pci_unlock_rescan_remove();
>-	} else if (frozen_bus) {
>+		if (pe->type & EEH_PE_VF)
>+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
>+		else {
>+			pci_lock_rescan_remove();
>+			pcibios_remove_pci_devices(bus);
>+			pci_unlock_rescan_remove();
>+		}
>+	} else if (frozen_bus)
> 		eeh_pe_dev_traverse(pe, eeh_rmv_device, &removed);
>-	}
>
> 	/*
> 	 * Reset the pci controller. (Asserts RST#; resets config space).
>@@ -607,14 +650,26 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus)
> 		 * PE. We should disconnect it so the binding can be
> 		 * rebuilt when adding PCI devices.
> 		 */
>+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
> 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
>-		pcibios_add_pci_devices(bus);
>+#ifdef CONFIG_PCI_IOV
>+		if (pe->type & EEH_PE_VF)
>+			eeh_add_virt_device(edev, NULL);
>+		else
>+#endif
>+			pcibios_add_pci_devices(bus);
> 	} else if (frozen_bus && removed) {
> 		pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
> 		ssleep(5);
>
>+		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
> 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
>-		pcibios_add_pci_devices(frozen_bus);
>+#ifdef CONFIG_PCI_IOV
>+		if (pe->type & EEH_PE_VF)
>+			eeh_add_virt_device(edev, NULL);
>+		else
>+#endif
>+			pcibios_add_pci_devices(frozen_bus);
> 	}
> 	eeh_pe_state_clear(pe, EEH_PE_KEEP);
>
>@@ -792,11 +847,15 @@ perm_error:
> 	 * the their PCI config any more.
> 	 */
> 	if (frozen_bus) {
>-		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>-
>-		pci_lock_rescan_remove();
>-		pcibios_remove_pci_devices(frozen_bus);
>-		pci_unlock_rescan_remove();
>+		if (pe->type & EEH_PE_VF) {
>+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
>+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>+		} else {
>+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>+			pci_lock_rescan_remove();
>+			pcibios_remove_pci_devices(frozen_bus);
>+			pci_unlock_rescan_remove();
>+		}
> 	}
> }
>
>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>index 260a701..5cde950 100644
>--- a/arch/powerpc/kernel/eeh_pe.c
>+++ b/arch/powerpc/kernel/eeh_pe.c
>@@ -914,7 +914,8 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe)
> 	if (pe->type & EEH_PE_PHB) {
> 		bus = pe->phb->bus;
> 	} else if (pe->type & EEH_PE_BUS ||
>-		   pe->type & EEH_PE_DEVICE) {
>+		   pe->type & EEH_PE_DEVICE ||
>+		   pe->type & EEH_PE_VF) {
> 		if (pe->bus) {
> 			bus = pe->bus;
> 			goto out;
>-- 
>1.7.9.5
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 10/11] powerpc/powernv: use "compound" as the child's list_head for compound PE
  2015-05-15  5:46   ` Wei Yang
@ 2015-05-15  7:37     ` Gavin Shan
  -1 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  7:37 UTC (permalink / raw)
  To: Wei Yang; +Cc: gwshan, bhelgaas, linuxppc-dev, linux-pci

On Fri, May 15, 2015 at 01:46:25PM +0800, Wei Yang wrote:
>Commit 262af557dd75(powerpc/powernv: Enable M64 aperatus for PHB3)
>introduces the concept of compound PE, and they are linked together to
>master PE's slaves lish_head with the list field. While this field is
>usually used to linked to the phb->ioda.pe_list to represents the PE is
>used.
>
>This patch introduces a field "compound" to link those compound PEs.
>

I don't think we needn't it with:

- VF PEs are classified to master and slave PEs.
- Master PEs is linked to phb->list;
- Slave PEs is linked to master->slaves;
- When iterating all PEs, you have to check PE's flag to cover all
  (master & slave) VF PEs:

  for_each_pe_in_phb_list {
         /* Things we're doing */
         if (pe_is_not_vf_pe ||
             pe_is_not_master_pe)
             continue;

         for_each_pe_in_master_vf_pe_list {
               /* slave VF PEs */
         }
  }

Thanks,
Gavin

>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/platforms/powernv/pci-ioda.c |    8 ++++----
> arch/powerpc/platforms/powernv/pci.h      |    1 +
> 2 files changed, 5 insertions(+), 4 deletions(-)
>
>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>index 920c252..843457b 100644
>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>@@ -344,7 +344,7 @@ done:
> 		} else {
> 			pe->flags |= PNV_IODA_PE_SLAVE;
> 			pe->master = master_pe;
>-			list_add_tail(&pe->list, &master_pe->slaves);
>+			list_add_tail(&pe->compound, &master_pe->slaves);
> 		}
> 	}
>
>@@ -428,7 +428,7 @@ static void pnv_ioda_freeze_pe(struct pnv_phb *phb, int pe_no)
> 	if (!(pe->flags & PNV_IODA_PE_MASTER))
> 		return;
>
>-	list_for_each_entry(slave, &pe->slaves, list) {
>+	list_for_each_entry(slave, &pe->slaves, compound) {
> 		rc = opal_pci_eeh_freeze_set(phb->opal_id,
> 					     slave->pe_number,
> 					     OPAL_EEH_ACTION_SET_FREEZE_ALL);
>@@ -464,7 +464,7 @@ static int pnv_ioda_unfreeze_pe(struct pnv_phb *phb, int pe_no, int opt)
> 		return 0;
>
> 	/* Clear frozen state for slave PEs */
>-	list_for_each_entry(slave, &pe->slaves, list) {
>+	list_for_each_entry(slave, &pe->slaves, compound) {
> 		rc = opal_pci_eeh_freeze_clear(phb->opal_id,
> 					     slave->pe_number,
> 					     opt);
>@@ -516,7 +516,7 @@ static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
> 	if (!(pe->flags & PNV_IODA_PE_MASTER))
> 		return state;
>
>-	list_for_each_entry(slave, &pe->slaves, list) {
>+	list_for_each_entry(slave, &pe->slaves, compound) {
> 		rc = opal_pci_eeh_freeze_status(phb->opal_id,
> 						slave->pe_number,
> 						&fstate,
>diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
>index 070ee88..540ab1e 100644
>--- a/arch/powerpc/platforms/powernv/pci.h
>+++ b/arch/powerpc/platforms/powernv/pci.h
>@@ -73,6 +73,7 @@ struct pnv_ioda_pe {
> 	/* PEs in compound case */
> 	struct pnv_ioda_pe	*master;
> 	struct list_head	slaves;
>+	struct list_head	compound;
>
> 	/* Link in list of PE#s */
> 	struct list_head	dma_link;
>-- 
>1.7.9.5
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 10/11] powerpc/powernv: use "compound" as the child's list_head for compound PE
@ 2015-05-15  7:37     ` Gavin Shan
  0 siblings, 0 replies; 44+ messages in thread
From: Gavin Shan @ 2015-05-15  7:37 UTC (permalink / raw)
  To: Wei Yang; +Cc: bhelgaas, linux-pci, linuxppc-dev, gwshan

On Fri, May 15, 2015 at 01:46:25PM +0800, Wei Yang wrote:
>Commit 262af557dd75(powerpc/powernv: Enable M64 aperatus for PHB3)
>introduces the concept of compound PE, and they are linked together to
>master PE's slaves lish_head with the list field. While this field is
>usually used to linked to the phb->ioda.pe_list to represents the PE is
>used.
>
>This patch introduces a field "compound" to link those compound PEs.
>

I don't think we needn't it with:

- VF PEs are classified to master and slave PEs.
- Master PEs is linked to phb->list;
- Slave PEs is linked to master->slaves;
- When iterating all PEs, you have to check PE's flag to cover all
  (master & slave) VF PEs:

  for_each_pe_in_phb_list {
         /* Things we're doing */
         if (pe_is_not_vf_pe ||
             pe_is_not_master_pe)
             continue;

         for_each_pe_in_master_vf_pe_list {
               /* slave VF PEs */
         }
  }

Thanks,
Gavin

>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/platforms/powernv/pci-ioda.c |    8 ++++----
> arch/powerpc/platforms/powernv/pci.h      |    1 +
> 2 files changed, 5 insertions(+), 4 deletions(-)
>
>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>index 920c252..843457b 100644
>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>@@ -344,7 +344,7 @@ done:
> 		} else {
> 			pe->flags |= PNV_IODA_PE_SLAVE;
> 			pe->master = master_pe;
>-			list_add_tail(&pe->list, &master_pe->slaves);
>+			list_add_tail(&pe->compound, &master_pe->slaves);
> 		}
> 	}
>
>@@ -428,7 +428,7 @@ static void pnv_ioda_freeze_pe(struct pnv_phb *phb, int pe_no)
> 	if (!(pe->flags & PNV_IODA_PE_MASTER))
> 		return;
>
>-	list_for_each_entry(slave, &pe->slaves, list) {
>+	list_for_each_entry(slave, &pe->slaves, compound) {
> 		rc = opal_pci_eeh_freeze_set(phb->opal_id,
> 					     slave->pe_number,
> 					     OPAL_EEH_ACTION_SET_FREEZE_ALL);
>@@ -464,7 +464,7 @@ static int pnv_ioda_unfreeze_pe(struct pnv_phb *phb, int pe_no, int opt)
> 		return 0;
>
> 	/* Clear frozen state for slave PEs */
>-	list_for_each_entry(slave, &pe->slaves, list) {
>+	list_for_each_entry(slave, &pe->slaves, compound) {
> 		rc = opal_pci_eeh_freeze_clear(phb->opal_id,
> 					     slave->pe_number,
> 					     opt);
>@@ -516,7 +516,7 @@ static int pnv_ioda_get_pe_state(struct pnv_phb *phb, int pe_no)
> 	if (!(pe->flags & PNV_IODA_PE_MASTER))
> 		return state;
>
>-	list_for_each_entry(slave, &pe->slaves, list) {
>+	list_for_each_entry(slave, &pe->slaves, compound) {
> 		rc = opal_pci_eeh_freeze_status(phb->opal_id,
> 						slave->pe_number,
> 						&fstate,
>diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
>index 070ee88..540ab1e 100644
>--- a/arch/powerpc/platforms/powernv/pci.h
>+++ b/arch/powerpc/platforms/powernv/pci.h
>@@ -73,6 +73,7 @@ struct pnv_ioda_pe {
> 	/* PEs in compound case */
> 	struct pnv_ioda_pe	*master;
> 	struct list_head	slaves;
>+	struct list_head	compound;
>
> 	/* Link in list of PE#s */
> 	struct list_head	dma_link;
>-- 
>1.7.9.5
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 05/11] powerpc/powernv: create/release eeh_dev for VF
  2015-05-15  6:19     ` Gavin Shan
@ 2015-05-15  8:53       ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  8:53 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Wei Yang, bhelgaas, linuxppc-dev, linux-pci

On Fri, May 15, 2015 at 04:19:16PM +1000, Gavin Shan wrote:
>On Fri, May 15, 2015 at 01:46:20PM +0800, Wei Yang wrote:
>>EEH on powerpc platform needs eeh_dev structure to track the PCI device
>>status. Since VFs are created/released dynamically, VF's eeh_dev is also
>>dynamically created/released in system.
>>
>>This patch creates/removes eeh_dev when pci_dn is created/removed for VFs,
>>and marks it with EEH_DEV_VF type.
>>
>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>
>Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>
>After removing the unnecessary line of code as below.
>
>>---
>> arch/powerpc/include/asm/eeh.h |    1 +
>> arch/powerpc/kernel/eeh.c      |    4 ++++
>> arch/powerpc/kernel/pci_dn.c   |   11 +++++++++++
>> 3 files changed, 16 insertions(+)
>>
>>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>>index a52db28..1b3614d 100644
>>--- a/arch/powerpc/include/asm/eeh.h
>>+++ b/arch/powerpc/include/asm/eeh.h
>>@@ -138,6 +138,7 @@ struct eeh_dev {
>> 	struct pci_controller *phb;	/* Associated PHB		*/
>> 	struct pci_dn *pdn;		/* Associated PCI device node	*/
>> 	struct pci_dev *pdev;		/* Associated PCI device	*/
>>+	struct pci_dev *physfn;		/* Associated PF PORT		*/
>> 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
>> };
>>
>>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>>index 6c7ce1b..221e280 100644
>>--- a/arch/powerpc/kernel/eeh.c
>>+++ b/arch/powerpc/kernel/eeh.c
>>@@ -1135,6 +1135,10 @@ void eeh_add_device_late(struct pci_dev *dev)
>> 	}
>>
>> 	edev->pdev = dev;
>>+#ifdef CONFIG_PCI_IOV
>>+	if (dev->is_virtfn)
>>+		edev->physfn = dev->physfn;
>>+#endif
>> 	dev->dev.archdata.edev = edev;
>>
>> 	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
>>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>>index f771130..94806a4 100644
>>--- a/arch/powerpc/kernel/pci_dn.c
>>+++ b/arch/powerpc/kernel/pci_dn.c
>>@@ -180,7 +180,9 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
>> struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
>> {
>> #ifdef CONFIG_PCI_IOV
>>+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
>> 	struct pci_dn *parent, *pdn;
>>+	struct eeh_dev *edev;
>> 	int i;
>>
>> 	/* Only support IOV for now */
>>@@ -206,6 +208,8 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
>> 				 __func__, i);
>> 			return NULL;
>> 		}
>>+		eeh_dev_init(pdn, hose);
>>+		edev = pdn_to_eeh_dev(pdn);
>
>Nothing is done to edev after getting it. So I think the last line of changes
>here isn't needed. Could you check and remove it if I'm correct?

You are right, removed.

>
>Thanks,
>Gavin
>
>> 	}
>> #endif /* CONFIG_PCI_IOV */
>>
>>@@ -254,10 +258,17 @@ void remove_dev_pci_data(struct pci_dev *pdev)
>> 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>> 		list_for_each_entry_safe(pdn, tmp,
>> 			&parent->child_list, list) {
>>+			struct eeh_dev *edev;
>> 			if (pdn->busno != pci_iov_virtfn_bus(pdev, i) ||
>> 			    pdn->devfn != pci_iov_virtfn_devfn(pdev, i))
>> 				continue;
>>
>>+			edev = pdn_to_eeh_dev(pdn);
>>+			if (edev) {
>>+				pdn->edev = NULL;
>>+				kfree(edev);
>>+			}
>>+
>> 			if (!list_empty(&pdn->list))
>> 				list_del(&pdn->list);
>>
>>-- 
>>1.7.9.5
>>

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 05/11] powerpc/powernv: create/release eeh_dev for VF
@ 2015-05-15  8:53       ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  8:53 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Fri, May 15, 2015 at 04:19:16PM +1000, Gavin Shan wrote:
>On Fri, May 15, 2015 at 01:46:20PM +0800, Wei Yang wrote:
>>EEH on powerpc platform needs eeh_dev structure to track the PCI device
>>status. Since VFs are created/released dynamically, VF's eeh_dev is also
>>dynamically created/released in system.
>>
>>This patch creates/removes eeh_dev when pci_dn is created/removed for VFs,
>>and marks it with EEH_DEV_VF type.
>>
>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>
>Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>
>After removing the unnecessary line of code as below.
>
>>---
>> arch/powerpc/include/asm/eeh.h |    1 +
>> arch/powerpc/kernel/eeh.c      |    4 ++++
>> arch/powerpc/kernel/pci_dn.c   |   11 +++++++++++
>> 3 files changed, 16 insertions(+)
>>
>>diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>>index a52db28..1b3614d 100644
>>--- a/arch/powerpc/include/asm/eeh.h
>>+++ b/arch/powerpc/include/asm/eeh.h
>>@@ -138,6 +138,7 @@ struct eeh_dev {
>> 	struct pci_controller *phb;	/* Associated PHB		*/
>> 	struct pci_dn *pdn;		/* Associated PCI device node	*/
>> 	struct pci_dev *pdev;		/* Associated PCI device	*/
>>+	struct pci_dev *physfn;		/* Associated PF PORT		*/
>> 	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
>> };
>>
>>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>>index 6c7ce1b..221e280 100644
>>--- a/arch/powerpc/kernel/eeh.c
>>+++ b/arch/powerpc/kernel/eeh.c
>>@@ -1135,6 +1135,10 @@ void eeh_add_device_late(struct pci_dev *dev)
>> 	}
>>
>> 	edev->pdev = dev;
>>+#ifdef CONFIG_PCI_IOV
>>+	if (dev->is_virtfn)
>>+		edev->physfn = dev->physfn;
>>+#endif
>> 	dev->dev.archdata.edev = edev;
>>
>> 	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
>>diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
>>index f771130..94806a4 100644
>>--- a/arch/powerpc/kernel/pci_dn.c
>>+++ b/arch/powerpc/kernel/pci_dn.c
>>@@ -180,7 +180,9 @@ static struct pci_dn *add_one_dev_pci_data(struct pci_dn *parent,
>> struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
>> {
>> #ifdef CONFIG_PCI_IOV
>>+	struct pci_controller *hose = pci_bus_to_host(pdev->bus);
>> 	struct pci_dn *parent, *pdn;
>>+	struct eeh_dev *edev;
>> 	int i;
>>
>> 	/* Only support IOV for now */
>>@@ -206,6 +208,8 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev)
>> 				 __func__, i);
>> 			return NULL;
>> 		}
>>+		eeh_dev_init(pdn, hose);
>>+		edev = pdn_to_eeh_dev(pdn);
>
>Nothing is done to edev after getting it. So I think the last line of changes
>here isn't needed. Could you check and remove it if I'm correct?

You are right, removed.

>
>Thanks,
>Gavin
>
>> 	}
>> #endif /* CONFIG_PCI_IOV */
>>
>>@@ -254,10 +258,17 @@ void remove_dev_pci_data(struct pci_dev *pdev)
>> 	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>> 		list_for_each_entry_safe(pdn, tmp,
>> 			&parent->child_list, list) {
>>+			struct eeh_dev *edev;
>> 			if (pdn->busno != pci_iov_virtfn_bus(pdev, i) ||
>> 			    pdn->devfn != pci_iov_virtfn_devfn(pdev, i))
>> 				continue;
>>
>>+			edev = pdn_to_eeh_dev(pdn);
>>+			if (edev) {
>>+				pdn->edev = NULL;
>>+				kfree(edev);
>>+			}
>>+
>> 			if (!list_empty(&pdn->list))
>> 				list_del(&pdn->list);
>>
>>-- 
>>1.7.9.5
>>

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 08/11] powerpc/powernv: Support PCI config restore for VFs
  2015-05-15  7:27     ` Gavin Shan
@ 2015-05-15  9:18       ` Wei Yang
  -1 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  9:18 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Wei Yang, bhelgaas, linuxppc-dev, linux-pci

On Fri, May 15, 2015 at 05:27:52PM +1000, Gavin Shan wrote:
>On Fri, May 15, 2015 at 01:46:23PM +0800, Wei Yang wrote:
>>Since skiboot firmware is not aware of VFs, the restore action for VF
>>should be done in kernel.
>>
>>The patch introduces function pnv_eeh_restore_vf_config() to restore PCI
>>config space for VFs after reset.
>>
>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>---
>> arch/powerpc/include/asm/pci-bridge.h        |    1 +
>> arch/powerpc/platforms/powernv/eeh-powernv.c |   59 +++++++++++++++++++++++++-
>> arch/powerpc/platforms/powernv/pci.c         |   16 +++++++
>> 3 files changed, 75 insertions(+), 1 deletion(-)
>>
>>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>>index d78afe4..168b991 100644
>>--- a/arch/powerpc/include/asm/pci-bridge.h
>>+++ b/arch/powerpc/include/asm/pci-bridge.h
>>@@ -205,6 +205,7 @@ struct pci_dn {
>> 	int     m64_per_iov;
>> #define IODA_INVALID_M64        (-1)
>> 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
>>+	int	mps;
>> #endif /* CONFIG_PCI_IOV */
>> #endif
>> 	struct list_head child_list;
>>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>index 61f1a55..e200ed1 100644
>>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>@@ -1601,6 +1601,59 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
>> 	return ret;
>> }
>>
>>+#ifdef CONFIG_PCI_IOV
>>+static int pnv_eeh_restore_vf_config(struct pci_dn *pdn)
>>+{
>>+	int pcie_cap, aer_cap, old_mps;
>>+	u32 devctl, cmd, cap2, aer_capctl;
>>+
>
>It's worthy to check if PCIE cap is valid or not.
>
>>+	/* Restore MPS */
>>+	pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
>>+	if (pcie_cap) {
>>+		old_mps = (ffs(pdn->mps) - 8) << 5;
>>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>>+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
>>+		devctl |= old_mps;
>>+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>>+	}
>>+
>>+	/* Disable Completion Timeout */
>>+	if (pcie_cap) {
>>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
>>+		if (cap2 & 0x10) {
>
>There should have one macro for "0x10" in pci_regs.h. If so, please use that one.

Actually, no.

I have checked the kernel, it doesn't has this field. That's why I put it
here.

>
>>+			eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
>>+			cap2 |= 0x10;
>>+			eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
>>+		}
>>+	}
>>+
>>+	/* Enable SERR and parity checking */
>>+	eeh_ops->read_config(pdn, PCI_COMMAND, 2, &cmd);
>>+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
>>+	eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd);
>>+
>>+	/* Enable report various errors */
>>+	if (pcie_cap) {
>>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>>+		devctl &= ~PCI_EXP_DEVCTL_CERE;
>>+		devctl |= (PCI_EXP_DEVCTL_NFERE |
>>+			   PCI_EXP_DEVCTL_FERE |
>>+			   PCI_EXP_DEVCTL_URRE);
>>+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>>+	}
>>+
>>+	/* Enable ECRC generation and check */
>>+	if (pcie_cap) {
>>+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
>
>The AER cap should have been cached in eeh-powernv.c::pnv_eeh_probe(). Similar
>to the case of PCIE cap, you need check if the AER cap is valid or not.
>
>>+		eeh_ops->read_config(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
>>+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
>>+		eeh_ops->write_config(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
>>+	}
>>+
>>+	return 0;
>>+}
>>+#endif /* CONFIG_PCI_IOV */
>>+
>> static int pnv_eeh_restore_config(struct pci_dn *pdn)
>> {
>> 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>>@@ -1611,7 +1664,11 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
>> 		return -EEXIST;
>>
>> 	phb = edev->phb->private_data;
>>-	ret = opal_pci_reinit(phb->opal_id,
>>+	/* FW is not VF aware, we rely on OS to restore it */
>
>Please change the comment to:
>
>	/*
>	 * We have to restore the PCI config space after reset since
>	 * the firmware can't see SRIOV VFs.
>	 */
>
>>+	if (edev->physfn)
>>+		ret = pnv_eeh_restore_vf_config(pdn);
>>+	else
>>+		ret = opal_pci_reinit(phb->opal_id,
>> 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
>> 	if (ret) {
>> 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
>>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
>>index bca2aeb..31d0258 100644
>>--- a/arch/powerpc/platforms/powernv/pci.c
>>+++ b/arch/powerpc/platforms/powernv/pci.c
>>@@ -781,3 +781,19 @@ machine_subsys_initcall_sync(powernv, tce_iommu_bus_notifier_init);
>> struct pci_controller_ops pnv_pci_controller_ops = {
>> 	.dma_dev_setup = pnv_pci_dma_dev_setup,
>> };
>>+
>>+static void pnv_pci_fixup_vf_caps(struct pci_dev *pdev)
>>+{
>>+	struct pci_dn *pdn = pci_get_pdn(pdev);
>>+	int parent_mps;
>>+
>>+	if (!pdev->is_virtfn)
>>+		return;
>>+
>>+	/* Synchronize MPS for VF and PF */
>>+	parent_mps = pcie_get_mps(pdev->physfn);
>>+	if ((128 << pdev->pcie_mpss) >= parent_mps)
>>+		pcie_set_mps(pdev, parent_mps);
>>+	pdn->mps = pcie_get_mps(pdev);
>>+}
>>+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_caps);
>
>Thanks,
>Gavin

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH V4 08/11] powerpc/powernv: Support PCI config restore for VFs
@ 2015-05-15  9:18       ` Wei Yang
  0 siblings, 0 replies; 44+ messages in thread
From: Wei Yang @ 2015-05-15  9:18 UTC (permalink / raw)
  To: Gavin Shan; +Cc: bhelgaas, linux-pci, Wei Yang, linuxppc-dev

On Fri, May 15, 2015 at 05:27:52PM +1000, Gavin Shan wrote:
>On Fri, May 15, 2015 at 01:46:23PM +0800, Wei Yang wrote:
>>Since skiboot firmware is not aware of VFs, the restore action for VF
>>should be done in kernel.
>>
>>The patch introduces function pnv_eeh_restore_vf_config() to restore PCI
>>config space for VFs after reset.
>>
>>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>>---
>> arch/powerpc/include/asm/pci-bridge.h        |    1 +
>> arch/powerpc/platforms/powernv/eeh-powernv.c |   59 +++++++++++++++++++++++++-
>> arch/powerpc/platforms/powernv/pci.c         |   16 +++++++
>> 3 files changed, 75 insertions(+), 1 deletion(-)
>>
>>diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>>index d78afe4..168b991 100644
>>--- a/arch/powerpc/include/asm/pci-bridge.h
>>+++ b/arch/powerpc/include/asm/pci-bridge.h
>>@@ -205,6 +205,7 @@ struct pci_dn {
>> 	int     m64_per_iov;
>> #define IODA_INVALID_M64        (-1)
>> 	int     m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
>>+	int	mps;
>> #endif /* CONFIG_PCI_IOV */
>> #endif
>> 	struct list_head child_list;
>>diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>index 61f1a55..e200ed1 100644
>>--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
>>+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
>>@@ -1601,6 +1601,59 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
>> 	return ret;
>> }
>>
>>+#ifdef CONFIG_PCI_IOV
>>+static int pnv_eeh_restore_vf_config(struct pci_dn *pdn)
>>+{
>>+	int pcie_cap, aer_cap, old_mps;
>>+	u32 devctl, cmd, cap2, aer_capctl;
>>+
>
>It's worthy to check if PCIE cap is valid or not.
>
>>+	/* Restore MPS */
>>+	pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP);
>>+	if (pcie_cap) {
>>+		old_mps = (ffs(pdn->mps) - 8) << 5;
>>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>>+		devctl &= ~PCI_EXP_DEVCTL_PAYLOAD;
>>+		devctl |= old_mps;
>>+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>>+	}
>>+
>>+	/* Disable Completion Timeout */
>>+	if (pcie_cap) {
>>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCAP2, 4, &cap2);
>>+		if (cap2 & 0x10) {
>
>There should have one macro for "0x10" in pci_regs.h. If so, please use that one.

Actually, no.

I have checked the kernel, it doesn't has this field. That's why I put it
here.

>
>>+			eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, &cap2);
>>+			cap2 |= 0x10;
>>+			eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL2, 4, cap2);
>>+		}
>>+	}
>>+
>>+	/* Enable SERR and parity checking */
>>+	eeh_ops->read_config(pdn, PCI_COMMAND, 2, &cmd);
>>+	cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR);
>>+	eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd);
>>+
>>+	/* Enable report various errors */
>>+	if (pcie_cap) {
>>+		eeh_ops->read_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, &devctl);
>>+		devctl &= ~PCI_EXP_DEVCTL_CERE;
>>+		devctl |= (PCI_EXP_DEVCTL_NFERE |
>>+			   PCI_EXP_DEVCTL_FERE |
>>+			   PCI_EXP_DEVCTL_URRE);
>>+		eeh_ops->write_config(pdn, pcie_cap + PCI_EXP_DEVCTL, 2, devctl);
>>+	}
>>+
>>+	/* Enable ECRC generation and check */
>>+	if (pcie_cap) {
>>+		aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR);
>
>The AER cap should have been cached in eeh-powernv.c::pnv_eeh_probe(). Similar
>to the case of PCIE cap, you need check if the AER cap is valid or not.
>
>>+		eeh_ops->read_config(pdn, aer_cap + PCI_ERR_CAP, 4, &aer_capctl);
>>+		aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE);
>>+		eeh_ops->write_config(pdn, aer_cap + PCI_ERR_CAP, 4, aer_capctl);
>>+	}
>>+
>>+	return 0;
>>+}
>>+#endif /* CONFIG_PCI_IOV */
>>+
>> static int pnv_eeh_restore_config(struct pci_dn *pdn)
>> {
>> 	struct eeh_dev *edev = pdn_to_eeh_dev(pdn);
>>@@ -1611,7 +1664,11 @@ static int pnv_eeh_restore_config(struct pci_dn *pdn)
>> 		return -EEXIST;
>>
>> 	phb = edev->phb->private_data;
>>-	ret = opal_pci_reinit(phb->opal_id,
>>+	/* FW is not VF aware, we rely on OS to restore it */
>
>Please change the comment to:
>
>	/*
>	 * We have to restore the PCI config space after reset since
>	 * the firmware can't see SRIOV VFs.
>	 */
>
>>+	if (edev->physfn)
>>+		ret = pnv_eeh_restore_vf_config(pdn);
>>+	else
>>+		ret = opal_pci_reinit(phb->opal_id,
>> 			      OPAL_REINIT_PCI_DEV, edev->config_addr);
>> 	if (ret) {
>> 		pr_warn("%s: Can't reinit PCI dev 0x%x (%lld)\n",
>>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
>>index bca2aeb..31d0258 100644
>>--- a/arch/powerpc/platforms/powernv/pci.c
>>+++ b/arch/powerpc/platforms/powernv/pci.c
>>@@ -781,3 +781,19 @@ machine_subsys_initcall_sync(powernv, tce_iommu_bus_notifier_init);
>> struct pci_controller_ops pnv_pci_controller_ops = {
>> 	.dma_dev_setup = pnv_pci_dma_dev_setup,
>> };
>>+
>>+static void pnv_pci_fixup_vf_caps(struct pci_dev *pdev)
>>+{
>>+	struct pci_dn *pdn = pci_get_pdn(pdev);
>>+	int parent_mps;
>>+
>>+	if (!pdev->is_virtfn)
>>+		return;
>>+
>>+	/* Synchronize MPS for VF and PF */
>>+	parent_mps = pcie_get_mps(pdev->physfn);
>>+	if ((128 << pdev->pcie_mpss) >= parent_mps)
>>+		pcie_set_mps(pdev, parent_mps);
>>+	pdn->mps = pcie_get_mps(pdev);
>>+}
>>+DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pnv_pci_fixup_vf_caps);
>
>Thanks,
>Gavin

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2015-05-15  9:19 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-15  5:46 [PATCH V4 00/11] VF EEH on Power8 Wei Yang
2015-05-15  5:46 ` Wei Yang
2015-05-15  5:46 ` [PATCH V4 01/11] pci/iov: rename and export virtfn_add/virtfn_remove Wei Yang
2015-05-15  5:46   ` Wei Yang
2015-05-15  5:56   ` Gavin Shan
2015-05-15  5:56     ` Gavin Shan
2015-05-15  5:46 ` [PATCH V4 02/11] powerpc/pci_dn: cache vf_index in pci_dn Wei Yang
2015-05-15  5:46   ` Wei Yang
2015-05-15  5:57   ` Gavin Shan
2015-05-15  5:57     ` Gavin Shan
2015-05-15  5:46 ` [PATCH V4 03/11] powerpc/pci: remove PCI devices in reverse order Wei Yang
2015-05-15  5:46   ` Wei Yang
2015-05-15  5:46 ` [PATCH V4 04/11] powerpc/eeh: cache address range just for normal device Wei Yang
2015-05-15  5:46   ` Wei Yang
2015-05-15  5:46 ` [PATCH V4 05/11] powerpc/powernv: create/release eeh_dev for VF Wei Yang
2015-05-15  5:46   ` Wei Yang
2015-05-15  6:19   ` Gavin Shan
2015-05-15  6:19     ` Gavin Shan
2015-05-15  8:53     ` Wei Yang
2015-05-15  8:53       ` Wei Yang
2015-05-15  5:46 ` [PATCH V4 06/11] powerpc/eeh: create EEH_PE_VF for VF PE Wei Yang
2015-05-15  5:46   ` Wei Yang
2015-05-15  6:26   ` Gavin Shan
2015-05-15  6:26     ` Gavin Shan
2015-05-15  5:46 ` [PATCH V4 07/11] powerpc/powernv: Support EEH reset for VFs Wei Yang
2015-05-15  5:46   ` Wei Yang
2015-05-15  7:12   ` Gavin Shan
2015-05-15  7:12     ` Gavin Shan
2015-05-15  5:46 ` [PATCH V4 08/11] powerpc/powernv: Support PCI config restore " Wei Yang
2015-05-15  5:46   ` Wei Yang
2015-05-15  7:27   ` Gavin Shan
2015-05-15  7:27     ` Gavin Shan
2015-05-15  9:18     ` Wei Yang
2015-05-15  9:18       ` Wei Yang
2015-05-15  5:46 ` [PATCH V4 09/11] powerpc/eeh: handle VF PE properly Wei Yang
2015-05-15  5:46   ` Wei Yang
2015-05-15  7:31   ` Gavin Shan
2015-05-15  7:31     ` Gavin Shan
2015-05-15  5:46 ` [PATCH V4 10/11] powerpc/powernv: use "compound" as the child's list_head for compound PE Wei Yang
2015-05-15  5:46   ` Wei Yang
2015-05-15  7:37   ` Gavin Shan
2015-05-15  7:37     ` Gavin Shan
2015-05-15  5:46 ` [PATCH V4 11/11] powerpc/powernv: compound PE for VFs Wei Yang
2015-05-15  5:46   ` Wei Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.