linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] SR-IOV fixes and cleanup
@ 2015-10-29 22:22 Bjorn Helgaas
  2015-10-29 22:22 ` [PATCH v2 1/7] PCI: Set SR-IOV NumVFs to zero after enumeration Bjorn Helgaas
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Bjorn Helgaas @ 2015-10-29 22:22 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: linux-pci, Ethan Zhao, Wei Yang, linux-kernel

This is basically a revision of Alex's recent series:
http://lkml.kernel.org/r/20151027204607.14626.59671.stgit@localhost.localdomain

I squashed some fixes into the original patch, dropped two, and split
the rest up a little differently:

  iov: Update virtfn_max_buses to validate offset and stride
    Folded into the patch it fixed

  iov: Reset resources to 0 if totalVFs increases after enabling ARI
    Dropped on the theory that the resource allocation code
    should already handle this case

  iov: Fix sriov_enable exception handling path
    Split into a couple patches to make it easier to review

  iov: Variable and loop cleanup for sriov_disable and sriov_enable
    Dropped for readability

  iov: Update sriov_enable to correctly handle offset and stride
    Split into the piece that removes code made redundant by "PCI: Set
    SR-IOV NumVFs to zero after enumeration" and the piece that fixes
    the bug (using offset/stride before setting numVFs)

This is all on my pci/virtualization branch, and I'm still hoping to
merge this for v4.4.

---

Alexander Duyck (7):
      PCI: Set SR-IOV NumVFs to zero after enumeration
      PCI: Remove redundant validation of SR-IOV offset/stride registers
      PCI: Remove VFs in reverse order if virtfn_add() fails
      PCI: Reorder pcibios_sriov_disable()
      PCI: Wait 1 second between disabling VFs and clearing NumVFs
      PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures
      PCI: Set NumVFs before computing how many buses VFs require


 drivers/pci/iov.c |  100 ++++++++++++++++++++++++++---------------------------
 1 file changed, 49 insertions(+), 51 deletions(-)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/7] PCI: Set SR-IOV NumVFs to zero after enumeration
  2015-10-29 22:22 [PATCH v2 0/7] SR-IOV fixes and cleanup Bjorn Helgaas
@ 2015-10-29 22:22 ` Bjorn Helgaas
  2015-10-30  3:48   ` Wei Yang
  2015-10-29 22:23 ` [PATCH v2 2/7] PCI: Remove redundant validation of SR-IOV offset/stride registers Bjorn Helgaas
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2015-10-29 22:22 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: linux-pci, Ethan Zhao, Wei Yang, linux-kernel

From: Alexander Duyck <aduyck@mirantis.com>

The enumeration path should leave NumVFs set to zero.  But after
4449f079722c ("PCI: Calculate maximum number of buses required for VFs"),
we call virtfn_max_buses() in the enumeration path, which changes NumVFs.
This NumVFs change is visible via lspci and sysfs until a driver enables
SR-IOV.

Iterate from TotalVFs down to zero so NumVFs is zero when we're finished
computing the maximum number of buses.  Validate offset and stride in
the loop, so we can test it at every possible NumVFs setting.  Rename
virtfn_max_buses() to compute_max_vf_buses() to hint that it does have a
side effect of updating iov->max_VF_buses.

[bhelgaas: changelog, rename, allow numVF==1 && stride==0, rework loop,
reverse sense of error path]
Fixes: 4449f079722c ("PCI: Calculate maximum number of buses required for VFs")
Based-on-patch-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/iov.c |   41 ++++++++++++++++++++++-------------------
 1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 0cdb2d1..1b1acc2 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -54,24 +54,29 @@ static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
  * The PF consumes one bus number.  NumVFs, First VF Offset, and VF Stride
  * determine how many additional bus numbers will be consumed by VFs.
  *
- * Iterate over all valid NumVFs and calculate the maximum number of bus
- * numbers that could ever be required.
+ * Iterate over all valid NumVFs, validate offset and stride, and calculate
+ * the maximum number of bus numbers that could ever be required.
  */
-static inline u8 virtfn_max_buses(struct pci_dev *dev)
+static int compute_max_vf_buses(struct pci_dev *dev)
 {
 	struct pci_sriov *iov = dev->sriov;
-	int nr_virtfn;
-	u8 max = 0;
-	int busnr;
+	int nr_virtfn, busnr, rc = 0;
 
-	for (nr_virtfn = 1; nr_virtfn <= iov->total_VFs; nr_virtfn++) {
+	for (nr_virtfn = iov->total_VFs; nr_virtfn; nr_virtfn--) {
 		pci_iov_set_numvfs(dev, nr_virtfn);
+		if (!iov->offset || (nr_virtfn > 1 && !iov->stride)) {
+			rc = -EIO;
+			goto out;
+		}
+
 		busnr = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
-		if (busnr > max)
-			max = busnr;
+		if (busnr > iov->max_VF_buses)
+			iov->max_VF_buses = busnr;
 	}
 
-	return max;
+out:
+	pci_iov_set_numvfs(dev, 0);
+	return rc;
 }
 
 static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
@@ -384,7 +389,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
 	int rc;
 	int nres;
 	u32 pgsz;
-	u16 ctrl, total, offset, stride;
+	u16 ctrl, total;
 	struct pci_sriov *iov;
 	struct resource *res;
 	struct pci_dev *pdev;
@@ -410,11 +415,6 @@ static int sriov_init(struct pci_dev *dev, int pos)
 
 found:
 	pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);
-	pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, 0);
-	pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset);
-	pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride);
-	if (!offset || (total > 1 && !stride))
-		return -EIO;
 
 	pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &total);
 	if (!total)
@@ -456,8 +456,6 @@ found:
 	iov->nres = nres;
 	iov->ctrl = ctrl;
 	iov->total_VFs = total;
-	iov->offset = offset;
-	iov->stride = stride;
 	iov->pgsz = pgsz;
 	iov->self = dev;
 	pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
@@ -474,10 +472,15 @@ found:
 
 	dev->sriov = iov;
 	dev->is_physfn = 1;
-	iov->max_VF_buses = virtfn_max_buses(dev);
+	rc = compute_max_vf_buses(dev);
+	if (rc)
+		goto fail_max_buses;
 
 	return 0;
 
+fail_max_buses:
+	dev->sriov = NULL;
+	dev->is_physfn = 0;
 failed:
 	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
 		res = &dev->resource[i + PCI_IOV_RESOURCES];


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/7] PCI: Remove redundant validation of SR-IOV offset/stride registers
  2015-10-29 22:22 [PATCH v2 0/7] SR-IOV fixes and cleanup Bjorn Helgaas
  2015-10-29 22:22 ` [PATCH v2 1/7] PCI: Set SR-IOV NumVFs to zero after enumeration Bjorn Helgaas
@ 2015-10-29 22:23 ` Bjorn Helgaas
  2015-10-30  5:08   ` Wei Yang
  2015-10-29 22:23 ` [PATCH v2 3/7] PCI: Remove VFs in reverse order if virtfn_add() fails Bjorn Helgaas
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2015-10-29 22:23 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: linux-pci, Ethan Zhao, Wei Yang, linux-kernel

From: Alexander Duyck <aduyck@mirantis.com>

Previously, we read, validated, and cached PCI_SRIOV_VF_OFFSET and
PCI_SRIOV_VF_STRIDE in sriov_enable().  But sriov_init() now does
that via compute_max_vf_buses(), so we don't need to do it again.

Remove the PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE config reads from
sriov_enable().  The pci_sriov structure already contains the offset and
stride corresponding to the current NumVFs.

[bhelgaas: split to separate patch for reviewability]
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/iov.c |   10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 1b1acc2..ca400a9 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -235,7 +235,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	int rc;
 	int i, j;
 	int nres;
-	u16 offset, stride, initial;
+	u16 initial;
 	struct resource *res;
 	struct pci_dev *pdev;
 	struct pci_sriov *iov = dev->sriov;
@@ -258,11 +258,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	    (!(iov->cap & PCI_SRIOV_CAP_VFM) && (nr_virtfn > initial)))
 		return -EINVAL;
 
-	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_OFFSET, &offset);
-	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_STRIDE, &stride);
-	if (!offset || (nr_virtfn > 1 && !stride))
-		return -EIO;
-
 	nres = 0;
 	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
 		bars |= (1 << (i + PCI_IOV_RESOURCES));
@@ -275,9 +270,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 		return -ENOMEM;
 	}
 
-	iov->offset = offset;
-	iov->stride = stride;
-
 	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
 	if (bus > dev->bus->busn_res.end) {
 		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 3/7] PCI: Remove VFs in reverse order if virtfn_add() fails
  2015-10-29 22:22 [PATCH v2 0/7] SR-IOV fixes and cleanup Bjorn Helgaas
  2015-10-29 22:22 ` [PATCH v2 1/7] PCI: Set SR-IOV NumVFs to zero after enumeration Bjorn Helgaas
  2015-10-29 22:23 ` [PATCH v2 2/7] PCI: Remove redundant validation of SR-IOV offset/stride registers Bjorn Helgaas
@ 2015-10-29 22:23 ` Bjorn Helgaas
  2015-10-30  5:09   ` Wei Yang
  2015-10-29 22:23 ` [PATCH v2 4/7] PCI: Reorder pcibios_sriov_disable() Bjorn Helgaas
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2015-10-29 22:23 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: linux-pci, Ethan Zhao, Wei Yang, linux-kernel

From: Alexander Duyck <aduyck@mirantis.com>

If virtfn_add() fails, we call virtfn_remove() for any previously added
devices.  Remove the devices in reverse order (first-added is
last-removed), which is more natural and doesn't require an additional
variable.

[bhelgaas: changelog, split to separate patch for reviewability]
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/iov.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index ca400a9..c86d94c 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -233,7 +233,7 @@ int __weak pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 {
 	int rc;
-	int i, j;
+	int i;
 	int nres;
 	u16 initial;
 	struct resource *res;
@@ -328,8 +328,8 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	return 0;
 
 failed:
-	for (j = 0; j < i; j++)
-		virtfn_remove(dev, j, 0);
+	while (i--)
+		virtfn_remove(dev, i, 0);
 
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
 	pci_cfg_access_lock(dev);


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 4/7] PCI: Reorder pcibios_sriov_disable()
  2015-10-29 22:22 [PATCH v2 0/7] SR-IOV fixes and cleanup Bjorn Helgaas
                   ` (2 preceding siblings ...)
  2015-10-29 22:23 ` [PATCH v2 3/7] PCI: Remove VFs in reverse order if virtfn_add() fails Bjorn Helgaas
@ 2015-10-29 22:23 ` Bjorn Helgaas
  2015-10-30  5:09   ` Wei Yang
  2015-10-29 22:23 ` [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs Bjorn Helgaas
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2015-10-29 22:23 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: linux-pci, Ethan Zhao, Wei Yang, linux-kernel

From: Alexander Duyck <aduyck@mirantis.com>

Move pcibios_sriov_disable() up so it's defined before a future use.

[bhelgaas: split to separate patch for reviewability]
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/iov.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index c86d94c..fada98d 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -227,7 +227,12 @@ static void virtfn_remove(struct pci_dev *dev, int id, int reset)
 
 int __weak pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 {
-       return 0;
+	return 0;
+}
+
+int __weak pcibios_sriov_disable(struct pci_dev *pdev)
+{
+	return 0;
 }
 
 static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
@@ -344,11 +349,6 @@ failed:
 	return rc;
 }
 
-int __weak pcibios_sriov_disable(struct pci_dev *pdev)
-{
-       return 0;
-}
-
 static void sriov_disable(struct pci_dev *dev)
 {
 	int i;


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs
  2015-10-29 22:22 [PATCH v2 0/7] SR-IOV fixes and cleanup Bjorn Helgaas
                   ` (3 preceding siblings ...)
  2015-10-29 22:23 ` [PATCH v2 4/7] PCI: Reorder pcibios_sriov_disable() Bjorn Helgaas
@ 2015-10-29 22:23 ` Bjorn Helgaas
  2015-10-30  5:14   ` Wei Yang
  2015-11-03  2:04   ` Wei Yang
  2015-10-29 22:23 ` [PATCH v2 6/7] PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures Bjorn Helgaas
  2015-10-29 22:23 ` [PATCH v2 7/7] PCI: Set NumVFs before computing how many buses VFs require Bjorn Helgaas
  6 siblings, 2 replies; 28+ messages in thread
From: Bjorn Helgaas @ 2015-10-29 22:23 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: linux-pci, Ethan Zhao, Wei Yang, linux-kernel

From: Alexander Duyck <aduyck@mirantis.com>

Per sec 3.3.3.1 of the SR-IOV spec, r1.1, we must allow 1.0s after clearing
VF Enable before reading any field in the SR-IOV Extended Capability.

Wait 1 second before calling pci_iov_set_numvfs(), which reads
PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE after it sets PCI_SRIOV_NUM_VF.

[bhelgaas: split to separate patch for reviewability, add spec reference]
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/iov.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index fada98d..24428d5 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -339,13 +339,13 @@ failed:
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
-	pci_iov_set_numvfs(dev, 0);
 	ssleep(1);
 	pci_cfg_access_unlock(dev);
 
 	if (iov->link != dev->devfn)
 		sysfs_remove_link(&dev->dev.kobj, "dep_link");
 
+	pci_iov_set_numvfs(dev, 0);
 	return rc;
 }
 


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 6/7] PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures
  2015-10-29 22:22 [PATCH v2 0/7] SR-IOV fixes and cleanup Bjorn Helgaas
                   ` (4 preceding siblings ...)
  2015-10-29 22:23 ` [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs Bjorn Helgaas
@ 2015-10-29 22:23 ` Bjorn Helgaas
  2015-10-30  5:18   ` Wei Yang
  2015-10-29 22:23 ` [PATCH v2 7/7] PCI: Set NumVFs before computing how many buses VFs require Bjorn Helgaas
  6 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2015-10-29 22:23 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: linux-pci, Ethan Zhao, Wei Yang, linux-kernel

From: Alexander Duyck <aduyck@mirantis.com>

Disable VFs if pcibios_enable_sriov() fails, just like we do for other
errors in sriov_enable().  Call pcibios_sriov_disable() if virtfn_add()
fails.

[bhelgaas: changelog, split to separate patch for reviewability]
Fixes: 995df527f399 ("PCI: Add pcibios_sriov_enable() and pcibios_sriov_disable()")
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/iov.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 24428d5..bd1c4fa 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -246,7 +246,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	struct pci_sriov *iov = dev->sriov;
 	int bars = 0;
 	int bus;
-	int retval;
 
 	if (!nr_virtfn)
 		return 0;
@@ -315,10 +314,10 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	if (nr_virtfn < initial)
 		initial = nr_virtfn;
 
-	if ((retval = pcibios_sriov_enable(dev, initial))) {
-		dev_err(&dev->dev, "failure %d from pcibios_sriov_enable()\n",
-			retval);
-		return retval;
+	rc = pcibios_sriov_enable(dev, initial);
+	if (rc) {
+		dev_err(&dev->dev, "failure %d from pcibios_sriov_enable()\n", rc);
+		goto err_pcibios;
 	}
 
 	for (i = 0; i < initial; i++) {
@@ -336,6 +335,8 @@ failed:
 	while (i--)
 		virtfn_remove(dev, i, 0);
 
+	pcibios_sriov_disable(dev);
+err_pcibios:
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 7/7] PCI: Set NumVFs before computing how many buses VFs require
  2015-10-29 22:22 [PATCH v2 0/7] SR-IOV fixes and cleanup Bjorn Helgaas
                   ` (5 preceding siblings ...)
  2015-10-29 22:23 ` [PATCH v2 6/7] PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures Bjorn Helgaas
@ 2015-10-29 22:23 ` Bjorn Helgaas
  2015-10-30  5:22   ` Wei Yang
  6 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2015-10-29 22:23 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: linux-pci, Ethan Zhao, Wei Yang, linux-kernel

From: Alexander Duyck <aduyck@mirantis.com>

VF bus numbers depend on the First VF Offset and VF Stride, and per
sections 3.3.9 and 3.3.10 of the SR-IOV spec r1.1, these depend on the
NumVF value.

Wait until after we set NumVFs to compute and validate the bus number of
the last VF.

[bhelgaas: changelog, add spec reference, split to separate patch for
reviewability]
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/iov.c |   18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index bd1c4fa..9d29712 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -274,13 +274,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 		return -ENOMEM;
 	}
 
-	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
-	if (bus > dev->bus->busn_res.end) {
-		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
-			nr_virtfn, bus, &dev->bus->busn_res);
-		return -ENOMEM;
-	}
-
 	if (pci_enable_resources(dev, bars)) {
 		dev_err(&dev->dev, "SR-IOV: IOV BARS not allocated\n");
 		return -ENOMEM;
@@ -304,6 +297,15 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	}
 
 	pci_iov_set_numvfs(dev, nr_virtfn);
+
+	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
+	if (bus > dev->bus->busn_res.end) {
+		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
+			nr_virtfn, bus, &dev->bus->busn_res);
+		rc = -ENOMEM;
+		goto err_bus;
+	}
+
 	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
@@ -342,7 +344,7 @@ err_pcibios:
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
 	pci_cfg_access_unlock(dev);
-
+err_bus:
 	if (iov->link != dev->devfn)
 		sysfs_remove_link(&dev->dev.kobj, "dep_link");
 


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/7] PCI: Set SR-IOV NumVFs to zero after enumeration
  2015-10-29 22:22 ` [PATCH v2 1/7] PCI: Set SR-IOV NumVFs to zero after enumeration Bjorn Helgaas
@ 2015-10-30  3:48   ` Wei Yang
  2015-10-30  5:25     ` Wei Yang
  2015-10-30 15:40     ` Alexander Duyck
  0 siblings, 2 replies; 28+ messages in thread
From: Wei Yang @ 2015-10-30  3:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, Ethan Zhao, Wei Yang, linux-kernel

On Thu, Oct 29, 2015 at 05:22:54PM -0500, Bjorn Helgaas wrote:
>ines: 115
>
>From: Alexander Duyck <aduyck@mirantis.com>
>
>The enumeration path should leave NumVFs set to zero.  But after
>4449f079722c ("PCI: Calculate maximum number of buses required for VFs"),
>we call virtfn_max_buses() in the enumeration path, which changes NumVFs.
>This NumVFs change is visible via lspci and sysfs until a driver enables
>SR-IOV.
>
>Iterate from TotalVFs down to zero so NumVFs is zero when we're finished
>computing the maximum number of buses.  Validate offset and stride in
>the loop, so we can test it at every possible NumVFs setting.  Rename
>virtfn_max_buses() to compute_max_vf_buses() to hint that it does have a
>side effect of updating iov->max_VF_buses.
>
>[bhelgaas: changelog, rename, allow numVF==1 && stride==0, rework loop,
>reverse sense of error path]
>Fixes: 4449f079722c ("PCI: Calculate maximum number of buses required for VFs")
>Based-on-patch-by: Ethan Zhao <ethan.zhao@oracle.com>
>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>---
> drivers/pci/iov.c |   41 ++++++++++++++++++++++-------------------
> 1 file changed, 22 insertions(+), 19 deletions(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index 0cdb2d1..1b1acc2 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -54,24 +54,29 @@ static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
>  * The PF consumes one bus number.  NumVFs, First VF Offset, and VF Stride
>  * determine how many additional bus numbers will be consumed by VFs.
>  *
>- * Iterate over all valid NumVFs and calculate the maximum number of bus
>- * numbers that could ever be required.
>+ * Iterate over all valid NumVFs, validate offset and stride, and calculate
>+ * the maximum number of bus numbers that could ever be required.
>  */
>-static inline u8 virtfn_max_buses(struct pci_dev *dev)
>+static int compute_max_vf_buses(struct pci_dev *dev)
> {
> 	struct pci_sriov *iov = dev->sriov;
>-	int nr_virtfn;
>-	u8 max = 0;
>-	int busnr;
>+	int nr_virtfn, busnr, rc = 0;
>
>-	for (nr_virtfn = 1; nr_virtfn <= iov->total_VFs; nr_virtfn++) {
>+	for (nr_virtfn = iov->total_VFs; nr_virtfn; nr_virtfn--) {

Hmm... I agree to iterate the nr_virtfn from total_VFs to 0, which keep the
original logic, before my patch.


> 		pci_iov_set_numvfs(dev, nr_virtfn);
>+		if (!iov->offset || (nr_virtfn > 1 && !iov->stride)) {
                                     ^^^

Should be this?
                if (!iov->offset || (iov->total_VFs > 1 && !iov->stride))


>+			rc = -EIO;
>+			goto out;
>+		}
>+
> 		busnr = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>-		if (busnr > max)
>-			max = busnr;
>+		if (busnr > iov->max_VF_buses)
>+			iov->max_VF_buses = busnr;
> 	}
>
>-	return max;
>+out:
>+	pci_iov_set_numvfs(dev, 0);
>+	return rc;
> }
>
> static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
>@@ -384,7 +389,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
> 	int rc;
> 	int nres;
> 	u32 pgsz;
>-	u16 ctrl, total, offset, stride;
>+	u16 ctrl, total;
> 	struct pci_sriov *iov;
> 	struct resource *res;
> 	struct pci_dev *pdev;
>@@ -410,11 +415,6 @@ static int sriov_init(struct pci_dev *dev, int pos)
>
> found:
> 	pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);
>-	pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, 0);
>-	pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset);
>-	pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride);
>-	if (!offset || (total > 1 && !stride))
>-		return -EIO;
>

Original code will return when it found this condition, so that we don't need
to bother to do those initialization and then fall back.

So I suggest to keep it here.

> 	pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &total);
> 	if (!total)
>@@ -456,8 +456,6 @@ found:
> 	iov->nres = nres;
> 	iov->ctrl = ctrl;
> 	iov->total_VFs = total;
>-	iov->offset = offset;
>-	iov->stride = stride;
> 	iov->pgsz = pgsz;
> 	iov->self = dev;
> 	pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
>@@ -474,10 +472,15 @@ found:
>
> 	dev->sriov = iov;
> 	dev->is_physfn = 1;
>-	iov->max_VF_buses = virtfn_max_buses(dev);
>+	rc = compute_max_vf_buses(dev);
>+	if (rc)
>+		goto fail_max_buses;
>
> 	return 0;
>
>+fail_max_buses:
>+	dev->sriov = NULL;

The dev->sriov is allocated with kzalloc(), seems we forget to release it?

>+	dev->is_physfn = 0;
> failed:
> 	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
> 		res = &dev->resource[i + PCI_IOV_R

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/7] PCI: Remove redundant validation of SR-IOV offset/stride registers
  2015-10-29 22:23 ` [PATCH v2 2/7] PCI: Remove redundant validation of SR-IOV offset/stride registers Bjorn Helgaas
@ 2015-10-30  5:08   ` Wei Yang
  0 siblings, 0 replies; 28+ messages in thread
From: Wei Yang @ 2015-10-30  5:08 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, Ethan Zhao, Wei Yang, linux-kernel

On Thu, Oct 29, 2015 at 05:23:01PM -0500, Bjorn Helgaas wrote:
>From: Alexander Duyck <aduyck@mirantis.com>
>
>Previously, we read, validated, and cached PCI_SRIOV_VF_OFFSET and
>PCI_SRIOV_VF_STRIDE in sriov_enable().  But sriov_init() now does
>that via compute_max_vf_buses(), so we don't need to do it again.
>
>Remove the PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE config reads from
>sriov_enable().  The pci_sriov structure already contains the offset and
>stride corresponding to the current NumVFs.
>
>[bhelgaas: split to separate patch for reviewability]
>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>

>---
> drivers/pci/iov.c |   10 +---------
> 1 file changed, 1 insertion(+), 9 deletions(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index 1b1acc2..ca400a9 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -235,7 +235,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 	int rc;
> 	int i, j;
> 	int nres;
>-	u16 offset, stride, initial;
>+	u16 initial;
> 	struct resource *res;
> 	struct pci_dev *pdev;
> 	struct pci_sriov *iov = dev->sriov;
>@@ -258,11 +258,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 	    (!(iov->cap & PCI_SRIOV_CAP_VFM) && (nr_virtfn > initial)))
> 		return -EINVAL;
>
>-	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_OFFSET, &offset);
>-	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_STRIDE, &stride);
>-	if (!offset || (nr_virtfn > 1 && !stride))
>-		return -EIO;
>-
> 	nres = 0;
> 	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
> 		bars |= (1 << (i + PCI_IOV_RESOURCES));
>@@ -275,9 +270,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 		return -ENOMEM;
> 	}
>
>-	iov->offset = offset;
>-	iov->stride = stride;
>-
> 	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
> 	if (bus > dev->bus->busn_res.end) {
> 		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 3/7] PCI: Remove VFs in reverse order if virtfn_add() fails
  2015-10-29 22:23 ` [PATCH v2 3/7] PCI: Remove VFs in reverse order if virtfn_add() fails Bjorn Helgaas
@ 2015-10-30  5:09   ` Wei Yang
  0 siblings, 0 replies; 28+ messages in thread
From: Wei Yang @ 2015-10-30  5:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, Ethan Zhao, Wei Yang, linux-kernel

On Thu, Oct 29, 2015 at 05:23:08PM -0500, Bjorn Helgaas wrote:
>From: Alexander Duyck <aduyck@mirantis.com>
>
>If virtfn_add() fails, we call virtfn_remove() for any previously added
>devices.  Remove the devices in reverse order (first-added is
>last-removed), which is more natural and doesn't require an additional
>variable.
>
>[bhelgaas: changelog, split to separate patch for reviewability]
>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Interesting.

Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>

>---
> drivers/pci/iov.c |    6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index ca400a9..c86d94c 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -233,7 +233,7 @@ int __weak pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
> static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> {
> 	int rc;
>-	int i, j;
>+	int i;
> 	int nres;
> 	u16 initial;
> 	struct resource *res;
>@@ -328,8 +328,8 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 	return 0;
>
> failed:
>-	for (j = 0; j < i; j++)
>-		virtfn_remove(dev, j, 0);
>+	while (i--)
>+		virtfn_remove(dev, i, 0);
>
> 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
> 	pci_cfg_access_lock(dev);

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 4/7] PCI: Reorder pcibios_sriov_disable()
  2015-10-29 22:23 ` [PATCH v2 4/7] PCI: Reorder pcibios_sriov_disable() Bjorn Helgaas
@ 2015-10-30  5:09   ` Wei Yang
  0 siblings, 0 replies; 28+ messages in thread
From: Wei Yang @ 2015-10-30  5:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, Ethan Zhao, Wei Yang, linux-kernel

On Thu, Oct 29, 2015 at 05:23:15PM -0500, Bjorn Helgaas wrote:
>From: Alexander Duyck <aduyck@mirantis.com>
>
>Move pcibios_sriov_disable() up so it's defined before a future use.
>
>[bhelgaas: split to separate patch for reviewability]
>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>

>---
> drivers/pci/iov.c |   12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index c86d94c..fada98d 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -227,7 +227,12 @@ static void virtfn_remove(struct pci_dev *dev, int id, int reset)
>
> int __weak pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
> {
>-       return 0;
>+	return 0;
>+}
>+
>+int __weak pcibios_sriov_disable(struct pci_dev *pdev)
>+{
>+	return 0;
> }
>
> static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>@@ -344,11 +349,6 @@ failed:
> 	return rc;
> }
>
>-int __weak pcibios_sriov_disable(struct pci_dev *pdev)
>-{
>-       return 0;
>-}
>-
> static void sriov_disable(struct pci_dev *dev)
> {
> 	int i;

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs
  2015-10-29 22:23 ` [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs Bjorn Helgaas
@ 2015-10-30  5:14   ` Wei Yang
  2015-10-30  6:00     ` ethan zhao
  2015-11-03  2:04   ` Wei Yang
  1 sibling, 1 reply; 28+ messages in thread
From: Wei Yang @ 2015-10-30  5:14 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, Ethan Zhao, Wei Yang, linux-kernel

On Thu, Oct 29, 2015 at 05:23:22PM -0500, Bjorn Helgaas wrote:
>From: Alexander Duyck <aduyck@mirantis.com>
>
>Per sec 3.3.3.1 of the SR-IOV spec, r1.1, we must allow 1.0s after clearing
>VF Enable before reading any field in the SR-IOV Extended Capability.
>
>Wait 1 second before calling pci_iov_set_numvfs(), which reads
>PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE after it sets PCI_SRIOV_NUM_VF.
>
>[bhelgaas: split to separate patch for reviewability, add spec reference]
>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>---
> drivers/pci/iov.c |    2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index fada98d..24428d5 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -339,13 +339,13 @@ failed:
> 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
> 	pci_cfg_access_lock(dev);
> 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>-	pci_iov_set_numvfs(dev, 0);
> 	ssleep(1);
> 	pci_cfg_access_unlock(dev);
>
> 	if (iov->link != dev->devfn)
> 		sysfs_remove_link(&dev->dev.kobj, "dep_link");
>
>+	pci_iov_set_numvfs(dev, 0);

One small question, any specific reason put it here instead of just after
sleep()?

> 	return rc;
> }
>

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 6/7] PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures
  2015-10-29 22:23 ` [PATCH v2 6/7] PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures Bjorn Helgaas
@ 2015-10-30  5:18   ` Wei Yang
  0 siblings, 0 replies; 28+ messages in thread
From: Wei Yang @ 2015-10-30  5:18 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, Ethan Zhao, Wei Yang, linux-kernel

On Thu, Oct 29, 2015 at 05:23:29PM -0500, Bjorn Helgaas wrote:
>From: Alexander Duyck <aduyck@mirantis.com>
>
>Disable VFs if pcibios_enable_sriov() fails, just like we do for other
>errors in sriov_enable().  Call pcibios_sriov_disable() if virtfn_add()
>fails.
>
>[bhelgaas: changelog, split to separate patch for reviewability]
>Fixes: 995df527f399 ("PCI: Add pcibios_sriov_enable() and pcibios_sriov_disable()")
>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>

>---
> drivers/pci/iov.c |   11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index 24428d5..bd1c4fa 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -246,7 +246,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 	struct pci_sriov *iov = dev->sriov;
> 	int bars = 0;
> 	int bus;
>-	int retval;
>
> 	if (!nr_virtfn)
> 		return 0;
>@@ -315,10 +314,10 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 	if (nr_virtfn < initial)
> 		initial = nr_virtfn;
>
>-	if ((retval = pcibios_sriov_enable(dev, initial))) {
>-		dev_err(&dev->dev, "failure %d from pcibios_sriov_enable()\n",
>-			retval);
>-		return retval;
>+	rc = pcibios_sriov_enable(dev, initial);
>+	if (rc) {
>+		dev_err(&dev->dev, "failure %d from pcibios_sriov_enable()\n", rc);
>+		goto err_pcibios;
> 	}
>
> 	for (i = 0; i < initial; i++) {
>@@ -336,6 +335,8 @@ failed:
> 	while (i--)
> 		virtfn_remove(dev, i, 0);
>
>+	pcibios_sriov_disable(dev);
>+err_pcibios:
> 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
> 	pci_cfg_access_lock(dev);
> 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/7] PCI: Set NumVFs before computing how many buses VFs require
  2015-10-29 22:23 ` [PATCH v2 7/7] PCI: Set NumVFs before computing how many buses VFs require Bjorn Helgaas
@ 2015-10-30  5:22   ` Wei Yang
  2015-10-30 16:03     ` Alexander Duyck
  0 siblings, 1 reply; 28+ messages in thread
From: Wei Yang @ 2015-10-30  5:22 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, Ethan Zhao, Wei Yang, linux-kernel

On Thu, Oct 29, 2015 at 05:23:36PM -0500, Bjorn Helgaas wrote:
>From: Alexander Duyck <aduyck@mirantis.com>
>
>VF bus numbers depend on the First VF Offset and VF Stride, and per
>sections 3.3.9 and 3.3.10 of the SR-IOV spec r1.1, these depend on the
>NumVF value.
>
>Wait until after we set NumVFs to compute and validate the bus number of
>the last VF.
>
>[bhelgaas: changelog, add spec reference, split to separate patch for
>reviewability]
>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>---
> drivers/pci/iov.c |   18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index bd1c4fa..9d29712 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -274,13 +274,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 		return -ENOMEM;
> 	}
>
>-	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>-	if (bus > dev->bus->busn_res.end) {
>-		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
>-			nr_virtfn, bus, &dev->bus->busn_res);
>-		return -ENOMEM;
>-	}
>-
> 	if (pci_enable_resources(dev, bars)) {
> 		dev_err(&dev->dev, "SR-IOV: IOV BARS not allocated\n");
> 		return -ENOMEM;
>@@ -304,6 +297,15 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
> 	}
>
> 	pci_iov_set_numvfs(dev, nr_virtfn);

How about move it up?

>+
>+	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>+	if (bus > dev->bus->busn_res.end) {
>+		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
>+			nr_virtfn, bus, &dev->bus->busn_res);
>+		rc = -ENOMEM;
>+		goto err_bus;
>+	}
>+
> 	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
> 	pci_cfg_access_lock(dev);
> 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>@@ -342,7 +344,7 @@ err_pcibios:
> 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
> 	ssleep(1);
> 	pci_cfg_access_unlock(dev);
>-
>+err_bus:
> 	if (iov->link != dev->devfn)
> 		sysfs_remove_link(&dev->dev.kobj, "dep_link");
>

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/7] PCI: Set SR-IOV NumVFs to zero after enumeration
  2015-10-30  3:48   ` Wei Yang
@ 2015-10-30  5:25     ` Wei Yang
  2015-10-30 15:40     ` Alexander Duyck
  1 sibling, 0 replies; 28+ messages in thread
From: Wei Yang @ 2015-10-30  5:25 UTC (permalink / raw)
  To: Wei Yang
  Cc: Bjorn Helgaas, Alexander Duyck, linux-pci, Ethan Zhao, linux-kernel

On Fri, Oct 30, 2015 at 11:48:21AM +0800, Wei Yang wrote:
>On Thu, Oct 29, 2015 at 05:22:54PM -0500, Bjorn Helgaas wrote:
>>ines: 115
>>
>>From: Alexander Duyck <aduyck@mirantis.com>
>>
>>The enumeration path should leave NumVFs set to zero.  But after
>>4449f079722c ("PCI: Calculate maximum number of buses required for VFs"),
>>we call virtfn_max_buses() in the enumeration path, which changes NumVFs.
>>This NumVFs change is visible via lspci and sysfs until a driver enables
>>SR-IOV.
>>
>>Iterate from TotalVFs down to zero so NumVFs is zero when we're finished
>>computing the maximum number of buses.  Validate offset and stride in
>>the loop, so we can test it at every possible NumVFs setting.  Rename
>>virtfn_max_buses() to compute_max_vf_buses() to hint that it does have a
>>side effect of updating iov->max_VF_buses.
>>
>>[bhelgaas: changelog, rename, allow numVF==1 && stride==0, rework loop,
>>reverse sense of error path]
>>Fixes: 4449f079722c ("PCI: Calculate maximum number of buses required for VFs")
>>Based-on-patch-by: Ethan Zhao <ethan.zhao@oracle.com>
>>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>---
>> drivers/pci/iov.c |   41 ++++++++++++++++++++++-------------------
>> 1 file changed, 22 insertions(+), 19 deletions(-)
>>
>>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>>index 0cdb2d1..1b1acc2 100644
>>--- a/drivers/pci/iov.c
>>+++ b/drivers/pci/iov.c
>>@@ -54,24 +54,29 @@ static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
>>  * The PF consumes one bus number.  NumVFs, First VF Offset, and VF Stride
>>  * determine how many additional bus numbers will be consumed by VFs.
>>  *
>>- * Iterate over all valid NumVFs and calculate the maximum number of bus
>>- * numbers that could ever be required.
>>+ * Iterate over all valid NumVFs, validate offset and stride, and calculate
>>+ * the maximum number of bus numbers that could ever be required.
>>  */
>>-static inline u8 virtfn_max_buses(struct pci_dev *dev)
>>+static int compute_max_vf_buses(struct pci_dev *dev)
>> {
>> 	struct pci_sriov *iov = dev->sriov;
>>-	int nr_virtfn;
>>-	u8 max = 0;
>>-	int busnr;
>>+	int nr_virtfn, busnr, rc = 0;
>>
>>-	for (nr_virtfn = 1; nr_virtfn <= iov->total_VFs; nr_virtfn++) {
>>+	for (nr_virtfn = iov->total_VFs; nr_virtfn; nr_virtfn--) {
>
>Hmm... I agree to iterate the nr_virtfn from total_VFs to 0, which keep the
>original logic, before my patch.
>
>
>> 		pci_iov_set_numvfs(dev, nr_virtfn);
>>+		if (!iov->offset || (nr_virtfn > 1 && !iov->stride)) {
>                                     ^^^

After reading the SPEC, I think this code is correct. The original one removed
below may not comply with the SPEC.

>
>Should be this?
>                if (!iov->offset || (iov->total_VFs > 1 && !iov->stride))
>
>
>>+			rc = -EIO;
>>+			goto out;
>>+		}
>>+
>> 		busnr = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>>-		if (busnr > max)
>>-			max = busnr;
>>+		if (busnr > iov->max_VF_buses)
>>+			iov->max_VF_buses = busnr;
>> 	}
>>
>>-	return max;
>>+out:
>>+	pci_iov_set_numvfs(dev, 0);
>>+	return rc;
>> }
>>
>> static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
>>@@ -384,7 +389,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
>> 	int rc;
>> 	int nres;
>> 	u32 pgsz;
>>-	u16 ctrl, total, offset, stride;
>>+	u16 ctrl, total;
>> 	struct pci_sriov *iov;
>> 	struct resource *res;
>> 	struct pci_dev *pdev;
>>@@ -410,11 +415,6 @@ static int sriov_init(struct pci_dev *dev, int pos)
>>
>> found:
>> 	pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);
>>-	pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, 0);
>>-	pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset);
>>-	pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride);
>>-	if (!offset || (total > 1 && !stride))
>>-		return -EIO;
>>
>
>Original code will return when it found this condition, so that we don't need
>to bother to do those initialization and then fall back.
>
>So I suggest to keep it here.
>
>> 	pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &total);
>> 	if (!total)
>>@@ -456,8 +456,6 @@ found:
>> 	iov->nres = nres;
>> 	iov->ctrl = ctrl;
>> 	iov->total_VFs = total;
>>-	iov->offset = offset;
>>-	iov->stride = stride;
>> 	iov->pgsz = pgsz;
>> 	iov->self = dev;
>> 	pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
>>@@ -474,10 +472,15 @@ found:
>>
>> 	dev->sriov = iov;
>> 	dev->is_physfn = 1;
>>-	iov->max_VF_buses = virtfn_max_buses(dev);
>>+	rc = compute_max_vf_buses(dev);
>>+	if (rc)
>>+		goto fail_max_buses;
>>
>> 	return 0;
>>
>>+fail_max_buses:
>>+	dev->sriov = NULL;
>
>The dev->sriov is allocated with kzalloc(), seems we forget to release it?
>
>>+	dev->is_physfn = 0;
>> failed:
>> 	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
>> 		res = &dev->resource[i + PCI_IOV_R
>
>-- 
>Richard Yang
>Help you, Help me

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs
  2015-10-30  5:14   ` Wei Yang
@ 2015-10-30  6:00     ` ethan zhao
  2015-10-30 15:57       ` Alexander Duyck
  0 siblings, 1 reply; 28+ messages in thread
From: ethan zhao @ 2015-10-30  6:00 UTC (permalink / raw)
  To: Wei Yang, Bjorn Helgaas; +Cc: Alexander Duyck, linux-pci, linux-kernel

Wei,

On 2015/10/30 13:14, Wei Yang wrote:
> On Thu, Oct 29, 2015 at 05:23:22PM -0500, Bjorn Helgaas wrote:
>> From: Alexander Duyck <aduyck@mirantis.com>
>>
>> Per sec 3.3.3.1 of the SR-IOV spec, r1.1, we must allow 1.0s after clearing
>> VF Enable before reading any field in the SR-IOV Extended Capability.
>>
>> Wait 1 second before calling pci_iov_set_numvfs(), which reads
>> PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE after it sets PCI_SRIOV_NUM_VF.
>>
>> [bhelgaas: split to separate patch for reviewability, add spec reference]
>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>> ---
>> drivers/pci/iov.c |    2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index fada98d..24428d5 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -339,13 +339,13 @@ failed:
>> 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
>> 	pci_cfg_access_lock(dev);
>> 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>> -	pci_iov_set_numvfs(dev, 0);
>> 	ssleep(1);
>> 	pci_cfg_access_unlock(dev);
>>
>> 	if (iov->link != dev->devfn)
>> 		sysfs_remove_link(&dev->dev.kobj, "dep_link");
>>
>> +	pci_iov_set_numvfs(dev, 0);
> One small question, any specific reason put it here instead of just after
> sleep()?
  Agree,  pci_iov_set_numvfs(dev, 0) should be put before 
pci_cfg_access_unlock(dev) to avoid race,  because "NumVFs may only be 
written while VF Enable is Clear"

  Thanks,
  Ethan
>> 	return rc;
>> }
>>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/7] PCI: Set SR-IOV NumVFs to zero after enumeration
  2015-10-30  3:48   ` Wei Yang
  2015-10-30  5:25     ` Wei Yang
@ 2015-10-30 15:40     ` Alexander Duyck
  2015-11-02  7:28       ` Wei Yang
  1 sibling, 1 reply; 28+ messages in thread
From: Alexander Duyck @ 2015-10-30 15:40 UTC (permalink / raw)
  To: Wei Yang, Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, Ethan Zhao, linux-kernel

On 10/29/2015 08:48 PM, Wei Yang wrote:
> On Thu, Oct 29, 2015 at 05:22:54PM -0500, Bjorn Helgaas wrote:
>> ines: 115
>>
>> From: Alexander Duyck <aduyck@mirantis.com>
>>
>> The enumeration path should leave NumVFs set to zero.  But after
>> 4449f079722c ("PCI: Calculate maximum number of buses required for VFs"),
>> we call virtfn_max_buses() in the enumeration path, which changes NumVFs.
>> This NumVFs change is visible via lspci and sysfs until a driver enables
>> SR-IOV.
>>
>> Iterate from TotalVFs down to zero so NumVFs is zero when we're finished
>> computing the maximum number of buses.  Validate offset and stride in
>> the loop, so we can test it at every possible NumVFs setting.  Rename
>> virtfn_max_buses() to compute_max_vf_buses() to hint that it does have a
>> side effect of updating iov->max_VF_buses.
>>
>> [bhelgaas: changelog, rename, allow numVF==1 && stride==0, rework loop,
>> reverse sense of error path]
>> Fixes: 4449f079722c ("PCI: Calculate maximum number of buses required for VFs")
>> Based-on-patch-by: Ethan Zhao <ethan.zhao@oracle.com>
>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>> ---
>> drivers/pci/iov.c |   41 ++++++++++++++++++++++-------------------
>> 1 file changed, 22 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index 0cdb2d1..1b1acc2 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -54,24 +54,29 @@ static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
>>   * The PF consumes one bus number.  NumVFs, First VF Offset, and VF Stride
>>   * determine how many additional bus numbers will be consumed by VFs.
>>   *
>> - * Iterate over all valid NumVFs and calculate the maximum number of bus
>> - * numbers that could ever be required.
>> + * Iterate over all valid NumVFs, validate offset and stride, and calculate
>> + * the maximum number of bus numbers that could ever be required.
>>   */
>> -static inline u8 virtfn_max_buses(struct pci_dev *dev)
>> +static int compute_max_vf_buses(struct pci_dev *dev)
>> {
>> 	struct pci_sriov *iov = dev->sriov;
>> -	int nr_virtfn;
>> -	u8 max = 0;
>> -	int busnr;
>> +	int nr_virtfn, busnr, rc = 0;
>>
>> -	for (nr_virtfn = 1; nr_virtfn <= iov->total_VFs; nr_virtfn++) {
>> +	for (nr_virtfn = iov->total_VFs; nr_virtfn; nr_virtfn--) {
> Hmm... I agree to iterate the nr_virtfn from total_VFs to 0, which keep the
> original logic, before my patch.
>
>
>> 		pci_iov_set_numvfs(dev, nr_virtfn);
>> +		if (!iov->offset || (nr_virtfn > 1 && !iov->stride)) {
>                                       ^^^
>
> Should be this?
>                  if (!iov->offset || (iov->total_VFs > 1 && !iov->stride))

The problem is the spec says stride and offset can change depending on 
the value of NumVFs.  We end up testing all values from TotalVFs to 1.  
The spec states that stride is unused if NumVFs is 1, not TotalVFs.

>
>> +			rc = -EIO;
>> +			goto out;
>> +		}
>> +
>> 		busnr = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>> -		if (busnr > max)
>> -			max = busnr;
>> +		if (busnr > iov->max_VF_buses)
>> +			iov->max_VF_buses = busnr;
>> 	}
>>
>> -	return max;
>> +out:
>> +	pci_iov_set_numvfs(dev, 0);
>> +	return rc;
>> }
>>
>> static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
>> @@ -384,7 +389,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
>> 	int rc;
>> 	int nres;
>> 	u32 pgsz;
>> -	u16 ctrl, total, offset, stride;
>> +	u16 ctrl, total;
>> 	struct pci_sriov *iov;
>> 	struct resource *res;
>> 	struct pci_dev *pdev;
>> @@ -410,11 +415,6 @@ static int sriov_init(struct pci_dev *dev, int pos)
>>
>> found:
>> 	pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);
>> -	pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, 0);
>> -	pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset);
>> -	pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride);
>> -	if (!offset || (total > 1 && !stride))
>> -		return -EIO;
>>
> Original code will return when it found this condition, so that we don't need
> to bother to do those initialization and then fall back.
>
> So I suggest to keep it here.

The problem is this code isn't valid as per the spec.  The offset and 
stride are considered unused when numvfs is 0.  That is why this has to 
be dropped.

>> 	pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &total);
>> 	if (!total)
>> @@ -456,8 +456,6 @@ found:
>> 	iov->nres = nres;
>> 	iov->ctrl = ctrl;
>> 	iov->total_VFs = total;
>> -	iov->offset = offset;
>> -	iov->stride = stride;
>> 	iov->pgsz = pgsz;
>> 	iov->self = dev;
>> 	pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
>> @@ -474,10 +472,15 @@ found:
>>
>> 	dev->sriov = iov;
>> 	dev->is_physfn = 1;
>> -	iov->max_VF_buses = virtfn_max_buses(dev);
>> +	rc = compute_max_vf_buses(dev);
>> +	if (rc)
>> +		goto fail_max_buses;
>>
>> 	return 0;
>>
>> +fail_max_buses:
>> +	dev->sriov = NULL;
> The dev->sriov is allocated with kzalloc(), seems we forget to release it?

No, if you check at the end of the function we release it via a kfree(iov).

>> +	dev->is_physfn = 0;
>> failed:
>> 	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
>> 		res = &dev->resource[i + PCI_IOV_R


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs
  2015-10-30  6:00     ` ethan zhao
@ 2015-10-30 15:57       ` Alexander Duyck
  2015-11-02  8:33         ` Wei Yang
  0 siblings, 1 reply; 28+ messages in thread
From: Alexander Duyck @ 2015-10-30 15:57 UTC (permalink / raw)
  To: ethan zhao, Wei Yang, Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, linux-kernel

On 10/29/2015 11:00 PM, ethan zhao wrote:
> Wei,
>
> On 2015/10/30 13:14, Wei Yang wrote:
>> On Thu, Oct 29, 2015 at 05:23:22PM -0500, Bjorn Helgaas wrote:
>>> From: Alexander Duyck <aduyck@mirantis.com>
>>>
>>> Per sec 3.3.3.1 of the SR-IOV spec, r1.1, we must allow 1.0s after 
>>> clearing
>>> VF Enable before reading any field in the SR-IOV Extended Capability.
>>>
>>> Wait 1 second before calling pci_iov_set_numvfs(), which reads
>>> PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE after it sets 
>>> PCI_SRIOV_NUM_VF.
>>>
>>> [bhelgaas: split to separate patch for reviewability, add spec 
>>> reference]
>>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>> ---
>>> drivers/pci/iov.c |    2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>>> index fada98d..24428d5 100644
>>> --- a/drivers/pci/iov.c
>>> +++ b/drivers/pci/iov.c
>>> @@ -339,13 +339,13 @@ failed:
>>>     iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
>>>     pci_cfg_access_lock(dev);
>>>     pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>>> -    pci_iov_set_numvfs(dev, 0);
>>>     ssleep(1);
>>>     pci_cfg_access_unlock(dev);
>>>
>>>     if (iov->link != dev->devfn)
>>>         sysfs_remove_link(&dev->dev.kobj, "dep_link");
>>>
>>> +    pci_iov_set_numvfs(dev, 0);
>> One small question, any specific reason put it here instead of just 
>> after
>> sleep()?
>  Agree,  pci_iov_set_numvfs(dev, 0) should be put before 
> pci_cfg_access_unlock(dev) to avoid race,  because "NumVFs may only be 
> written while VF Enable is Clear"

We are already guaranteeing that aren't we?  I'm assuming there is 
already code in place here somewhere that prevents us from both enabling 
and disabling SR-IOV from more than one thread.  Otherwise how could we 
hope to have any sort of consistent state?

I'm fine with us being more explicit about it if we want to be, but if 
we are going to do it we should probably update all 3 spots where we 
update NumVFs after init instead of just this one.  Perhaps it should be 
a separate patch.

- Alex

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/7] PCI: Set NumVFs before computing how many buses VFs require
  2015-10-30  5:22   ` Wei Yang
@ 2015-10-30 16:03     ` Alexander Duyck
  2015-11-02  8:27       ` Wei Yang
  0 siblings, 1 reply; 28+ messages in thread
From: Alexander Duyck @ 2015-10-30 16:03 UTC (permalink / raw)
  To: Wei Yang, Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, Ethan Zhao, linux-kernel

On 10/29/2015 10:22 PM, Wei Yang wrote:
> On Thu, Oct 29, 2015 at 05:23:36PM -0500, Bjorn Helgaas wrote:
>> From: Alexander Duyck <aduyck@mirantis.com>
>>
>> VF bus numbers depend on the First VF Offset and VF Stride, and per
>> sections 3.3.9 and 3.3.10 of the SR-IOV spec r1.1, these depend on the
>> NumVF value.
>>
>> Wait until after we set NumVFs to compute and validate the bus number of
>> the last VF.
>>
>> [bhelgaas: changelog, add spec reference, split to separate patch for
>> reviewability]
>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>> ---
>> drivers/pci/iov.c |   18 ++++++++++--------
>> 1 file changed, 10 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index bd1c4fa..9d29712 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -274,13 +274,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>> 		return -ENOMEM;
>> 	}
>>
>> -	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>> -	if (bus > dev->bus->busn_res.end) {
>> -		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
>> -			nr_virtfn, bus, &dev->bus->busn_res);
>> -		return -ENOMEM;
>> -	}
>> -
>> 	if (pci_enable_resources(dev, bars)) {
>> 		dev_err(&dev->dev, "SR-IOV: IOV BARS not allocated\n");
>> 		return -ENOMEM;
>> @@ -304,6 +297,15 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>> 	}
>>
>> 	pci_iov_set_numvfs(dev, nr_virtfn);
> How about move it up?

The idea with moving the write down is to keep the pollution of the 
SR-IOV capability to a minimum.  Basically we have addressed all of the 
possible software issues at this point so all that remains is possible 
hardware complications.  In addition by moving this code down we only 
have to modify this code instead of adding "rc=X; goto foo;" in places 
where "return X;" was used.

Also this is an exception case.  There really isn't much point in 
optimizing for something that should never really happen.

>> +
>> +	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>> +	if (bus > dev->bus->busn_res.end) {
>> +		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
>> +			nr_virtfn, bus, &dev->bus->busn_res);
>> +		rc = -ENOMEM;
>> +		goto err_bus;
>> +	}
>> +
>> 	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
>> 	pci_cfg_access_lock(dev);
>> 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>> @@ -342,7 +344,7 @@ err_pcibios:
>> 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>> 	ssleep(1);
>> 	pci_cfg_access_unlock(dev);
>> -
>> +err_bus:
>> 	if (iov->link != dev->devfn)
>> 		sysfs_remove_link(&dev->dev.kobj, "dep_link");
>>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 1/7] PCI: Set SR-IOV NumVFs to zero after enumeration
  2015-10-30 15:40     ` Alexander Duyck
@ 2015-11-02  7:28       ` Wei Yang
  0 siblings, 0 replies; 28+ messages in thread
From: Wei Yang @ 2015-11-02  7:28 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Wei Yang, Bjorn Helgaas, Alexander Duyck, linux-pci, Ethan Zhao,
	linux-kernel

On Fri, Oct 30, 2015 at 08:40:52AM -0700, Alexander Duyck wrote:
>On 10/29/2015 08:48 PM, Wei Yang wrote:
>>On Thu, Oct 29, 2015 at 05:22:54PM -0500, Bjorn Helgaas wrote:
>>>ines: 115
>>>
>>>From: Alexander Duyck <aduyck@mirantis.com>
>>>
>>>The enumeration path should leave NumVFs set to zero.  But after
>>>4449f079722c ("PCI: Calculate maximum number of buses required for VFs"),
>>>we call virtfn_max_buses() in the enumeration path, which changes NumVFs.
>>>This NumVFs change is visible via lspci and sysfs until a driver enables
>>>SR-IOV.
>>>
>>>Iterate from TotalVFs down to zero so NumVFs is zero when we're finished
>>>computing the maximum number of buses.  Validate offset and stride in
>>>the loop, so we can test it at every possible NumVFs setting.  Rename
>>>virtfn_max_buses() to compute_max_vf_buses() to hint that it does have a
>>>side effect of updating iov->max_VF_buses.
>>>
>>>[bhelgaas: changelog, rename, allow numVF==1 && stride==0, rework loop,
>>>reverse sense of error path]
>>>Fixes: 4449f079722c ("PCI: Calculate maximum number of buses required for VFs")
>>>Based-on-patch-by: Ethan Zhao <ethan.zhao@oracle.com>
>>>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>>>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>>---
>>>drivers/pci/iov.c |   41 ++++++++++++++++++++++-------------------
>>>1 file changed, 22 insertions(+), 19 deletions(-)
>>>
>>>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>>>index 0cdb2d1..1b1acc2 100644
>>>--- a/drivers/pci/iov.c
>>>+++ b/drivers/pci/iov.c
>>>@@ -54,24 +54,29 @@ static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
>>>  * The PF consumes one bus number.  NumVFs, First VF Offset, and VF Stride
>>>  * determine how many additional bus numbers will be consumed by VFs.
>>>  *
>>>- * Iterate over all valid NumVFs and calculate the maximum number of bus
>>>- * numbers that could ever be required.
>>>+ * Iterate over all valid NumVFs, validate offset and stride, and calculate
>>>+ * the maximum number of bus numbers that could ever be required.
>>>  */
>>>-static inline u8 virtfn_max_buses(struct pci_dev *dev)
>>>+static int compute_max_vf_buses(struct pci_dev *dev)
>>>{
>>>	struct pci_sriov *iov = dev->sriov;
>>>-	int nr_virtfn;
>>>-	u8 max = 0;
>>>-	int busnr;
>>>+	int nr_virtfn, busnr, rc = 0;
>>>
>>>-	for (nr_virtfn = 1; nr_virtfn <= iov->total_VFs; nr_virtfn++) {
>>>+	for (nr_virtfn = iov->total_VFs; nr_virtfn; nr_virtfn--) {
>>Hmm... I agree to iterate the nr_virtfn from total_VFs to 0, which keep the
>>original logic, before my patch.
>>
>>
>>>		pci_iov_set_numvfs(dev, nr_virtfn);
>>>+		if (!iov->offset || (nr_virtfn > 1 && !iov->stride)) {
>>                                      ^^^
>>
>>Should be this?
>>                 if (!iov->offset || (iov->total_VFs > 1 && !iov->stride))
>
>The problem is the spec says stride and offset can change depending on the
>value of NumVFs.  We end up testing all values from TotalVFs to 1.  The spec
>states that stride is unused if NumVFs is 1, not TotalVFs.
>

Yes, I checked the SPEC again, and found this change is correct.

>>
>>>+			rc = -EIO;
>>>+			goto out;
>>>+		}
>>>+
>>>		busnr = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>>>-		if (busnr > max)
>>>-			max = busnr;
>>>+		if (busnr > iov->max_VF_buses)
>>>+			iov->max_VF_buses = busnr;
>>>	}
>>>
>>>-	return max;
>>>+out:
>>>+	pci_iov_set_numvfs(dev, 0);
>>>+	return rc;
>>>}
>>>
>>>static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
>>>@@ -384,7 +389,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
>>>	int rc;
>>>	int nres;
>>>	u32 pgsz;
>>>-	u16 ctrl, total, offset, stride;
>>>+	u16 ctrl, total;
>>>	struct pci_sriov *iov;
>>>	struct resource *res;
>>>	struct pci_dev *pdev;
>>>@@ -410,11 +415,6 @@ static int sriov_init(struct pci_dev *dev, int pos)
>>>
>>>found:
>>>	pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);
>>>-	pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, 0);
>>>-	pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset);
>>>-	pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride);
>>>-	if (!offset || (total > 1 && !stride))
>>>-		return -EIO;
>>>
>>Original code will return when it found this condition, so that we don't need
>>to bother to do those initialization and then fall back.
>>
>>So I suggest to keep it here.
>
>The problem is this code isn't valid as per the spec.  The offset and stride
>are considered unused when numvfs is 0.  That is why this has to be dropped.
>
>>>	pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &total);
>>>	if (!total)
>>>@@ -456,8 +456,6 @@ found:
>>>	iov->nres = nres;
>>>	iov->ctrl = ctrl;
>>>	iov->total_VFs = total;
>>>-	iov->offset = offset;
>>>-	iov->stride = stride;
>>>	iov->pgsz = pgsz;
>>>	iov->self = dev;
>>>	pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
>>>@@ -474,10 +472,15 @@ found:
>>>
>>>	dev->sriov = iov;
>>>	dev->is_physfn = 1;
>>>-	iov->max_VF_buses = virtfn_max_buses(dev);
>>>+	rc = compute_max_vf_buses(dev);
>>>+	if (rc)
>>>+		goto fail_max_buses;
>>>
>>>	return 0;
>>>
>>>+fail_max_buses:
>>>+	dev->sriov = NULL;
>>The dev->sriov is allocated with kzalloc(), seems we forget to release it?
>
>No, if you check at the end of the function we release it via a kfree(iov).
>

Right.

>>>+	dev->is_physfn = 0;
>>>failed:
>>>	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
>>>		res = &dev->resource[i + PCI_IOV_R

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/7] PCI: Set NumVFs before computing how many buses VFs require
  2015-10-30 16:03     ` Alexander Duyck
@ 2015-11-02  8:27       ` Wei Yang
  2015-11-02 15:39         ` Alexander Duyck
  0 siblings, 1 reply; 28+ messages in thread
From: Wei Yang @ 2015-11-02  8:27 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Wei Yang, Bjorn Helgaas, Alexander Duyck, linux-pci, Ethan Zhao,
	linux-kernel

On Fri, Oct 30, 2015 at 09:03:54AM -0700, Alexander Duyck wrote:
>On 10/29/2015 10:22 PM, Wei Yang wrote:
>>On Thu, Oct 29, 2015 at 05:23:36PM -0500, Bjorn Helgaas wrote:
>>>From: Alexander Duyck <aduyck@mirantis.com>
>>>
>>>VF bus numbers depend on the First VF Offset and VF Stride, and per
>>>sections 3.3.9 and 3.3.10 of the SR-IOV spec r1.1, these depend on the
>>>NumVF value.
>>>
>>>Wait until after we set NumVFs to compute and validate the bus number of
>>>the last VF.
>>>
>>>[bhelgaas: changelog, add spec reference, split to separate patch for
>>>reviewability]
>>>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>>>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>>---
>>>drivers/pci/iov.c |   18 ++++++++++--------
>>>1 file changed, 10 insertions(+), 8 deletions(-)
>>>
>>>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>>>index bd1c4fa..9d29712 100644
>>>--- a/drivers/pci/iov.c
>>>+++ b/drivers/pci/iov.c
>>>@@ -274,13 +274,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>>>		return -ENOMEM;
>>>	}
>>>
>>>-	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>>>-	if (bus > dev->bus->busn_res.end) {
>>>-		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
>>>-			nr_virtfn, bus, &dev->bus->busn_res);
>>>-		return -ENOMEM;
>>>-	}
>>>-
>>>	if (pci_enable_resources(dev, bars)) {
>>>		dev_err(&dev->dev, "SR-IOV: IOV BARS not allocated\n");
>>>		return -ENOMEM;
>>>@@ -304,6 +297,15 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>>>	}
>>>
>>>	pci_iov_set_numvfs(dev, nr_virtfn);
>>How about move it up?
>
>The idea with moving the write down is to keep the pollution of the SR-IOV
>capability to a minimum.  Basically we have addressed all of the possible
>software issues at this point so all that remains is possible hardware
>complications.  In addition by moving this code down we only have to modify
>this code instead of adding "rc=X; goto foo;" in places where "return X;" was
>used.
>

I think your logic is clear, while it is not easy to classify the software
issue and hardware complications. For example, at the beginning of
sriov_enable(), the hardware value initial VFs number is checked.

And in my mind, this is reasonable to check the hardware issue before software
issue.

For your comment, adding "rc=X; goto foo;", I don't see this would happen.
The code in my mind would like this:


+	pci_iov_set_numvfs(dev, nr_virtfn);
 	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
 	if (bus > dev->bus->busn_res.end) {
 		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
 			nr_virtfn, bus, &dev->bus->busn_res);
 		return -ENOMEM;
 	}
 
	if (pci_enable_resources(dev, bars)) {
		dev_err(&dev->dev, "SR-IOV: IOV BARS not allocated\n");
		return -ENOMEM;

Do I missed something?

>Also this is an exception case.  There really isn't much point in optimizing
>for something that should never really happen.
>
>>>+
>>>+	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>>>+	if (bus > dev->bus->busn_res.end) {
>>>+		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
>>>+			nr_virtfn, bus, &dev->bus->busn_res);
>>>+		rc = -ENOMEM;
>>>+		goto err_bus;
>>>+	}
>>>+
>>>	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
>>>	pci_cfg_access_lock(dev);
>>>	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>>>@@ -342,7 +344,7 @@ err_pcibios:
>>>	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>>>	ssleep(1);
>>>	pci_cfg_access_unlock(dev);
>>>-
>>>+err_bus:
>>>	if (iov->link != dev->devfn)
>>>		sysfs_remove_link(&dev->dev.kobj, "dep_link");
>>>
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs
  2015-10-30 15:57       ` Alexander Duyck
@ 2015-11-02  8:33         ` Wei Yang
  2015-11-02 15:46           ` Alexander Duyck
  0 siblings, 1 reply; 28+ messages in thread
From: Wei Yang @ 2015-11-02  8:33 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: ethan zhao, Wei Yang, Bjorn Helgaas, Alexander Duyck, linux-pci,
	linux-kernel

On Fri, Oct 30, 2015 at 08:57:17AM -0700, Alexander Duyck wrote:
>On 10/29/2015 11:00 PM, ethan zhao wrote:
>>Wei,
>>
>>On 2015/10/30 13:14, Wei Yang wrote:
>>>On Thu, Oct 29, 2015 at 05:23:22PM -0500, Bjorn Helgaas wrote:
>>>>From: Alexander Duyck <aduyck@mirantis.com>
>>>>
>>>>Per sec 3.3.3.1 of the SR-IOV spec, r1.1, we must allow 1.0s after
>>>>clearing
>>>>VF Enable before reading any field in the SR-IOV Extended Capability.
>>>>
>>>>Wait 1 second before calling pci_iov_set_numvfs(), which reads
>>>>PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE after it sets
>>>>PCI_SRIOV_NUM_VF.
>>>>
>>>>[bhelgaas: split to separate patch for reviewability, add spec
>>>>reference]
>>>>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>>>>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>>>---
>>>>drivers/pci/iov.c |    2 +-
>>>>1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>>>>index fada98d..24428d5 100644
>>>>--- a/drivers/pci/iov.c
>>>>+++ b/drivers/pci/iov.c
>>>>@@ -339,13 +339,13 @@ failed:
>>>>    iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
>>>>    pci_cfg_access_lock(dev);
>>>>    pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>>>>-    pci_iov_set_numvfs(dev, 0);
>>>>    ssleep(1);
>>>>    pci_cfg_access_unlock(dev);
>>>>
>>>>    if (iov->link != dev->devfn)
>>>>        sysfs_remove_link(&dev->dev.kobj, "dep_link");
>>>>
>>>>+    pci_iov_set_numvfs(dev, 0);
>>>One small question, any specific reason put it here instead of just after
>>>sleep()?
>> Agree,  pci_iov_set_numvfs(dev, 0) should be put before
>>pci_cfg_access_unlock(dev) to avoid race,  because "NumVFs may only be
>>written while VF Enable is Clear"
>
>We are already guaranteeing that aren't we?  I'm assuming there is already
>code in place here somewhere that prevents us from both enabling and
>disabling SR-IOV from more than one thread.  Otherwise how could we hope to
>have any sort of consistent state?
>
>I'm fine with us being more explicit about it if we want to be, but if we are
>going to do it we should probably update all 3 spots where we update NumVFs
>after init instead of just this one.  Perhaps it should be a separate patch.
>

Yep, I think the statement is met, "NumVFs may only be written while VF Enable
is Clear".

While in your commit log, the purpose of this patch is to wait 1 second before
write NumVFs. So I am interesting to know why you move this out of the
pci_cfg_access_lock. Because it looks better? have better performance?

Actually, this is a question instead of a challenge :-)

>- Alex

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/7] PCI: Set NumVFs before computing how many buses VFs require
  2015-11-02  8:27       ` Wei Yang
@ 2015-11-02 15:39         ` Alexander Duyck
  2015-11-03  2:18           ` Wei Yang
  0 siblings, 1 reply; 28+ messages in thread
From: Alexander Duyck @ 2015-11-02 15:39 UTC (permalink / raw)
  To: Wei Yang
  Cc: Bjorn Helgaas, Alexander Duyck, linux-pci, Ethan Zhao, linux-kernel

On 11/02/2015 12:27 AM, Wei Yang wrote:
> On Fri, Oct 30, 2015 at 09:03:54AM -0700, Alexander Duyck wrote:
>> On 10/29/2015 10:22 PM, Wei Yang wrote:
>>> On Thu, Oct 29, 2015 at 05:23:36PM -0500, Bjorn Helgaas wrote:
>>>> From: Alexander Duyck <aduyck@mirantis.com>
>>>>
>>>> VF bus numbers depend on the First VF Offset and VF Stride, and per
>>>> sections 3.3.9 and 3.3.10 of the SR-IOV spec r1.1, these depend on the
>>>> NumVF value.
>>>>
>>>> Wait until after we set NumVFs to compute and validate the bus number of
>>>> the last VF.
>>>>
>>>> [bhelgaas: changelog, add spec reference, split to separate patch for
>>>> reviewability]
>>>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>>>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>>> ---
>>>> drivers/pci/iov.c |   18 ++++++++++--------
>>>> 1 file changed, 10 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>>>> index bd1c4fa..9d29712 100644
>>>> --- a/drivers/pci/iov.c
>>>> +++ b/drivers/pci/iov.c
>>>> @@ -274,13 +274,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>>>> 		return -ENOMEM;
>>>> 	}
>>>>
>>>> -	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>>>> -	if (bus > dev->bus->busn_res.end) {
>>>> -		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
>>>> -			nr_virtfn, bus, &dev->bus->busn_res);
>>>> -		return -ENOMEM;
>>>> -	}
>>>> -
>>>> 	if (pci_enable_resources(dev, bars)) {
>>>> 		dev_err(&dev->dev, "SR-IOV: IOV BARS not allocated\n");
>>>> 		return -ENOMEM;
>>>> @@ -304,6 +297,15 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>>>> 	}
>>>>
>>>> 	pci_iov_set_numvfs(dev, nr_virtfn);
>>> How about move it up?
>>
>> The idea with moving the write down is to keep the pollution of the SR-IOV
>> capability to a minimum.  Basically we have addressed all of the possible
>> software issues at this point so all that remains is possible hardware
>> complications.  In addition by moving this code down we only have to modify
>> this code instead of adding "rc=X; goto foo;" in places where "return X;" was
>> used.
>>
>
> I think your logic is clear, while it is not easy to classify the software
> issue and hardware complications. For example, at the beginning of
> sriov_enable(), the hardware value initial VFs number is checked.
>
> And in my mind, this is reasonable to check the hardware issue before software
> issue.
>
> For your comment, adding "rc=X; goto foo;", I don't see this would happen.
> The code in my mind would like this:
>
>
> +	pci_iov_set_numvfs(dev, nr_virtfn);
>   	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>   	if (bus > dev->bus->busn_res.end) {
>   		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
>   			nr_virtfn, bus, &dev->bus->busn_res);
>   		return -ENOMEM;
>   	}
>
> 	if (pci_enable_resources(dev, bars)) {
> 		dev_err(&dev->dev, "SR-IOV: IOV BARS not allocated\n");
> 		return -ENOMEM;
>
> Do I missed something?

The problem is that pci_iov_set_numvfs has side effects visible to the 
user since they can read NumVFs lspci and via sysfs.  As such we want to 
keep the two in sync, and if you get the bus error here that is not the 
case.

That is why you must call pci_iov_set_numvfs(dev, 0) if this fails.

- Alex


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs
  2015-11-02  8:33         ` Wei Yang
@ 2015-11-02 15:46           ` Alexander Duyck
  2015-11-03  2:01             ` Wei Yang
  0 siblings, 1 reply; 28+ messages in thread
From: Alexander Duyck @ 2015-11-02 15:46 UTC (permalink / raw)
  To: Wei Yang
  Cc: ethan zhao, Bjorn Helgaas, Alexander Duyck, linux-pci, linux-kernel

On 11/02/2015 12:33 AM, Wei Yang wrote:
> On Fri, Oct 30, 2015 at 08:57:17AM -0700, Alexander Duyck wrote:
>> On 10/29/2015 11:00 PM, ethan zhao wrote:
>>> Wei,
>>>
>>> On 2015/10/30 13:14, Wei Yang wrote:
>>>> On Thu, Oct 29, 2015 at 05:23:22PM -0500, Bjorn Helgaas wrote:
>>>>> From: Alexander Duyck <aduyck@mirantis.com>
>>>>>
>>>>> Per sec 3.3.3.1 of the SR-IOV spec, r1.1, we must allow 1.0s after
>>>>> clearing
>>>>> VF Enable before reading any field in the SR-IOV Extended Capability.
>>>>>
>>>>> Wait 1 second before calling pci_iov_set_numvfs(), which reads
>>>>> PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE after it sets
>>>>> PCI_SRIOV_NUM_VF.
>>>>>
>>>>> [bhelgaas: split to separate patch for reviewability, add spec
>>>>> reference]
>>>>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>>>>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>>>> ---
>>>>> drivers/pci/iov.c |    2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>>>>> index fada98d..24428d5 100644
>>>>> --- a/drivers/pci/iov.c
>>>>> +++ b/drivers/pci/iov.c
>>>>> @@ -339,13 +339,13 @@ failed:
>>>>>     iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
>>>>>     pci_cfg_access_lock(dev);
>>>>>     pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>>>>> -    pci_iov_set_numvfs(dev, 0);
>>>>>     ssleep(1);
>>>>>     pci_cfg_access_unlock(dev);
>>>>>
>>>>>     if (iov->link != dev->devfn)
>>>>>         sysfs_remove_link(&dev->dev.kobj, "dep_link");
>>>>>
>>>>> +    pci_iov_set_numvfs(dev, 0);
>>>> One small question, any specific reason put it here instead of just after
>>>> sleep()?
>>> Agree,  pci_iov_set_numvfs(dev, 0) should be put before
>>> pci_cfg_access_unlock(dev) to avoid race,  because "NumVFs may only be
>>> written while VF Enable is Clear"
>> We are already guaranteeing that aren't we?  I'm assuming there is already
>> code in place here somewhere that prevents us from both enabling and
>> disabling SR-IOV from more than one thread.  Otherwise how could we hope to
>> have any sort of consistent state?
>>
>> I'm fine with us being more explicit about it if we want to be, but if we are
>> going to do it we should probably update all 3 spots where we update NumVFs
>> after init instead of just this one.  Perhaps it should be a separate patch.
>>
> Yep, I think the statement is met, "NumVFs may only be written while VF Enable
> is Clear".
>
> While in your commit log, the purpose of this patch is to wait 1 second before
> write NumVFs. So I am interesting to know why you move this out of the
> pci_cfg_access_lock. Because it looks better? have better performance?
>
> Actually, this is a question instead of a challenge :-)

It is because the first call to pci_iov_set_numvfs is done outside of 
the pci_cfg_access_lock.  This way when I add the clean-up for the bus 
numbering failure in patch 7 I don't have to modify as much code either 
since the write is already pulled out.

An added bonus is the code is now much closer to what we have in 
sriov_disable which has seen much more use than the exception handling 
case for sriov_enable, so it has been more thoroughly tested.

- Alex


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs
  2015-11-02 15:46           ` Alexander Duyck
@ 2015-11-03  2:01             ` Wei Yang
  0 siblings, 0 replies; 28+ messages in thread
From: Wei Yang @ 2015-11-03  2:01 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Wei Yang, ethan zhao, Bjorn Helgaas, Alexander Duyck, linux-pci,
	linux-kernel

On Mon, Nov 02, 2015 at 07:46:24AM -0800, Alexander Duyck wrote:
>On 11/02/2015 12:33 AM, Wei Yang wrote:
>>On Fri, Oct 30, 2015 at 08:57:17AM -0700, Alexander Duyck wrote:
>>>On 10/29/2015 11:00 PM, ethan zhao wrote:
>>>>Wei,
>>>>
>>>>On 2015/10/30 13:14, Wei Yang wrote:
>>>>>On Thu, Oct 29, 2015 at 05:23:22PM -0500, Bjorn Helgaas wrote:
>>>>>>From: Alexander Duyck <aduyck@mirantis.com>
>>>>>>
>>>>>>Per sec 3.3.3.1 of the SR-IOV spec, r1.1, we must allow 1.0s after
>>>>>>clearing
>>>>>>VF Enable before reading any field in the SR-IOV Extended Capability.
>>>>>>
>>>>>>Wait 1 second before calling pci_iov_set_numvfs(), which reads
>>>>>>PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE after it sets
>>>>>>PCI_SRIOV_NUM_VF.
>>>>>>
>>>>>>[bhelgaas: split to separate patch for reviewability, add spec
>>>>>>reference]
>>>>>>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>>>>>>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>>>>>---
>>>>>>drivers/pci/iov.c |    2 +-
>>>>>>1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>>>>>>index fada98d..24428d5 100644
>>>>>>--- a/drivers/pci/iov.c
>>>>>>+++ b/drivers/pci/iov.c
>>>>>>@@ -339,13 +339,13 @@ failed:
>>>>>>    iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
>>>>>>    pci_cfg_access_lock(dev);
>>>>>>    pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>>>>>>-    pci_iov_set_numvfs(dev, 0);
>>>>>>    ssleep(1);
>>>>>>    pci_cfg_access_unlock(dev);
>>>>>>
>>>>>>    if (iov->link != dev->devfn)
>>>>>>        sysfs_remove_link(&dev->dev.kobj, "dep_link");
>>>>>>
>>>>>>+    pci_iov_set_numvfs(dev, 0);
>>>>>One small question, any specific reason put it here instead of just after
>>>>>sleep()?
>>>>Agree,  pci_iov_set_numvfs(dev, 0) should be put before
>>>>pci_cfg_access_unlock(dev) to avoid race,  because "NumVFs may only be
>>>>written while VF Enable is Clear"
>>>We are already guaranteeing that aren't we?  I'm assuming there is already
>>>code in place here somewhere that prevents us from both enabling and
>>>disabling SR-IOV from more than one thread.  Otherwise how could we hope to
>>>have any sort of consistent state?
>>>
>>>I'm fine with us being more explicit about it if we want to be, but if we are
>>>going to do it we should probably update all 3 spots where we update NumVFs
>>>after init instead of just this one.  Perhaps it should be a separate patch.
>>>
>>Yep, I think the statement is met, "NumVFs may only be written while VF Enable
>>is Clear".
>>
>>While in your commit log, the purpose of this patch is to wait 1 second before
>>write NumVFs. So I am interesting to know why you move this out of the
>>pci_cfg_access_lock. Because it looks better? have better performance?
>>
>>Actually, this is a question instead of a challenge :-)
>
>It is because the first call to pci_iov_set_numvfs is done outside of the
>pci_cfg_access_lock.  This way when I add the clean-up for the bus numbering
>failure in patch 7 I don't have to modify as much code either since the write
>is already pulled out.
>
>An added bonus is the code is now much closer to what we have in
>sriov_disable which has seen much more use than the exception handling case
>for sriov_enable, so it has been more thoroughly tested.

Thanks~ I am more comfortable with this change~

>
>- Alex

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs
  2015-10-29 22:23 ` [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs Bjorn Helgaas
  2015-10-30  5:14   ` Wei Yang
@ 2015-11-03  2:04   ` Wei Yang
  1 sibling, 0 replies; 28+ messages in thread
From: Wei Yang @ 2015-11-03  2:04 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, Ethan Zhao, Wei Yang, linux-kernel

On Thu, Oct 29, 2015 at 05:23:22PM -0500, Bjorn Helgaas wrote:
>From: Alexander Duyck <aduyck@mirantis.com>
>
>Per sec 3.3.3.1 of the SR-IOV spec, r1.1, we must allow 1.0s after clearing
>VF Enable before reading any field in the SR-IOV Extended Capability.
>
>Wait 1 second before calling pci_iov_set_numvfs(), which reads
>PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE after it sets PCI_SRIOV_NUM_VF.
>
>[bhelgaas: split to separate patch for reviewability, add spec reference]
>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Reviewed-by: Wei Yang <weiyang@linux.vnet.ibm.com>

>---
> drivers/pci/iov.c |    2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>index fada98d..24428d5 100644
>--- a/drivers/pci/iov.c
>+++ b/drivers/pci/iov.c
>@@ -339,13 +339,13 @@ failed:
> 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
> 	pci_cfg_access_lock(dev);
> 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>-	pci_iov_set_numvfs(dev, 0);
> 	ssleep(1);
> 	pci_cfg_access_unlock(dev);
>
> 	if (iov->link != dev->devfn)
> 		sysfs_remove_link(&dev->dev.kobj, "dep_link");
>
>+	pci_iov_set_numvfs(dev, 0);
> 	return rc;
> }
>

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 7/7] PCI: Set NumVFs before computing how many buses VFs require
  2015-11-02 15:39         ` Alexander Duyck
@ 2015-11-03  2:18           ` Wei Yang
  0 siblings, 0 replies; 28+ messages in thread
From: Wei Yang @ 2015-11-03  2:18 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Wei Yang, Bjorn Helgaas, Alexander Duyck, linux-pci, Ethan Zhao,
	linux-kernel

On Mon, Nov 02, 2015 at 07:39:48AM -0800, Alexander Duyck wrote:
>On 11/02/2015 12:27 AM, Wei Yang wrote:
>>On Fri, Oct 30, 2015 at 09:03:54AM -0700, Alexander Duyck wrote:
>>>On 10/29/2015 10:22 PM, Wei Yang wrote:
>>>>On Thu, Oct 29, 2015 at 05:23:36PM -0500, Bjorn Helgaas wrote:
>>>>>From: Alexander Duyck <aduyck@mirantis.com>
>>>>>
>>>>>VF bus numbers depend on the First VF Offset and VF Stride, and per
>>>>>sections 3.3.9 and 3.3.10 of the SR-IOV spec r1.1, these depend on the
>>>>>NumVF value.
>>>>>
>>>>>Wait until after we set NumVFs to compute and validate the bus number of
>>>>>the last VF.
>>>>>
>>>>>[bhelgaas: changelog, add spec reference, split to separate patch for
>>>>>reviewability]
>>>>>Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>>>>>Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>>>>---
>>>>>drivers/pci/iov.c |   18 ++++++++++--------
>>>>>1 file changed, 10 insertions(+), 8 deletions(-)
>>>>>
>>>>>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>>>>>index bd1c4fa..9d29712 100644
>>>>>--- a/drivers/pci/iov.c
>>>>>+++ b/drivers/pci/iov.c
>>>>>@@ -274,13 +274,6 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>>>>>		return -ENOMEM;
>>>>>	}
>>>>>
>>>>>-	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>>>>>-	if (bus > dev->bus->busn_res.end) {
>>>>>-		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
>>>>>-			nr_virtfn, bus, &dev->bus->busn_res);
>>>>>-		return -ENOMEM;
>>>>>-	}
>>>>>-
>>>>>	if (pci_enable_resources(dev, bars)) {
>>>>>		dev_err(&dev->dev, "SR-IOV: IOV BARS not allocated\n");
>>>>>		return -ENOMEM;
>>>>>@@ -304,6 +297,15 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>>>>>	}
>>>>>
>>>>>	pci_iov_set_numvfs(dev, nr_virtfn);
>>>>How about move it up?
>>>
>>>The idea with moving the write down is to keep the pollution of the SR-IOV
>>>capability to a minimum.  Basically we have addressed all of the possible
>>>software issues at this point so all that remains is possible hardware
>>>complications.  In addition by moving this code down we only have to modify
>>>this code instead of adding "rc=X; goto foo;" in places where "return X;" was
>>>used.
>>>
>>
>>I think your logic is clear, while it is not easy to classify the software
>>issue and hardware complications. For example, at the beginning of
>>sriov_enable(), the hardware value initial VFs number is checked.
>>
>>And in my mind, this is reasonable to check the hardware issue before software
>>issue.
>>
>>For your comment, adding "rc=X; goto foo;", I don't see this would happen.
>>The code in my mind would like this:
>>
>>
>>+	pci_iov_set_numvfs(dev, nr_virtfn);
>>  	bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
>>  	if (bus > dev->bus->busn_res.end) {
>>  		dev_err(&dev->dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
>>  			nr_virtfn, bus, &dev->bus->busn_res);
>>  		return -ENOMEM;
>>  	}
>>
>>	if (pci_enable_resources(dev, bars)) {
>>		dev_err(&dev->dev, "SR-IOV: IOV BARS not allocated\n");
>>		return -ENOMEM;
>>
>>Do I missed something?
>
>The problem is that pci_iov_set_numvfs has side effects visible to the user
>since they can read NumVFs lspci and via sysfs.  As such we want to keep the
>two in sync, and if you get the bus error here that is not the case.
>
>That is why you must call pci_iov_set_numvfs(dev, 0) if this fails.
>

That's the reason. Thanks for letting me know :-)

>- Alex

-- 
Richard Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2015-11-03  2:18 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-29 22:22 [PATCH v2 0/7] SR-IOV fixes and cleanup Bjorn Helgaas
2015-10-29 22:22 ` [PATCH v2 1/7] PCI: Set SR-IOV NumVFs to zero after enumeration Bjorn Helgaas
2015-10-30  3:48   ` Wei Yang
2015-10-30  5:25     ` Wei Yang
2015-10-30 15:40     ` Alexander Duyck
2015-11-02  7:28       ` Wei Yang
2015-10-29 22:23 ` [PATCH v2 2/7] PCI: Remove redundant validation of SR-IOV offset/stride registers Bjorn Helgaas
2015-10-30  5:08   ` Wei Yang
2015-10-29 22:23 ` [PATCH v2 3/7] PCI: Remove VFs in reverse order if virtfn_add() fails Bjorn Helgaas
2015-10-30  5:09   ` Wei Yang
2015-10-29 22:23 ` [PATCH v2 4/7] PCI: Reorder pcibios_sriov_disable() Bjorn Helgaas
2015-10-30  5:09   ` Wei Yang
2015-10-29 22:23 ` [PATCH v2 5/7] PCI: Wait 1 second between disabling VFs and clearing NumVFs Bjorn Helgaas
2015-10-30  5:14   ` Wei Yang
2015-10-30  6:00     ` ethan zhao
2015-10-30 15:57       ` Alexander Duyck
2015-11-02  8:33         ` Wei Yang
2015-11-02 15:46           ` Alexander Duyck
2015-11-03  2:01             ` Wei Yang
2015-11-03  2:04   ` Wei Yang
2015-10-29 22:23 ` [PATCH v2 6/7] PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures Bjorn Helgaas
2015-10-30  5:18   ` Wei Yang
2015-10-29 22:23 ` [PATCH v2 7/7] PCI: Set NumVFs before computing how many buses VFs require Bjorn Helgaas
2015-10-30  5:22   ` Wei Yang
2015-10-30 16:03     ` Alexander Duyck
2015-11-02  8:27       ` Wei Yang
2015-11-02 15:39         ` Alexander Duyck
2015-11-03  2:18           ` Wei Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).