All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] PCI: reset driver SR-IOV state after remove
@ 2018-06-18 22:01 Jakub Kicinski
  2018-06-26 16:40 ` Jakub Kicinski
  2018-06-29 20:09 ` [PATCH] PCI: reset driver SR-IOV state after remove Bjorn Helgaas
  0 siblings, 2 replies; 13+ messages in thread
From: Jakub Kicinski @ 2018-06-18 22:01 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, alexander.duyck, oss-drivers, Christoph Hellwig,
	Don Dutile, Jakub Kicinski

Bjorn points out that currently core and most of the drivers don't
clean up dev->sriov->driver_max_VFs settings on .remove().  This
means that if a different driver is bound afterwards it will
inherit the old setting:

  - load PF driver 1
  - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs
  - unload PF driver 1
  - load PF driver 2
  # driver 2 does not know to call pci_sriov_set_totalvfs()

Some drivers (e.g. nfp) used to do the clean up by calling
pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver
limit set.  After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers
to limit total_VFs to 0") 0 no longer has this special meaning.

The need to reset driver_max_VFs comes from the fact that old FW
builds may not expose its VF limits to the drivers, and depend on
the ability to reject the configuration change when driver notifies
the FW as part of struct pci_driver->sriov_configure() callback.
Therefore if there is no information on VF count limits we should
use the PCI capability max, and not the last value set by
pci_sriov_set_totalvfs().

Reset driver_max_VFs back to total_VFs after device remove.  If
drivers want to reload FW/reconfigure the device while driver is
bound we may add an explicit pci_sriov_reset_totalvfs(), but for
now no driver is doing that.

Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/pci/iov.c        | 16 ++++++++++++++++
 drivers/pci/pci-driver.c |  1 +
 drivers/pci/pci.h        |  4 ++++
 3 files changed, 21 insertions(+)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index d0d73dbbd5ca..cbbe6d8fab0c 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -574,6 +574,22 @@ void pci_iov_release(struct pci_dev *dev)
 		sriov_release(dev);
 }
 
+/**
+ * pci_iov_device_removed - clean up SR-IOV state after PF driver is detached
+ * @dev: the PCI device
+ */
+void pci_iov_device_removed(struct pci_dev *dev)
+{
+	struct pci_sriov *iov = dev->sriov;
+
+	if (!dev->is_physfn)
+		return;
+	iov->driver_max_VFs = iov->total_VFs;
+	if (iov->num_VFs)
+		dev_warn(&dev->dev,
+			 "driver left SR-IOV enabled after remove\n");
+}
+
 /**
  * pci_iov_update_resource - update a VF BAR
  * @dev: the PCI device
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index c125d53033c6..80a281cf5d21 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -445,6 +445,7 @@ static int pci_device_remove(struct device *dev)
 		}
 		pcibios_free_irq(pci_dev);
 		pci_dev->driver = NULL;
+		pci_iov_device_removed(pci_dev);
 	}
 
 	/* Undo the runtime PM settings in local_pci_probe() */
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c358e7a07f3f..fc8bd4fdfb95 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -311,6 +311,7 @@ static inline void pci_restore_ats_state(struct pci_dev *dev)
 #ifdef CONFIG_PCI_IOV
 int pci_iov_init(struct pci_dev *dev);
 void pci_iov_release(struct pci_dev *dev);
+void pci_iov_device_removed(struct pci_dev *dev);
 void pci_iov_update_resource(struct pci_dev *dev, int resno);
 resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
 void pci_restore_iov_state(struct pci_dev *dev);
@@ -323,6 +324,9 @@ static inline int pci_iov_init(struct pci_dev *dev)
 }
 static inline void pci_iov_release(struct pci_dev *dev)
 
+{
+}
+static inline void pci_iov_device_removed(struct pci_dev *dev)
 {
 }
 static inline void pci_restore_iov_state(struct pci_dev *dev)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread
* Re: [PATCH] PCI: allow drivers to limit the number of VFs to 0
@ 2018-05-25 21:45 Bjorn Helgaas
  2018-05-26  3:00 ` [PATCH] PCI: reset driver SR-IOV state after remove Jakub Kicinski
  0 siblings, 1 reply; 13+ messages in thread
From: Bjorn Helgaas @ 2018-05-25 21:45 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Bjorn Helgaas, linux-pci, netdev, Sathya Perla, Felix Manlunas,
	alexander.duyck, john.fastabend, Jacob Keller, Donald Dutile,
	oss-drivers, Christoph Hellwig

On Fri, May 25, 2018 at 02:05:21PM -0700, Jakub Kicinski wrote:
> On Fri, 25 May 2018 09:02:23 -0500, Bjorn Helgaas wrote:
> > On Thu, May 24, 2018 at 06:20:15PM -0700, Jakub Kicinski wrote:
> > > On Thu, 24 May 2018 18:57:48 -0500, Bjorn Helgaas wrote:  
> > > > On Mon, Apr 02, 2018 at 03:46:52PM -0700, Jakub Kicinski wrote:  
> > > > > Some user space depends on enabling sriov_totalvfs number of VFs
> > > > > to not fail, e.g.:
> > > > > 
> > > > > $ cat .../sriov_totalvfs > .../sriov_numvfs
> > > > > 
> > > > > For devices which VF support depends on loaded FW we have the
> > > > > pci_sriov_{g,s}et_totalvfs() API.  However, this API uses 0 as
> > > > > a special "unset" value, meaning drivers can't limit sriov_totalvfs
> > > > > to 0.  Remove the special values completely and simply initialize
> > > > > driver_max_VFs to total_VFs.  Then always use driver_max_VFs.
> > > > > Add a helper for drivers to reset the VF limit back to total.    
> > > > 
> > > > I still can't really make sense out of the changelog.
> > > >
> > > > I think part of the reason it's confusing is because there are two
> > > > things going on:
> > > > 
> > > >   1) You want this:
> > > >   
> > > >        pci_sriov_set_totalvfs(dev, 0);
> > > >        x = pci_sriov_get_totalvfs(dev) 
> > > > 
> > > >      to return 0 instead of total_VFs.  That seems to connect with
> > > >      your subject line.  It means "sriov_totalvfs" in sysfs could be
> > > >      0, but I don't know how that is useful (I'm sure it is; just
> > > >      educate me :))  
> > > 
> > > Let me just quote the bug report that got filed on our internal bug
> > > tracker :)
> > > 
> > >   When testing Juju Openstack with Ubuntu 18.04, enabling SR-IOV causes
> > >   errors because Juju gets the sriov_totalvfs for SR-IOV-capable device
> > >   then tries to set that as the sriov_numvfs parameter.
> > > 
> > >   For SR-IOV incapable FW, the sriov_totalvfs parameter should be 0, 
> > >   but it's set to max.  When FW is switched to flower*, the correct 
> > >   sriov_totalvfs value is presented.
> > > 
> > > * flower is a project name  
> > 
> > From the point of view of the PCI core (which knows nothing about
> > device firmware and relies on the architected config space described
> > by the PCIe spec), this sounds like an erratum: with some firmware
> > installed, the device is not capable of SR-IOV, but still advertises
> > an SR-IOV capability with "TotalVFs > 0".
> > 
> > Regardless of whether that's an erratum, we do allow PF drivers to use
> > pci_sriov_set_totalvfs() to limit the number of VFs that may be
> > enabled by writing to the PF's "sriov_numvfs" sysfs file.
> 
> Think more of an FPGA which can be reprogrammed at runtime to have
> different capabilities than an erratum.  Some FWs simply have no use
> for VFs and save resources (and validation time) by not supporting it.

This is a bit of a gray area.  Reloading firmware or reprogramming an
FPGA has the potential to create a new and different device than we
had before, but the PCI core doesn't know that.  The typical sequence
is:

  - PCI core enumerates device
  - driver binds to device (we call .probe())
  - driver loads new firmware to device
  - driver resets device with pci_reset_function() or similar
  - pci_reset_function() saves config space
  - pci_reset_function() resets device
  - device uses new firmware when it comes out of reset
  - pci_reset_function() restores config space

Loading the new firmware might change what the device looks like in
config space -- it could change the number or size of BARs, the
capabilities advertised, etc.  We currently sweep that under the rug
and blindly restore the old config space.

It looks like your driver does the reset differently, so maybe it
keeps the original config space setup.

But all that said, I agree that we should allow a PF driver to prevent
VF enablement, whether because the firmware doesn't support it or the
PF driver just wants to prevent use of VFs for whatever reason (maybe
we don't have enough MMIO resources, we don't need the VFs, etc.)

> Okay, perfect.  That makes sense.  The patch below certainly fixes the
> first issue for us.  Thank you!
> 
> As far as the second issue goes - agreed, having the core reset the
> number of VFs to total_VFs definitely makes sense.  It doesn't cater to
> the case where FW is reloaded without reprobing, but we don't do this
> today anyway.
> 
> Should I try to come up with a patch to reset total_VFs after detach?

Yes, please.

Bjorn

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-06-29 20:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-18 22:01 [PATCH] PCI: reset driver SR-IOV state after remove Jakub Kicinski
2018-06-26 16:40 ` Jakub Kicinski
2018-06-28 13:56   ` Bjorn Helgaas
2018-06-28 16:17     ` Jakub Kicinski
2018-06-28 18:07       ` Bjorn Helgaas
2018-06-28 18:14         ` Jakub Kicinski
2018-06-28 22:10           ` Bjorn Helgaas
2018-06-28 22:26             ` Jakub Kicinski
2018-06-28 18:30         ` [PATCH] nfp: align setting totalvfs to changes in PCI core Jakub Kicinski
2018-06-29 20:09           ` Bjorn Helgaas
2018-06-29 20:09 ` [PATCH] PCI: reset driver SR-IOV state after remove Bjorn Helgaas
2018-06-29 20:20   ` Jakub Kicinski
  -- strict thread matches above, loose matches on Subject: below --
2018-05-25 21:45 [PATCH] PCI: allow drivers to limit the number of VFs to 0 Bjorn Helgaas
2018-05-26  3:00 ` [PATCH] PCI: reset driver SR-IOV state after remove Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.