linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Bjorn Helgaas <bhelgaas@google.com>, Saeed Mahameed <saeedm@nvidia.com>
Cc: Leon Romanovsky <leonro@nvidia.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>,
	linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org,
	netdev@vger.kernel.org, Don Dutile <ddutile@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	"David S . Miller" <davem@davemloft.net>
Subject: [PATCH mlx5-next v6 1/4] PCI: Add sysfs callback to allow MSI-X table size change of SR-IOV VFs
Date: Tue,  9 Feb 2021 15:34:42 +0200	[thread overview]
Message-ID: <20210209133445.700225-2-leon@kernel.org> (raw)
In-Reply-To: <20210209133445.700225-1-leon@kernel.org>

From: Leon Romanovsky <leonro@nvidia.com>

Extend PCI sysfs interface with a new callback that allows configuration
of the number of MSI-X vectors for specific SR-IOV VF. This is needed
to optimize the performance of VFs devices by allocating the number of
vectors based on the administrator knowledge of the intended use of the VF.

This function is applicable for SR-IOV VF because such devices allocate
their MSI-X table before they will run on the VMs and HW can't guess the
right number of vectors, so some devices allocate them statically and equally.

1) The newly added /sys/bus/pci/devices/.../sriov_vf_msix_count
file will be seen for the VFs and it is writable as long as a driver is not
bound to the VF.

The values accepted are:
 * > 0 - this will be number reported by the Table Size in the VF's MSI-X Message
         Control register
 * < 0 - not valid
 * = 0 - will reset to the device default value

2) In order to make management easy, provide new read-only sysfs file that
returns a total number of possible to configure MSI-X vectors.

cat /sys/bus/pci/devices/.../sriov_vf_total_msix
  = 0 - feature is not supported
  > 0 - total number of MSI-X vectors available for distribution among the VFs

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 Documentation/ABI/testing/sysfs-bus-pci |  28 +++++
 drivers/pci/iov.c                       | 153 ++++++++++++++++++++++++
 include/linux/pci.h                     |  12 ++
 3 files changed, 193 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 25c9c39770c6..7dadc3610959 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -375,3 +375,31 @@ Description:
 		The value comes from the PCI kernel device state and can be one
 		of: "unknown", "error", "D0", D1", "D2", "D3hot", "D3cold".
 		The file is read only.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_total_msix
+Date:		January 2021
+Contact:	Leon Romanovsky <leonro@nvidia.com>
+Description:
+		This file is associated with the SR-IOV PFs.
+		It contains the total number of MSI-X vectors available for
+		assignment to all VFs associated with this PF. It may be zero
+		if the device doesn't support this functionality.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_msix_count
+Date:		January 2021
+Contact:	Leon Romanovsky <leonro@nvidia.com>
+Description:
+		This file is associated with the SR-IOV VFs.
+		It allows configuration of the number of MSI-X vectors for
+		the VF. This is needed to optimize performance of newly bound
+		devices by allocating the number of vectors based on the
+		administrator knowledge of targeted VM.
+
+		The values accepted are:
+		 * > 0 - this will be number reported by the VF's MSI-X
+			 capability
+		 * < 0 - not valid
+		 * = 0 - will reset to the device default value
+
+		The file is writable if the PF is bound to a driver that
+		implements ->sriov_set_msix_vec_count().
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 4afd4ee4f7f0..c0554aa6b90a 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -31,6 +31,7 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id)
 	return (dev->devfn + dev->sriov->offset +
 		dev->sriov->stride * vf_id) & 0xff;
 }
+EXPORT_SYMBOL_GPL(pci_iov_virtfn_devfn);

 /*
  * Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may
@@ -157,6 +158,158 @@ int pci_iov_sysfs_link(struct pci_dev *dev,
 	return rc;
 }

+#ifdef CONFIG_PCI_MSI
+static ssize_t sriov_vf_msix_count_store(struct device *dev,
+					 struct device_attribute *attr,
+					 const char *buf, size_t count)
+{
+	struct pci_dev *vf_dev = to_pci_dev(dev);
+	struct pci_dev *pdev = pci_physfn(vf_dev);
+	int val, ret;
+
+	ret = kstrtoint(buf, 0, &val);
+	if (ret)
+		return ret;
+
+	if (val < 0)
+		return -EINVAL;
+
+	device_lock(&pdev->dev);
+	if (!pdev->driver || !pdev->driver->sriov_set_msix_vec_count) {
+		ret = -EOPNOTSUPP;
+		goto err_pdev;
+	}
+
+	device_lock(&vf_dev->dev);
+	if (vf_dev->driver) {
+		/*
+		 * Driver already probed this VF and configured itself
+		 * based on previously configured (or default) MSI-X vector
+		 * count. It is too late to change this field for this
+		 * specific VF.
+		 */
+		ret = -EBUSY;
+		goto err_dev;
+	}
+
+	ret = pdev->driver->sriov_set_msix_vec_count(vf_dev, val);
+
+err_dev:
+	device_unlock(&vf_dev->dev);
+err_pdev:
+	device_unlock(&pdev->dev);
+	return ret ? : count;
+}
+static DEVICE_ATTR_WO(sriov_vf_msix_count);
+
+static ssize_t sriov_vf_total_msix_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u32 vf_total_msix;
+
+	device_lock(dev);
+	if (!pdev->driver || !pdev->driver->sriov_get_vf_total_msix) {
+		device_unlock(dev);
+		return -EOPNOTSUPP;
+	}
+	vf_total_msix = pdev->driver->sriov_get_vf_total_msix(pdev);
+	device_unlock(dev);
+
+	return sysfs_emit(buf, "%u\n", vf_total_msix);
+}
+static DEVICE_ATTR_RO(sriov_vf_total_msix);
+#endif
+
+static const struct attribute *sriov_pf_dev_attrs[] = {
+#ifdef CONFIG_PCI_MSI
+	&dev_attr_sriov_vf_total_msix.attr,
+#endif
+	NULL,
+};
+
+static const struct attribute *sriov_vf_dev_attrs[] = {
+#ifdef CONFIG_PCI_MSI
+	&dev_attr_sriov_vf_msix_count.attr,
+#endif
+	NULL,
+};
+
+/*
+ * The PF can change the specific properties of associated VFs. Such
+ * functionality is usually known after PF probed and PCI sysfs files
+ * were already created.
+ *
+ * The function below is driven by such PF. It adds sysfs files to already
+ * existing PF/VF sysfs device hierarchies.
+ */
+int pci_enable_vf_overlay(struct pci_dev *dev)
+{
+	struct pci_dev *virtfn;
+	int id, ret;
+
+	if (!dev->is_physfn || !dev->sriov->num_VFs)
+		return 0;
+
+	ret = sysfs_create_files(&dev->dev.kobj, sriov_pf_dev_attrs);
+	if (ret)
+		return ret;
+
+	for (id = 0; id < dev->sriov->num_VFs; id++) {
+		virtfn = pci_get_domain_bus_and_slot(
+			pci_domain_nr(dev->bus), pci_iov_virtfn_bus(dev, id),
+			pci_iov_virtfn_devfn(dev, id));
+
+		if (!virtfn)
+			continue;
+
+		ret = sysfs_create_files(&virtfn->dev.kobj,
+					 sriov_vf_dev_attrs);
+		if (ret)
+			goto out;
+	}
+	return 0;
+
+out:
+	while (id--) {
+		virtfn = pci_get_domain_bus_and_slot(
+			pci_domain_nr(dev->bus), pci_iov_virtfn_bus(dev, id),
+			pci_iov_virtfn_devfn(dev, id));
+
+		if (!virtfn)
+			continue;
+
+		sysfs_remove_files(&virtfn->dev.kobj, sriov_vf_dev_attrs);
+	}
+	sysfs_remove_files(&dev->dev.kobj, sriov_pf_dev_attrs);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pci_enable_vf_overlay);
+
+void pci_disable_vf_overlay(struct pci_dev *dev)
+{
+	struct pci_dev *virtfn;
+	int id;
+
+	if (!dev->is_physfn || !dev->sriov->num_VFs)
+		return;
+
+	id = dev->sriov->num_VFs;
+	while (id--) {
+		virtfn = pci_get_domain_bus_and_slot(
+			pci_domain_nr(dev->bus), pci_iov_virtfn_bus(dev, id),
+			pci_iov_virtfn_devfn(dev, id));
+
+		if (!virtfn)
+			continue;
+
+		sysfs_remove_files(&virtfn->dev.kobj, sriov_vf_dev_attrs);
+	}
+	sysfs_remove_files(&dev->dev.kobj, sriov_pf_dev_attrs);
+}
+EXPORT_SYMBOL_GPL(pci_disable_vf_overlay);
+
 int pci_iov_add_virtfn(struct pci_dev *dev, int id)
 {
 	int i;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index b32126d26997..732611937574 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -856,6 +856,11 @@ struct module;
  *		e.g. drivers/net/e100.c.
  * @sriov_configure: Optional driver callback to allow configuration of
  *		number of VFs to enable via sysfs "sriov_numvfs" file.
+ * @sriov_set_msix_vec_count: Driver callback to change number of MSI-X vectors
+ *              to configure via sysfs "sriov_vf_msix_count" entry. This will
+ *              change MSI-X Table Size in their Message Control registers.
+ * @sriov_get_vf_total_msix: Total number of MSI-X veectors to distribute
+ *              to the VFs
  * @err_handler: See Documentation/PCI/pci-error-recovery.rst
  * @groups:	Sysfs attribute groups.
  * @driver:	Driver model structure.
@@ -871,6 +876,8 @@ struct pci_driver {
 	int  (*resume)(struct pci_dev *dev);	/* Device woken up */
 	void (*shutdown)(struct pci_dev *dev);
 	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
+	int  (*sriov_set_msix_vec_count)(struct pci_dev *vf, int msix_vec_count); /* On PF */
+	u32  (*sriov_get_vf_total_msix)(struct pci_dev *pf);
 	const struct pci_error_handlers *err_handler;
 	const struct attribute_group **groups;
 	struct device_driver	driver;
@@ -2059,6 +2066,9 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar);
 int pci_iov_virtfn_bus(struct pci_dev *dev, int id);
 int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);

+int pci_enable_vf_overlay(struct pci_dev *dev);
+void pci_disable_vf_overlay(struct pci_dev *dev);
+
 int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 void pci_disable_sriov(struct pci_dev *dev);

@@ -2100,6 +2110,8 @@ static inline int pci_iov_add_virtfn(struct pci_dev *dev, int id)
 }
 static inline void pci_iov_remove_virtfn(struct pci_dev *dev,
 					 int id) { }
+static inline int pci_enable_vf_overlay(struct pci_dev *dev) { return 0; }
+static inline void pci_disable_vf_overlay(struct pci_dev *dev) { }
 static inline void pci_disable_sriov(struct pci_dev *dev) { }
 static inline int pci_num_vf(struct pci_dev *dev) { return 0; }
 static inline int pci_vfs_assigned(struct pci_dev *dev)
--
2.29.2


  reply	other threads:[~2021-02-09 13:35 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-09 13:34 [PATCH mlx5-next v6 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
2021-02-09 13:34 ` Leon Romanovsky [this message]
2021-02-15 21:01   ` [PATCH mlx5-next v6 1/4] PCI: Add sysfs callback to allow MSI-X table size change of SR-IOV VFs Bjorn Helgaas
2021-02-16  7:33     ` Leon Romanovsky
2021-02-16 16:12       ` Bjorn Helgaas
2021-02-16 19:58         ` Leon Romanovsky
2021-02-17 18:02           ` Bjorn Helgaas
2021-02-17 19:25             ` Jason Gunthorpe
2021-02-17 20:28               ` Bjorn Helgaas
2021-02-17 23:52                 ` Jason Gunthorpe
2021-02-18 10:15             ` Leon Romanovsky
2021-02-18 22:39               ` Bjorn Helgaas
2021-02-19  7:52                 ` Leon Romanovsky
2021-02-19  8:20                   ` Greg Kroah-Hartman
2021-02-19 16:58                     ` Leon Romanovsky
2021-02-20 19:06                     ` Bjorn Helgaas
2021-02-21  6:59                       ` Leon Romanovsky
2021-02-23 21:07                         ` Bjorn Helgaas
2021-02-24  8:09                           ` Greg Kroah-Hartman
2021-02-24 21:37                             ` Don Dutile
2021-02-24  9:53                           ` Leon Romanovsky
2021-02-24 15:07                             ` Bjorn Helgaas
2021-02-21 13:00                       ` Greg Kroah-Hartman
2021-02-21 13:55                         ` Leon Romanovsky
2021-02-21 15:01                           ` Greg Kroah-Hartman
2021-02-21 15:30                             ` Leon Romanovsky
2021-02-16 20:37         ` Jason Gunthorpe
2021-02-09 13:34 ` [PATCH mlx5-next v6 2/4] net/mlx5: Add dynamic MSI-X capabilities bits Leon Romanovsky
2021-02-09 13:34 ` [PATCH mlx5-next v6 3/4] net/mlx5: Dynamically assign MSI-X vectors count Leon Romanovsky
2021-02-09 13:34 ` [PATCH mlx5-next v6 4/4] net/mlx5: Allow to the users to configure number of MSI-X vectors Leon Romanovsky
2021-02-09 21:06 ` [PATCH mlx5-next v6 0/4] Dynamically assign MSI-X vectors count Alexander Duyck
2021-02-14  5:24   ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210209133445.700225-2-leon@kernel.org \
    --to=leon@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=davem@davemloft.net \
    --cc=ddutile@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).