linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count
@ 2021-03-14 12:42 Leon Romanovsky
  2021-03-14 12:42 ` [PATCH mlx5-next v8 1/4] PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs Leon Romanovsky
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Leon Romanovsky @ 2021-03-14 12:42 UTC (permalink / raw)
  To: Bjorn Helgaas, Saeed Mahameed
  Cc: Jason Gunthorpe, Alexander Duyck, Jakub Kicinski, linux-pci,
	linux-rdma, netdev, Don Dutile, Alex Williamson,
	David S . Miller, Greg Kroah-Hartman

---------------------------------------------------------------------------------
Changelog
v8:
 * Added "physical/virtual function" words near PF and VF acronyms.
v7: https://lore.kernel.org/linux-pci/20210301075524.441609-1-leon@kernel.org
 * Rebase on top v5.12-rc1
 * More english fixes
 * Returned to static sysfs creation model as was implemented in v0/v1.
v6: https://lore.kernel.org/linux-pci/20210209133445.700225-1-leon@kernel.org
 * Patch 1:
   * English fixes
   * Moved pci_vf_set_msix_vec_count() from msi.c to iov.c
   * Embedded pci_vf_set_msix_vec_count() into sriov_vf_msix_count_store
   * Deleted sriov_vf_msix_count_show
   * Deleted vfs_overlay folder
   * Renamed functions *_vfs_overlay_* to be *_vf_overlay_*
   * Deleted is_supported and attribute_group because it confused people more than
     it gave advantage.
   * Changed vf_total_msix to be callback
 * Patch 3:
   * Fixed english as suggested by Bjorn
   * Added more explanations to the commit message
 * Patch 4:
   * Protected enable/disable with capability check
v5: https://lore.kernel.org/linux-pci/20210126085730.1165673-1-leon@kernel.org
 * Patch 1:
  * Added forgotten "inline" keyword when declaring empty functions.
v4: https://lore.kernel.org/linux-pci/20210124131119.558563-1-leon@kernel.org
 * Used sysfs_emit() instead of sprintf() in new sysfs entries.
 * Changed EXPORT_SYMBOL to be EXPORT_SYMBOL_GPL for pci_iov_virtfn_devfn().
 * Rewrote sysfs registration code to be driven by PF that wants to enable VF
   overlay instead of creating to all SR-IOV devices.
 * Grouped all such functionality under new "vfs_overlay" folder.
 * Combined two PCI patches into one.
v3: https://lore.kernel.org/linux-pci/20210117081548.1278992-1-leon@kernel.org
 * Renamed pci_set_msix_vec_count to be pci_vf_set_msix_vec_count.
 * Added VF msix_cap check to hide sysfs entry if device doesn't support msix.
 * Changed "-" to be ":" in the mlx5 patch to silence CI warnings about missing
   kdoc description.
 * Split differently error print in mlx5 driver to avoid checkpatch warning.
v2: https://lore.kernel.org/linux-pci/20210114103140.866141-1-leon@kernel.org
 * Patch 1:
  * Renamed vf_msix_vec sysfs knob to be sriov_vf_msix_count
  * Added PF and VF device locks during set MSI-X call to protect from parallel
    driver bind/unbind operations.
  * Removed extra checks when reading sriov_vf_msix, because users will
    be able to distinguish between supported/not supported by looking on
    sriov_vf_total_msix count.
  * Changed all occurrences of "numb" to be "count"
  * Changed returned error from EOPNOTSUPP to be EBUSY if user tries to set
    MSI-X count after driver already bound to the VF.
  * Added extra comment in pci_set_msix_vec_count() to emphasize that driver
    should not be bound.
 * Patch 2:
  * Changed vf_total_msix from int to be u32 and updated function signatures
    accordingly.
  * Improved patch title
v1: https://lore.kernel.org/linux-pci/20210110150727.1965295-1-leon@kernel.org
 * Improved wording and commit messages of first PCI patch
 * Added extra PCI patch to provide total number of MSI-X vectors
 * Prohibited read of vf_msix_vec sysfs file if driver doesn't support write
 * Removed extra function definition in pci.h
v0: https://lore.kernel.org/linux-pci/20210103082440.34994-1-leon@kernel.org

--------------------------------------------------------------------
Hi,

The number of MSI-X vectors is PCI property visible through lspci, that
field is read-only and configured by the device.

The static assignment of an amount of MSI-X vectors doesn't allow utilize
the newly created VF because it is not known to the device the future load
and configuration where that VF will be used.

The VFs are created on the hypervisor and forwarded to the VMs that have
different properties (for example number of CPUs).

To overcome the inefficiency in the spread of such MSI-X vectors, we
allow the kernel to instruct the device with the needed number of such
vectors, before VF is initialized and bounded to the driver.

Before this series:
[root@server ~]# lspci -vs 0000:08:00.2
08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
....
        Capabilities: [9c] MSI-X: Enable- Count=12 Masked-

Configuration script:
1. Start fresh
echo 0 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
modprobe -q -r mlx5_ib mlx5_core
2. Ensure that driver doesn't run and it is safe to change MSI-X
echo 0 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_drivers_autoprobe
3. Load driver for the PF
modprobe mlx5_core
4. Configure one of the VFs with new number
echo 2 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
echo 21 > /sys/bus/pci/devices/0000\:08\:00.2/sriov_vf_msix_count

After this series:
[root@server ~]# lspci -vs 0000:08:00.2
08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
....
        Capabilities: [9c] MSI-X: Enable- Count=21 Masked-

Thanks

Leon Romanovsky (4):
  PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs
  net/mlx5: Add dynamic MSI-X capabilities bits
  net/mlx5: Dynamically assign MSI-X vectors count
  net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks

 Documentation/ABI/testing/sysfs-bus-pci       |  29 +++++
 .../net/ethernet/mellanox/mlx5/core/main.c    |   6 ++
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  12 +++
 .../net/ethernet/mellanox/mlx5/core/pci_irq.c |  73 +++++++++++++
 .../net/ethernet/mellanox/mlx5/core/sriov.c   |  48 ++++++++-
 drivers/pci/iov.c                             | 102 ++++++++++++++++--
 drivers/pci/pci-sysfs.c                       |   3 +-
 drivers/pci/pci.h                             |   3 +-
 include/linux/mlx5/mlx5_ifc.h                 |  11 +-
 include/linux/pci.h                           |   8 ++
 10 files changed, 284 insertions(+), 11 deletions(-)

--
2.30.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH mlx5-next v8 1/4] PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs
  2021-03-14 12:42 [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
@ 2021-03-14 12:42 ` Leon Romanovsky
  2021-04-03  0:24   ` Bjorn Helgaas
  2021-03-14 12:42 ` [PATCH mlx5-next v8 2/4] net/mlx5: Add dynamic MSI-X capabilities bits Leon Romanovsky
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2021-03-14 12:42 UTC (permalink / raw)
  To: Bjorn Helgaas, Saeed Mahameed
  Cc: Jason Gunthorpe, Alexander Duyck, Jakub Kicinski, linux-pci,
	linux-rdma, netdev, Don Dutile, Alex Williamson,
	David S . Miller, Greg Kroah-Hartman

A typical cloud provider SR-IOV use case is to create many VFs for use by
guest VMs. The VFs may not be assigned to a VM until a customer requests a
VM of a certain size, e.g., number of CPUs. A VF may need MSI-X vectors
proportional to the number of CPUs in the VM, but there is no standard way
to change the number of MSI-X vectors supported by a VF.

Some Mellanox ConnectX devices support dynamic assignment of MSI-X vectors
to SR-IOV VFs. This can be done by the PF driver after VFs are enabled,
and it can be done without affecting VFs that are already in use. The
hardware supports a limited pool of MSI-X vectors that can be assigned to
the PF or to individual VFs.  This is device-specific behavior that
requires support in the PF driver.

Add a read-only "sriov_vf_total_msix" sysfs file for the PF and a writable
"sriov_vf_msix_count" file for each VF. Management software may use these
to learn how many MSI-X vectors are available and to dynamically assign
them to VFs before the VFs are passed through to a VM.

If the PF driver implements the ->sriov_get_vf_total_msix() callback,
"sriov_vf_total_msix" contains the total number of MSI-X vectors available
for distribution among VFs.

If no driver is bound to the VF, writing "N" to "sriov_vf_msix_count" uses
the PF driver ->sriov_set_msix_vec_count() callback to assign "N" MSI-X
vectors to the VF.  When a VF driver subsequently reads the MSI-X Message
Control register, it will see the new Table Size "N".

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 Documentation/ABI/testing/sysfs-bus-pci |  29 +++++++
 drivers/pci/iov.c                       | 102 ++++++++++++++++++++++--
 drivers/pci/pci-sysfs.c                 |   3 +-
 drivers/pci/pci.h                       |   3 +-
 include/linux/pci.h                     |   8 ++
 5 files changed, 137 insertions(+), 8 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 25c9c39770c6..606eec8ae4eb 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -375,3 +375,32 @@ Description:
 		The value comes from the PCI kernel device state and can be one
 		of: "unknown", "error", "D0", D1", "D2", "D3hot", "D3cold".
 		The file is read only.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_total_msix
+Date:		January 2021
+Contact:	Leon Romanovsky <leonro@nvidia.com>
+Description:
+		This file is associated with a SR-IOV physical function (PF).
+		It contains the total number of MSI-X vectors available for
+		assignment to all virtual functions (VFs) associated with PF.
+		The value will be zero if the device doesn't support this
+		functionality. For supported devices, the value will be
+		constant and won't be changed after MSI-X vectors assignment.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_msix_count
+Date:		January 2021
+Contact:	Leon Romanovsky <leonro@nvidia.com>
+Description:
+		This file is associated with a SR-IOV virtual function (VF).
+		It allows configuration of the number of MSI-X vectors for
+		the VF. This allows devices that have a global pool of MSI-X
+		vectors to optimally divide them between VFs based on VF usage.
+
+		The values accepted are:
+		 * > 0 - this number will be reported as the Table Size in the
+			 VF's MSI-X capability
+		 * < 0 - not valid
+		 * = 0 - will reset to the device default value
+
+		The file is writable if the PF is bound to a driver that
+		implements ->sriov_set_msix_vec_count().
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 4afd4ee4f7f0..9bf6f52ad4d8 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -31,6 +31,7 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id)
 	return (dev->devfn + dev->sriov->offset +
 		dev->sriov->stride * vf_id) & 0xff;
 }
+EXPORT_SYMBOL_GPL(pci_iov_virtfn_devfn);

 /*
  * Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may
@@ -157,6 +158,92 @@ int pci_iov_sysfs_link(struct pci_dev *dev,
 	return rc;
 }

+#ifdef CONFIG_PCI_MSI
+static ssize_t sriov_vf_msix_count_store(struct device *dev,
+					 struct device_attribute *attr,
+					 const char *buf, size_t count)
+{
+	struct pci_dev *vf_dev = to_pci_dev(dev);
+	struct pci_dev *pdev = pci_physfn(vf_dev);
+	int val, ret;
+
+	ret = kstrtoint(buf, 0, &val);
+	if (ret)
+		return ret;
+
+	if (val < 0)
+		return -EINVAL;
+
+	device_lock(&pdev->dev);
+	if (!pdev->driver || !pdev->driver->sriov_set_msix_vec_count) {
+		ret = -EOPNOTSUPP;
+		goto err_pdev;
+	}
+
+	device_lock(&vf_dev->dev);
+	if (vf_dev->driver) {
+		/*
+		 * A driver is already attached to this VF and has configured
+		 * itself based on the current MSI-X vector count. Changing
+		 * the vector size could mess up the driver, so block it.
+		 */
+		ret = -EBUSY;
+		goto err_dev;
+	}
+
+	ret = pdev->driver->sriov_set_msix_vec_count(vf_dev, val);
+
+err_dev:
+	device_unlock(&vf_dev->dev);
+err_pdev:
+	device_unlock(&pdev->dev);
+	return ret ? : count;
+}
+static DEVICE_ATTR_WO(sriov_vf_msix_count);
+
+static ssize_t sriov_vf_total_msix_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u32 vf_total_msix = 0;
+
+	device_lock(dev);
+	if (!pdev->driver || !pdev->driver->sriov_get_vf_total_msix)
+		goto unlock;
+
+	vf_total_msix = pdev->driver->sriov_get_vf_total_msix(pdev);
+unlock:
+	device_unlock(dev);
+	return sysfs_emit(buf, "%u\n", vf_total_msix);
+}
+static DEVICE_ATTR_RO(sriov_vf_total_msix);
+#endif
+
+static struct attribute *sriov_vf_dev_attrs[] = {
+#ifdef CONFIG_PCI_MSI
+	&dev_attr_sriov_vf_msix_count.attr,
+#endif
+	NULL,
+};
+
+static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
+					  struct attribute *a, int n)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (!pdev->is_virtfn)
+		return 0;
+
+	return a->mode;
+}
+
+const struct attribute_group sriov_vf_dev_attr_group = {
+	.attrs = sriov_vf_dev_attrs,
+	.is_visible = sriov_vf_attrs_are_visible,
+};
+
 int pci_iov_add_virtfn(struct pci_dev *dev, int id)
 {
 	int i;
@@ -400,18 +487,21 @@ static DEVICE_ATTR_RO(sriov_stride);
 static DEVICE_ATTR_RO(sriov_vf_device);
 static DEVICE_ATTR_RW(sriov_drivers_autoprobe);

-static struct attribute *sriov_dev_attrs[] = {
+static struct attribute *sriov_pf_dev_attrs[] = {
 	&dev_attr_sriov_totalvfs.attr,
 	&dev_attr_sriov_numvfs.attr,
 	&dev_attr_sriov_offset.attr,
 	&dev_attr_sriov_stride.attr,
 	&dev_attr_sriov_vf_device.attr,
 	&dev_attr_sriov_drivers_autoprobe.attr,
+#ifdef CONFIG_PCI_MSI
+	&dev_attr_sriov_vf_total_msix.attr,
+#endif
 	NULL,
 };

-static umode_t sriov_attrs_are_visible(struct kobject *kobj,
-				       struct attribute *a, int n)
+static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
+					  struct attribute *a, int n)
 {
 	struct device *dev = kobj_to_dev(kobj);

@@ -421,9 +511,9 @@ static umode_t sriov_attrs_are_visible(struct kobject *kobj,
 	return a->mode;
 }

-const struct attribute_group sriov_dev_attr_group = {
-	.attrs = sriov_dev_attrs,
-	.is_visible = sriov_attrs_are_visible,
+const struct attribute_group sriov_pf_dev_attr_group = {
+	.attrs = sriov_pf_dev_attrs,
+	.is_visible = sriov_pf_attrs_are_visible,
 };

 int __weak pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index f8afd54ca3e1..a6b8fbbba6d2 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1567,7 +1567,8 @@ static const struct attribute_group *pci_dev_attr_groups[] = {
 	&pci_dev_attr_group,
 	&pci_dev_hp_attr_group,
 #ifdef CONFIG_PCI_IOV
-	&sriov_dev_attr_group,
+	&sriov_pf_dev_attr_group,
+	&sriov_vf_dev_attr_group,
 #endif
 	&pci_bridge_attr_group,
 	&pcie_dev_attr_group,
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index ef7c4661314f..afb87b917f07 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -501,7 +501,8 @@ void pci_iov_update_resource(struct pci_dev *dev, int resno);
 resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
 void pci_restore_iov_state(struct pci_dev *dev);
 int pci_iov_bus_range(struct pci_bus *bus);
-extern const struct attribute_group sriov_dev_attr_group;
+extern const struct attribute_group sriov_pf_dev_attr_group;
+extern const struct attribute_group sriov_vf_dev_attr_group;
 #else
 static inline int pci_iov_init(struct pci_dev *dev)
 {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 86c799c97b77..9b575a676888 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -856,6 +856,12 @@ struct module;
  *		e.g. drivers/net/e100.c.
  * @sriov_configure: Optional driver callback to allow configuration of
  *		number of VFs to enable via sysfs "sriov_numvfs" file.
+ * @sriov_set_msix_vec_count: PF Driver callback to change number of MSI-X
+ *              vectors on a VF. Triggered via sysfs "sriov_vf_msix_count".
+ *              This will change MSI-X Table Size in the VF Message Control
+ *              registers.
+ * @sriov_get_vf_total_msix: PF driver callback to get the total number of
+ *              MSI-X vectors available for distribution to the VFs.
  * @err_handler: See Documentation/PCI/pci-error-recovery.rst
  * @groups:	Sysfs attribute groups.
  * @driver:	Driver model structure.
@@ -871,6 +877,8 @@ struct pci_driver {
 	int  (*resume)(struct pci_dev *dev);	/* Device woken up */
 	void (*shutdown)(struct pci_dev *dev);
 	int  (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */
+	int  (*sriov_set_msix_vec_count)(struct pci_dev *vf, int msix_vec_count); /* On PF */
+	u32  (*sriov_get_vf_total_msix)(struct pci_dev *pf);
 	const struct pci_error_handlers *err_handler;
 	const struct attribute_group **groups;
 	struct device_driver	driver;
--
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH mlx5-next v8 2/4] net/mlx5: Add dynamic MSI-X capabilities bits
  2021-03-14 12:42 [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
  2021-03-14 12:42 ` [PATCH mlx5-next v8 1/4] PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs Leon Romanovsky
@ 2021-03-14 12:42 ` Leon Romanovsky
  2021-03-14 12:42 ` [PATCH mlx5-next v8 3/4] net/mlx5: Dynamically assign MSI-X vectors count Leon Romanovsky
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2021-03-14 12:42 UTC (permalink / raw)
  To: Bjorn Helgaas, Saeed Mahameed
  Cc: Jason Gunthorpe, Alexander Duyck, Jakub Kicinski, linux-pci,
	linux-rdma, netdev, Don Dutile, Alex Williamson,
	David S . Miller, Greg Kroah-Hartman

These new fields declare the number of MSI-X vectors that is possible to
allocate on the VF through PF configuration.

Value must be in range defined by min_dynamic_vf_msix_table_size and
max_dynamic_vf_msix_table_size.

The driver should continue to query its MSI-X table through PCI
configuration header.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 include/linux/mlx5/mlx5_ifc.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index df5d91c8b2d4..c0ce1c2e1e57 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1680,7 +1680,16 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8	   reserved_at_6e0[0x10];
 	u8	   sf_base_id[0x10];

-	u8	   reserved_at_700[0x80];
+	u8	   reserved_at_700[0x8];
+	u8	   num_total_dynamic_vf_msix[0x18];
+	u8	   reserved_at_720[0x14];
+	u8	   dynamic_msix_table_size[0xc];
+	u8	   reserved_at_740[0xc];
+	u8	   min_dynamic_vf_msix_table_size[0x4];
+	u8	   reserved_at_750[0x4];
+	u8	   max_dynamic_vf_msix_table_size[0xc];
+
+	u8	   reserved_at_760[0x20];
 	u8	   vhca_tunnel_commands[0x40];
 	u8	   reserved_at_7c0[0x40];
 };
--
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH mlx5-next v8 3/4] net/mlx5: Dynamically assign MSI-X vectors count
  2021-03-14 12:42 [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
  2021-03-14 12:42 ` [PATCH mlx5-next v8 1/4] PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs Leon Romanovsky
  2021-03-14 12:42 ` [PATCH mlx5-next v8 2/4] net/mlx5: Add dynamic MSI-X capabilities bits Leon Romanovsky
@ 2021-03-14 12:42 ` Leon Romanovsky
  2021-03-14 12:42 ` [PATCH mlx5-next v8 4/4] net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks Leon Romanovsky
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2021-03-14 12:42 UTC (permalink / raw)
  To: Bjorn Helgaas, Saeed Mahameed
  Cc: Jason Gunthorpe, Alexander Duyck, Jakub Kicinski, linux-pci,
	linux-rdma, netdev, Don Dutile, Alex Williamson,
	David S . Miller, Greg Kroah-Hartman

The number of MSI-X vectors is a PCI property visible through lspci. The
field is read-only and configured by the device. The mlx5 devices work in
a static or dynamic assignment mode.

Static assignment means that all newly created VFs have a preset number of
MSI-X vectors determined by device configuration parameters. This can
result in some VFs having too many or too few MSI-X vectors. Till now this
has been the only means of fine-tuning the MSI-X vector count and it was
acceptable for small numbers of VFs.

With dynamic assignment the inefficiency of having a fixed number of MSI-X
vectors can be avoided with each VF having exactly the required
vectors. Userspace will provide this information while provisioning the VF
for use, based on the intended use. For instance if being used with a VM,
the MSI-X vector count might be matched to the CPU count of the VM.

For compatibility mlx5 continues to start up with MSI-X vector assignment,
but the kernel can now access a larger dynamic vector pool and assign more
vectors to created VFs.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/main.c    |  4 +
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  5 ++
 .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 73 +++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/sriov.c   | 13 +++-
 4 files changed, 93 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index c568896cfb23..0489712865b7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -571,6 +571,10 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)

 	mlx5_vhca_state_cap_handle(dev, set_hca_cap);

+	if (MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix))
+		MLX5_SET(cmd_hca_cap, set_hca_cap, num_total_dynamic_vf_msix,
+			 MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix));
+
 	return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE);
 }

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index efe403c7e354..f0aed664dd35 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -174,6 +174,11 @@ int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx,
 		       struct notifier_block *nb);
 int mlx5_irq_detach_nb(struct mlx5_irq_table *irq_table, int vecidx,
 		       struct notifier_block *nb);
+
+int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int devfn,
+			    int msix_vec_count);
+int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs);
+
 struct cpumask *
 mlx5_irq_get_affinity_mask(struct mlx5_irq_table *irq_table, int vecidx);
 struct cpu_rmap *mlx5_irq_get_rmap(struct mlx5_irq_table *table);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
index a61e09aff152..19e3e978267e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
@@ -61,6 +61,79 @@ static struct mlx5_irq *mlx5_irq_get(struct mlx5_core_dev *dev, int vecidx)
 	return &irq_table->irq[vecidx];
 }

+/**
+ * mlx5_get_default_msix_vec_count - Get the default number of MSI-X vectors
+ *                                   to be ssigned to each VF.
+ * @dev: PF to work on
+ * @num_vfs: Number of enabled VFs
+ */
+int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs)
+{
+	int num_vf_msix, min_msix, max_msix;
+
+	num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix);
+	if (!num_vf_msix)
+		return 0;
+
+	min_msix = MLX5_CAP_GEN(dev, min_dynamic_vf_msix_table_size);
+	max_msix = MLX5_CAP_GEN(dev, max_dynamic_vf_msix_table_size);
+
+	/* Limit maximum number of MSI-X vectors so the default configuration
+	 * has some available in the pool. This will allow the user to increase
+	 * the number of vectors in a VF without having to first size-down other
+	 * VFs.
+	 */
+	return max(min(num_vf_msix / num_vfs, max_msix / 2), min_msix);
+}
+
+/**
+ * mlx5_set_msix_vec_count - Set dynamically allocated MSI-X on the VF
+ * @dev: PF to work on
+ * @function_id: Internal PCI VF function IDd
+ * @msix_vec_count: Number of MSI-X vectors to set
+ */
+int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int function_id,
+			    int msix_vec_count)
+{
+	int sz = MLX5_ST_SZ_BYTES(set_hca_cap_in);
+	int num_vf_msix, min_msix, max_msix;
+	void *hca_cap, *cap;
+	int ret;
+
+	num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix);
+	if (!num_vf_msix)
+		return 0;
+
+	if (!MLX5_CAP_GEN(dev, vport_group_manager) || !mlx5_core_is_pf(dev))
+		return -EOPNOTSUPP;
+
+	min_msix = MLX5_CAP_GEN(dev, min_dynamic_vf_msix_table_size);
+	max_msix = MLX5_CAP_GEN(dev, max_dynamic_vf_msix_table_size);
+
+	if (msix_vec_count < min_msix)
+		return -EINVAL;
+
+	if (msix_vec_count > max_msix)
+		return -EOVERFLOW;
+
+	hca_cap = kzalloc(sz, GFP_KERNEL);
+	if (!hca_cap)
+		return -ENOMEM;
+
+	cap = MLX5_ADDR_OF(set_hca_cap_in, hca_cap, capability);
+	MLX5_SET(cmd_hca_cap, cap, dynamic_msix_table_size, msix_vec_count);
+
+	MLX5_SET(set_hca_cap_in, hca_cap, opcode, MLX5_CMD_OP_SET_HCA_CAP);
+	MLX5_SET(set_hca_cap_in, hca_cap, other_function, 1);
+	MLX5_SET(set_hca_cap_in, hca_cap, function_id, function_id);
+
+	MLX5_SET(set_hca_cap_in, hca_cap, op_mod,
+		 MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE << 1);
+	ret = mlx5_cmd_exec_in(dev, set_hca_cap, hca_cap);
+	kfree(hca_cap);
+	return ret;
+}
+
 int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx,
 		       struct notifier_block *nb)
 {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
index 3094d20297a9..f0ec86a1c8a6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
@@ -71,8 +71,7 @@ static int sriov_restore_guids(struct mlx5_core_dev *dev, int vf)
 static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs)
 {
 	struct mlx5_core_sriov *sriov = &dev->priv.sriov;
-	int err;
-	int vf;
+	int err, vf, num_msix_count;

 	if (!MLX5_ESWITCH_MANAGER(dev))
 		goto enable_vfs_hca;
@@ -85,12 +84,22 @@ static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs)
 	}

 enable_vfs_hca:
+	num_msix_count = mlx5_get_default_msix_vec_count(dev, num_vfs);
 	for (vf = 0; vf < num_vfs; vf++) {
 		err = mlx5_core_enable_hca(dev, vf + 1);
 		if (err) {
 			mlx5_core_warn(dev, "failed to enable VF %d (%d)\n", vf, err);
 			continue;
 		}
+
+		err = mlx5_set_msix_vec_count(dev, vf + 1, num_msix_count);
+		if (err) {
+			mlx5_core_warn(dev,
+				       "failed to set MSI-X vector counts VF %d, err %d\n",
+				       vf, err);
+			continue;
+		}
+
 		sriov->vfs_ctx[vf].enabled = 1;
 		if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_IB) {
 			err = sriov_restore_guids(dev, vf);
--
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH mlx5-next v8 4/4] net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
  2021-03-14 12:42 [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
                   ` (2 preceding siblings ...)
  2021-03-14 12:42 ` [PATCH mlx5-next v8 3/4] net/mlx5: Dynamically assign MSI-X vectors count Leon Romanovsky
@ 2021-03-14 12:42 ` Leon Romanovsky
  2021-03-20  9:13 ` [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2021-03-14 12:42 UTC (permalink / raw)
  To: Bjorn Helgaas, Saeed Mahameed
  Cc: Jason Gunthorpe, Alexander Duyck, Jakub Kicinski, linux-pci,
	linux-rdma, netdev, Don Dutile, Alex Williamson,
	David S . Miller, Greg Kroah-Hartman

The mlx5 implementation executes a firmware command on the PF to change
the configuration of the selected VF.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/main.c    |  2 ++
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  7 ++++
 .../net/ethernet/mellanox/mlx5/core/sriov.c   | 35 +++++++++++++++++++
 3 files changed, 44 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 0489712865b7..edca6bc87639 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1700,6 +1700,8 @@ static struct pci_driver mlx5_core_driver = {
 	.shutdown	= shutdown,
 	.err_handler	= &mlx5_err_handler,
 	.sriov_configure   = mlx5_core_sriov_configure,
+	.sriov_get_vf_total_msix = mlx5_sriov_get_vf_total_msix,
+	.sriov_set_msix_vec_count = mlx5_core_sriov_set_msix_vec_count,
 };

 static void mlx5_core_verify_params(void)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index f0aed664dd35..99007f2d0424 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -140,6 +140,7 @@ void mlx5_sriov_cleanup(struct mlx5_core_dev *dev);
 int mlx5_sriov_attach(struct mlx5_core_dev *dev);
 void mlx5_sriov_detach(struct mlx5_core_dev *dev);
 int mlx5_core_sriov_configure(struct pci_dev *dev, int num_vfs);
+int mlx5_core_sriov_set_msix_vec_count(struct pci_dev *vf, int msix_vec_count);
 int mlx5_core_enable_hca(struct mlx5_core_dev *dev, u16 func_id);
 int mlx5_core_disable_hca(struct mlx5_core_dev *dev, u16 func_id);
 int mlx5_create_scheduling_element_cmd(struct mlx5_core_dev *dev, u8 hierarchy,
@@ -278,4 +279,10 @@ int mlx5_load_one(struct mlx5_core_dev *dev, bool boot);
 int mlx5_vport_get_other_func_cap(struct mlx5_core_dev *dev, u16 function_id, void *out);

 void mlx5_events_work_enqueue(struct mlx5_core_dev *dev, struct work_struct *work);
+static inline u32 mlx5_sriov_get_vf_total_msix(struct pci_dev *pdev)
+{
+	struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+
+	return MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix);
+}
 #endif /* __MLX5_CORE_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
index f0ec86a1c8a6..2338989d4403 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
@@ -187,6 +187,41 @@ int mlx5_core_sriov_configure(struct pci_dev *pdev, int num_vfs)
 	return err ? err : num_vfs;
 }

+int mlx5_core_sriov_set_msix_vec_count(struct pci_dev *vf, int msix_vec_count)
+{
+	struct pci_dev *pf = pci_physfn(vf);
+	struct mlx5_core_sriov *sriov;
+	struct mlx5_core_dev *dev;
+	int num_vf_msix, id;
+
+	dev = pci_get_drvdata(pf);
+	num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix);
+	if (!num_vf_msix)
+		return -EOPNOTSUPP;
+
+	if (!msix_vec_count)
+		msix_vec_count =
+			mlx5_get_default_msix_vec_count(dev, pci_num_vf(pf));
+
+	sriov = &dev->priv.sriov;
+
+	/* Reversed translation of PCI VF function number to the internal
+	 * function_id, which exists in the name of virtfn symlink.
+	 */
+	for (id = 0; id < pci_num_vf(pf); id++) {
+		if (!sriov->vfs_ctx[id].enabled)
+			continue;
+
+		if (vf->devfn == pci_iov_virtfn_devfn(pf, id))
+			break;
+	}
+
+	if (id == pci_num_vf(pf) || !sriov->vfs_ctx[id].enabled)
+		return -EINVAL;
+
+	return mlx5_set_msix_vec_count(dev, id + 1, msix_vec_count);
+}
+
 int mlx5_sriov_attach(struct mlx5_core_dev *dev)
 {
 	if (!mlx5_core_is_pf(dev) || !pci_num_vf(dev->pdev))
--
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count
  2021-03-14 12:42 [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
                   ` (3 preceding siblings ...)
  2021-03-14 12:42 ` [PATCH mlx5-next v8 4/4] net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks Leon Romanovsky
@ 2021-03-20  9:13 ` Leon Romanovsky
  2021-03-25  7:57   ` Leon Romanovsky
  2021-04-03  0:25 ` Bjorn Helgaas
  2021-04-04  7:33 ` Leon Romanovsky
  6 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2021-03-20  9:13 UTC (permalink / raw)
  To: Bjorn Helgaas, Saeed Mahameed
  Cc: Jason Gunthorpe, Alexander Duyck, Jakub Kicinski, linux-pci,
	linux-rdma, netdev, Don Dutile, Alex Williamson,
	David S . Miller, Greg Kroah-Hartman

Gentle reminder

Thanks

On Sun, Mar 14, 2021 at 02:42:52PM +0200, Leon Romanovsky wrote:
> ---------------------------------------------------------------------------------
> Changelog
> v8:
>  * Added "physical/virtual function" words near PF and VF acronyms.
> v7: https://lore.kernel.org/linux-pci/20210301075524.441609-1-leon@kernel.org
>  * Rebase on top v5.12-rc1
>  * More english fixes
>  * Returned to static sysfs creation model as was implemented in v0/v1.
> v6: https://lore.kernel.org/linux-pci/20210209133445.700225-1-leon@kernel.org
>  * Patch 1:
>    * English fixes
>    * Moved pci_vf_set_msix_vec_count() from msi.c to iov.c
>    * Embedded pci_vf_set_msix_vec_count() into sriov_vf_msix_count_store
>    * Deleted sriov_vf_msix_count_show
>    * Deleted vfs_overlay folder
>    * Renamed functions *_vfs_overlay_* to be *_vf_overlay_*
>    * Deleted is_supported and attribute_group because it confused people more than
>      it gave advantage.
>    * Changed vf_total_msix to be callback
>  * Patch 3:
>    * Fixed english as suggested by Bjorn
>    * Added more explanations to the commit message
>  * Patch 4:
>    * Protected enable/disable with capability check
> v5: https://lore.kernel.org/linux-pci/20210126085730.1165673-1-leon@kernel.org
>  * Patch 1:
>   * Added forgotten "inline" keyword when declaring empty functions.
> v4: https://lore.kernel.org/linux-pci/20210124131119.558563-1-leon@kernel.org
>  * Used sysfs_emit() instead of sprintf() in new sysfs entries.
>  * Changed EXPORT_SYMBOL to be EXPORT_SYMBOL_GPL for pci_iov_virtfn_devfn().
>  * Rewrote sysfs registration code to be driven by PF that wants to enable VF
>    overlay instead of creating to all SR-IOV devices.
>  * Grouped all such functionality under new "vfs_overlay" folder.
>  * Combined two PCI patches into one.
> v3: https://lore.kernel.org/linux-pci/20210117081548.1278992-1-leon@kernel.org
>  * Renamed pci_set_msix_vec_count to be pci_vf_set_msix_vec_count.
>  * Added VF msix_cap check to hide sysfs entry if device doesn't support msix.
>  * Changed "-" to be ":" in the mlx5 patch to silence CI warnings about missing
>    kdoc description.
>  * Split differently error print in mlx5 driver to avoid checkpatch warning.
> v2: https://lore.kernel.org/linux-pci/20210114103140.866141-1-leon@kernel.org
>  * Patch 1:
>   * Renamed vf_msix_vec sysfs knob to be sriov_vf_msix_count
>   * Added PF and VF device locks during set MSI-X call to protect from parallel
>     driver bind/unbind operations.
>   * Removed extra checks when reading sriov_vf_msix, because users will
>     be able to distinguish between supported/not supported by looking on
>     sriov_vf_total_msix count.
>   * Changed all occurrences of "numb" to be "count"
>   * Changed returned error from EOPNOTSUPP to be EBUSY if user tries to set
>     MSI-X count after driver already bound to the VF.
>   * Added extra comment in pci_set_msix_vec_count() to emphasize that driver
>     should not be bound.
>  * Patch 2:
>   * Changed vf_total_msix from int to be u32 and updated function signatures
>     accordingly.
>   * Improved patch title
> v1: https://lore.kernel.org/linux-pci/20210110150727.1965295-1-leon@kernel.org
>  * Improved wording and commit messages of first PCI patch
>  * Added extra PCI patch to provide total number of MSI-X vectors
>  * Prohibited read of vf_msix_vec sysfs file if driver doesn't support write
>  * Removed extra function definition in pci.h
> v0: https://lore.kernel.org/linux-pci/20210103082440.34994-1-leon@kernel.org
> 
> --------------------------------------------------------------------
> Hi,
> 
> The number of MSI-X vectors is PCI property visible through lspci, that
> field is read-only and configured by the device.
> 
> The static assignment of an amount of MSI-X vectors doesn't allow utilize
> the newly created VF because it is not known to the device the future load
> and configuration where that VF will be used.
> 
> The VFs are created on the hypervisor and forwarded to the VMs that have
> different properties (for example number of CPUs).
> 
> To overcome the inefficiency in the spread of such MSI-X vectors, we
> allow the kernel to instruct the device with the needed number of such
> vectors, before VF is initialized and bounded to the driver.
> 
> Before this series:
> [root@server ~]# lspci -vs 0000:08:00.2
> 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
> ....
>         Capabilities: [9c] MSI-X: Enable- Count=12 Masked-
> 
> Configuration script:
> 1. Start fresh
> echo 0 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
> modprobe -q -r mlx5_ib mlx5_core
> 2. Ensure that driver doesn't run and it is safe to change MSI-X
> echo 0 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_drivers_autoprobe
> 3. Load driver for the PF
> modprobe mlx5_core
> 4. Configure one of the VFs with new number
> echo 2 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
> echo 21 > /sys/bus/pci/devices/0000\:08\:00.2/sriov_vf_msix_count
> 
> After this series:
> [root@server ~]# lspci -vs 0000:08:00.2
> 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
> ....
>         Capabilities: [9c] MSI-X: Enable- Count=21 Masked-
> 
> Thanks
> 
> Leon Romanovsky (4):
>   PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs
>   net/mlx5: Add dynamic MSI-X capabilities bits
>   net/mlx5: Dynamically assign MSI-X vectors count
>   net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
> 
>  Documentation/ABI/testing/sysfs-bus-pci       |  29 +++++
>  .../net/ethernet/mellanox/mlx5/core/main.c    |   6 ++
>  .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  12 +++
>  .../net/ethernet/mellanox/mlx5/core/pci_irq.c |  73 +++++++++++++
>  .../net/ethernet/mellanox/mlx5/core/sriov.c   |  48 ++++++++-
>  drivers/pci/iov.c                             | 102 ++++++++++++++++--
>  drivers/pci/pci-sysfs.c                       |   3 +-
>  drivers/pci/pci.h                             |   3 +-
>  include/linux/mlx5/mlx5_ifc.h                 |  11 +-
>  include/linux/pci.h                           |   8 ++
>  10 files changed, 284 insertions(+), 11 deletions(-)
> 
> --
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count
  2021-03-20  9:13 ` [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
@ 2021-03-25  7:57   ` Leon Romanovsky
  0 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2021-03-25  7:57 UTC (permalink / raw)
  To: Bjorn Helgaas, Saeed Mahameed, Jakub Kicinski, David S . Miller
  Cc: Jason Gunthorpe, Alexander Duyck, linux-pci, linux-rdma, netdev,
	Don Dutile, Alex Williamson, Greg Kroah-Hartman

Bjorn,

Can you please help us progress with the series?

It needs to go from mlx5-next to net-next and maybe to rdma-next too
and we need enough time to keep the series for automatic testing in
all those branches.

The code was posted on -rc1 [1] while we are in -rc4 already.

[1] https://lore.kernel.org/linux-pci/20210301075524.441609-1-leon@kernel.org/

Thanks

On Sat, Mar 20, 2021 at 11:13:44AM +0200, Leon Romanovsky wrote:
> Gentle reminder
> 
> Thanks
> 
> On Sun, Mar 14, 2021 at 02:42:52PM +0200, Leon Romanovsky wrote:
> > ---------------------------------------------------------------------------------
> > Changelog
> > v8:
> >  * Added "physical/virtual function" words near PF and VF acronyms.
> > v7: https://lore.kernel.org/linux-pci/20210301075524.441609-1-leon@kernel.org
> >  * Rebase on top v5.12-rc1
> >  * More english fixes
> >  * Returned to static sysfs creation model as was implemented in v0/v1.
> > v6: https://lore.kernel.org/linux-pci/20210209133445.700225-1-leon@kernel.org
> >  * Patch 1:
> >    * English fixes
> >    * Moved pci_vf_set_msix_vec_count() from msi.c to iov.c
> >    * Embedded pci_vf_set_msix_vec_count() into sriov_vf_msix_count_store
> >    * Deleted sriov_vf_msix_count_show
> >    * Deleted vfs_overlay folder
> >    * Renamed functions *_vfs_overlay_* to be *_vf_overlay_*
> >    * Deleted is_supported and attribute_group because it confused people more than
> >      it gave advantage.
> >    * Changed vf_total_msix to be callback
> >  * Patch 3:
> >    * Fixed english as suggested by Bjorn
> >    * Added more explanations to the commit message
> >  * Patch 4:
> >    * Protected enable/disable with capability check
> > v5: https://lore.kernel.org/linux-pci/20210126085730.1165673-1-leon@kernel.org
> >  * Patch 1:
> >   * Added forgotten "inline" keyword when declaring empty functions.
> > v4: https://lore.kernel.org/linux-pci/20210124131119.558563-1-leon@kernel.org
> >  * Used sysfs_emit() instead of sprintf() in new sysfs entries.
> >  * Changed EXPORT_SYMBOL to be EXPORT_SYMBOL_GPL for pci_iov_virtfn_devfn().
> >  * Rewrote sysfs registration code to be driven by PF that wants to enable VF
> >    overlay instead of creating to all SR-IOV devices.
> >  * Grouped all such functionality under new "vfs_overlay" folder.
> >  * Combined two PCI patches into one.
> > v3: https://lore.kernel.org/linux-pci/20210117081548.1278992-1-leon@kernel.org
> >  * Renamed pci_set_msix_vec_count to be pci_vf_set_msix_vec_count.
> >  * Added VF msix_cap check to hide sysfs entry if device doesn't support msix.
> >  * Changed "-" to be ":" in the mlx5 patch to silence CI warnings about missing
> >    kdoc description.
> >  * Split differently error print in mlx5 driver to avoid checkpatch warning.
> > v2: https://lore.kernel.org/linux-pci/20210114103140.866141-1-leon@kernel.org
> >  * Patch 1:
> >   * Renamed vf_msix_vec sysfs knob to be sriov_vf_msix_count
> >   * Added PF and VF device locks during set MSI-X call to protect from parallel
> >     driver bind/unbind operations.
> >   * Removed extra checks when reading sriov_vf_msix, because users will
> >     be able to distinguish between supported/not supported by looking on
> >     sriov_vf_total_msix count.
> >   * Changed all occurrences of "numb" to be "count"
> >   * Changed returned error from EOPNOTSUPP to be EBUSY if user tries to set
> >     MSI-X count after driver already bound to the VF.
> >   * Added extra comment in pci_set_msix_vec_count() to emphasize that driver
> >     should not be bound.
> >  * Patch 2:
> >   * Changed vf_total_msix from int to be u32 and updated function signatures
> >     accordingly.
> >   * Improved patch title
> > v1: https://lore.kernel.org/linux-pci/20210110150727.1965295-1-leon@kernel.org
> >  * Improved wording and commit messages of first PCI patch
> >  * Added extra PCI patch to provide total number of MSI-X vectors
> >  * Prohibited read of vf_msix_vec sysfs file if driver doesn't support write
> >  * Removed extra function definition in pci.h
> > v0: https://lore.kernel.org/linux-pci/20210103082440.34994-1-leon@kernel.org
> > 
> > --------------------------------------------------------------------
> > Hi,
> > 
> > The number of MSI-X vectors is PCI property visible through lspci, that
> > field is read-only and configured by the device.
> > 
> > The static assignment of an amount of MSI-X vectors doesn't allow utilize
> > the newly created VF because it is not known to the device the future load
> > and configuration where that VF will be used.
> > 
> > The VFs are created on the hypervisor and forwarded to the VMs that have
> > different properties (for example number of CPUs).
> > 
> > To overcome the inefficiency in the spread of such MSI-X vectors, we
> > allow the kernel to instruct the device with the needed number of such
> > vectors, before VF is initialized and bounded to the driver.
> > 
> > Before this series:
> > [root@server ~]# lspci -vs 0000:08:00.2
> > 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
> > ....
> >         Capabilities: [9c] MSI-X: Enable- Count=12 Masked-
> > 
> > Configuration script:
> > 1. Start fresh
> > echo 0 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
> > modprobe -q -r mlx5_ib mlx5_core
> > 2. Ensure that driver doesn't run and it is safe to change MSI-X
> > echo 0 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_drivers_autoprobe
> > 3. Load driver for the PF
> > modprobe mlx5_core
> > 4. Configure one of the VFs with new number
> > echo 2 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
> > echo 21 > /sys/bus/pci/devices/0000\:08\:00.2/sriov_vf_msix_count
> > 
> > After this series:
> > [root@server ~]# lspci -vs 0000:08:00.2
> > 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
> > ....
> >         Capabilities: [9c] MSI-X: Enable- Count=21 Masked-
> > 
> > Thanks
> > 
> > Leon Romanovsky (4):
> >   PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs
> >   net/mlx5: Add dynamic MSI-X capabilities bits
> >   net/mlx5: Dynamically assign MSI-X vectors count
> >   net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
> > 
> >  Documentation/ABI/testing/sysfs-bus-pci       |  29 +++++
> >  .../net/ethernet/mellanox/mlx5/core/main.c    |   6 ++
> >  .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  12 +++
> >  .../net/ethernet/mellanox/mlx5/core/pci_irq.c |  73 +++++++++++++
> >  .../net/ethernet/mellanox/mlx5/core/sriov.c   |  48 ++++++++-
> >  drivers/pci/iov.c                             | 102 ++++++++++++++++--
> >  drivers/pci/pci-sysfs.c                       |   3 +-
> >  drivers/pci/pci.h                             |   3 +-
> >  include/linux/mlx5/mlx5_ifc.h                 |  11 +-
> >  include/linux/pci.h                           |   8 ++
> >  10 files changed, 284 insertions(+), 11 deletions(-)
> > 
> > --
> > 2.30.2
> > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH mlx5-next v8 1/4] PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs
  2021-03-14 12:42 ` [PATCH mlx5-next v8 1/4] PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs Leon Romanovsky
@ 2021-04-03  0:24   ` Bjorn Helgaas
  2021-04-04  7:02     ` Leon Romanovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2021-04-03  0:24 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Bjorn Helgaas, Saeed Mahameed, Jason Gunthorpe, Alexander Duyck,
	Jakub Kicinski, linux-pci, linux-rdma, netdev, Don Dutile,
	Alex Williamson, David S . Miller, Greg Kroah-Hartman

Possible subject, since this adds *two* files, not just "a file":

  PCI/IOV: Add sysfs MSI-X vector assignment interface

On Sun, Mar 14, 2021 at 02:42:53PM +0200, Leon Romanovsky wrote:
> A typical cloud provider SR-IOV use case is to create many VFs for use by
> guest VMs. The VFs may not be assigned to a VM until a customer requests a
> VM of a certain size, e.g., number of CPUs. A VF may need MSI-X vectors
> proportional to the number of CPUs in the VM, but there is no standard way
> to change the number of MSI-X vectors supported by a VF.
> ...

> +#ifdef CONFIG_PCI_MSI
> +static ssize_t sriov_vf_msix_count_store(struct device *dev,
> +					 struct device_attribute *attr,
> +					 const char *buf, size_t count)
> +{
> +	struct pci_dev *vf_dev = to_pci_dev(dev);
> +	struct pci_dev *pdev = pci_physfn(vf_dev);
> +	int val, ret;
> +
> +	ret = kstrtoint(buf, 0, &val);
> +	if (ret)
> +		return ret;
> +
> +	if (val < 0)
> +		return -EINVAL;
> +
> +	device_lock(&pdev->dev);
> +	if (!pdev->driver || !pdev->driver->sriov_set_msix_vec_count) {
> +		ret = -EOPNOTSUPP;
> +		goto err_pdev;
> +	}
> +
> +	device_lock(&vf_dev->dev);
> +	if (vf_dev->driver) {
> +		/*
> +		 * A driver is already attached to this VF and has configured
> +		 * itself based on the current MSI-X vector count. Changing
> +		 * the vector size could mess up the driver, so block it.
> +		 */
> +		ret = -EBUSY;
> +		goto err_dev;
> +	}
> +
> +	ret = pdev->driver->sriov_set_msix_vec_count(vf_dev, val);
> +
> +err_dev:
> +	device_unlock(&vf_dev->dev);
> +err_pdev:
> +	device_unlock(&pdev->dev);
> +	return ret ? : count;
> +}
> +static DEVICE_ATTR_WO(sriov_vf_msix_count);
> +
> +static ssize_t sriov_vf_total_msix_show(struct device *dev,
> +					struct device_attribute *attr,
> +					char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	u32 vf_total_msix = 0;
> +
> +	device_lock(dev);
> +	if (!pdev->driver || !pdev->driver->sriov_get_vf_total_msix)
> +		goto unlock;
> +
> +	vf_total_msix = pdev->driver->sriov_get_vf_total_msix(pdev);
> +unlock:
> +	device_unlock(dev);
> +	return sysfs_emit(buf, "%u\n", vf_total_msix);
> +}
> +static DEVICE_ATTR_RO(sriov_vf_total_msix);

Can you reverse the order of sriov_vf_total_msix_show() and
sriov_vf_msix_count_store()?  Currently we have:

  VF stuff (msix_count_store)
  PF stuff (total_msix)
  more VF stuff related to the above (vf_dev_attrs, are_visible)

so the total_msix bit is mixed in the middle.

> +#endif
> +
> +static struct attribute *sriov_vf_dev_attrs[] = {
> +#ifdef CONFIG_PCI_MSI
> +	&dev_attr_sriov_vf_msix_count.attr,
> +#endif
> +	NULL,
> +};
> +
> +static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
> +					  struct attribute *a, int n)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +
> +	if (!pdev->is_virtfn)
> +		return 0;
> +
> +	return a->mode;
> +}
> +
> +const struct attribute_group sriov_vf_dev_attr_group = {
> +	.attrs = sriov_vf_dev_attrs,
> +	.is_visible = sriov_vf_attrs_are_visible,
> +};
> +
>  int pci_iov_add_virtfn(struct pci_dev *dev, int id)
>  {
>  	int i;
> @@ -400,18 +487,21 @@ static DEVICE_ATTR_RO(sriov_stride);
>  static DEVICE_ATTR_RO(sriov_vf_device);
>  static DEVICE_ATTR_RW(sriov_drivers_autoprobe);
> 
> -static struct attribute *sriov_dev_attrs[] = {
> +static struct attribute *sriov_pf_dev_attrs[] = {

This and the related sriov_pf_attrs_are_visible change below are nice.
Would you mind splitting them to a preliminary patch, since they
really aren't related to the concept of *this* patch?

>  	&dev_attr_sriov_totalvfs.attr,
>  	&dev_attr_sriov_numvfs.attr,
>  	&dev_attr_sriov_offset.attr,
>  	&dev_attr_sriov_stride.attr,
>  	&dev_attr_sriov_vf_device.attr,
>  	&dev_attr_sriov_drivers_autoprobe.attr,
> +#ifdef CONFIG_PCI_MSI
> +	&dev_attr_sriov_vf_total_msix.attr,
> +#endif
>  	NULL,
>  };
> 
> -static umode_t sriov_attrs_are_visible(struct kobject *kobj,
> -				       struct attribute *a, int n)
> +static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
> +					  struct attribute *a, int n)
>  {
>  	struct device *dev = kobj_to_dev(kobj);
> 
> @@ -421,9 +511,9 @@ static umode_t sriov_attrs_are_visible(struct kobject *kobj,
>  	return a->mode;
>  }
> 
> -const struct attribute_group sriov_dev_attr_group = {
> -	.attrs = sriov_dev_attrs,
> -	.is_visible = sriov_attrs_are_visible,
> +const struct attribute_group sriov_pf_dev_attr_group = {
> +	.attrs = sriov_pf_dev_attrs,
> +	.is_visible = sriov_pf_attrs_are_visible,
>  };

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count
  2021-03-14 12:42 [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
                   ` (4 preceding siblings ...)
  2021-03-20  9:13 ` [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
@ 2021-04-03  0:25 ` Bjorn Helgaas
  2021-04-04  7:33 ` Leon Romanovsky
  6 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2021-04-03  0:25 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Bjorn Helgaas, Saeed Mahameed, Jason Gunthorpe, Alexander Duyck,
	Jakub Kicinski, linux-pci, linux-rdma, netdev, Don Dutile,
	Alex Williamson, David S . Miller, Greg Kroah-Hartman

On Sun, Mar 14, 2021 at 02:42:52PM +0200, Leon Romanovsky wrote:
> ---------------------------------------------------------------------------------
> Changelog
> v8:
>  * Added "physical/virtual function" words near PF and VF acronyms.
> v7: https://lore.kernel.org/linux-pci/20210301075524.441609-1-leon@kernel.org
>  * Rebase on top v5.12-rc1
>  * More english fixes
>  * Returned to static sysfs creation model as was implemented in v0/v1.
> v6: https://lore.kernel.org/linux-pci/20210209133445.700225-1-leon@kernel.org
>  * Patch 1:
>    * English fixes
>    * Moved pci_vf_set_msix_vec_count() from msi.c to iov.c
>    * Embedded pci_vf_set_msix_vec_count() into sriov_vf_msix_count_store
>    * Deleted sriov_vf_msix_count_show
>    * Deleted vfs_overlay folder
>    * Renamed functions *_vfs_overlay_* to be *_vf_overlay_*
>    * Deleted is_supported and attribute_group because it confused people more than
>      it gave advantage.
>    * Changed vf_total_msix to be callback
>  * Patch 3:
>    * Fixed english as suggested by Bjorn
>    * Added more explanations to the commit message
>  * Patch 4:
>    * Protected enable/disable with capability check
> v5: https://lore.kernel.org/linux-pci/20210126085730.1165673-1-leon@kernel.org
>  * Patch 1:
>   * Added forgotten "inline" keyword when declaring empty functions.
> v4: https://lore.kernel.org/linux-pci/20210124131119.558563-1-leon@kernel.org
>  * Used sysfs_emit() instead of sprintf() in new sysfs entries.
>  * Changed EXPORT_SYMBOL to be EXPORT_SYMBOL_GPL for pci_iov_virtfn_devfn().
>  * Rewrote sysfs registration code to be driven by PF that wants to enable VF
>    overlay instead of creating to all SR-IOV devices.
>  * Grouped all such functionality under new "vfs_overlay" folder.
>  * Combined two PCI patches into one.
> v3: https://lore.kernel.org/linux-pci/20210117081548.1278992-1-leon@kernel.org
>  * Renamed pci_set_msix_vec_count to be pci_vf_set_msix_vec_count.
>  * Added VF msix_cap check to hide sysfs entry if device doesn't support msix.
>  * Changed "-" to be ":" in the mlx5 patch to silence CI warnings about missing
>    kdoc description.
>  * Split differently error print in mlx5 driver to avoid checkpatch warning.
> v2: https://lore.kernel.org/linux-pci/20210114103140.866141-1-leon@kernel.org
>  * Patch 1:
>   * Renamed vf_msix_vec sysfs knob to be sriov_vf_msix_count
>   * Added PF and VF device locks during set MSI-X call to protect from parallel
>     driver bind/unbind operations.
>   * Removed extra checks when reading sriov_vf_msix, because users will
>     be able to distinguish between supported/not supported by looking on
>     sriov_vf_total_msix count.
>   * Changed all occurrences of "numb" to be "count"
>   * Changed returned error from EOPNOTSUPP to be EBUSY if user tries to set
>     MSI-X count after driver already bound to the VF.
>   * Added extra comment in pci_set_msix_vec_count() to emphasize that driver
>     should not be bound.
>  * Patch 2:
>   * Changed vf_total_msix from int to be u32 and updated function signatures
>     accordingly.
>   * Improved patch title
> v1: https://lore.kernel.org/linux-pci/20210110150727.1965295-1-leon@kernel.org
>  * Improved wording and commit messages of first PCI patch
>  * Added extra PCI patch to provide total number of MSI-X vectors
>  * Prohibited read of vf_msix_vec sysfs file if driver doesn't support write
>  * Removed extra function definition in pci.h
> v0: https://lore.kernel.org/linux-pci/20210103082440.34994-1-leon@kernel.org
> 
> --------------------------------------------------------------------
> Hi,
> 
> The number of MSI-X vectors is PCI property visible through lspci, that
> field is read-only and configured by the device.
> 
> The static assignment of an amount of MSI-X vectors doesn't allow utilize
> the newly created VF because it is not known to the device the future load
> and configuration where that VF will be used.
> 
> The VFs are created on the hypervisor and forwarded to the VMs that have
> different properties (for example number of CPUs).
> 
> To overcome the inefficiency in the spread of such MSI-X vectors, we
> allow the kernel to instruct the device with the needed number of such
> vectors, before VF is initialized and bounded to the driver.
> 
> Before this series:
> [root@server ~]# lspci -vs 0000:08:00.2
> 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
> ....
>         Capabilities: [9c] MSI-X: Enable- Count=12 Masked-
> 
> Configuration script:
> 1. Start fresh
> echo 0 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
> modprobe -q -r mlx5_ib mlx5_core
> 2. Ensure that driver doesn't run and it is safe to change MSI-X
> echo 0 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_drivers_autoprobe
> 3. Load driver for the PF
> modprobe mlx5_core
> 4. Configure one of the VFs with new number
> echo 2 > /sys/bus/pci/devices/0000\:08\:00.0/sriov_numvfs
> echo 21 > /sys/bus/pci/devices/0000\:08\:00.2/sriov_vf_msix_count
> 
> After this series:
> [root@server ~]# lspci -vs 0000:08:00.2
> 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
> ....
>         Capabilities: [9c] MSI-X: Enable- Count=21 Masked-
> 
> Thanks
> 
> Leon Romanovsky (4):
>   PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs
>   net/mlx5: Add dynamic MSI-X capabilities bits
>   net/mlx5: Dynamically assign MSI-X vectors count
>   net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
> 
>  Documentation/ABI/testing/sysfs-bus-pci       |  29 +++++
>  .../net/ethernet/mellanox/mlx5/core/main.c    |   6 ++
>  .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  12 +++
>  .../net/ethernet/mellanox/mlx5/core/pci_irq.c |  73 +++++++++++++
>  .../net/ethernet/mellanox/mlx5/core/sriov.c   |  48 ++++++++-
>  drivers/pci/iov.c                             | 102 ++++++++++++++++--
>  drivers/pci/pci-sysfs.c                       |   3 +-
>  drivers/pci/pci.h                             |   3 +-
>  include/linux/mlx5/mlx5_ifc.h                 |  11 +-
>  include/linux/pci.h                           |   8 ++
>  10 files changed, 284 insertions(+), 11 deletions(-)

Looks good to me, thanks for persevering with this.

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Minor comments on 1/4, not critical.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH mlx5-next v8 1/4] PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs
  2021-04-03  0:24   ` Bjorn Helgaas
@ 2021-04-04  7:02     ` Leon Romanovsky
  0 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2021-04-04  7:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, Saeed Mahameed, Jason Gunthorpe, Alexander Duyck,
	Jakub Kicinski, linux-pci, linux-rdma, netdev, Don Dutile,
	Alex Williamson, David S . Miller, Greg Kroah-Hartman

On Fri, Apr 02, 2021 at 07:24:26PM -0500, Bjorn Helgaas wrote:
> Possible subject, since this adds *two* files, not just "a file":
> 
>   PCI/IOV: Add sysfs MSI-X vector assignment interface

Sure

> 
> On Sun, Mar 14, 2021 at 02:42:53PM +0200, Leon Romanovsky wrote:
> > A typical cloud provider SR-IOV use case is to create many VFs for use by
> > guest VMs. The VFs may not be assigned to a VM until a customer requests a
> > VM of a certain size, e.g., number of CPUs. A VF may need MSI-X vectors
> > proportional to the number of CPUs in the VM, but there is no standard way
> > to change the number of MSI-X vectors supported by a VF.
> > ...
> 
> > +#ifdef CONFIG_PCI_MSI
> > +static ssize_t sriov_vf_msix_count_store(struct device *dev,
> > +					 struct device_attribute *attr,
> > +					 const char *buf, size_t count)
> > +{
> > +	struct pci_dev *vf_dev = to_pci_dev(dev);
> > +	struct pci_dev *pdev = pci_physfn(vf_dev);
> > +	int val, ret;
> > +
> > +	ret = kstrtoint(buf, 0, &val);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (val < 0)
> > +		return -EINVAL;
> > +
> > +	device_lock(&pdev->dev);
> > +	if (!pdev->driver || !pdev->driver->sriov_set_msix_vec_count) {
> > +		ret = -EOPNOTSUPP;
> > +		goto err_pdev;
> > +	}
> > +
> > +	device_lock(&vf_dev->dev);
> > +	if (vf_dev->driver) {
> > +		/*
> > +		 * A driver is already attached to this VF and has configured
> > +		 * itself based on the current MSI-X vector count. Changing
> > +		 * the vector size could mess up the driver, so block it.
> > +		 */
> > +		ret = -EBUSY;
> > +		goto err_dev;
> > +	}
> > +
> > +	ret = pdev->driver->sriov_set_msix_vec_count(vf_dev, val);
> > +
> > +err_dev:
> > +	device_unlock(&vf_dev->dev);
> > +err_pdev:
> > +	device_unlock(&pdev->dev);
> > +	return ret ? : count;
> > +}
> > +static DEVICE_ATTR_WO(sriov_vf_msix_count);
> > +
> > +static ssize_t sriov_vf_total_msix_show(struct device *dev,
> > +					struct device_attribute *attr,
> > +					char *buf)
> > +{
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	u32 vf_total_msix = 0;
> > +
> > +	device_lock(dev);
> > +	if (!pdev->driver || !pdev->driver->sriov_get_vf_total_msix)
> > +		goto unlock;
> > +
> > +	vf_total_msix = pdev->driver->sriov_get_vf_total_msix(pdev);
> > +unlock:
> > +	device_unlock(dev);
> > +	return sysfs_emit(buf, "%u\n", vf_total_msix);
> > +}
> > +static DEVICE_ATTR_RO(sriov_vf_total_msix);
> 
> Can you reverse the order of sriov_vf_total_msix_show() and
> sriov_vf_msix_count_store()?  Currently we have:
> 
>   VF stuff (msix_count_store)
>   PF stuff (total_msix)
>   more VF stuff related to the above (vf_dev_attrs, are_visible)
> 
> so the total_msix bit is mixed in the middle.

No problem, I'll do.

> 
> > +#endif
> > +
> > +static struct attribute *sriov_vf_dev_attrs[] = {
> > +#ifdef CONFIG_PCI_MSI
> > +	&dev_attr_sriov_vf_msix_count.attr,
> > +#endif
> > +	NULL,
> > +};
> > +
> > +static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
> > +					  struct attribute *a, int n)
> > +{
> > +	struct device *dev = kobj_to_dev(kobj);
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > +	if (!pdev->is_virtfn)
> > +		return 0;
> > +
> > +	return a->mode;
> > +}
> > +
> > +const struct attribute_group sriov_vf_dev_attr_group = {
> > +	.attrs = sriov_vf_dev_attrs,
> > +	.is_visible = sriov_vf_attrs_are_visible,
> > +};
> > +
> >  int pci_iov_add_virtfn(struct pci_dev *dev, int id)
> >  {
> >  	int i;
> > @@ -400,18 +487,21 @@ static DEVICE_ATTR_RO(sriov_stride);
> >  static DEVICE_ATTR_RO(sriov_vf_device);
> >  static DEVICE_ATTR_RW(sriov_drivers_autoprobe);
> > 
> > -static struct attribute *sriov_dev_attrs[] = {
> > +static struct attribute *sriov_pf_dev_attrs[] = {
> 
> This and the related sriov_pf_attrs_are_visible change below are nice.
> Would you mind splitting them to a preliminary patch, since they
> really aren't related to the concept of *this* patch?

I don't think so, that prepatch will have only two lines of renames
from sriov_dev_attrs to be sriov_pf_dev_attrs. It is not worth the
hassle.

Thanks

> 
> >  	&dev_attr_sriov_totalvfs.attr,
> >  	&dev_attr_sriov_numvfs.attr,
> >  	&dev_attr_sriov_offset.attr,
> >  	&dev_attr_sriov_stride.attr,
> >  	&dev_attr_sriov_vf_device.attr,
> >  	&dev_attr_sriov_drivers_autoprobe.attr,
> > +#ifdef CONFIG_PCI_MSI
> > +	&dev_attr_sriov_vf_total_msix.attr,
> > +#endif
> >  	NULL,
> >  };
> > 
> > -static umode_t sriov_attrs_are_visible(struct kobject *kobj,
> > -				       struct attribute *a, int n)
> > +static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
> > +					  struct attribute *a, int n)
> >  {
> >  	struct device *dev = kobj_to_dev(kobj);
> > 
> > @@ -421,9 +511,9 @@ static umode_t sriov_attrs_are_visible(struct kobject *kobj,
> >  	return a->mode;
> >  }
> > 
> > -const struct attribute_group sriov_dev_attr_group = {
> > -	.attrs = sriov_dev_attrs,
> > -	.is_visible = sriov_attrs_are_visible,
> > +const struct attribute_group sriov_pf_dev_attr_group = {
> > +	.attrs = sriov_pf_dev_attrs,
> > +	.is_visible = sriov_pf_attrs_are_visible,
> >  };

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count
  2021-03-14 12:42 [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
                   ` (5 preceding siblings ...)
  2021-04-03  0:25 ` Bjorn Helgaas
@ 2021-04-04  7:33 ` Leon Romanovsky
  6 siblings, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2021-04-04  7:33 UTC (permalink / raw)
  To: Bjorn Helgaas, Saeed Mahameed
  Cc: Jason Gunthorpe, Alexander Duyck, Jakub Kicinski, linux-pci,
	linux-rdma, netdev, Don Dutile, Alex Williamson,
	David S . Miller, Greg Kroah-Hartman

On Sun, Mar 14, 2021 at 02:42:52PM +0200, Leon Romanovsky wrote:
> ---------------------------------------------------------------------------------
> Changelog
> v8:
>  * Added "physical/virtual function" words near PF and VF acronyms.
> v7: https://lore.kernel.org/linux-pci/20210301075524.441609-1-leon@kernel.org
>  * Rebase on top v5.12-rc1
>  * More english fixes
>  * Returned to static sysfs creation model as was implemented in v0/v1.

<...>

> Leon Romanovsky (4):
>   PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs
>   net/mlx5: Add dynamic MSI-X capabilities bits
>   net/mlx5: Dynamically assign MSI-X vectors count
>   net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks

applied to mlx5-next with changes asked by Bjorn.

e71b75f73763 net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
604774add516 net/mlx5: Dynamically assign MSI-X vectors count
0b989c1e3705 net/mlx5: Add dynamic MSI-X capabilities bits
c3d5c2d96d69 PCI/IOV: Add sysfs MSI-X vector assignment interface

Thanks

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-04-04  7:33 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-14 12:42 [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
2021-03-14 12:42 ` [PATCH mlx5-next v8 1/4] PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs Leon Romanovsky
2021-04-03  0:24   ` Bjorn Helgaas
2021-04-04  7:02     ` Leon Romanovsky
2021-03-14 12:42 ` [PATCH mlx5-next v8 2/4] net/mlx5: Add dynamic MSI-X capabilities bits Leon Romanovsky
2021-03-14 12:42 ` [PATCH mlx5-next v8 3/4] net/mlx5: Dynamically assign MSI-X vectors count Leon Romanovsky
2021-03-14 12:42 ` [PATCH mlx5-next v8 4/4] net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks Leon Romanovsky
2021-03-20  9:13 ` [PATCH mlx5-next v8 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
2021-03-25  7:57   ` Leon Romanovsky
2021-04-03  0:25 ` Bjorn Helgaas
2021-04-04  7:33 ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).