All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net-next 00/14] mlx5 updates 2023-07-24
@ 2023-07-24 22:44 Saeed Mahameed
  2023-07-24 22:44 ` [net-next 01/14] net/mlx5: Expose port.c/mlx5_query_module_num() function Saeed Mahameed
                   ` (13 more replies)
  0 siblings, 14 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan

From: Saeed Mahameed <saeedm@nvidia.com>

This series adds misc updates.
For more information please see tag log below.

Please pull and let me know if there is any problem.

Thanks,
Saeed.


The following changes since commit dc644b540a2d2874112706591234be3d3fbf9ef7:

  tcx: Fix splat in ingress_destroy upon tcx_entry_free (2023-07-24 11:42:35 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2023-07-24

for you to fetch changes up to 67d648e27188088a28cb7ea67208b9e174af1bf3:

  net/mlx5: Remove pointless devlink_rate checks (2023-07-24 15:34:06 -0700)

----------------------------------------------------------------
mlx5-updates-2023-07-24

1) Replace current thermal implementation with hwmon API.

2) Generalize devcom implementation to be independent of number of ports
   or device's GUID.

3) Save memory on command interface statistics.

4) General code cleanups

----------------------------------------------------------------
Adham Faris (2):
      net/mlx5: Expose port.c/mlx5_query_module_num() function
      net/mlx5: Expose NIC temperature via hardware monitoring kernel API

Jiri Pirko (2):
      net/mlx5: Don't check vport->enabled in port ops
      net/mlx5: Remove pointless devlink_rate checks

Parav Pandit (2):
      net/mlx5e: Remove duplicate code for user flow
      net/mlx5e: Make flow classification filters static

Roi Dayan (4):
      net/mlx5: Use shared code for checking lag is supported
      net/mlx5: Devcom, Infrastructure changes
      net/mlx5e: E-Switch, Register devcom device with switch id key
      net/mlx5e: E-Switch, Allow devcom initialization on more vports

Shay Drory (4):
      net/mlx5: Re-organize mlx5_cmd struct
      net/mlx5: Remove redundant cmdif revision check
      net/mlx5: split mlx5_cmd_init() to probe and reload routines
      net/mlx5: Allocate command stats with xarray

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c      | 223 +++++-----
 drivers/net/ethernet/mellanox/mlx5/core/debugfs.c  |  34 +-
 drivers/net/ethernet/mellanox/mlx5/core/dev.c      |   6 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |   3 -
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |   6 +-
 .../ethernet/mellanox/mlx5/core/en_fs_ethtool.c    |   4 -
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  21 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c    |  45 +--
 .../net/ethernet/mellanox/mlx5/core/esw/bridge.c   |  22 +-
 .../ethernet/mellanox/mlx5/core/esw/bridge_mcast.c |  17 +-
 .../ethernet/mellanox/mlx5/core/esw/devlink_port.c |  12 +-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.h  |   7 +-
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c |  79 ++--
 drivers/net/ethernet/mellanox/mlx5/core/hwmon.c    | 428 ++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/hwmon.h    |  24 ++
 drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c  |  12 +-
 drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h  |  12 +-
 .../net/ethernet/mellanox/mlx5/core/lib/devcom.c   | 448 +++++++++++----------
 .../net/ethernet/mellanox/mlx5/core/lib/devcom.h   |  74 ++--
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |  35 +-
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h    |   3 +
 drivers/net/ethernet/mellanox/mlx5/core/port.c     |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/thermal.c  | 114 ------
 drivers/net/ethernet/mellanox/mlx5/core/thermal.h  |  20 -
 include/linux/mlx5/driver.h                        |  30 +-
 include/linux/mlx5/mlx5_ifc.h                      |  14 +-
 27 files changed, 1039 insertions(+), 658 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/hwmon.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/hwmon.h
 delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/thermal.c
 delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/thermal.h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [net-next 01/14] net/mlx5: Expose port.c/mlx5_query_module_num() function
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 02/14] net/mlx5: Expose NIC temperature via hardware monitoring kernel API Saeed Mahameed
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris, Gal Pressman

From: Adham Faris <afaris@nvidia.com>

Make mlx5_query_module_num() defined in port.c, a non-static, so it can
be used by other files.

Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h | 1 +
 drivers/net/ethernet/mellanox/mlx5/core/port.c      | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index c4be257c043d..6cebc8417282 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -176,6 +176,7 @@ static inline int mlx5_flexible_inlen(struct mlx5_core_dev *dev, size_t fixed,
 
 int mlx5_query_hca_caps(struct mlx5_core_dev *dev);
 int mlx5_query_board_id(struct mlx5_core_dev *dev);
+int mlx5_query_module_num(struct mlx5_core_dev *dev, int *module_num);
 int mlx5_cmd_init(struct mlx5_core_dev *dev);
 void mlx5_cmd_cleanup(struct mlx5_core_dev *dev);
 void mlx5_cmd_set_state(struct mlx5_core_dev *dev,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index 0daeb4b72cca..be70d1f23a5d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -271,7 +271,7 @@ void mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, u16 *oper_mtu,
 }
 EXPORT_SYMBOL_GPL(mlx5_query_port_oper_mtu);
 
-static int mlx5_query_module_num(struct mlx5_core_dev *dev, int *module_num)
+int mlx5_query_module_num(struct mlx5_core_dev *dev, int *module_num)
 {
 	u32 in[MLX5_ST_SZ_DW(pmlp_reg)] = {0};
 	u32 out[MLX5_ST_SZ_DW(pmlp_reg)];
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 02/14] net/mlx5: Expose NIC temperature via hardware monitoring kernel API
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
  2023-07-24 22:44 ` [net-next 01/14] net/mlx5: Expose port.c/mlx5_query_module_num() function Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-26  3:31   ` Jakub Kicinski
  2023-07-24 22:44 ` [net-next 03/14] net/mlx5: Use shared code for checking lag is supported Saeed Mahameed
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Adham Faris, Gal Pressman

From: Adham Faris <afaris@nvidia.com>

Expose NIC temperature by implementing hwmon kernel API, which turns
current thermal zone kernel API to redundant.

For each one of the supported and exposed thermal diode sensors, expose
the following attributes:
1) Input temperature.
2) Highest temperature.
3) Temperature label.
4) Temperature critical max value:
   refers to the high threshold of Warning Event. Will be exposed as
   `tempY_crit` hwmon attribute (RO attribute). For example for
   ConnectX5 HCA's this temperature value will be 105 Celsius, 10
   degrees lower than the HW shutdown temperature).
5) Temperature reset history: resets highest temperature.

For example, for dualport ConnectX5 NIC with a single IC thermal diode
sensor will have 2 hwmon directories (one for each PCI function)
under "/sys/class/hwmon/hwmon[X,Y]".

Listing one of the directories above (hwmonX/Y) generates the
corresponding output below:

$ grep -H -d skip . /sys/class/hwmon/hwmon0/*

Output
=======================================================================
/sys/class/hwmon/hwmon0/name:0000:08:00.0
/sys/class/hwmon/hwmon0/temp1_crit:105000
/sys/class/hwmon/hwmon0/temp1_highest:68000
/sys/class/hwmon/hwmon0/temp1_input:68000
/sys/class/hwmon/hwmon0/temp1_label:sensor0
grep: /sys/class/hwmon/hwmon0/temp1_reset_history: Permission denied

Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 .../net/ethernet/mellanox/mlx5/core/hwmon.c   | 428 ++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/hwmon.h   |  24 +
 .../net/ethernet/mellanox/mlx5/core/main.c    |   8 +-
 .../net/ethernet/mellanox/mlx5/core/thermal.c | 114 -----
 .../net/ethernet/mellanox/mlx5/core/thermal.h |  20 -
 include/linux/mlx5/driver.h                   |   3 +-
 include/linux/mlx5/mlx5_ifc.h                 |  14 +-
 8 files changed, 472 insertions(+), 141 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/hwmon.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/hwmon.h
 delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/thermal.c
 delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/thermal.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 35f00700a4d6..fddb88c000ec 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -78,7 +78,7 @@ mlx5_core-$(CONFIG_MLX5_ESWITCH)   += esw/acl/helper.o \
 mlx5_core-$(CONFIG_MLX5_BRIDGE)    += esw/bridge.o esw/bridge_mcast.o esw/bridge_debugfs.o \
 				      en/rep/bridge.o
 
-mlx5_core-$(CONFIG_THERMAL)        += thermal.o
+mlx5_core-$(CONFIG_HWMON)          += hwmon.o
 mlx5_core-$(CONFIG_MLX5_MPFS)      += lib/mpfs.o
 mlx5_core-$(CONFIG_VXLAN)          += lib/vxlan.o
 mlx5_core-$(CONFIG_PTP_1588_CLOCK) += lib/clock.o
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/hwmon.c b/drivers/net/ethernet/mellanox/mlx5/core/hwmon.c
new file mode 100644
index 000000000000..7f27bb62a1d5
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/hwmon.c
@@ -0,0 +1,428 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+// Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved
+
+#include <linux/hwmon.h>
+#include <linux/bitmap.h>
+#include <linux/mlx5/device.h>
+#include <linux/mlx5/mlx5_ifc.h>
+#include <linux/mlx5/port.h>
+#include "mlx5_core.h"
+#include "hwmon.h"
+
+#define CHANNELS_TYPE_NUM 2 /* chip channel and temp channel */
+#define CHIP_CONFIG_NUM 1
+
+/* module 0 is mapped to sensor_index 64 in MTMP register */
+#define to_mtmp_module_sensor_idx(idx) (64 + (idx))
+
+/* All temperatures retrieved in units of 0.125C. hwmon framework expect
+ * it in units of millidegrees C. Hence multiply values by 125.
+ */
+#define mtmp_temp_to_mdeg(temp) ((temp) * 125)
+
+struct temp_channel_desc {
+	u32 sensor_index;
+	char sensor_name[32];
+};
+
+/* chip_channel_config and channel_info arrays must be 0-terminated, hence + 1 */
+struct mlx5_hwmon {
+	struct mlx5_core_dev *mdev;
+	struct device *hwmon_dev;
+	struct hwmon_channel_info chip_info;
+	u32 chip_channel_config[CHIP_CONFIG_NUM + 1];
+	struct hwmon_channel_info temp_info;
+	u32 *temp_channel_config;
+	const struct hwmon_channel_info *channel_info[CHANNELS_TYPE_NUM + 1];
+	struct hwmon_chip_info chip;
+	const char *name;
+	struct temp_channel_desc *temp_channel_desc;
+	u32 asic_platform_scount;
+	u32 module_scount;
+};
+
+static int mlx5_hwmon_query_mtmp(struct mlx5_core_dev *mdev, u32 sensor_index, u32 *mtmp_out)
+{
+	u32 mtmp_in[MLX5_ST_SZ_DW(mtmp_reg)] = {};
+
+	MLX5_SET(mtmp_reg, mtmp_in, sensor_index, sensor_index);
+
+	return mlx5_core_access_reg(mdev, mtmp_in,  sizeof(mtmp_in),
+				    mtmp_out, MLX5_ST_SZ_BYTES(mtmp_reg),
+				    MLX5_REG_MTMP, 0, 0);
+}
+
+static int mlx5_hwmon_reset_max_temp(struct mlx5_core_dev *mdev, int sensor_index)
+{
+	u32 mtmp_out[MLX5_ST_SZ_DW(mtmp_reg)] = {};
+	u32 mtmp_in[MLX5_ST_SZ_DW(mtmp_reg)] = {};
+
+	MLX5_SET(mtmp_reg, mtmp_in, sensor_index, sensor_index);
+	MLX5_SET(mtmp_reg, mtmp_in, mtr, 1);
+
+	return mlx5_core_access_reg(mdev, mtmp_in,  sizeof(mtmp_in),
+				    mtmp_out, sizeof(mtmp_out),
+				    MLX5_REG_MTMP, 0, 0);
+}
+
+static int mlx5_hwmon_enable_max_temp(struct mlx5_core_dev *mdev, int sensor_index)
+{
+	u32 mtmp_out[MLX5_ST_SZ_DW(mtmp_reg)] = {};
+	u32 mtmp_in[MLX5_ST_SZ_DW(mtmp_reg)] = {};
+	int err;
+
+	err = mlx5_hwmon_query_mtmp(mdev, sensor_index, mtmp_in);
+	if (err)
+		return err;
+
+	MLX5_SET(mtmp_reg, mtmp_in, mte, 1);
+	return mlx5_core_access_reg(mdev, mtmp_in,  sizeof(mtmp_in),
+				    mtmp_out, sizeof(mtmp_out),
+				    MLX5_REG_MTMP, 0, 1);
+}
+
+static int mlx5_hwmon_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+			   int channel, long *val)
+{
+	struct mlx5_hwmon *hwmon = dev_get_drvdata(dev);
+	u32 mtmp_out[MLX5_ST_SZ_DW(mtmp_reg)] = {};
+	int err;
+
+	if (type != hwmon_temp)
+		return -EOPNOTSUPP;
+
+	err = mlx5_hwmon_query_mtmp(hwmon->mdev, hwmon->temp_channel_desc[channel].sensor_index,
+				    mtmp_out);
+	if (err)
+		return err;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		*val = mtmp_temp_to_mdeg(MLX5_GET(mtmp_reg, mtmp_out, temperature));
+		return 0;
+	case hwmon_temp_highest:
+		*val = mtmp_temp_to_mdeg(MLX5_GET(mtmp_reg, mtmp_out, max_temperature));
+		return 0;
+	case hwmon_temp_crit:
+		*val = mtmp_temp_to_mdeg(MLX5_GET(mtmp_reg, mtmp_out, temp_threshold_hi));
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int mlx5_hwmon_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+			    int channel, long val)
+{
+	struct mlx5_hwmon *hwmon = dev_get_drvdata(dev);
+
+	if (type != hwmon_temp || attr != hwmon_temp_reset_history)
+		return -EOPNOTSUPP;
+
+	return mlx5_hwmon_reset_max_temp(hwmon->mdev,
+				hwmon->temp_channel_desc[channel].sensor_index);
+}
+
+static umode_t mlx5_hwmon_is_visible(const void *data, enum hwmon_sensor_types type, u32 attr,
+				     int channel)
+{
+	if (type != hwmon_temp)
+		return 0;
+
+	switch (attr) {
+	case hwmon_temp_input:
+	case hwmon_temp_highest:
+	case hwmon_temp_crit:
+	case hwmon_temp_label:
+		return 0444;
+	case hwmon_temp_reset_history:
+		return 0200;
+	default:
+		return 0;
+	}
+}
+
+static int mlx5_hwmon_read_string(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+				  int channel, const char **str)
+{
+	struct mlx5_hwmon *hwmon = dev_get_drvdata(dev);
+
+	if (type != hwmon_temp || attr != hwmon_temp_label)
+		return -EOPNOTSUPP;
+
+	*str = (const char *)hwmon->temp_channel_desc[channel].sensor_name;
+	return 0;
+}
+
+static const struct hwmon_ops mlx5_hwmon_ops = {
+	.read = mlx5_hwmon_read,
+	.read_string = mlx5_hwmon_read_string,
+	.is_visible = mlx5_hwmon_is_visible,
+	.write = mlx5_hwmon_write,
+};
+
+static int mlx5_hwmon_init_channels_names(struct mlx5_hwmon *hwmon)
+{
+	u32 i;
+
+	for (i = 0; i < hwmon->asic_platform_scount + hwmon->module_scount; i++) {
+		u32 mtmp_out[MLX5_ST_SZ_DW(mtmp_reg)] = {};
+		char *sensor_name;
+		int err;
+
+		err = mlx5_hwmon_query_mtmp(hwmon->mdev, hwmon->temp_channel_desc[i].sensor_index,
+					    mtmp_out);
+		if (err)
+			return err;
+
+		sensor_name = MLX5_ADDR_OF(mtmp_reg, mtmp_out, sensor_name_hi);
+		if (!*sensor_name) {
+			snprintf(hwmon->temp_channel_desc[i].sensor_name,
+				 sizeof(hwmon->temp_channel_desc[i].sensor_name), "sensor%u",
+				 hwmon->temp_channel_desc[i].sensor_index);
+			continue;
+		}
+
+		memcpy(&hwmon->temp_channel_desc[i].sensor_name, sensor_name,
+		       MLX5_FLD_SZ_BYTES(mtmp_reg, sensor_name_hi) +
+		       MLX5_FLD_SZ_BYTES(mtmp_reg, sensor_name_lo));
+	}
+
+	return 0;
+}
+
+static int mlx5_hwmon_get_module_sensor_index(struct mlx5_core_dev *mdev, u32 *module_index)
+{
+	int module_num;
+	int err;
+
+	err = mlx5_query_module_num(mdev, &module_num);
+	if (err)
+		return err;
+
+	*module_index = to_mtmp_module_sensor_idx(module_num);
+
+	return 0;
+}
+
+static int mlx5_hwmon_init_sensors_indexes(struct mlx5_hwmon *hwmon, u64 sensor_map)
+{
+	DECLARE_BITMAP(smap, BITS_PER_TYPE(sensor_map));
+	unsigned long bit_pos;
+	int err = 0;
+	int i = 0;
+
+	bitmap_from_u64(smap, sensor_map);
+
+	for_each_set_bit(bit_pos, smap, BITS_PER_TYPE(sensor_map)) {
+		hwmon->temp_channel_desc[i].sensor_index = bit_pos;
+		i++;
+	}
+
+	if (hwmon->module_scount)
+		err = mlx5_hwmon_get_module_sensor_index(hwmon->mdev,
+							 &hwmon->temp_channel_desc[i].sensor_index);
+
+	return err;
+}
+
+static void mlx5_hwmon_channel_info_init(struct mlx5_hwmon *hwmon)
+{
+	int i;
+
+	hwmon->channel_info[0] = &hwmon->chip_info;
+	hwmon->channel_info[1] = &hwmon->temp_info;
+
+	hwmon->chip_channel_config[0] = HWMON_C_REGISTER_TZ;
+	hwmon->chip_info.config = (const u32 *)hwmon->chip_channel_config;
+	hwmon->chip_info.type = hwmon_chip;
+
+	for (i = 0; i < hwmon->asic_platform_scount + hwmon->module_scount; i++)
+		hwmon->temp_channel_config[i] = HWMON_T_INPUT | HWMON_T_HIGHEST | HWMON_T_CRIT |
+					     HWMON_T_RESET_HISTORY | HWMON_T_LABEL;
+
+	hwmon->temp_info.config = (const u32 *)hwmon->temp_channel_config;
+	hwmon->temp_info.type = hwmon_temp;
+}
+
+static int mlx5_hwmon_is_module_mon_cap(struct mlx5_core_dev *mdev, bool *mon_cap)
+{
+	u32 mtmp_out[MLX5_ST_SZ_DW(mtmp_reg)];
+	u32 module_index;
+	int err;
+
+	err = mlx5_hwmon_get_module_sensor_index(mdev, &module_index);
+	if (err)
+		return err;
+
+	err = mlx5_hwmon_query_mtmp(mdev, module_index, mtmp_out);
+	if (err)
+		return err;
+
+	if (MLX5_GET(mtmp_reg, mtmp_out, temperature))
+		*mon_cap = true;
+
+	return 0;
+}
+
+static int mlx5_hwmon_get_sensors_count(struct mlx5_core_dev *mdev, u32 *asic_platform_scount)
+{
+	u32 mtcap_out[MLX5_ST_SZ_DW(mtcap_reg)] = {};
+	u32 mtcap_in[MLX5_ST_SZ_DW(mtcap_reg)] = {};
+	int err;
+
+	err = mlx5_core_access_reg(mdev, mtcap_in,  sizeof(mtcap_in),
+				   mtcap_out, sizeof(mtcap_out),
+				   MLX5_REG_MTCAP, 0, 0);
+	if (err)
+		return err;
+
+	*asic_platform_scount = MLX5_GET(mtcap_reg, mtcap_out, sensor_count);
+
+	return 0;
+}
+
+static void mlx5_hwmon_free(struct mlx5_hwmon *hwmon)
+{
+	if (!hwmon)
+		return;
+
+	kfree(hwmon->temp_channel_config);
+	kfree(hwmon->temp_channel_desc);
+	kfree(hwmon->name);
+	kfree(hwmon);
+}
+
+static struct mlx5_hwmon *mlx5_hwmon_alloc(struct mlx5_core_dev *mdev)
+{
+	struct mlx5_hwmon *hwmon;
+	bool mon_cap = false;
+	u32 sensors_count;
+	int err;
+
+	hwmon = kzalloc(sizeof(*mdev->hwmon), GFP_KERNEL);
+	if (!hwmon)
+		return ERR_PTR(-ENOMEM);
+
+	hwmon->name = hwmon_sanitize_name(pci_name(mdev->pdev));
+	if (IS_ERR(hwmon->name)) {
+		err = PTR_ERR(hwmon->name);
+		goto err_free_hwmon;
+	}
+
+	err = mlx5_hwmon_get_sensors_count(mdev, &hwmon->asic_platform_scount);
+	if (err)
+		goto err_free_name;
+
+	/* check if module sensor has thermal mon cap. if yes, allocate channel desc for it */
+	err = mlx5_hwmon_is_module_mon_cap(mdev, &mon_cap);
+	if (err)
+		goto err_free_name;
+
+	hwmon->module_scount = mon_cap ? 1 : 0;
+	sensors_count = hwmon->asic_platform_scount + hwmon->module_scount;
+	hwmon->temp_channel_desc = kcalloc(sensors_count, sizeof(*hwmon->temp_channel_desc),
+					   GFP_KERNEL);
+	if (!hwmon->temp_channel_desc) {
+		err = -ENOMEM;
+		goto err_free_name;
+	}
+
+	/* sensors configuration values array, must be 0-terminated hence, + 1 */
+	hwmon->temp_channel_config = kcalloc(sensors_count + 1, sizeof(*hwmon->temp_channel_config),
+					     GFP_KERNEL);
+	if (!hwmon->temp_channel_config) {
+		err = -ENOMEM;
+		goto err_free_temp_channel_desc;
+	}
+
+	hwmon->mdev = mdev;
+
+	return hwmon;
+
+err_free_temp_channel_desc:
+	kfree(hwmon->temp_channel_desc);
+err_free_name:
+	kfree(hwmon->name);
+err_free_hwmon:
+	kfree(hwmon);
+	return ERR_PTR(err);
+}
+
+static int mlx5_hwmon_dev_init(struct mlx5_hwmon *hwmon)
+{
+	u32 mtcap_out[MLX5_ST_SZ_DW(mtcap_reg)] = {};
+	u32 mtcap_in[MLX5_ST_SZ_DW(mtcap_reg)] = {};
+	int err;
+	int i;
+
+	err =  mlx5_core_access_reg(hwmon->mdev, mtcap_in,  sizeof(mtcap_in),
+				    mtcap_out, sizeof(mtcap_out),
+				    MLX5_REG_MTCAP, 0, 0);
+	if (err)
+		return err;
+
+	mlx5_hwmon_channel_info_init(hwmon);
+	mlx5_hwmon_init_sensors_indexes(hwmon, MLX5_GET64(mtcap_reg, mtcap_out, sensor_map));
+	err = mlx5_hwmon_init_channels_names(hwmon);
+	if (err)
+		return err;
+
+	for (i = 0; i < hwmon->asic_platform_scount + hwmon->module_scount; i++) {
+		err = mlx5_hwmon_enable_max_temp(hwmon->mdev,
+						 hwmon->temp_channel_desc[i].sensor_index);
+		if (err)
+			return err;
+	}
+
+	hwmon->chip.ops = &mlx5_hwmon_ops;
+	hwmon->chip.info = (const struct hwmon_channel_info **)hwmon->channel_info;
+
+	return 0;
+}
+
+int mlx5_hwmon_dev_register(struct mlx5_core_dev *mdev)
+{
+	struct device *dev = mdev->device;
+	struct mlx5_hwmon *hwmon;
+	int err;
+
+	if (!MLX5_CAP_MCAM_REG(mdev, mtmp))
+		return 0;
+
+	hwmon = mlx5_hwmon_alloc(mdev);
+	if (IS_ERR(hwmon))
+		return PTR_ERR(hwmon);
+
+	err = mlx5_hwmon_dev_init(hwmon);
+	if (err)
+		goto err_free_hwmon;
+
+	hwmon->hwmon_dev = hwmon_device_register_with_info(dev, hwmon->name,
+							   hwmon,
+							   &hwmon->chip,
+							   NULL);
+	if (IS_ERR(hwmon->hwmon_dev)) {
+		err = PTR_ERR(hwmon->hwmon_dev);
+		goto err_free_hwmon;
+	}
+
+	mdev->hwmon = hwmon;
+	return 0;
+
+err_free_hwmon:
+	mlx5_hwmon_free(hwmon);
+	return err;
+}
+
+void mlx5_hwmon_dev_unregister(struct mlx5_core_dev *mdev)
+{
+	struct mlx5_hwmon *hwmon = mdev->hwmon;
+
+	if (!hwmon)
+		return;
+
+	hwmon_device_unregister(hwmon->hwmon_dev);
+	mlx5_hwmon_free(hwmon);
+	mdev->hwmon = NULL;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/hwmon.h b/drivers/net/ethernet/mellanox/mlx5/core/hwmon.h
new file mode 100644
index 000000000000..999654a9b9da
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/hwmon.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+ * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ */
+#ifndef __MLX5_HWMON_H__
+#define __MLX5_HWMON_H__
+
+#include <linux/mlx5/driver.h>
+
+#if IS_ENABLED(CONFIG_HWMON)
+
+int mlx5_hwmon_dev_register(struct mlx5_core_dev *mdev);
+void mlx5_hwmon_dev_unregister(struct mlx5_core_dev *mdev);
+
+#else
+static inline int mlx5_hwmon_dev_register(struct mlx5_core_dev *mdev)
+{
+	return 0;
+}
+
+static inline void mlx5_hwmon_dev_unregister(struct mlx5_core_dev *mdev) {}
+
+#endif
+
+#endif /* __MLX5_HWMON_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 88dbea6631d5..865d028b8abd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -49,7 +49,6 @@
 #include <linux/version.h>
 #include <net/devlink.h>
 #include "mlx5_core.h"
-#include "thermal.h"
 #include "lib/eq.h"
 #include "fs_core.h"
 #include "lib/mpfs.h"
@@ -73,6 +72,7 @@
 #include "sf/dev/dev.h"
 #include "sf/sf.h"
 #include "mlx5_irq.h"
+#include "hwmon.h"
 
 MODULE_AUTHOR("Eli Cohen <eli@mellanox.com>");
 MODULE_DESCRIPTION("Mellanox 5th generation network adapters (ConnectX series) core driver");
@@ -1920,9 +1920,9 @@ static int probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (err)
 		dev_err(&pdev->dev, "mlx5_crdump_enable failed with error code %d\n", err);
 
-	err = mlx5_thermal_init(dev);
+	err = mlx5_hwmon_dev_register(dev);
 	if (err)
-		dev_err(&pdev->dev, "mlx5_thermal_init failed with error code %d\n", err);
+		mlx5_core_err(dev, "mlx5_hwmon_dev_register failed with error code %d\n", err);
 
 	pci_save_state(pdev);
 	devlink_register(devlink);
@@ -1954,7 +1954,7 @@ static void remove_one(struct pci_dev *pdev)
 	mlx5_drain_health_wq(dev);
 	devlink_unregister(devlink);
 	mlx5_sriov_disable(pdev, false);
-	mlx5_thermal_uninit(dev);
+	mlx5_hwmon_dev_unregister(dev);
 	mlx5_crdump_disable(dev);
 	mlx5_uninit_one(dev);
 	mlx5_pci_close(dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/thermal.c b/drivers/net/ethernet/mellanox/mlx5/core/thermal.c
deleted file mode 100644
index 52199d39657e..000000000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/thermal.c
+++ /dev/null
@@ -1,114 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
-// Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES.
-
-#include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/device.h>
-#include <linux/thermal.h>
-#include <linux/err.h>
-#include <linux/mlx5/driver.h>
-#include "mlx5_core.h"
-#include "thermal.h"
-
-#define MLX5_THERMAL_POLL_INT_MSEC	1000
-#define MLX5_THERMAL_NUM_TRIPS		0
-#define MLX5_THERMAL_ASIC_SENSOR_INDEX	0
-
-/* Bit string indicating the writeablility of trip points if any */
-#define MLX5_THERMAL_TRIP_MASK	(BIT(MLX5_THERMAL_NUM_TRIPS) - 1)
-
-struct mlx5_thermal {
-	struct mlx5_core_dev *mdev;
-	struct thermal_zone_device *tzdev;
-};
-
-static int mlx5_thermal_get_mtmp_temp(struct mlx5_core_dev *mdev, u32 id, int *p_temp)
-{
-	u32 mtmp_out[MLX5_ST_SZ_DW(mtmp_reg)] = {};
-	u32 mtmp_in[MLX5_ST_SZ_DW(mtmp_reg)] = {};
-	int err;
-
-	MLX5_SET(mtmp_reg, mtmp_in, sensor_index, id);
-
-	err = mlx5_core_access_reg(mdev, mtmp_in,  sizeof(mtmp_in),
-				   mtmp_out, sizeof(mtmp_out),
-				   MLX5_REG_MTMP, 0, 0);
-
-	if (err)
-		return err;
-
-	*p_temp = MLX5_GET(mtmp_reg, mtmp_out, temperature);
-
-	return 0;
-}
-
-static int mlx5_thermal_get_temp(struct thermal_zone_device *tzdev,
-				 int *p_temp)
-{
-	struct mlx5_thermal *thermal = thermal_zone_device_priv(tzdev);
-	struct mlx5_core_dev *mdev = thermal->mdev;
-	int err;
-
-	err = mlx5_thermal_get_mtmp_temp(mdev, MLX5_THERMAL_ASIC_SENSOR_INDEX, p_temp);
-
-	if (err)
-		return err;
-
-	/* The unit of temp returned is in 0.125 C. The thermal
-	 * framework expects the value in 0.001 C.
-	 */
-	*p_temp *= 125;
-
-	return 0;
-}
-
-static struct thermal_zone_device_ops mlx5_thermal_ops = {
-	.get_temp = mlx5_thermal_get_temp,
-};
-
-int mlx5_thermal_init(struct mlx5_core_dev *mdev)
-{
-	char data[THERMAL_NAME_LENGTH];
-	struct mlx5_thermal *thermal;
-	int err;
-
-	if (!mlx5_core_is_pf(mdev) && !mlx5_core_is_ecpf(mdev))
-		return 0;
-
-	err = snprintf(data, sizeof(data), "mlx5_%s", dev_name(mdev->device));
-	if (err < 0 || err >= sizeof(data)) {
-		mlx5_core_err(mdev, "Failed to setup thermal zone name, %d\n", err);
-		return -EINVAL;
-	}
-
-	thermal = kzalloc(sizeof(*thermal), GFP_KERNEL);
-	if (!thermal)
-		return -ENOMEM;
-
-	thermal->mdev = mdev;
-	thermal->tzdev = thermal_zone_device_register_with_trips(data,
-								 NULL,
-								 MLX5_THERMAL_NUM_TRIPS,
-								 MLX5_THERMAL_TRIP_MASK,
-								 thermal,
-								 &mlx5_thermal_ops,
-								 NULL, 0, MLX5_THERMAL_POLL_INT_MSEC);
-	if (IS_ERR(thermal->tzdev)) {
-		err = PTR_ERR(thermal->tzdev);
-		mlx5_core_err(mdev, "Failed to register thermal zone device (%s) %d\n", data, err);
-		kfree(thermal);
-		return err;
-	}
-
-	mdev->thermal = thermal;
-	return 0;
-}
-
-void mlx5_thermal_uninit(struct mlx5_core_dev *mdev)
-{
-	if (!mdev->thermal)
-		return;
-
-	thermal_zone_device_unregister(mdev->thermal->tzdev);
-	kfree(mdev->thermal);
-}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/thermal.h b/drivers/net/ethernet/mellanox/mlx5/core/thermal.h
deleted file mode 100644
index 7d752c122192..000000000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/thermal.h
+++ /dev/null
@@ -1,20 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
- * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES.
- */
-#ifndef __MLX5_THERMAL_DRIVER_H
-#define __MLX5_THERMAL_DRIVER_H
-
-#if IS_ENABLED(CONFIG_THERMAL)
-int mlx5_thermal_init(struct mlx5_core_dev *mdev);
-void mlx5_thermal_uninit(struct mlx5_core_dev *mdev);
-#else
-static inline int mlx5_thermal_init(struct mlx5_core_dev *mdev)
-{
-	mdev->thermal = NULL;
-	return 0;
-}
-
-static inline void mlx5_thermal_uninit(struct mlx5_core_dev *mdev) { }
-#endif
-
-#endif /* __MLX5_THERMAL_DRIVER_H */
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 25d0528f9219..7cb1520a27d6 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -134,6 +134,7 @@ enum {
 	MLX5_REG_PCAM		 = 0x507f,
 	MLX5_REG_NODE_DESC	 = 0x6001,
 	MLX5_REG_HOST_ENDIANNESS = 0x7004,
+	MLX5_REG_MTCAP		 = 0x9009,
 	MLX5_REG_MTMP		 = 0x900A,
 	MLX5_REG_MCIA		 = 0x9014,
 	MLX5_REG_MFRL		 = 0x9028,
@@ -804,7 +805,7 @@ struct mlx5_core_dev {
 	struct mlx5_rsc_dump    *rsc_dump;
 	u32                      vsc_addr;
 	struct mlx5_hv_vhca	*hv_vhca;
-	struct mlx5_thermal	*thermal;
+	struct mlx5_hwmon	*hwmon;
 };
 
 struct mlx5_db {
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 33344a71c3e3..bd7c0c3d3a4f 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -10193,7 +10193,9 @@ struct mlx5_ifc_mcam_access_reg_bits {
 	u8         mrtc[0x1];
 	u8         regs_44_to_32[0xd];
 
-	u8         regs_31_to_0[0x20];
+	u8         regs_31_to_10[0x16];
+	u8         mtmp[0x1];
+	u8         regs_8_to_0[0x9];
 };
 
 struct mlx5_ifc_mcam_access_reg_bits1 {
@@ -10946,6 +10948,15 @@ struct mlx5_ifc_mrtc_reg_bits {
 	u8         time_l[0x20];
 };
 
+struct mlx5_ifc_mtcap_reg_bits {
+	u8         reserved_at_0[0x19];
+	u8         sensor_count[0x7];
+
+	u8         reserved_at_20[0x20];
+
+	u8         sensor_map[0x40];
+};
+
 struct mlx5_ifc_mtmp_reg_bits {
 	u8         reserved_at_0[0x14];
 	u8         sensor_index[0xc];
@@ -11033,6 +11044,7 @@ union mlx5_ifc_ports_control_registers_document_bits {
 	struct mlx5_ifc_mfrl_reg_bits mfrl_reg;
 	struct mlx5_ifc_mtutc_reg_bits mtutc_reg;
 	struct mlx5_ifc_mrtc_reg_bits mrtc_reg;
+	struct mlx5_ifc_mtcap_reg_bits mtcap_reg;
 	struct mlx5_ifc_mtmp_reg_bits mtmp_reg;
 	u8         reserved_at_0[0x60e0];
 };
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 03/14] net/mlx5: Use shared code for checking lag is supported
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
  2023-07-24 22:44 ` [net-next 01/14] net/mlx5: Expose port.c/mlx5_query_module_num() function Saeed Mahameed
  2023-07-24 22:44 ` [net-next 02/14] net/mlx5: Expose NIC temperature via hardware monitoring kernel API Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 04/14] net/mlx5: Devcom, Infrastructure changes Saeed Mahameed
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Roi Dayan

From: Roi Dayan <roid@nvidia.com>

Move shared function to check lag is supported to lag header file.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/dev.c     |  6 ++----
 drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 10 ----------
 drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h | 12 ++++++++++--
 3 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
index edb06fb9bbc5..7909f378dc93 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
@@ -36,6 +36,7 @@
 #include <linux/mlx5/vport.h>
 #include "mlx5_core.h"
 #include "devlink.h"
+#include "lag/lag.h"
 
 /* intf dev list mutex */
 static DEFINE_MUTEX(mlx5_intf_mutex);
@@ -587,10 +588,7 @@ static int next_phys_dev_lag(struct device *dev, const void *data)
 	if (!mdev)
 		return 0;
 
-	if (!MLX5_CAP_GEN(mdev, vport_group_manager) ||
-	    !MLX5_CAP_GEN(mdev, lag_master) ||
-	    (MLX5_CAP_GEN(mdev, num_lag_ports) > MLX5_MAX_PORTS ||
-	     MLX5_CAP_GEN(mdev, num_lag_ports) <= 1))
+	if (!mlx5_lag_is_supported(mdev))
 		return 0;
 
 	return _next_phys_dev(mdev, data);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index f0a074b2fcdf..900a18883f28 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -1268,16 +1268,6 @@ void mlx5_lag_remove_mdev(struct mlx5_core_dev *dev)
 	mlx5_ldev_put(ldev);
 }
 
-bool mlx5_lag_is_supported(struct mlx5_core_dev *dev)
-{
-	if (!MLX5_CAP_GEN(dev, vport_group_manager) ||
-	    !MLX5_CAP_GEN(dev, lag_master) ||
-	    MLX5_CAP_GEN(dev, num_lag_ports) < 2 ||
-	    MLX5_CAP_GEN(dev, num_lag_ports) > MLX5_MAX_PORTS)
-		return false;
-	return true;
-}
-
 void mlx5_lag_add_mdev(struct mlx5_core_dev *dev)
 {
 	int err;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
index a061b1873e27..481e92f39fe6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
@@ -74,8 +74,6 @@ struct mlx5_lag {
 	struct lag_mpesw	  lag_mpesw;
 };
 
-bool mlx5_lag_is_supported(struct mlx5_core_dev *dev);
-
 static inline struct mlx5_lag *
 mlx5_lag_dev(struct mlx5_core_dev *dev)
 {
@@ -115,4 +113,14 @@ void mlx5_lag_remove_devices(struct mlx5_lag *ldev);
 int mlx5_deactivate_lag(struct mlx5_lag *ldev);
 void mlx5_lag_add_devices(struct mlx5_lag *ldev);
 
+static inline bool mlx5_lag_is_supported(struct mlx5_core_dev *dev)
+{
+	if (!MLX5_CAP_GEN(dev, vport_group_manager) ||
+	    !MLX5_CAP_GEN(dev, lag_master) ||
+	    MLX5_CAP_GEN(dev, num_lag_ports) < 2 ||
+	    MLX5_CAP_GEN(dev, num_lag_ports) > MLX5_MAX_PORTS)
+		return false;
+	return true;
+}
+
 #endif /* __MLX5_LAG_H__ */
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 04/14] net/mlx5: Devcom, Infrastructure changes
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 03/14] net/mlx5: Use shared code for checking lag is supported Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 05/14] net/mlx5e: E-Switch, Register devcom device with switch id key Saeed Mahameed
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Roi Dayan, Eli Cohen, Shay Drory

From: Roi Dayan <roid@nvidia.com>

Update devcom infrastructure to be more generic, without
depending on max supported ports definition or a device guid,
and also more encapsulated so callers don't need to pass
the register devcom component id per event call.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  |  21 +-
 .../net/ethernet/mellanox/mlx5/core/en_tc.c   |  36 +-
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  |  22 +-
 .../mellanox/mlx5/core/esw/bridge_mcast.c     |  17 +-
 .../net/ethernet/mellanox/mlx5/core/eswitch.h |   3 +
 .../mellanox/mlx5/core/eswitch_offloads.c     |  47 +-
 .../net/ethernet/mellanox/mlx5/core/lag/lag.c |   2 +-
 .../ethernet/mellanox/mlx5/core/lib/devcom.c  | 448 ++++++++++--------
 .../ethernet/mellanox/mlx5/core/lib/devcom.h  |  74 ++-
 .../net/ethernet/mellanox/mlx5/core/main.c    |  12 +-
 include/linux/mlx5/driver.h                   |   4 +-
 11 files changed, 359 insertions(+), 327 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 152b62138450..ca4f57f5064f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -399,15 +399,13 @@ static void mlx5e_sqs2vport_stop(struct mlx5_eswitch *esw,
 }
 
 static int mlx5e_sqs2vport_add_peers_rules(struct mlx5_eswitch *esw, struct mlx5_eswitch_rep *rep,
-					   struct mlx5_devcom *devcom,
 					   struct mlx5e_rep_sq *rep_sq, int i)
 {
-	struct mlx5_eswitch *peer_esw = NULL;
 	struct mlx5_flow_handle *flow_rule;
-	int tmp;
+	struct mlx5_devcom_comp_dev *tmp;
+	struct mlx5_eswitch *peer_esw;
 
-	mlx5_devcom_for_each_peer_entry(devcom, MLX5_DEVCOM_ESW_OFFLOADS,
-					peer_esw, tmp) {
+	mlx5_devcom_for_each_peer_entry(esw->devcom, peer_esw, tmp) {
 		u16 peer_rule_idx = MLX5_CAP_GEN(peer_esw->dev, vhca_id);
 		struct mlx5e_rep_sq_peer *sq_peer;
 		int err;
@@ -443,7 +441,6 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
 	struct mlx5_flow_handle *flow_rule;
 	struct mlx5e_rep_priv *rpriv;
 	struct mlx5e_rep_sq *rep_sq;
-	struct mlx5_devcom *devcom;
 	bool devcom_locked = false;
 	int err;
 	int i;
@@ -451,10 +448,10 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
 	if (esw->mode != MLX5_ESWITCH_OFFLOADS)
 		return 0;
 
-	devcom = esw->dev->priv.devcom;
 	rpriv = mlx5e_rep_to_rep_priv(rep);
-	if (mlx5_devcom_comp_is_ready(devcom, MLX5_DEVCOM_ESW_OFFLOADS) &&
-	    mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS))
+
+	if (mlx5_devcom_comp_is_ready(esw->devcom) &&
+	    mlx5_devcom_for_each_peer_begin(esw->devcom))
 		devcom_locked = true;
 
 	for (i = 0; i < sqns_num; i++) {
@@ -477,7 +474,7 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
 
 		xa_init(&rep_sq->sq_peer);
 		if (devcom_locked) {
-			err = mlx5e_sqs2vport_add_peers_rules(esw, rep, devcom, rep_sq, i);
+			err = mlx5e_sqs2vport_add_peers_rules(esw, rep, rep_sq, i);
 			if (err) {
 				mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule);
 				xa_destroy(&rep_sq->sq_peer);
@@ -490,7 +487,7 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
 	}
 
 	if (devcom_locked)
-		mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+		mlx5_devcom_for_each_peer_end(esw->devcom);
 
 	return 0;
 
@@ -498,7 +495,7 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw,
 	mlx5e_sqs2vport_stop(esw, rep);
 
 	if (devcom_locked)
-		mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+		mlx5_devcom_for_each_peer_end(esw->devcom);
 
 	return err;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 8d0a3f69693e..22bc88620653 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -1668,11 +1668,10 @@ int mlx5e_tc_query_route_vport(struct net_device *out_dev, struct net_device *ro
 {
 	struct mlx5e_priv *out_priv, *route_priv;
 	struct mlx5_core_dev *route_mdev;
-	struct mlx5_devcom *devcom;
+	struct mlx5_devcom_comp_dev *pos;
 	struct mlx5_eswitch *esw;
 	u16 vhca_id;
 	int err;
-	int i;
 
 	out_priv = netdev_priv(out_dev);
 	esw = out_priv->mdev->priv.eswitch;
@@ -1688,10 +1687,8 @@ int mlx5e_tc_query_route_vport(struct net_device *out_dev, struct net_device *ro
 		return err;
 
 	rcu_read_lock();
-	devcom = out_priv->mdev->priv.devcom;
 	err = -ENODEV;
-	mlx5_devcom_for_each_peer_entry_rcu(devcom, MLX5_DEVCOM_ESW_OFFLOADS,
-					    esw, i) {
+	mlx5_devcom_for_each_peer_entry_rcu(esw->devcom, esw, pos) {
 		err = mlx5_eswitch_vhca_id_to_vport(esw, vhca_id, vport);
 		if (!err)
 			break;
@@ -2031,15 +2028,15 @@ static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
 			      struct mlx5e_tc_flow *flow)
 {
 	if (mlx5e_is_eswitch_flow(flow)) {
-		struct mlx5_devcom *devcom = flow->priv->mdev->priv.devcom;
+		struct mlx5_devcom_comp_dev *devcom = flow->priv->mdev->priv.eswitch->devcom;
 
-		if (!mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS)) {
+		if (!mlx5_devcom_for_each_peer_begin(devcom)) {
 			mlx5e_tc_del_fdb_flow(priv, flow);
 			return;
 		}
 
 		mlx5e_tc_del_fdb_peers_flow(flow);
-		mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+		mlx5_devcom_for_each_peer_end(devcom);
 		mlx5e_tc_del_fdb_flow(priv, flow);
 	} else {
 		mlx5e_tc_del_nic_flow(priv, flow);
@@ -4216,8 +4213,7 @@ static bool is_peer_flow_needed(struct mlx5e_tc_flow *flow)
 		flow_flag_test(flow, INGRESS);
 	bool act_is_encap = !!(attr->action &
 			       MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT);
-	bool esw_paired = mlx5_devcom_comp_is_ready(esw_attr->in_mdev->priv.devcom,
-						    MLX5_DEVCOM_ESW_OFFLOADS);
+	bool esw_paired = mlx5_devcom_comp_is_ready(esw_attr->in_mdev->priv.eswitch->devcom);
 
 	if (!esw_paired)
 		return false;
@@ -4471,14 +4467,13 @@ mlx5e_add_fdb_flow(struct mlx5e_priv *priv,
 		   struct net_device *filter_dev,
 		   struct mlx5e_tc_flow **__flow)
 {
-	struct mlx5_devcom *devcom = priv->mdev->priv.devcom;
+	struct mlx5_devcom_comp_dev *devcom = priv->mdev->priv.eswitch->devcom, *pos;
 	struct mlx5e_rep_priv *rpriv = priv->ppriv;
 	struct mlx5_eswitch_rep *in_rep = rpriv->rep;
 	struct mlx5_core_dev *in_mdev = priv->mdev;
 	struct mlx5_eswitch *peer_esw;
 	struct mlx5e_tc_flow *flow;
 	int err;
-	int i;
 
 	flow = __mlx5e_add_fdb_flow(priv, f, flow_flags, filter_dev, in_rep,
 				    in_mdev);
@@ -4490,27 +4485,25 @@ mlx5e_add_fdb_flow(struct mlx5e_priv *priv,
 		return 0;
 	}
 
-	if (!mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS)) {
+	if (!mlx5_devcom_for_each_peer_begin(devcom)) {
 		err = -ENODEV;
 		goto clean_flow;
 	}
 
-	mlx5_devcom_for_each_peer_entry(devcom,
-					MLX5_DEVCOM_ESW_OFFLOADS,
-					peer_esw, i) {
+	mlx5_devcom_for_each_peer_entry(devcom, peer_esw, pos) {
 		err = mlx5e_tc_add_fdb_peer_flow(f, flow, flow_flags, peer_esw);
 		if (err)
 			goto peer_clean;
 	}
 
-	mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+	mlx5_devcom_for_each_peer_end(devcom);
 
 	*__flow = flow;
 	return 0;
 
 peer_clean:
 	mlx5e_tc_del_fdb_peers_flow(flow);
-	mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+	mlx5_devcom_for_each_peer_end(devcom);
 clean_flow:
 	mlx5e_tc_del_fdb_flow(priv, flow);
 	return err;
@@ -4728,7 +4721,7 @@ int mlx5e_tc_fill_action_stats(struct mlx5e_priv *priv,
 int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
 		       struct flow_cls_offload *f, unsigned long flags)
 {
-	struct mlx5_devcom *devcom = priv->mdev->priv.devcom;
+	struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
 	struct rhashtable *tc_ht = get_tc_ht(priv, flags);
 	struct mlx5e_tc_flow *flow;
 	struct mlx5_fc *counter;
@@ -4764,7 +4757,7 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
 	/* Under multipath it's possible for one rule to be currently
 	 * un-offloaded while the other rule is offloaded.
 	 */
-	if (!mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS))
+	if (esw && !mlx5_devcom_for_each_peer_begin(esw->devcom))
 		goto out;
 
 	if (flow_flag_test(flow, DUP)) {
@@ -4795,7 +4788,8 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv,
 	}
 
 no_peer_counter:
-	mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+	if (esw)
+		mlx5_devcom_for_each_peer_end(esw->devcom);
 out:
 	flow_stats_update(&f->stats, bytes, packets, 0, lastuse,
 			  FLOW_ACTION_HW_STATS_DELAYED);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index f4fe1daa4afd..e36294b7ade2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -652,30 +652,30 @@ mlx5_esw_bridge_ingress_flow_peer_create(u16 vport_num, u16 esw_owner_vhca_id,
 					 struct mlx5_esw_bridge_vlan *vlan, u32 counter_id,
 					 struct mlx5_esw_bridge *bridge)
 {
-	struct mlx5_devcom *devcom = bridge->br_offloads->esw->dev->priv.devcom;
+	struct mlx5_devcom_comp_dev *devcom = bridge->br_offloads->esw->devcom, *pos;
 	struct mlx5_eswitch *tmp, *peer_esw = NULL;
 	static struct mlx5_flow_handle *handle;
-	int i;
 
-	if (!mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS))
+	if (!mlx5_devcom_for_each_peer_begin(devcom))
 		return ERR_PTR(-ENODEV);
 
-	mlx5_devcom_for_each_peer_entry(devcom,
-					MLX5_DEVCOM_ESW_OFFLOADS,
-					tmp, i) {
+	mlx5_devcom_for_each_peer_entry(devcom, tmp, pos) {
 		if (mlx5_esw_is_owner(tmp, vport_num, esw_owner_vhca_id)) {
 			peer_esw = tmp;
 			break;
 		}
 	}
+
 	if (!peer_esw) {
-		mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
-		return ERR_PTR(-ENODEV);
+		handle = ERR_PTR(-ENODEV);
+		goto out;
 	}
 
 	handle = mlx5_esw_bridge_ingress_flow_with_esw_create(vport_num, addr, vlan, counter_id,
 							      bridge, peer_esw);
-	mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+
+out:
+	mlx5_devcom_for_each_peer_end(devcom);
 	return handle;
 }
 
@@ -1391,8 +1391,8 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, u16 esw_ow
 						    mlx5_fc_id(counter), bridge);
 	if (IS_ERR(handle)) {
 		err = PTR_ERR(handle);
-		esw_warn(esw->dev, "Failed to create ingress flow(vport=%u,err=%d)\n",
-			 vport_num, err);
+		esw_warn(esw->dev, "Failed to create ingress flow(vport=%u,err=%d,peer=%d)\n",
+			 vport_num, err, peer);
 		goto err_ingress_flow_create;
 	}
 	entry->ingress_handle = handle;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c
index 2455f8b93c1e..7a01714b3780 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c
@@ -539,30 +539,29 @@ mlx5_esw_bridge_mcast_filter_flow_create(struct mlx5_esw_bridge_port *port)
 static struct mlx5_flow_handle *
 mlx5_esw_bridge_mcast_filter_flow_peer_create(struct mlx5_esw_bridge_port *port)
 {
-	struct mlx5_devcom *devcom = port->bridge->br_offloads->esw->dev->priv.devcom;
+	struct mlx5_devcom_comp_dev *devcom = port->bridge->br_offloads->esw->devcom, *pos;
 	struct mlx5_eswitch *tmp, *peer_esw = NULL;
 	static struct mlx5_flow_handle *handle;
-	int i;
 
-	if (!mlx5_devcom_for_each_peer_begin(devcom, MLX5_DEVCOM_ESW_OFFLOADS))
+	if (!mlx5_devcom_for_each_peer_begin(devcom))
 		return ERR_PTR(-ENODEV);
 
-	mlx5_devcom_for_each_peer_entry(devcom,
-					MLX5_DEVCOM_ESW_OFFLOADS,
-					tmp, i) {
+	mlx5_devcom_for_each_peer_entry(devcom, tmp, pos) {
 		if (mlx5_esw_is_owner(tmp, port->vport_num, port->esw_owner_vhca_id)) {
 			peer_esw = tmp;
 			break;
 		}
 	}
+
 	if (!peer_esw) {
-		mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
-		return ERR_PTR(-ENODEV);
+		handle = ERR_PTR(-ENODEV);
+		goto out;
 	}
 
 	handle = mlx5_esw_bridge_mcast_flow_with_esw_create(port, peer_esw);
 
-	mlx5_devcom_for_each_peer_end(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+out:
+	mlx5_devcom_for_each_peer_end(devcom);
 	return handle;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index ae0dc8a3060d..6d9378b0bce5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -354,6 +354,7 @@ struct mlx5_eswitch {
 	}  params;
 	struct blocking_notifier_head n_head;
 	struct xarray paired;
+	struct mlx5_devcom_comp_dev *devcom;
 };
 
 void esw_offloads_disable(struct mlx5_eswitch *esw);
@@ -383,6 +384,7 @@ void mlx5_eswitch_disable_locked(struct mlx5_eswitch *esw);
 void mlx5_eswitch_disable(struct mlx5_eswitch *esw);
 void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw);
 void mlx5_esw_offloads_devcom_cleanup(struct mlx5_eswitch *esw);
+bool mlx5_esw_offloads_devcom_is_ready(struct mlx5_eswitch *esw);
 int mlx5_eswitch_set_vport_mac(struct mlx5_eswitch *esw,
 			       u16 vport, const u8 *mac);
 int mlx5_eswitch_set_vport_state(struct mlx5_eswitch *esw,
@@ -818,6 +820,7 @@ static inline void mlx5_eswitch_disable_sriov(struct mlx5_eswitch *esw, bool cle
 static inline void mlx5_eswitch_disable(struct mlx5_eswitch *esw) {}
 static inline void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw) {}
 static inline void mlx5_esw_offloads_devcom_cleanup(struct mlx5_eswitch *esw) {}
+static inline bool mlx5_esw_offloads_devcom_is_ready(struct mlx5_eswitch *esw) { return false; }
 static inline bool mlx5_eswitch_is_funcs_handler(struct mlx5_core_dev *dev) { return false; }
 static inline
 int mlx5_eswitch_set_vport_state(struct mlx5_eswitch *esw, u16 vport, int link_state) { return 0; }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index bdfe609cc9ec..11cce630c1b8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2811,7 +2811,6 @@ static int mlx5_esw_offloads_devcom_event(int event,
 					  void *event_data)
 {
 	struct mlx5_eswitch *esw = my_data;
-	struct mlx5_devcom *devcom = esw->dev->priv.devcom;
 	struct mlx5_eswitch *peer_esw = event_data;
 	u16 esw_i, peer_esw_i;
 	bool esw_paired;
@@ -2833,6 +2832,7 @@ static int mlx5_esw_offloads_devcom_event(int event,
 		err = mlx5_esw_offloads_set_ns_peer(esw, peer_esw, true);
 		if (err)
 			goto err_out;
+
 		err = mlx5_esw_offloads_pair(esw, peer_esw);
 		if (err)
 			goto err_peer;
@@ -2851,7 +2851,7 @@ static int mlx5_esw_offloads_devcom_event(int event,
 
 		esw->num_peers++;
 		peer_esw->num_peers++;
-		mlx5_devcom_comp_set_ready(devcom, MLX5_DEVCOM_ESW_OFFLOADS, true);
+		mlx5_devcom_comp_set_ready(esw->devcom, true);
 		break;
 
 	case ESW_OFFLOADS_DEVCOM_UNPAIR:
@@ -2861,7 +2861,7 @@ static int mlx5_esw_offloads_devcom_event(int event,
 		peer_esw->num_peers--;
 		esw->num_peers--;
 		if (!esw->num_peers && !peer_esw->num_peers)
-			mlx5_devcom_comp_set_ready(devcom, MLX5_DEVCOM_ESW_OFFLOADS, false);
+			mlx5_devcom_comp_set_ready(esw->devcom, false);
 		xa_erase(&peer_esw->paired, esw_i);
 		xa_erase(&esw->paired, peer_esw_i);
 		mlx5_esw_offloads_unpair(peer_esw, esw);
@@ -2888,7 +2888,7 @@ static int mlx5_esw_offloads_devcom_event(int event,
 
 void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw)
 {
-	struct mlx5_devcom *devcom = esw->dev->priv.devcom;
+	u64 guid;
 	int i;
 
 	for (i = 0; i < MLX5_MAX_PORTS; i++)
@@ -2902,34 +2902,41 @@ void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw)
 		return;
 
 	xa_init(&esw->paired);
-	mlx5_devcom_register_component(devcom,
-				       MLX5_DEVCOM_ESW_OFFLOADS,
-				       mlx5_esw_offloads_devcom_event,
-				       esw);
+	guid = mlx5_query_nic_system_image_guid(esw->dev);
 
 	esw->num_peers = 0;
-	mlx5_devcom_send_event(devcom,
-			       MLX5_DEVCOM_ESW_OFFLOADS,
+	esw->devcom = mlx5_devcom_register_component(esw->dev->priv.devc,
+						     MLX5_DEVCOM_ESW_OFFLOADS,
+						     guid,
+						     mlx5_esw_offloads_devcom_event,
+						     esw);
+	if (IS_ERR_OR_NULL(esw->devcom))
+		return;
+
+	mlx5_devcom_send_event(esw->devcom,
 			       ESW_OFFLOADS_DEVCOM_PAIR,
-			       ESW_OFFLOADS_DEVCOM_UNPAIR, esw);
+			       ESW_OFFLOADS_DEVCOM_UNPAIR,
+			       esw);
 }
 
 void mlx5_esw_offloads_devcom_cleanup(struct mlx5_eswitch *esw)
 {
-	struct mlx5_devcom *devcom = esw->dev->priv.devcom;
-
-	if (!MLX5_CAP_ESW(esw->dev, merged_eswitch))
+	if (IS_ERR_OR_NULL(esw->devcom))
 		return;
 
-	if (!mlx5_lag_is_supported(esw->dev))
-		return;
-
-	mlx5_devcom_send_event(devcom, MLX5_DEVCOM_ESW_OFFLOADS,
+	mlx5_devcom_send_event(esw->devcom,
+			       ESW_OFFLOADS_DEVCOM_UNPAIR,
 			       ESW_OFFLOADS_DEVCOM_UNPAIR,
-			       ESW_OFFLOADS_DEVCOM_UNPAIR, esw);
+			       esw);
 
-	mlx5_devcom_unregister_component(devcom, MLX5_DEVCOM_ESW_OFFLOADS);
+	mlx5_devcom_unregister_component(esw->devcom);
 	xa_destroy(&esw->paired);
+	esw->devcom = NULL;
+}
+
+bool mlx5_esw_offloads_devcom_is_ready(struct mlx5_eswitch *esw)
+{
+	return mlx5_devcom_comp_is_ready(esw->devcom);
 }
 
 bool mlx5_esw_vport_match_metadata_supported(const struct mlx5_eswitch *esw)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
index 900a18883f28..af3fac090b82 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
@@ -835,7 +835,7 @@ static bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev)
 	dev = ldev->pf[MLX5_LAG_P1].dev;
 	if (is_mdev_switchdev_mode(dev) &&
 	    mlx5_eswitch_vport_match_metadata_enabled(dev->priv.eswitch) &&
-	    mlx5_devcom_comp_is_ready(dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS) &&
+	    mlx5_esw_offloads_devcom_is_ready(dev->priv.eswitch) &&
 	    MLX5_CAP_ESW(dev, esw_shared_ingress_acl) &&
 	    mlx5_eswitch_get_npeers(dev->priv.eswitch) == MLX5_CAP_GEN(dev, num_lag_ports) - 1)
 		return true;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
index 78c94b22bdc0..feb62d952643 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c
@@ -2,214 +2,273 @@
 /* Copyright (c) 2018 Mellanox Technologies */
 
 #include <linux/mlx5/vport.h>
+#include <linux/list.h>
 #include "lib/devcom.h"
 #include "mlx5_core.h"
 
-static LIST_HEAD(devcom_list);
+static LIST_HEAD(devcom_dev_list);
+static LIST_HEAD(devcom_comp_list);
+/* protect device list */
+static DEFINE_MUTEX(dev_list_lock);
+/* protect component list */
+static DEFINE_MUTEX(comp_list_lock);
 
-#define devcom_for_each_component(priv, comp, iter) \
-	for (iter = 0; \
-	     comp = &(priv)->components[iter], iter < MLX5_DEVCOM_NUM_COMPONENTS; \
-	     iter++)
+#define devcom_for_each_component(iter) \
+	list_for_each_entry(iter, &devcom_comp_list, comp_list)
 
-struct mlx5_devcom_component {
-	struct {
-		void __rcu *data;
-	} device[MLX5_DEVCOM_PORTS_SUPPORTED];
+struct mlx5_devcom_dev {
+	struct list_head list;
+	struct mlx5_core_dev *dev;
+	struct kref ref;
+};
 
+struct mlx5_devcom_comp {
+	struct list_head comp_list;
+	enum mlx5_devcom_component id;
+	u64 key;
+	struct list_head comp_dev_list_head;
 	mlx5_devcom_event_handler_t handler;
-	struct rw_semaphore sem;
+	struct kref ref;
 	bool ready;
+	struct rw_semaphore sem;
 };
 
-struct mlx5_devcom_list {
+struct mlx5_devcom_comp_dev {
 	struct list_head list;
-
-	struct mlx5_devcom_component components[MLX5_DEVCOM_NUM_COMPONENTS];
-	struct mlx5_core_dev *devs[MLX5_DEVCOM_PORTS_SUPPORTED];
+	struct mlx5_devcom_comp *comp;
+	struct mlx5_devcom_dev *devc;
+	void __rcu *data;
 };
 
-struct mlx5_devcom {
-	struct mlx5_devcom_list *priv;
-	int idx;
-};
-
-static struct mlx5_devcom_list *mlx5_devcom_list_alloc(void)
+static bool devcom_dev_exists(struct mlx5_core_dev *dev)
 {
-	struct mlx5_devcom_component *comp;
-	struct mlx5_devcom_list *priv;
-	int i;
-
-	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
-	if (!priv)
-		return NULL;
+	struct mlx5_devcom_dev *iter;
 
-	devcom_for_each_component(priv, comp, i)
-		init_rwsem(&comp->sem);
+	list_for_each_entry(iter, &devcom_dev_list, list)
+		if (iter->dev == dev)
+			return true;
 
-	return priv;
+	return false;
 }
 
-static struct mlx5_devcom *mlx5_devcom_alloc(struct mlx5_devcom_list *priv,
-					     u8 idx)
+static struct mlx5_devcom_dev *
+mlx5_devcom_dev_alloc(struct mlx5_core_dev *dev)
 {
-	struct mlx5_devcom *devcom;
+	struct mlx5_devcom_dev *devc;
 
-	devcom = kzalloc(sizeof(*devcom), GFP_KERNEL);
-	if (!devcom)
+	devc = kzalloc(sizeof(*devc), GFP_KERNEL);
+	if (!devc)
 		return NULL;
 
-	devcom->priv = priv;
-	devcom->idx = idx;
-	return devcom;
+	devc->dev = dev;
+	kref_init(&devc->ref);
+	return devc;
 }
 
-/* Must be called with intf_mutex held */
-struct mlx5_devcom *mlx5_devcom_register_device(struct mlx5_core_dev *dev)
+struct mlx5_devcom_dev *
+mlx5_devcom_register_device(struct mlx5_core_dev *dev)
 {
-	struct mlx5_devcom_list *priv = NULL, *iter;
-	struct mlx5_devcom *devcom = NULL;
-	bool new_priv = false;
-	u64 sguid0, sguid1;
-	int idx, i;
-
-	if (!mlx5_core_is_pf(dev))
-		return NULL;
-	if (MLX5_CAP_GEN(dev, num_lag_ports) > MLX5_DEVCOM_PORTS_SUPPORTED)
-		return NULL;
-
-	mlx5_dev_list_lock();
-	sguid0 = mlx5_query_nic_system_image_guid(dev);
-	list_for_each_entry(iter, &devcom_list, list) {
-		/* There is at least one device in iter */
-		struct mlx5_core_dev *tmp_dev;
-
-		idx = -1;
-		for (i = 0; i < MLX5_DEVCOM_PORTS_SUPPORTED; i++) {
-			if (iter->devs[i])
-				tmp_dev = iter->devs[i];
-			else
-				idx = i;
-		}
-
-		if (idx == -1)
-			continue;
-
-		sguid1 = mlx5_query_nic_system_image_guid(tmp_dev);
-		if (sguid0 != sguid1)
-			continue;
-
-		priv = iter;
-		break;
-	}
+	struct mlx5_devcom_dev *devc;
 
-	if (!priv) {
-		priv = mlx5_devcom_list_alloc();
-		if (!priv) {
-			devcom = ERR_PTR(-ENOMEM);
-			goto out;
-		}
+	mutex_lock(&dev_list_lock);
 
-		idx = 0;
-		new_priv = true;
+	if (devcom_dev_exists(dev)) {
+		devc = ERR_PTR(-EEXIST);
+		goto out;
 	}
 
-	priv->devs[idx] = dev;
-	devcom = mlx5_devcom_alloc(priv, idx);
-	if (!devcom) {
-		if (new_priv)
-			kfree(priv);
-		devcom = ERR_PTR(-ENOMEM);
+	devc = mlx5_devcom_dev_alloc(dev);
+	if (!devc) {
+		devc = ERR_PTR(-ENOMEM);
 		goto out;
 	}
 
-	if (new_priv)
-		list_add(&priv->list, &devcom_list);
+	list_add_tail(&devc->list, &devcom_dev_list);
 out:
-	mlx5_dev_list_unlock();
-	return devcom;
+	mutex_unlock(&dev_list_lock);
+	return devc;
 }
 
-/* Must be called with intf_mutex held */
-void mlx5_devcom_unregister_device(struct mlx5_devcom *devcom)
+static void
+mlx5_devcom_dev_release(struct kref *ref)
 {
-	struct mlx5_devcom_list *priv;
-	int i;
+	struct mlx5_devcom_dev *devc = container_of(ref, struct mlx5_devcom_dev, ref);
 
-	if (IS_ERR_OR_NULL(devcom))
-		return;
+	mutex_lock(&dev_list_lock);
+	list_del(&devc->list);
+	mutex_unlock(&dev_list_lock);
+	kfree(devc);
+}
 
-	mlx5_dev_list_lock();
-	priv = devcom->priv;
-	priv->devs[devcom->idx] = NULL;
+void mlx5_devcom_unregister_device(struct mlx5_devcom_dev *devc)
+{
+	if (!IS_ERR_OR_NULL(devc))
+		kref_put(&devc->ref, mlx5_devcom_dev_release);
+}
 
-	kfree(devcom);
+static struct mlx5_devcom_comp *
+mlx5_devcom_comp_alloc(u64 id, u64 key, mlx5_devcom_event_handler_t handler)
+{
+	struct mlx5_devcom_comp *comp;
 
-	for (i = 0; i < MLX5_DEVCOM_PORTS_SUPPORTED; i++)
-		if (priv->devs[i])
-			break;
+	comp = kzalloc(sizeof(*comp), GFP_KERNEL);
+	if (!comp)
+		return ERR_PTR(-ENOMEM);
 
-	if (i != MLX5_DEVCOM_PORTS_SUPPORTED)
-		goto out;
+	comp->id = id;
+	comp->key = key;
+	comp->handler = handler;
+	init_rwsem(&comp->sem);
+	kref_init(&comp->ref);
+	INIT_LIST_HEAD(&comp->comp_dev_list_head);
 
-	list_del(&priv->list);
-	kfree(priv);
-out:
-	mlx5_dev_list_unlock();
+	return comp;
 }
 
-void mlx5_devcom_register_component(struct mlx5_devcom *devcom,
-				    enum mlx5_devcom_components id,
-				    mlx5_devcom_event_handler_t handler,
-				    void *data)
+static void
+mlx5_devcom_comp_release(struct kref *ref)
 {
-	struct mlx5_devcom_component *comp;
+	struct mlx5_devcom_comp *comp = container_of(ref, struct mlx5_devcom_comp, ref);
 
-	if (IS_ERR_OR_NULL(devcom))
-		return;
+	mutex_lock(&comp_list_lock);
+	list_del(&comp->comp_list);
+	mutex_unlock(&comp_list_lock);
+	kfree(comp);
+}
+
+static struct mlx5_devcom_comp_dev *
+devcom_alloc_comp_dev(struct mlx5_devcom_dev *devc,
+		      struct mlx5_devcom_comp *comp,
+		      void *data)
+{
+	struct mlx5_devcom_comp_dev *devcom;
 
-	WARN_ON(!data);
+	devcom = kzalloc(sizeof(*devcom), GFP_KERNEL);
+	if (!devcom)
+		return ERR_PTR(-ENOMEM);
+
+	kref_get(&devc->ref);
+	devcom->devc = devc;
+	devcom->comp = comp;
+	rcu_assign_pointer(devcom->data, data);
 
-	comp = &devcom->priv->components[id];
 	down_write(&comp->sem);
-	comp->handler = handler;
-	rcu_assign_pointer(comp->device[devcom->idx].data, data);
+	list_add_tail(&devcom->list, &comp->comp_dev_list_head);
 	up_write(&comp->sem);
+
+	return devcom;
 }
 
-void mlx5_devcom_unregister_component(struct mlx5_devcom *devcom,
-				      enum mlx5_devcom_components id)
+static void
+devcom_free_comp_dev(struct mlx5_devcom_comp_dev *devcom)
 {
-	struct mlx5_devcom_component *comp;
-
-	if (IS_ERR_OR_NULL(devcom))
-		return;
+	struct mlx5_devcom_comp *comp = devcom->comp;
 
-	comp = &devcom->priv->components[id];
 	down_write(&comp->sem);
-	RCU_INIT_POINTER(comp->device[devcom->idx].data, NULL);
+	list_del(&devcom->list);
 	up_write(&comp->sem);
-	synchronize_rcu();
+
+	kref_put(&devcom->devc->ref, mlx5_devcom_dev_release);
+	kfree(devcom);
+	kref_put(&comp->ref, mlx5_devcom_comp_release);
 }
 
-int mlx5_devcom_send_event(struct mlx5_devcom *devcom,
-			   enum mlx5_devcom_components id,
+static bool
+devcom_component_equal(struct mlx5_devcom_comp *devcom,
+		       enum mlx5_devcom_component id,
+		       u64 key)
+{
+	return devcom->id == id && devcom->key == key;
+}
+
+static struct mlx5_devcom_comp *
+devcom_component_get(struct mlx5_devcom_dev *devc,
+		     enum mlx5_devcom_component id,
+		     u64 key,
+		     mlx5_devcom_event_handler_t handler)
+{
+	struct mlx5_devcom_comp *comp;
+
+	devcom_for_each_component(comp) {
+		if (devcom_component_equal(comp, id, key)) {
+			if (handler == comp->handler) {
+				kref_get(&comp->ref);
+				return comp;
+			}
+
+			mlx5_core_err(devc->dev,
+				      "Cannot register existing devcom component with different handler\n");
+			return ERR_PTR(-EINVAL);
+		}
+	}
+
+	return NULL;
+}
+
+struct mlx5_devcom_comp_dev *
+mlx5_devcom_register_component(struct mlx5_devcom_dev *devc,
+			       enum mlx5_devcom_component id,
+			       u64 key,
+			       mlx5_devcom_event_handler_t handler,
+			       void *data)
+{
+	struct mlx5_devcom_comp_dev *devcom;
+	struct mlx5_devcom_comp *comp;
+
+	if (IS_ERR_OR_NULL(devc))
+		return NULL;
+
+	mutex_lock(&comp_list_lock);
+	comp = devcom_component_get(devc, id, key, handler);
+	if (IS_ERR(comp)) {
+		devcom = ERR_PTR(-EINVAL);
+		goto out_unlock;
+	}
+
+	if (!comp) {
+		comp = mlx5_devcom_comp_alloc(id, key, handler);
+		if (IS_ERR(comp)) {
+			devcom = ERR_CAST(comp);
+			goto out_unlock;
+		}
+		list_add_tail(&comp->comp_list, &devcom_comp_list);
+	}
+	mutex_unlock(&comp_list_lock);
+
+	devcom = devcom_alloc_comp_dev(devc, comp, data);
+	if (IS_ERR(devcom))
+		kref_put(&comp->ref, mlx5_devcom_comp_release);
+
+	return devcom;
+
+out_unlock:
+	mutex_unlock(&comp_list_lock);
+	return devcom;
+}
+
+void mlx5_devcom_unregister_component(struct mlx5_devcom_comp_dev *devcom)
+{
+	if (!IS_ERR_OR_NULL(devcom))
+		devcom_free_comp_dev(devcom);
+}
+
+int mlx5_devcom_send_event(struct mlx5_devcom_comp_dev *devcom,
 			   int event, int rollback_event,
 			   void *event_data)
 {
-	struct mlx5_devcom_component *comp;
-	int err = -ENODEV, i;
+	struct mlx5_devcom_comp *comp = devcom->comp;
+	struct mlx5_devcom_comp_dev *pos;
+	int err = 0;
+	void *data;
 
 	if (IS_ERR_OR_NULL(devcom))
-		return err;
+		return -ENODEV;
 
-	comp = &devcom->priv->components[id];
 	down_write(&comp->sem);
-	for (i = 0; i < MLX5_DEVCOM_PORTS_SUPPORTED; i++) {
-		void *data = rcu_dereference_protected(comp->device[i].data,
-						       lockdep_is_held(&comp->sem));
+	list_for_each_entry(pos, &comp->comp_dev_list_head, list) {
+		data = rcu_dereference_protected(pos->data, lockdep_is_held(&comp->sem));
 
-		if (i != devcom->idx && data) {
+		if (pos != devcom && data) {
 			err = comp->handler(event, data, event_data);
 			if (err)
 				goto rollback;
@@ -220,48 +279,43 @@ int mlx5_devcom_send_event(struct mlx5_devcom *devcom,
 	return 0;
 
 rollback:
-	while (i--) {
-		void *data = rcu_dereference_protected(comp->device[i].data,
-						       lockdep_is_held(&comp->sem));
+	if (list_entry_is_head(pos, &comp->comp_dev_list_head, list))
+		goto out;
+	pos = list_prev_entry(pos, list);
+	list_for_each_entry_from_reverse(pos, &comp->comp_dev_list_head, list) {
+		data = rcu_dereference_protected(pos->data, lockdep_is_held(&comp->sem));
 
-		if (i != devcom->idx && data)
+		if (pos != devcom && data)
 			comp->handler(rollback_event, data, event_data);
 	}
-
+out:
 	up_write(&comp->sem);
 	return err;
 }
 
-void mlx5_devcom_comp_set_ready(struct mlx5_devcom *devcom,
-				enum mlx5_devcom_components id,
-				bool ready)
+void mlx5_devcom_comp_set_ready(struct mlx5_devcom_comp_dev *devcom, bool ready)
 {
-	struct mlx5_devcom_component *comp;
-
-	comp = &devcom->priv->components[id];
-	WARN_ON(!rwsem_is_locked(&comp->sem));
+	WARN_ON(!rwsem_is_locked(&devcom->comp->sem));
 
-	WRITE_ONCE(comp->ready, ready);
+	WRITE_ONCE(devcom->comp->ready, ready);
 }
 
-bool mlx5_devcom_comp_is_ready(struct mlx5_devcom *devcom,
-			       enum mlx5_devcom_components id)
+bool mlx5_devcom_comp_is_ready(struct mlx5_devcom_comp_dev *devcom)
 {
 	if (IS_ERR_OR_NULL(devcom))
 		return false;
 
-	return READ_ONCE(devcom->priv->components[id].ready);
+	return READ_ONCE(devcom->comp->ready);
 }
 
-bool mlx5_devcom_for_each_peer_begin(struct mlx5_devcom *devcom,
-				     enum mlx5_devcom_components id)
+bool mlx5_devcom_for_each_peer_begin(struct mlx5_devcom_comp_dev *devcom)
 {
-	struct mlx5_devcom_component *comp;
+	struct mlx5_devcom_comp *comp;
 
 	if (IS_ERR_OR_NULL(devcom))
 		return false;
 
-	comp = &devcom->priv->components[id];
+	comp = devcom->comp;
 	down_read(&comp->sem);
 	if (!READ_ONCE(comp->ready)) {
 		up_read(&comp->sem);
@@ -271,74 +325,60 @@ bool mlx5_devcom_for_each_peer_begin(struct mlx5_devcom *devcom,
 	return true;
 }
 
-void mlx5_devcom_for_each_peer_end(struct mlx5_devcom *devcom,
-				   enum mlx5_devcom_components id)
+void mlx5_devcom_for_each_peer_end(struct mlx5_devcom_comp_dev *devcom)
 {
-	struct mlx5_devcom_component *comp = &devcom->priv->components[id];
-
-	up_read(&comp->sem);
+	up_read(&devcom->comp->sem);
 }
 
-void *mlx5_devcom_get_next_peer_data(struct mlx5_devcom *devcom,
-				     enum mlx5_devcom_components id,
-				     int *i)
+void *mlx5_devcom_get_next_peer_data(struct mlx5_devcom_comp_dev *devcom,
+				     struct mlx5_devcom_comp_dev **pos)
 {
-	struct mlx5_devcom_component *comp;
-	void *ret;
-	int idx;
+	struct mlx5_devcom_comp *comp = devcom->comp;
+	struct mlx5_devcom_comp_dev *tmp;
+	void *data;
 
-	comp = &devcom->priv->components[id];
+	tmp = list_prepare_entry(*pos, &comp->comp_dev_list_head, list);
 
-	if (*i == MLX5_DEVCOM_PORTS_SUPPORTED)
-		return NULL;
-	for (idx = *i; idx < MLX5_DEVCOM_PORTS_SUPPORTED; idx++) {
-		if (idx != devcom->idx) {
-			ret = rcu_dereference_protected(comp->device[idx].data,
-							lockdep_is_held(&comp->sem));
-			if (ret)
+	list_for_each_entry_continue(tmp, &comp->comp_dev_list_head, list) {
+		if (tmp != devcom) {
+			data = rcu_dereference_protected(tmp->data, lockdep_is_held(&comp->sem));
+			if (data)
 				break;
 		}
 	}
 
-	if (idx == MLX5_DEVCOM_PORTS_SUPPORTED) {
-		*i = idx;
+	if (list_entry_is_head(tmp, &comp->comp_dev_list_head, list))
 		return NULL;
-	}
-	*i = idx + 1;
 
-	return ret;
+	*pos = tmp;
+	return data;
 }
 
-void *mlx5_devcom_get_next_peer_data_rcu(struct mlx5_devcom *devcom,
-					 enum mlx5_devcom_components id,
-					 int *i)
+void *mlx5_devcom_get_next_peer_data_rcu(struct mlx5_devcom_comp_dev *devcom,
+					 struct mlx5_devcom_comp_dev **pos)
 {
-	struct mlx5_devcom_component *comp;
-	void *ret;
-	int idx;
+	struct mlx5_devcom_comp *comp = devcom->comp;
+	struct mlx5_devcom_comp_dev *tmp;
+	void *data;
 
-	comp = &devcom->priv->components[id];
+	tmp = list_prepare_entry(*pos, &comp->comp_dev_list_head, list);
 
-	if (*i == MLX5_DEVCOM_PORTS_SUPPORTED)
-		return NULL;
-	for (idx = *i; idx < MLX5_DEVCOM_PORTS_SUPPORTED; idx++) {
-		if (idx != devcom->idx) {
+	list_for_each_entry_continue(tmp, &comp->comp_dev_list_head, list) {
+		if (tmp != devcom) {
 			/* This can change concurrently, however 'data' pointer will remain
 			 * valid for the duration of RCU read section.
 			 */
 			if (!READ_ONCE(comp->ready))
 				return NULL;
-			ret = rcu_dereference(comp->device[idx].data);
-			if (ret)
+			data = rcu_dereference(tmp->data);
+			if (data)
 				break;
 		}
 	}
 
-	if (idx == MLX5_DEVCOM_PORTS_SUPPORTED) {
-		*i = idx;
+	if (list_entry_is_head(tmp, &comp->comp_dev_list_head, list))
 		return NULL;
-	}
-	*i = idx + 1;
 
-	return ret;
+	*pos = tmp;
+	return data;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
index d953a01b8eaa..8389ac0af708 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h
@@ -6,11 +6,8 @@
 
 #include <linux/mlx5/driver.h>
 
-#define MLX5_DEVCOM_PORTS_SUPPORTED 4
-
-enum mlx5_devcom_components {
+enum mlx5_devcom_component {
 	MLX5_DEVCOM_ESW_OFFLOADS,
-
 	MLX5_DEVCOM_NUM_COMPONENTS,
 };
 
@@ -18,45 +15,40 @@ typedef int (*mlx5_devcom_event_handler_t)(int event,
 					   void *my_data,
 					   void *event_data);
 
-struct mlx5_devcom *mlx5_devcom_register_device(struct mlx5_core_dev *dev);
-void mlx5_devcom_unregister_device(struct mlx5_devcom *devcom);
+struct mlx5_devcom_dev *mlx5_devcom_register_device(struct mlx5_core_dev *dev);
+void mlx5_devcom_unregister_device(struct mlx5_devcom_dev *devc);
 
-void mlx5_devcom_register_component(struct mlx5_devcom *devcom,
-				    enum mlx5_devcom_components id,
-				    mlx5_devcom_event_handler_t handler,
-				    void *data);
-void mlx5_devcom_unregister_component(struct mlx5_devcom *devcom,
-				      enum mlx5_devcom_components id);
+struct mlx5_devcom_comp_dev *
+mlx5_devcom_register_component(struct mlx5_devcom_dev *devc,
+			       enum mlx5_devcom_component id,
+			       u64 key,
+			       mlx5_devcom_event_handler_t handler,
+			       void *data);
+void mlx5_devcom_unregister_component(struct mlx5_devcom_comp_dev *devcom);
 
-int mlx5_devcom_send_event(struct mlx5_devcom *devcom,
-			   enum mlx5_devcom_components id,
+int mlx5_devcom_send_event(struct mlx5_devcom_comp_dev *devcom,
 			   int event, int rollback_event,
 			   void *event_data);
 
-void mlx5_devcom_comp_set_ready(struct mlx5_devcom *devcom,
-				enum mlx5_devcom_components id,
-				bool ready);
-bool mlx5_devcom_comp_is_ready(struct mlx5_devcom *devcom,
-			       enum mlx5_devcom_components id);
-
-bool mlx5_devcom_for_each_peer_begin(struct mlx5_devcom *devcom,
-				     enum mlx5_devcom_components id);
-void mlx5_devcom_for_each_peer_end(struct mlx5_devcom *devcom,
-				   enum mlx5_devcom_components id);
-void *mlx5_devcom_get_next_peer_data(struct mlx5_devcom *devcom,
-				     enum mlx5_devcom_components id, int *i);
-
-#define mlx5_devcom_for_each_peer_entry(devcom, id, data, i)			\
-	for (i = 0, data = mlx5_devcom_get_next_peer_data(devcom, id, &i);	\
-	     data;								\
-	     data = mlx5_devcom_get_next_peer_data(devcom, id, &i))
-
-void *mlx5_devcom_get_next_peer_data_rcu(struct mlx5_devcom *devcom,
-					 enum mlx5_devcom_components id, int *i);
-
-#define mlx5_devcom_for_each_peer_entry_rcu(devcom, id, data, i)		\
-	for (i = 0, data = mlx5_devcom_get_next_peer_data_rcu(devcom, id, &i);	\
-	     data;								\
-	     data = mlx5_devcom_get_next_peer_data_rcu(devcom, id, &i))
-
-#endif
+void mlx5_devcom_comp_set_ready(struct mlx5_devcom_comp_dev *devcom, bool ready);
+bool mlx5_devcom_comp_is_ready(struct mlx5_devcom_comp_dev *devcom);
+
+bool mlx5_devcom_for_each_peer_begin(struct mlx5_devcom_comp_dev *devcom);
+void mlx5_devcom_for_each_peer_end(struct mlx5_devcom_comp_dev *devcom);
+void *mlx5_devcom_get_next_peer_data(struct mlx5_devcom_comp_dev *devcom,
+				     struct mlx5_devcom_comp_dev **pos);
+
+#define mlx5_devcom_for_each_peer_entry(devcom, data, pos)                    \
+	for (pos = NULL, data = mlx5_devcom_get_next_peer_data(devcom, &pos); \
+	     data;                                                            \
+	     data = mlx5_devcom_get_next_peer_data(devcom, &pos))
+
+void *mlx5_devcom_get_next_peer_data_rcu(struct mlx5_devcom_comp_dev *devcom,
+					 struct mlx5_devcom_comp_dev **pos);
+
+#define mlx5_devcom_for_each_peer_entry_rcu(devcom, data, pos)                    \
+	for (pos = NULL, data = mlx5_devcom_get_next_peer_data_rcu(devcom, &pos); \
+	     data;								  \
+	     data = mlx5_devcom_get_next_peer_data_rcu(devcom, &pos))
+
+#endif /* __LIB_MLX5_DEVCOM_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 865d028b8abd..5ffa8effe61a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -951,10 +951,10 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
 {
 	int err;
 
-	dev->priv.devcom = mlx5_devcom_register_device(dev);
-	if (IS_ERR(dev->priv.devcom))
-		mlx5_core_err(dev, "failed to register with devcom (0x%p)\n",
-			      dev->priv.devcom);
+	dev->priv.devc = mlx5_devcom_register_device(dev);
+	if (IS_ERR(dev->priv.devc))
+		mlx5_core_warn(dev, "failed to register devcom device %ld\n",
+			       PTR_ERR(dev->priv.devc));
 
 	err = mlx5_query_board_id(dev);
 	if (err) {
@@ -1089,7 +1089,7 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
 err_irq_cleanup:
 	mlx5_irq_table_cleanup(dev);
 err_devcom:
-	mlx5_devcom_unregister_device(dev->priv.devcom);
+	mlx5_devcom_unregister_device(dev->priv.devc);
 
 	return err;
 }
@@ -1118,7 +1118,7 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev)
 	mlx5_events_cleanup(dev);
 	mlx5_eq_table_cleanup(dev);
 	mlx5_irq_table_cleanup(dev);
-	mlx5_devcom_unregister_device(dev->priv.devcom);
+	mlx5_devcom_unregister_device(dev->priv.devc);
 }
 
 static int mlx5_function_enable(struct mlx5_core_dev *dev, bool boot, u64 timeout)
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 7cb1520a27d6..23c0ed57479a 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -502,7 +502,7 @@ struct mlx5_events;
 struct mlx5_mpfs;
 struct mlx5_eswitch;
 struct mlx5_lag;
-struct mlx5_devcom;
+struct mlx5_devcom_dev;
 struct mlx5_fw_reset;
 struct mlx5_eq_table;
 struct mlx5_irq_table;
@@ -619,7 +619,7 @@ struct mlx5_priv {
 	struct mlx5_core_sriov	sriov;
 	struct mlx5_lag		*lag;
 	u32			flags;
-	struct mlx5_devcom	*devcom;
+	struct mlx5_devcom_dev	*devc;
 	struct mlx5_fw_reset	*fw_reset;
 	struct mlx5_core_roce	roce;
 	struct mlx5_fc_stats		fc_stats;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 05/14] net/mlx5e: E-Switch, Register devcom device with switch id key
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 04/14] net/mlx5: Devcom, Infrastructure changes Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 06/14] net/mlx5e: E-Switch, Allow devcom initialization on more vports Saeed Mahameed
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Roi Dayan, Shay Drory

From: Roi Dayan <roid@nvidia.com>

Register devcom devices with switch id instead of guid.
Devcom interface is used to sync between ports in the eswitch,
e.g. Adding miss rules between the ports.
New eswitch devices could have the same guid but a different
switch id so its more correct to group according to switch id
which is the identifier if the ports are on the same eswitch.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c          | 9 +++++++--
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.h        | 4 ++--
 .../net/ethernet/mellanox/mlx5/core/eswitch_offloads.c   | 7 ++-----
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 22bc88620653..507825a1abc8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -5194,11 +5194,12 @@ void mlx5e_tc_ht_cleanup(struct rhashtable *tc_ht)
 int mlx5e_tc_esw_init(struct mlx5_rep_uplink_priv *uplink_priv)
 {
 	const size_t sz_enc_opts = sizeof(struct tunnel_match_enc_opts);
+	struct netdev_phys_item_id ppid;
 	struct mlx5e_rep_priv *rpriv;
 	struct mapping_ctx *mapping;
 	struct mlx5_eswitch *esw;
 	struct mlx5e_priv *priv;
-	u64 mapping_id;
+	u64 mapping_id, key;
 	int err = 0;
 
 	rpriv = container_of(uplink_priv, struct mlx5e_rep_priv, uplink_priv);
@@ -5252,7 +5253,11 @@ int mlx5e_tc_esw_init(struct mlx5_rep_uplink_priv *uplink_priv)
 		goto err_action_counter;
 	}
 
-	mlx5_esw_offloads_devcom_init(esw);
+	err = dev_get_port_parent_id(priv->netdev, &ppid, false);
+	if (!err) {
+		memcpy(&key, &ppid.id, sizeof(key));
+		mlx5_esw_offloads_devcom_init(esw, key);
+	}
 
 	return 0;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 6d9378b0bce5..9b5a1651b877 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -382,7 +382,7 @@ int mlx5_eswitch_enable(struct mlx5_eswitch *esw, int num_vfs);
 void mlx5_eswitch_disable_sriov(struct mlx5_eswitch *esw, bool clear_vf);
 void mlx5_eswitch_disable_locked(struct mlx5_eswitch *esw);
 void mlx5_eswitch_disable(struct mlx5_eswitch *esw);
-void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw);
+void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw, u64 key);
 void mlx5_esw_offloads_devcom_cleanup(struct mlx5_eswitch *esw);
 bool mlx5_esw_offloads_devcom_is_ready(struct mlx5_eswitch *esw);
 int mlx5_eswitch_set_vport_mac(struct mlx5_eswitch *esw,
@@ -818,7 +818,7 @@ static inline void mlx5_eswitch_cleanup(struct mlx5_eswitch *esw) {}
 static inline int mlx5_eswitch_enable(struct mlx5_eswitch *esw, int num_vfs) { return 0; }
 static inline void mlx5_eswitch_disable_sriov(struct mlx5_eswitch *esw, bool clear_vf) {}
 static inline void mlx5_eswitch_disable(struct mlx5_eswitch *esw) {}
-static inline void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw) {}
+static inline void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw, u64 key) {}
 static inline void mlx5_esw_offloads_devcom_cleanup(struct mlx5_eswitch *esw) {}
 static inline bool mlx5_esw_offloads_devcom_is_ready(struct mlx5_eswitch *esw) { return false; }
 static inline bool mlx5_eswitch_is_funcs_handler(struct mlx5_core_dev *dev) { return false; }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 11cce630c1b8..7d100cd4afab 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2886,9 +2886,8 @@ static int mlx5_esw_offloads_devcom_event(int event,
 	return err;
 }
 
-void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw)
+void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw, u64 key)
 {
-	u64 guid;
 	int i;
 
 	for (i = 0; i < MLX5_MAX_PORTS; i++)
@@ -2902,12 +2901,10 @@ void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw)
 		return;
 
 	xa_init(&esw->paired);
-	guid = mlx5_query_nic_system_image_guid(esw->dev);
-
 	esw->num_peers = 0;
 	esw->devcom = mlx5_devcom_register_component(esw->dev->priv.devc,
 						     MLX5_DEVCOM_ESW_OFFLOADS,
-						     guid,
+						     key,
 						     mlx5_esw_offloads_devcom_event,
 						     esw);
 	if (IS_ERR_OR_NULL(esw->devcom))
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 06/14] net/mlx5e: E-Switch, Allow devcom initialization on more vports
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 05/14] net/mlx5e: E-Switch, Register devcom device with switch id key Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 07/14] net/mlx5: Re-organize mlx5_cmd struct Saeed Mahameed
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Roi Dayan, Shay Drory

From: Roi Dayan <roid@nvidia.com>

New features could use the devcom interface but not necessarily
the lag feature although for vport managers and ECPF
still check for lag support.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 7d100cd4afab..b4b8cb788573 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2897,7 +2897,8 @@ void mlx5_esw_offloads_devcom_init(struct mlx5_eswitch *esw, u64 key)
 	if (!MLX5_CAP_ESW(esw->dev, merged_eswitch))
 		return;
 
-	if (!mlx5_lag_is_supported(esw->dev))
+	if ((MLX5_VPORT_MANAGER(esw->dev) || mlx5_core_is_ecpf_esw_manager(esw->dev)) &&
+	    !mlx5_lag_is_supported(esw->dev))
 		return;
 
 	xa_init(&esw->paired);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 07/14] net/mlx5: Re-organize mlx5_cmd struct
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 06/14] net/mlx5e: E-Switch, Allow devcom initialization on more vports Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 08/14] net/mlx5: Remove redundant cmdif revision check Saeed Mahameed
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Moshe Shemesh

From: Shay Drory <shayd@nvidia.com>

Downstream patch will split mlx5_cmd_init() to probe and reload
routines. As a preparation, organize mlx5_cmd struct so that any
field that will be used in the reload routine are grouped at new
nested struct.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 94 +++++++++----------
 .../net/ethernet/mellanox/mlx5/core/debugfs.c |  4 +-
 include/linux/mlx5/driver.h                   | 21 +++--
 3 files changed, 60 insertions(+), 59 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index d532883b42d7..f175af528fe0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -162,18 +162,18 @@ static int cmd_alloc_index(struct mlx5_cmd *cmd)
 	int ret;
 
 	spin_lock_irqsave(&cmd->alloc_lock, flags);
-	ret = find_first_bit(&cmd->bitmask, cmd->max_reg_cmds);
-	if (ret < cmd->max_reg_cmds)
-		clear_bit(ret, &cmd->bitmask);
+	ret = find_first_bit(&cmd->vars.bitmask, cmd->vars.max_reg_cmds);
+	if (ret < cmd->vars.max_reg_cmds)
+		clear_bit(ret, &cmd->vars.bitmask);
 	spin_unlock_irqrestore(&cmd->alloc_lock, flags);
 
-	return ret < cmd->max_reg_cmds ? ret : -ENOMEM;
+	return ret < cmd->vars.max_reg_cmds ? ret : -ENOMEM;
 }
 
 static void cmd_free_index(struct mlx5_cmd *cmd, int idx)
 {
 	lockdep_assert_held(&cmd->alloc_lock);
-	set_bit(idx, &cmd->bitmask);
+	set_bit(idx, &cmd->vars.bitmask);
 }
 
 static void cmd_ent_get(struct mlx5_cmd_work_ent *ent)
@@ -192,7 +192,7 @@ static void cmd_ent_put(struct mlx5_cmd_work_ent *ent)
 
 	if (ent->idx >= 0) {
 		cmd_free_index(cmd, ent->idx);
-		up(ent->page_queue ? &cmd->pages_sem : &cmd->sem);
+		up(ent->page_queue ? &cmd->vars.pages_sem : &cmd->vars.sem);
 	}
 
 	cmd_free_ent(ent);
@@ -202,7 +202,7 @@ static void cmd_ent_put(struct mlx5_cmd_work_ent *ent)
 
 static struct mlx5_cmd_layout *get_inst(struct mlx5_cmd *cmd, int idx)
 {
-	return cmd->cmd_buf + (idx << cmd->log_stride);
+	return cmd->cmd_buf + (idx << cmd->vars.log_stride);
 }
 
 static int mlx5_calc_cmd_blocks(struct mlx5_cmd_msg *msg)
@@ -974,7 +974,7 @@ static void cmd_work_handler(struct work_struct *work)
 	cb_timeout = msecs_to_jiffies(mlx5_tout_ms(dev, CMD));
 
 	complete(&ent->handling);
-	sem = ent->page_queue ? &cmd->pages_sem : &cmd->sem;
+	sem = ent->page_queue ? &cmd->vars.pages_sem : &cmd->vars.sem;
 	down(sem);
 	if (!ent->page_queue) {
 		alloc_ret = cmd_alloc_index(cmd);
@@ -994,9 +994,9 @@ static void cmd_work_handler(struct work_struct *work)
 		}
 		ent->idx = alloc_ret;
 	} else {
-		ent->idx = cmd->max_reg_cmds;
+		ent->idx = cmd->vars.max_reg_cmds;
 		spin_lock_irqsave(&cmd->alloc_lock, flags);
-		clear_bit(ent->idx, &cmd->bitmask);
+		clear_bit(ent->idx, &cmd->vars.bitmask);
 		spin_unlock_irqrestore(&cmd->alloc_lock, flags);
 	}
 
@@ -1572,15 +1572,15 @@ void mlx5_cmd_allowed_opcode(struct mlx5_core_dev *dev, u16 opcode)
 	struct mlx5_cmd *cmd = &dev->cmd;
 	int i;
 
-	for (i = 0; i < cmd->max_reg_cmds; i++)
-		down(&cmd->sem);
-	down(&cmd->pages_sem);
+	for (i = 0; i < cmd->vars.max_reg_cmds; i++)
+		down(&cmd->vars.sem);
+	down(&cmd->vars.pages_sem);
 
 	cmd->allowed_opcode = opcode;
 
-	up(&cmd->pages_sem);
-	for (i = 0; i < cmd->max_reg_cmds; i++)
-		up(&cmd->sem);
+	up(&cmd->vars.pages_sem);
+	for (i = 0; i < cmd->vars.max_reg_cmds; i++)
+		up(&cmd->vars.sem);
 }
 
 static void mlx5_cmd_change_mod(struct mlx5_core_dev *dev, int mode)
@@ -1588,15 +1588,15 @@ static void mlx5_cmd_change_mod(struct mlx5_core_dev *dev, int mode)
 	struct mlx5_cmd *cmd = &dev->cmd;
 	int i;
 
-	for (i = 0; i < cmd->max_reg_cmds; i++)
-		down(&cmd->sem);
-	down(&cmd->pages_sem);
+	for (i = 0; i < cmd->vars.max_reg_cmds; i++)
+		down(&cmd->vars.sem);
+	down(&cmd->vars.pages_sem);
 
 	cmd->mode = mode;
 
-	up(&cmd->pages_sem);
-	for (i = 0; i < cmd->max_reg_cmds; i++)
-		up(&cmd->sem);
+	up(&cmd->vars.pages_sem);
+	for (i = 0; i < cmd->vars.max_reg_cmds; i++)
+		up(&cmd->vars.sem);
 }
 
 static int cmd_comp_notifier(struct notifier_block *nb,
@@ -1655,7 +1655,7 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force
 
 	/* there can be at most 32 command queues */
 	vector = vec & 0xffffffff;
-	for (i = 0; i < (1 << cmd->log_sz); i++) {
+	for (i = 0; i < (1 << cmd->vars.log_sz); i++) {
 		if (test_bit(i, &vector)) {
 			ent = cmd->ent_arr[i];
 
@@ -1744,7 +1744,7 @@ static void mlx5_cmd_trigger_completions(struct mlx5_core_dev *dev)
 	/* wait for pending handlers to complete */
 	mlx5_eq_synchronize_cmd_irq(dev);
 	spin_lock_irqsave(&dev->cmd.alloc_lock, flags);
-	vector = ~dev->cmd.bitmask & ((1ul << (1 << dev->cmd.log_sz)) - 1);
+	vector = ~dev->cmd.vars.bitmask & ((1ul << (1 << dev->cmd.vars.log_sz)) - 1);
 	if (!vector)
 		goto no_trig;
 
@@ -1753,14 +1753,14 @@ static void mlx5_cmd_trigger_completions(struct mlx5_core_dev *dev)
 	 * to guarantee pending commands will not get freed in the meanwhile.
 	 * For that reason, it also has to be done inside the alloc_lock.
 	 */
-	for_each_set_bit(i, &bitmask, (1 << cmd->log_sz))
+	for_each_set_bit(i, &bitmask, (1 << cmd->vars.log_sz))
 		cmd_ent_get(cmd->ent_arr[i]);
 	vector |= MLX5_TRIGGERED_CMD_COMP;
 	spin_unlock_irqrestore(&dev->cmd.alloc_lock, flags);
 
 	mlx5_core_dbg(dev, "vector 0x%llx\n", vector);
 	mlx5_cmd_comp_handler(dev, vector, true);
-	for_each_set_bit(i, &bitmask, (1 << cmd->log_sz))
+	for_each_set_bit(i, &bitmask, (1 << cmd->vars.log_sz))
 		cmd_ent_put(cmd->ent_arr[i]);
 	return;
 
@@ -1773,22 +1773,22 @@ void mlx5_cmd_flush(struct mlx5_core_dev *dev)
 	struct mlx5_cmd *cmd = &dev->cmd;
 	int i;
 
-	for (i = 0; i < cmd->max_reg_cmds; i++) {
-		while (down_trylock(&cmd->sem)) {
+	for (i = 0; i < cmd->vars.max_reg_cmds; i++) {
+		while (down_trylock(&cmd->vars.sem)) {
 			mlx5_cmd_trigger_completions(dev);
 			cond_resched();
 		}
 	}
 
-	while (down_trylock(&cmd->pages_sem)) {
+	while (down_trylock(&cmd->vars.pages_sem)) {
 		mlx5_cmd_trigger_completions(dev);
 		cond_resched();
 	}
 
 	/* Unlock cmdif */
-	up(&cmd->pages_sem);
-	for (i = 0; i < cmd->max_reg_cmds; i++)
-		up(&cmd->sem);
+	up(&cmd->vars.pages_sem);
+	for (i = 0; i < cmd->vars.max_reg_cmds; i++)
+		up(&cmd->vars.sem);
 }
 
 static struct mlx5_cmd_msg *alloc_msg(struct mlx5_core_dev *dev, int in_size,
@@ -1858,7 +1858,7 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
 		/* atomic context may not sleep */
 		if (callback)
 			return -EINVAL;
-		down(&dev->cmd.throttle_sem);
+		down(&dev->cmd.vars.throttle_sem);
 	}
 
 	pages_queue = is_manage_pages(in);
@@ -1903,7 +1903,7 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
 	free_msg(dev, inb);
 out_up:
 	if (throttle_op)
-		up(&dev->cmd.throttle_sem);
+		up(&dev->cmd.vars.throttle_sem);
 	return err;
 }
 
@@ -2213,16 +2213,16 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 		goto err_free_pool;
 
 	cmd_l = ioread32be(&dev->iseg->cmdq_addr_l_sz) & 0xff;
-	cmd->log_sz = cmd_l >> 4 & 0xf;
-	cmd->log_stride = cmd_l & 0xf;
-	if (1 << cmd->log_sz > MLX5_MAX_COMMANDS) {
+	cmd->vars.log_sz = cmd_l >> 4 & 0xf;
+	cmd->vars.log_stride = cmd_l & 0xf;
+	if (1 << cmd->vars.log_sz > MLX5_MAX_COMMANDS) {
 		mlx5_core_err(dev, "firmware reports too many outstanding commands %d\n",
-			      1 << cmd->log_sz);
+			      1 << cmd->vars.log_sz);
 		err = -EINVAL;
 		goto err_free_page;
 	}
 
-	if (cmd->log_sz + cmd->log_stride > MLX5_ADAPTER_PAGE_SHIFT) {
+	if (cmd->vars.log_sz + cmd->vars.log_stride > MLX5_ADAPTER_PAGE_SHIFT) {
 		mlx5_core_err(dev, "command queue size overflow\n");
 		err = -EINVAL;
 		goto err_free_page;
@@ -2230,13 +2230,13 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 
 	cmd->state = MLX5_CMDIF_STATE_DOWN;
 	cmd->checksum_disabled = 1;
-	cmd->max_reg_cmds = (1 << cmd->log_sz) - 1;
-	cmd->bitmask = (1UL << cmd->max_reg_cmds) - 1;
+	cmd->vars.max_reg_cmds = (1 << cmd->vars.log_sz) - 1;
+	cmd->vars.bitmask = (1UL << cmd->vars.max_reg_cmds) - 1;
 
-	cmd->cmdif_rev = ioread32be(&dev->iseg->cmdif_rev_fw_sub) >> 16;
-	if (cmd->cmdif_rev > CMD_IF_REV) {
+	cmd->vars.cmdif_rev = ioread32be(&dev->iseg->cmdif_rev_fw_sub) >> 16;
+	if (cmd->vars.cmdif_rev > CMD_IF_REV) {
 		mlx5_core_err(dev, "driver does not support command interface version. driver %d, firmware %d\n",
-			      CMD_IF_REV, cmd->cmdif_rev);
+			      CMD_IF_REV, cmd->vars.cmdif_rev);
 		err = -EOPNOTSUPP;
 		goto err_free_page;
 	}
@@ -2246,9 +2246,9 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 	for (i = 0; i < MLX5_CMD_OP_MAX; i++)
 		spin_lock_init(&cmd->stats[i].lock);
 
-	sema_init(&cmd->sem, cmd->max_reg_cmds);
-	sema_init(&cmd->pages_sem, 1);
-	sema_init(&cmd->throttle_sem, DIV_ROUND_UP(cmd->max_reg_cmds, 2));
+	sema_init(&cmd->vars.sem, cmd->vars.max_reg_cmds);
+	sema_init(&cmd->vars.pages_sem, 1);
+	sema_init(&cmd->vars.throttle_sem, DIV_ROUND_UP(cmd->vars.max_reg_cmds, 2));
 
 	cmd_h = (u32)((u64)(cmd->dma) >> 32);
 	cmd_l = (u32)(cmd->dma);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c b/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c
index 2138f28a2931..9a826fb3ca38 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c
@@ -176,8 +176,8 @@ static ssize_t slots_read(struct file *filp, char __user *buf, size_t count,
 	int ret;
 
 	cmd = filp->private_data;
-	weight = bitmap_weight(&cmd->bitmask, cmd->max_reg_cmds);
-	field = cmd->max_reg_cmds - weight;
+	weight = bitmap_weight(&cmd->vars.bitmask, cmd->vars.max_reg_cmds);
+	field = cmd->vars.max_reg_cmds - weight;
 	ret = snprintf(tbuf, sizeof(tbuf), "%d\n", field);
 	return simple_read_from_buffer(buf, count, pos, tbuf, ret);
 }
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 23c0ed57479a..b0fd66ed96f8 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -288,18 +288,23 @@ struct mlx5_cmd_stats {
 struct mlx5_cmd {
 	struct mlx5_nb    nb;
 
+	/* members which needs to be queried or reinitialized each reload */
+	struct {
+		u16		cmdif_rev;
+		u8		log_sz;
+		u8		log_stride;
+		int		max_reg_cmds;
+		unsigned long	bitmask;
+		struct semaphore sem;
+		struct semaphore pages_sem;
+		struct semaphore throttle_sem;
+	} vars;
 	enum mlx5_cmdif_state	state;
 	void	       *cmd_alloc_buf;
 	dma_addr_t	alloc_dma;
 	int		alloc_size;
 	void	       *cmd_buf;
 	dma_addr_t	dma;
-	u16		cmdif_rev;
-	u8		log_sz;
-	u8		log_stride;
-	int		max_reg_cmds;
-	int		events;
-	u32 __iomem    *vector;
 
 	/* protect command queue allocations
 	 */
@@ -309,12 +314,8 @@ struct mlx5_cmd {
 	 */
 	spinlock_t	token_lock;
 	u8		token;
-	unsigned long	bitmask;
 	char		wq_name[MLX5_CMD_WQ_MAX_NAME];
 	struct workqueue_struct *wq;
-	struct semaphore sem;
-	struct semaphore pages_sem;
-	struct semaphore throttle_sem;
 	int	mode;
 	u16     allowed_opcode;
 	struct mlx5_cmd_work_ent *ent_arr[MLX5_MAX_COMMANDS];
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 08/14] net/mlx5: Remove redundant cmdif revision check
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 07/14] net/mlx5: Re-organize mlx5_cmd struct Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 09/14] net/mlx5: split mlx5_cmd_init() to probe and reload routines Saeed Mahameed
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Moshe Shemesh

From: Shay Drory <shayd@nvidia.com>

mlx5 is checking the cmdif revision twice, for no reason.
Remove the latter check.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index f175af528fe0..9ced943ebd0d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -2191,16 +2191,15 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 	int align = roundup_pow_of_two(size);
 	struct mlx5_cmd *cmd = &dev->cmd;
 	u32 cmd_h, cmd_l;
-	u16 cmd_if_rev;
 	int err;
 	int i;
 
 	memset(cmd, 0, sizeof(*cmd));
-	cmd_if_rev = cmdif_rev(dev);
-	if (cmd_if_rev != CMD_IF_REV) {
+	cmd->vars.cmdif_rev = cmdif_rev(dev);
+	if (cmd->vars.cmdif_rev != CMD_IF_REV) {
 		mlx5_core_err(dev,
 			      "Driver cmdif rev(%d) differs from firmware's(%d)\n",
-			      CMD_IF_REV, cmd_if_rev);
+			      CMD_IF_REV, cmd->vars.cmdif_rev);
 		return -EINVAL;
 	}
 
@@ -2233,14 +2232,6 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 	cmd->vars.max_reg_cmds = (1 << cmd->vars.log_sz) - 1;
 	cmd->vars.bitmask = (1UL << cmd->vars.max_reg_cmds) - 1;
 
-	cmd->vars.cmdif_rev = ioread32be(&dev->iseg->cmdif_rev_fw_sub) >> 16;
-	if (cmd->vars.cmdif_rev > CMD_IF_REV) {
-		mlx5_core_err(dev, "driver does not support command interface version. driver %d, firmware %d\n",
-			      CMD_IF_REV, cmd->vars.cmdif_rev);
-		err = -EOPNOTSUPP;
-		goto err_free_page;
-	}
-
 	spin_lock_init(&cmd->alloc_lock);
 	spin_lock_init(&cmd->token_lock);
 	for (i = 0; i < MLX5_CMD_OP_MAX; i++)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 09/14] net/mlx5: split mlx5_cmd_init() to probe and reload routines
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 08/14] net/mlx5: Remove redundant cmdif revision check Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 10/14] net/mlx5: Allocate command stats with xarray Saeed Mahameed
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Moshe Shemesh

From: Shay Drory <shayd@nvidia.com>

There is no need to destroy and allocate cmd SW structs during reload,
this is time consuming for no reason.
Hence, split mlx5_cmd_init() to probe and reload routines.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 121 ++++++++++--------
 .../net/ethernet/mellanox/mlx5/core/main.c    |  15 ++-
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
 3 files changed, 82 insertions(+), 56 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 9ced943ebd0d..45edd5a110c8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1548,7 +1548,6 @@ static void clean_debug_files(struct mlx5_core_dev *dev)
 	if (!mlx5_debugfs_root)
 		return;
 
-	mlx5_cmdif_debugfs_cleanup(dev);
 	debugfs_remove_recursive(dbg->dbg_root);
 }
 
@@ -1563,8 +1562,6 @@ static void create_debugfs_files(struct mlx5_core_dev *dev)
 	debugfs_create_file("out_len", 0600, dbg->dbg_root, dev, &olfops);
 	debugfs_create_u8("status", 0600, dbg->dbg_root, &dbg->status);
 	debugfs_create_file("run", 0200, dbg->dbg_root, dev, &fops);
-
-	mlx5_cmdif_debugfs_init(dev);
 }
 
 void mlx5_cmd_allowed_opcode(struct mlx5_core_dev *dev, u16 opcode)
@@ -2190,19 +2187,10 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 	int size = sizeof(struct mlx5_cmd_prot_block);
 	int align = roundup_pow_of_two(size);
 	struct mlx5_cmd *cmd = &dev->cmd;
-	u32 cmd_h, cmd_l;
+	u32 cmd_l;
 	int err;
 	int i;
 
-	memset(cmd, 0, sizeof(*cmd));
-	cmd->vars.cmdif_rev = cmdif_rev(dev);
-	if (cmd->vars.cmdif_rev != CMD_IF_REV) {
-		mlx5_core_err(dev,
-			      "Driver cmdif rev(%d) differs from firmware's(%d)\n",
-			      CMD_IF_REV, cmd->vars.cmdif_rev);
-		return -EINVAL;
-	}
-
 	cmd->pool = dma_pool_create("mlx5_cmd", mlx5_core_dma_dev(dev), size, align, 0);
 	if (!cmd->pool)
 		return -ENOMEM;
@@ -2211,43 +2199,93 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 	if (err)
 		goto err_free_pool;
 
+	cmd_l = (u32)(cmd->dma);
+	if (cmd_l & 0xfff) {
+		mlx5_core_err(dev, "invalid command queue address\n");
+		err = -ENOMEM;
+		goto err_cmd_page;
+	}
+	cmd->checksum_disabled = 1;
+
+	spin_lock_init(&cmd->alloc_lock);
+	spin_lock_init(&cmd->token_lock);
+	for (i = 0; i < MLX5_CMD_OP_MAX; i++)
+		spin_lock_init(&cmd->stats[i].lock);
+
+	create_msg_cache(dev);
+
+	set_wqname(dev);
+	cmd->wq = create_singlethread_workqueue(cmd->wq_name);
+	if (!cmd->wq) {
+		mlx5_core_err(dev, "failed to create command workqueue\n");
+		err = -ENOMEM;
+		goto err_cache;
+	}
+
+	mlx5_cmdif_debugfs_init(dev);
+
+	return 0;
+
+err_cache:
+	destroy_msg_cache(dev);
+err_cmd_page:
+	free_cmd_page(dev, cmd);
+err_free_pool:
+	dma_pool_destroy(cmd->pool);
+	return err;
+}
+
+void mlx5_cmd_cleanup(struct mlx5_core_dev *dev)
+{
+	struct mlx5_cmd *cmd = &dev->cmd;
+
+	mlx5_cmdif_debugfs_cleanup(dev);
+	destroy_workqueue(cmd->wq);
+	destroy_msg_cache(dev);
+	free_cmd_page(dev, cmd);
+	dma_pool_destroy(cmd->pool);
+}
+
+int mlx5_cmd_enable(struct mlx5_core_dev *dev)
+{
+	struct mlx5_cmd *cmd = &dev->cmd;
+	u32 cmd_h, cmd_l;
+
+	memset(&cmd->vars, 0, sizeof(cmd->vars));
+	cmd->vars.cmdif_rev = cmdif_rev(dev);
+	if (cmd->vars.cmdif_rev != CMD_IF_REV) {
+		mlx5_core_err(dev,
+			      "Driver cmdif rev(%d) differs from firmware's(%d)\n",
+			      CMD_IF_REV, cmd->vars.cmdif_rev);
+		return -EINVAL;
+	}
+
 	cmd_l = ioread32be(&dev->iseg->cmdq_addr_l_sz) & 0xff;
 	cmd->vars.log_sz = cmd_l >> 4 & 0xf;
 	cmd->vars.log_stride = cmd_l & 0xf;
 	if (1 << cmd->vars.log_sz > MLX5_MAX_COMMANDS) {
 		mlx5_core_err(dev, "firmware reports too many outstanding commands %d\n",
 			      1 << cmd->vars.log_sz);
-		err = -EINVAL;
-		goto err_free_page;
+		return -EINVAL;
 	}
 
 	if (cmd->vars.log_sz + cmd->vars.log_stride > MLX5_ADAPTER_PAGE_SHIFT) {
 		mlx5_core_err(dev, "command queue size overflow\n");
-		err = -EINVAL;
-		goto err_free_page;
+		return -EINVAL;
 	}
 
 	cmd->state = MLX5_CMDIF_STATE_DOWN;
-	cmd->checksum_disabled = 1;
 	cmd->vars.max_reg_cmds = (1 << cmd->vars.log_sz) - 1;
 	cmd->vars.bitmask = (1UL << cmd->vars.max_reg_cmds) - 1;
 
-	spin_lock_init(&cmd->alloc_lock);
-	spin_lock_init(&cmd->token_lock);
-	for (i = 0; i < MLX5_CMD_OP_MAX; i++)
-		spin_lock_init(&cmd->stats[i].lock);
-
 	sema_init(&cmd->vars.sem, cmd->vars.max_reg_cmds);
 	sema_init(&cmd->vars.pages_sem, 1);
 	sema_init(&cmd->vars.throttle_sem, DIV_ROUND_UP(cmd->vars.max_reg_cmds, 2));
 
 	cmd_h = (u32)((u64)(cmd->dma) >> 32);
 	cmd_l = (u32)(cmd->dma);
-	if (cmd_l & 0xfff) {
-		mlx5_core_err(dev, "invalid command queue address\n");
-		err = -ENOMEM;
-		goto err_free_page;
-	}
+	if (WARN_ON(cmd_l & 0xfff))
+		return -EINVAL;
 
 	iowrite32be(cmd_h, &dev->iseg->cmdq_addr_h);
 	iowrite32be(cmd_l, &dev->iseg->cmdq_addr_l_sz);
@@ -2260,40 +2298,17 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 	cmd->mode = CMD_MODE_POLLING;
 	cmd->allowed_opcode = CMD_ALLOWED_OPCODE_ALL;
 
-	create_msg_cache(dev);
-
-	set_wqname(dev);
-	cmd->wq = create_singlethread_workqueue(cmd->wq_name);
-	if (!cmd->wq) {
-		mlx5_core_err(dev, "failed to create command workqueue\n");
-		err = -ENOMEM;
-		goto err_cache;
-	}
-
 	create_debugfs_files(dev);
 
 	return 0;
-
-err_cache:
-	destroy_msg_cache(dev);
-
-err_free_page:
-	free_cmd_page(dev, cmd);
-
-err_free_pool:
-	dma_pool_destroy(cmd->pool);
-	return err;
 }
 
-void mlx5_cmd_cleanup(struct mlx5_core_dev *dev)
+void mlx5_cmd_disable(struct mlx5_core_dev *dev)
 {
 	struct mlx5_cmd *cmd = &dev->cmd;
 
 	clean_debug_files(dev);
-	destroy_workqueue(cmd->wq);
-	destroy_msg_cache(dev);
-	free_cmd_page(dev, cmd);
-	dma_pool_destroy(cmd->pool);
+	flush_workqueue(cmd->wq);
 }
 
 void mlx5_cmd_set_state(struct mlx5_core_dev *dev,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 5ffa8effe61a..36bc5c40630f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1142,7 +1142,7 @@ static int mlx5_function_enable(struct mlx5_core_dev *dev, bool boot, u64 timeou
 		return err;
 	}
 
-	err = mlx5_cmd_init(dev);
+	err = mlx5_cmd_enable(dev);
 	if (err) {
 		mlx5_core_err(dev, "Failed initializing command interface, aborting\n");
 		return err;
@@ -1196,7 +1196,7 @@ static int mlx5_function_enable(struct mlx5_core_dev *dev, bool boot, u64 timeou
 	mlx5_stop_health_poll(dev, boot);
 err_cmd_cleanup:
 	mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_DOWN);
-	mlx5_cmd_cleanup(dev);
+	mlx5_cmd_disable(dev);
 
 	return err;
 }
@@ -1207,7 +1207,7 @@ static void mlx5_function_disable(struct mlx5_core_dev *dev, bool boot)
 	mlx5_core_disable_hca(dev, 0);
 	mlx5_stop_health_poll(dev, boot);
 	mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_DOWN);
-	mlx5_cmd_cleanup(dev);
+	mlx5_cmd_disable(dev);
 }
 
 static int mlx5_function_open(struct mlx5_core_dev *dev)
@@ -1796,6 +1796,12 @@ int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx)
 	debugfs_create_file("vhca_id", 0400, priv->dbg.dbg_root, dev, &vhca_id_fops);
 	INIT_LIST_HEAD(&priv->traps);
 
+	err = mlx5_cmd_init(dev);
+	if (err) {
+		mlx5_core_err(dev, "Failed initializing cmdif SW structs, aborting\n");
+		goto err_cmd_init;
+	}
+
 	err = mlx5_tout_init(dev);
 	if (err) {
 		mlx5_core_err(dev, "Failed initializing timeouts, aborting\n");
@@ -1841,6 +1847,8 @@ int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx)
 err_health_init:
 	mlx5_tout_cleanup(dev);
 err_timeout_init:
+	mlx5_cmd_cleanup(dev);
+err_cmd_init:
 	debugfs_remove(dev->priv.dbg.dbg_root);
 	mutex_destroy(&priv->pgdir_mutex);
 	mutex_destroy(&priv->alloc_mutex);
@@ -1863,6 +1871,7 @@ void mlx5_mdev_uninit(struct mlx5_core_dev *dev)
 	mlx5_pagealloc_cleanup(dev);
 	mlx5_health_cleanup(dev);
 	mlx5_tout_cleanup(dev);
+	mlx5_cmd_cleanup(dev);
 	debugfs_remove_recursive(dev->priv.dbg.dbg_root);
 	mutex_destroy(&priv->pgdir_mutex);
 	mutex_destroy(&priv->alloc_mutex);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 6cebc8417282..d54afb07d972 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -179,6 +179,8 @@ int mlx5_query_board_id(struct mlx5_core_dev *dev);
 int mlx5_query_module_num(struct mlx5_core_dev *dev, int *module_num);
 int mlx5_cmd_init(struct mlx5_core_dev *dev);
 void mlx5_cmd_cleanup(struct mlx5_core_dev *dev);
+int mlx5_cmd_enable(struct mlx5_core_dev *dev);
+void mlx5_cmd_disable(struct mlx5_core_dev *dev);
 void mlx5_cmd_set_state(struct mlx5_core_dev *dev,
 			enum mlx5_cmdif_state cmdif_state);
 int mlx5_cmd_init_hca(struct mlx5_core_dev *dev, uint32_t *sw_owner_id);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 10/14] net/mlx5: Allocate command stats with xarray
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 09/14] net/mlx5: split mlx5_cmd_init() to probe and reload routines Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 11/14] net/mlx5e: Remove duplicate code for user flow Saeed Mahameed
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Shay Drory, Moshe Shemesh

From: Shay Drory <shayd@nvidia.com>

Command stats is an array with more than 2K entries, which amounts to
~180KB. This is way more than actually needed, as only ~190 entries
are being used.

Therefore, replace the array with xarray.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 15 +++++-----
 .../net/ethernet/mellanox/mlx5/core/debugfs.c | 30 ++++++++++++++++++-
 include/linux/mlx5/driver.h                   |  2 +-
 3 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 45edd5a110c8..afb348579577 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1225,8 +1225,8 @@ static int mlx5_cmd_invoke(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *in,
 		goto out_free;
 
 	ds = ent->ts2 - ent->ts1;
-	if (ent->op < MLX5_CMD_OP_MAX) {
-		stats = &cmd->stats[ent->op];
+	stats = xa_load(&cmd->stats, ent->op);
+	if (stats) {
 		spin_lock_irq(&stats->lock);
 		stats->sum += ds;
 		++stats->n;
@@ -1695,8 +1695,8 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force
 
 			if (ent->callback) {
 				ds = ent->ts2 - ent->ts1;
-				if (ent->op < MLX5_CMD_OP_MAX) {
-					stats = &cmd->stats[ent->op];
+				stats = xa_load(&cmd->stats, ent->op);
+				if (stats) {
 					spin_lock_irqsave(&stats->lock, flags);
 					stats->sum += ds;
 					++stats->n;
@@ -1923,7 +1923,9 @@ static void cmd_status_log(struct mlx5_core_dev *dev, u16 opcode, u8 status,
 	if (!err || !(strcmp(namep, "unknown command opcode")))
 		return;
 
-	stats = &dev->cmd.stats[opcode];
+	stats = xa_load(&dev->cmd.stats, opcode);
+	if (!stats)
+		return;
 	spin_lock_irq(&stats->lock);
 	stats->failed++;
 	if (err < 0)
@@ -2189,7 +2191,6 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 	struct mlx5_cmd *cmd = &dev->cmd;
 	u32 cmd_l;
 	int err;
-	int i;
 
 	cmd->pool = dma_pool_create("mlx5_cmd", mlx5_core_dma_dev(dev), size, align, 0);
 	if (!cmd->pool)
@@ -2209,8 +2210,6 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 
 	spin_lock_init(&cmd->alloc_lock);
 	spin_lock_init(&cmd->token_lock);
-	for (i = 0; i < MLX5_CMD_OP_MAX; i++)
-		spin_lock_init(&cmd->stats[i].lock);
 
 	create_msg_cache(dev);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c b/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c
index 9a826fb3ca38..09652dc89115 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c
@@ -188,6 +188,24 @@ static const struct file_operations slots_fops = {
 	.read	= slots_read,
 };
 
+static struct mlx5_cmd_stats *
+mlx5_cmdif_alloc_stats(struct xarray *stats_xa, int opcode)
+{
+	struct mlx5_cmd_stats *stats = kzalloc(sizeof(*stats), GFP_KERNEL);
+	int err;
+
+	if (!stats)
+		return NULL;
+
+	err = xa_insert(stats_xa, opcode, stats, GFP_KERNEL);
+	if (err) {
+		kfree(stats);
+		return NULL;
+	}
+	spin_lock_init(&stats->lock);
+	return stats;
+}
+
 void mlx5_cmdif_debugfs_init(struct mlx5_core_dev *dev)
 {
 	struct mlx5_cmd_stats *stats;
@@ -200,10 +218,14 @@ void mlx5_cmdif_debugfs_init(struct mlx5_core_dev *dev)
 
 	debugfs_create_file("slots_inuse", 0400, *cmd, &dev->cmd, &slots_fops);
 
+	xa_init(&dev->cmd.stats);
+
 	for (i = 0; i < MLX5_CMD_OP_MAX; i++) {
-		stats = &dev->cmd.stats[i];
 		namep = mlx5_command_str(i);
 		if (strcmp(namep, "unknown command opcode")) {
+			stats = mlx5_cmdif_alloc_stats(&dev->cmd.stats, i);
+			if (!stats)
+				continue;
 			stats->root = debugfs_create_dir(namep, *cmd);
 
 			debugfs_create_file("average", 0400, stats->root, stats,
@@ -224,7 +246,13 @@ void mlx5_cmdif_debugfs_init(struct mlx5_core_dev *dev)
 
 void mlx5_cmdif_debugfs_cleanup(struct mlx5_core_dev *dev)
 {
+	struct mlx5_cmd_stats *stats;
+	unsigned long i;
+
 	debugfs_remove_recursive(dev->priv.dbg.cmdif_debugfs);
+	xa_for_each(&dev->cmd.stats, i, stats)
+		kfree(stats);
+	xa_destroy(&dev->cmd.stats);
 }
 
 void mlx5_cq_debugfs_init(struct mlx5_core_dev *dev)
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index b0fd66ed96f8..4e34bdd0f633 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -323,7 +323,7 @@ struct mlx5_cmd {
 	struct mlx5_cmd_debug dbg;
 	struct cmd_msg_cache cache[MLX5_NUM_COMMAND_CACHES];
 	int checksum_disabled;
-	struct mlx5_cmd_stats stats[MLX5_CMD_OP_MAX];
+	struct xarray stats;
 };
 
 struct mlx5_cmd_mailbox {
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 11/14] net/mlx5e: Remove duplicate code for user flow
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 10/14] net/mlx5: Allocate command stats with xarray Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 12/14] net/mlx5e: Make flow classification filters static Saeed Mahameed
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Parav Pandit, Roi Dayan

From: Parav Pandit <parav@nvidia.com>

Flow table and priority detection is same for IP user flows and other L4
flows. Hence, use same code for all these flow types.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c
index aac32e505c14..08eb186615c8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c
@@ -96,10 +96,6 @@ static struct mlx5e_ethtool_table *get_flow_table(struct mlx5e_priv *priv,
 	case UDP_V4_FLOW:
 	case TCP_V6_FLOW:
 	case UDP_V6_FLOW:
-		max_tuples = ETHTOOL_NUM_L3_L4_FTS;
-		prio = MLX5E_ETHTOOL_L3_L4_PRIO + (max_tuples - num_tuples);
-		eth_ft = &ethtool->l3_l4_ft[prio];
-		break;
 	case IP_USER_FLOW:
 	case IPV6_USER_FLOW:
 		max_tuples = ETHTOOL_NUM_L3_L4_FTS;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 12/14] net/mlx5e: Make flow classification filters static
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (10 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 11/14] net/mlx5e: Remove duplicate code for user flow Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 13/14] net/mlx5: Don't check vport->enabled in port ops Saeed Mahameed
  2023-07-24 22:44 ` [net-next 14/14] net/mlx5: Remove pointless devlink_rate checks Saeed Mahameed
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Parav Pandit, Roi Dayan

From: Parav Pandit <parav@nvidia.com>

Get and set flow classification filters are used in a single file.
Hence, make them static.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h         | 3 ---
 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 6 +++---
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index b1807bfb815f..0f8f70b91485 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -1167,9 +1167,6 @@ int mlx5e_ethtool_set_link_ksettings(struct mlx5e_priv *priv,
 int mlx5e_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key, u8 *hfunc);
 int mlx5e_set_rxfh(struct net_device *dev, const u32 *indir, const u8 *key,
 		   const u8 hfunc);
-int mlx5e_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info,
-		    u32 *rule_locs);
-int mlx5e_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd);
 u32 mlx5e_ethtool_get_rxfh_key_size(struct mlx5e_priv *priv);
 u32 mlx5e_ethtool_get_rxfh_indir_size(struct mlx5e_priv *priv);
 int mlx5e_ethtool_get_ts_info(struct mlx5e_priv *priv,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 27861b68ced5..04195a673a6b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -2163,8 +2163,8 @@ static u32 mlx5e_get_priv_flags(struct net_device *netdev)
 	return priv->channels.params.pflags;
 }
 
-int mlx5e_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info,
-		    u32 *rule_locs)
+static int mlx5e_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info,
+			   u32 *rule_locs)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
 
@@ -2181,7 +2181,7 @@ int mlx5e_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info,
 	return mlx5e_ethtool_get_rxnfc(priv, info, rule_locs);
 }
 
-int mlx5e_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
+static int mlx5e_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 13/14] net/mlx5: Don't check vport->enabled in port ops
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (11 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 12/14] net/mlx5e: Make flow classification filters static Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  2023-07-24 22:44 ` [net-next 14/14] net/mlx5: Remove pointless devlink_rate checks Saeed Mahameed
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Jiri Pirko, Shay Drory

From: Jiri Pirko <jiri@nvidia.com>

vport->enabled is always set for a vport for which a devlink port is
registered, therefore the checks in the ops are pointless.
Remove those.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../mellanox/mlx5/core/eswitch_offloads.c     | 28 ++++---------------
 1 file changed, 6 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index b4b8cb788573..c192000e3614 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -4125,7 +4125,6 @@ int mlx5_devlink_port_fn_migratable_get(struct devlink_port *port, bool *is_enab
 {
 	struct mlx5_eswitch *esw;
 	struct mlx5_vport *vport;
-	int err = -EOPNOTSUPP;
 
 	esw = mlx5_devlink_eswitch_get(port->devlink);
 	if (IS_ERR(esw))
@@ -4133,7 +4132,7 @@ int mlx5_devlink_port_fn_migratable_get(struct devlink_port *port, bool *is_enab
 
 	if (!MLX5_CAP_GEN(esw->dev, migration)) {
 		NL_SET_ERR_MSG_MOD(extack, "Device doesn't support migration");
-		return err;
+		return -EOPNOTSUPP;
 	}
 
 	vport = mlx5_devlink_port_fn_get_vport(port, esw);
@@ -4143,12 +4142,9 @@ int mlx5_devlink_port_fn_migratable_get(struct devlink_port *port, bool *is_enab
 	}
 
 	mutex_lock(&esw->state_lock);
-	if (vport->enabled) {
-		*is_enabled = vport->info.mig_enabled;
-		err = 0;
-	}
+	*is_enabled = vport->info.mig_enabled;
 	mutex_unlock(&esw->state_lock);
-	return err;
+	return 0;
 }
 
 int mlx5_devlink_port_fn_migratable_set(struct devlink_port *port, bool enable,
@@ -4177,10 +4173,6 @@ int mlx5_devlink_port_fn_migratable_set(struct devlink_port *port, bool enable,
 	}
 
 	mutex_lock(&esw->state_lock);
-	if (!vport->enabled) {
-		NL_SET_ERR_MSG_MOD(extack, "Eswitch vport is disabled");
-		goto out;
-	}
 
 	if (vport->info.mig_enabled == enable) {
 		err = 0;
@@ -4224,7 +4216,6 @@ int mlx5_devlink_port_fn_roce_get(struct devlink_port *port, bool *is_enabled,
 {
 	struct mlx5_eswitch *esw;
 	struct mlx5_vport *vport;
-	int err = -EOPNOTSUPP;
 
 	esw = mlx5_devlink_eswitch_get(port->devlink);
 	if (IS_ERR(esw))
@@ -4237,12 +4228,9 @@ int mlx5_devlink_port_fn_roce_get(struct devlink_port *port, bool *is_enabled,
 	}
 
 	mutex_lock(&esw->state_lock);
-	if (vport->enabled) {
-		*is_enabled = vport->info.roce_enabled;
-		err = 0;
-	}
+	*is_enabled = vport->info.roce_enabled;
 	mutex_unlock(&esw->state_lock);
-	return err;
+	return 0;
 }
 
 int mlx5_devlink_port_fn_roce_set(struct devlink_port *port, bool enable,
@@ -4251,10 +4239,10 @@ int mlx5_devlink_port_fn_roce_set(struct devlink_port *port, bool enable,
 	int query_out_sz = MLX5_ST_SZ_BYTES(query_hca_cap_out);
 	struct mlx5_eswitch *esw;
 	struct mlx5_vport *vport;
-	int err = -EOPNOTSUPP;
 	void *query_ctx;
 	void *hca_caps;
 	u16 vport_num;
+	int err;
 
 	esw = mlx5_devlink_eswitch_get(port->devlink);
 	if (IS_ERR(esw))
@@ -4268,10 +4256,6 @@ int mlx5_devlink_port_fn_roce_set(struct devlink_port *port, bool enable,
 	vport_num = vport->vport;
 
 	mutex_lock(&esw->state_lock);
-	if (!vport->enabled) {
-		NL_SET_ERR_MSG_MOD(extack, "Eswitch vport is disabled");
-		goto out;
-	}
 
 	if (vport->info.roce_enabled == enable) {
 		err = 0;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [net-next 14/14] net/mlx5: Remove pointless devlink_rate checks
  2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
                   ` (12 preceding siblings ...)
  2023-07-24 22:44 ` [net-next 13/14] net/mlx5: Don't check vport->enabled in port ops Saeed Mahameed
@ 2023-07-24 22:44 ` Saeed Mahameed
  13 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-24 22:44 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: Saeed Mahameed, netdev, Tariq Toukan, Jiri Pirko, Shay Drory

From: Jiri Pirko <jiri@nvidia.com>

It is guaranteed that the devlink rate leaf is created during init paths.
No need to check during cleanup. Remove the checks.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/esw/devlink_port.c   | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c
index af779c700278..433541ac36a7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c
@@ -132,10 +132,8 @@ void mlx5_esw_offloads_devlink_port_unregister(struct mlx5_eswitch *esw, u16 vpo
 	if (IS_ERR(vport))
 		return;
 
-	if (vport->dl_port->devlink_rate) {
-		mlx5_esw_qos_vport_update_group(esw, vport, NULL, NULL);
-		devl_rate_leaf_destroy(vport->dl_port);
-	}
+	mlx5_esw_qos_vport_update_group(esw, vport, NULL, NULL);
+	devl_rate_leaf_destroy(vport->dl_port);
 
 	devl_port_unregister(vport->dl_port);
 	mlx5_esw_dl_port_free(vport->dl_port);
@@ -211,10 +209,8 @@ void mlx5_esw_devlink_sf_port_unregister(struct mlx5_eswitch *esw, u16 vport_num
 	if (IS_ERR(vport))
 		return;
 
-	if (vport->dl_port->devlink_rate) {
-		mlx5_esw_qos_vport_update_group(esw, vport, NULL, NULL);
-		devl_rate_leaf_destroy(vport->dl_port);
-	}
+	mlx5_esw_qos_vport_update_group(esw, vport, NULL, NULL);
+	devl_rate_leaf_destroy(vport->dl_port);
 
 	devl_port_unregister(vport->dl_port);
 	vport->dl_port = NULL;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [net-next 02/14] net/mlx5: Expose NIC temperature via hardware monitoring kernel API
  2023-07-24 22:44 ` [net-next 02/14] net/mlx5: Expose NIC temperature via hardware monitoring kernel API Saeed Mahameed
@ 2023-07-26  3:31   ` Jakub Kicinski
  2023-07-26 21:34     ` Saeed Mahameed
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2023-07-26  3:31 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, Paolo Abeni, Eric Dumazet, Saeed Mahameed,
	netdev, Tariq Toukan, Adham Faris, Gal Pressman

On Mon, 24 Jul 2023 15:44:14 -0700 Saeed Mahameed wrote:
> Expose NIC temperature by implementing hwmon kernel API, which turns
> current thermal zone kernel API to redundant.
> 
> For each one of the supported and exposed thermal diode sensors, expose
> the following attributes:
> 1) Input temperature.
> 2) Highest temperature.
> 3) Temperature label.
> 4) Temperature critical max value:
>    refers to the high threshold of Warning Event. Will be exposed as
>    `tempY_crit` hwmon attribute (RO attribute). For example for
>    ConnectX5 HCA's this temperature value will be 105 Celsius, 10
>    degrees lower than the HW shutdown temperature).
> 5) Temperature reset history: resets highest temperature.
> 
> For example, for dualport ConnectX5 NIC with a single IC thermal diode
> sensor will have 2 hwmon directories (one for each PCI function)
> under "/sys/class/hwmon/hwmon[X,Y]".
> 
> Listing one of the directories above (hwmonX/Y) generates the
> corresponding output below:
> 
> $ grep -H -d skip . /sys/class/hwmon/hwmon0/*

I missed it glancing on the series yesterday because it's just 
a warning in pw - we should really get hwmon folks and ML CCed
on this one.
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [net-next 02/14] net/mlx5: Expose NIC temperature via hardware monitoring kernel API
  2023-07-26  3:31   ` Jakub Kicinski
@ 2023-07-26 21:34     ` Saeed Mahameed
  0 siblings, 0 replies; 17+ messages in thread
From: Saeed Mahameed @ 2023-07-26 21:34 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, Paolo Abeni, Eric Dumazet, Saeed Mahameed,
	netdev, Tariq Toukan, Adham Faris, Gal Pressman

On 25 Jul 20:31, Jakub Kicinski wrote:
>On Mon, 24 Jul 2023 15:44:14 -0700 Saeed Mahameed wrote:
>> Expose NIC temperature by implementing hwmon kernel API, which turns
>> current thermal zone kernel API to redundant.
>>
>> For each one of the supported and exposed thermal diode sensors, expose
>> the following attributes:
>> 1) Input temperature.
>> 2) Highest temperature.
>> 3) Temperature label.
>> 4) Temperature critical max value:
>>    refers to the high threshold of Warning Event. Will be exposed as
>>    `tempY_crit` hwmon attribute (RO attribute). For example for
>>    ConnectX5 HCA's this temperature value will be 105 Celsius, 10
>>    degrees lower than the HW shutdown temperature).
>> 5) Temperature reset history: resets highest temperature.
>>
>> For example, for dualport ConnectX5 NIC with a single IC thermal diode
>> sensor will have 2 hwmon directories (one for each PCI function)
>> under "/sys/class/hwmon/hwmon[X,Y]".
>>
>> Listing one of the directories above (hwmonX/Y) generates the
>> corresponding output below:
>>
>> $ grep -H -d skip . /sys/class/hwmon/hwmon0/*
>
>I missed it glancing on the series yesterday because it's just
>a warning in pw - we should really get hwmon folks and ML CCed
>on this one.

Ok I will remove this patch from the series and send it separately with the
proper CCs.

>-- 
>pw-bot: cr

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-07-26 21:34 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-24 22:44 [pull request][net-next 00/14] mlx5 updates 2023-07-24 Saeed Mahameed
2023-07-24 22:44 ` [net-next 01/14] net/mlx5: Expose port.c/mlx5_query_module_num() function Saeed Mahameed
2023-07-24 22:44 ` [net-next 02/14] net/mlx5: Expose NIC temperature via hardware monitoring kernel API Saeed Mahameed
2023-07-26  3:31   ` Jakub Kicinski
2023-07-26 21:34     ` Saeed Mahameed
2023-07-24 22:44 ` [net-next 03/14] net/mlx5: Use shared code for checking lag is supported Saeed Mahameed
2023-07-24 22:44 ` [net-next 04/14] net/mlx5: Devcom, Infrastructure changes Saeed Mahameed
2023-07-24 22:44 ` [net-next 05/14] net/mlx5e: E-Switch, Register devcom device with switch id key Saeed Mahameed
2023-07-24 22:44 ` [net-next 06/14] net/mlx5e: E-Switch, Allow devcom initialization on more vports Saeed Mahameed
2023-07-24 22:44 ` [net-next 07/14] net/mlx5: Re-organize mlx5_cmd struct Saeed Mahameed
2023-07-24 22:44 ` [net-next 08/14] net/mlx5: Remove redundant cmdif revision check Saeed Mahameed
2023-07-24 22:44 ` [net-next 09/14] net/mlx5: split mlx5_cmd_init() to probe and reload routines Saeed Mahameed
2023-07-24 22:44 ` [net-next 10/14] net/mlx5: Allocate command stats with xarray Saeed Mahameed
2023-07-24 22:44 ` [net-next 11/14] net/mlx5e: Remove duplicate code for user flow Saeed Mahameed
2023-07-24 22:44 ` [net-next 12/14] net/mlx5e: Make flow classification filters static Saeed Mahameed
2023-07-24 22:44 ` [net-next 13/14] net/mlx5: Don't check vport->enabled in port ops Saeed Mahameed
2023-07-24 22:44 ` [net-next 14/14] net/mlx5: Remove pointless devlink_rate checks Saeed Mahameed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.