All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net-next 00/14] mlx5 updates 2021-10-25
@ 2021-10-25 20:54 Saeed Mahameed
  2021-10-25 20:54 ` [net-next 01/14] net/mlx5e: don't write directly to netdev->dev_addr Saeed Mahameed
                   ` (13 more replies)
  0 siblings, 14 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Saeed Mahameed

From: Saeed Mahameed <saeedm@nvidia.com>

Hi Dave and Jakub,

This series provides some updates to mlx5.
For more information please see tag log below.

Please pull and let me know if there is any problem.

Thanks,
Saeed.

---
The following changes since commit dcd63d4326802cec525de2a4775019849958125c:

  Merge branch 'bluetooth-don-t-write-directly-to-netdev-dev_addr' (2021-10-25 11:01:33 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2021-10-25

for you to fetch changes up to d67ab0a8c130be38b6dda8da3616a97f020ac424:

  net/mlx5: SF_DEV Add SF device trace points (2021-10-25 13:51:21 -0700)

----------------------------------------------------------------
mlx5-updates-2021-10-25

Misc updates for mlx5 driver:

1) Misc updates and cleanups:
 - Don't write directly to netdev->dev_addr, From Jakub Kicinski
 - Remove unnecessary checks for slow path flag in tc module
 - Fix unused function warning of mlx5i_flow_type_mask
 - Bridge, support replacing existing FDB entry

2) Sub Functions, Reduction in memory usage:
 - Reduce flow counters bulk query buffer size
 - Implement max_macs devlink parameter
 - Add devlink vendor params to control Event Queue sizes
 - Added SF life cycle trace points by Parav

3) From Aya, Firmware health buffer reporting improvements
 - Print health buffer by log level and more missing information
 - Periodic update of host time to firmware

----------------------------------------------------------------
Avihai Horon (1):
      net/mlx5: Reduce flow counters bulk query buffer size for SFs

Aya Levin (3):
      net/mlx5: Extend health buffer dump
      net/mlx5: Print health buffer by log level
      net/mlx5: Add periodic update of host time to firmware

Jakub Kicinski (1):
      net/mlx5e: don't write directly to netdev->dev_addr

Parav Pandit (2):
      net/mlx5: SF, Add SF trace points
      net/mlx5: SF_DEV Add SF device trace points

Paul Blakey (1):
      net/mlx5: Remove unnecessary checks for slow path flag

Shay Drory (4):
      net/mlx5: Fix unused function warning of mlx5i_flow_type_mask
      net/mlx5: Let user configure io_eq_size param
      net/mlx5: Let user configure event_eq_size param
      net/mlx5: Let user configure max_macs param

Vlad Buslov (2):
      net/mlx5: Bridge, extract code to lookup and del/notify entry
      net/mlx5: Bridge, support replacing existing FDB entry

 .../device_drivers/ethernet/mellanox/mlx5.rst      |  60 +++++++
 Documentation/networking/devlink/mlx5.rst          |  20 +++
 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/devlink.c  |  69 ++++++++
 drivers/net/ethernet/mellanox/mlx5/core/devlink.h  |  12 ++
 .../net/ethernet/mellanox/mlx5/core/devlink_res.c  |  80 ++++++++++
 .../net/ethernet/mellanox/mlx5/core/en/tc/sample.c |  17 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   8 +-
 drivers/net/ethernet/mellanox/mlx5/core/eq.c       |   5 +-
 .../net/ethernet/mellanox/mlx5/core/esw/bridge.c   |  62 ++++----
 .../net/ethernet/mellanox/mlx5/core/fs_counters.c  |   9 +-
 drivers/net/ethernet/mellanox/mlx5/core/health.c   | 126 ++++++++++++---
 .../ethernet/mellanox/mlx5/core/ipoib/ethtool.c    |  10 +-
 drivers/net/ethernet/mellanox/mlx5/core/main.c     |  21 +++
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h    |  24 +++
 .../net/ethernet/mellanox/mlx5/core/sf/dev/dev.c   |  23 ++-
 .../net/ethernet/mellanox/mlx5/core/sf/dev/dev.h   |   1 +
 .../mlx5/core/sf/dev/diag/dev_tracepoint.h         |  58 +++++++
 .../net/ethernet/mellanox/mlx5/core/sf/devlink.c   |   8 +
 .../mellanox/mlx5/core/sf/diag/sf_tracepoint.h     | 173 +++++++++++++++++++++
 .../mellanox/mlx5/core/sf/diag/vhca_tracepoint.h   |  40 +++++
 .../net/ethernet/mellanox/mlx5/core/sf/hw_table.c  |   4 +
 .../ethernet/mellanox/mlx5/core/sf/vhca_event.c    |   3 +
 include/linux/mlx5/device.h                        |  14 +-
 include/linux/mlx5/driver.h                        |   6 +-
 include/linux/mlx5/eq.h                            |   1 -
 include/linux/mlx5/mlx5_ifc.h                      |  24 ++-
 27 files changed, 787 insertions(+), 93 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink_res.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/diag/dev_tracepoint.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/diag/sf_tracepoint.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/diag/vhca_tracepoint.h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [net-next 01/14] net/mlx5e: don't write directly to netdev->dev_addr
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-26 12:30   ` patchwork-bot+netdevbpf
  2021-10-25 20:54 ` [net-next 02/14] net/mlx5: Remove unnecessary checks for slow path flag Saeed Mahameed
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Saeed Mahameed

From: Jakub Kicinski <kuba@kernel.org>

Use a local buffer and eth_hw_addr_set()

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 0ff36c83714b..f3dec58026d9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4432,13 +4432,17 @@ void mlx5e_build_nic_params(struct mlx5e_priv *priv, struct mlx5e_xsk *xsk, u16
 static void mlx5e_set_netdev_dev_addr(struct net_device *netdev)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
+	u8 addr[ETH_ALEN];
 
-	mlx5_query_mac_address(priv->mdev, netdev->dev_addr);
-	if (is_zero_ether_addr(netdev->dev_addr) &&
+	mlx5_query_mac_address(priv->mdev, addr);
+	if (is_zero_ether_addr(addr) &&
 	    !MLX5_CAP_GEN(priv->mdev, vport_group_manager)) {
 		eth_hw_addr_random(netdev);
 		mlx5_core_info(priv->mdev, "Assigned random MAC address %pM\n", netdev->dev_addr);
+		return;
 	}
+
+	eth_hw_addr_set(netdev, addr);
 }
 
 static int mlx5e_vxlan_set_port(struct net_device *netdev, unsigned int table,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 02/14] net/mlx5: Remove unnecessary checks for slow path flag
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
  2021-10-25 20:54 ` [net-next 01/14] net/mlx5e: don't write directly to netdev->dev_addr Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 03/14] net/mlx5: Fix unused function warning of mlx5i_flow_type_mask Saeed Mahameed
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Paul Blakey, Maor Dickman, Saeed Mahameed

From: Paul Blakey <paulb@nvidia.com>

After previous changes, caller (mlx5e_tc_offload_fdb_rules()) already
checks for the slow path flag, and if set won't call offload/unoffload
sample.

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/en/tc/sample.c  | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c
index d1d7e4b9f7ad..1046b7ea5c88 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c
@@ -509,13 +509,6 @@ mlx5e_tc_sample_offload(struct mlx5e_tc_psample *tc_psample,
 	if (IS_ERR_OR_NULL(tc_psample))
 		return ERR_PTR(-EOPNOTSUPP);
 
-	/* If slow path flag is set, eg. when the neigh is invalid for encap,
-	 * don't offload sample action.
-	 */
-	esw = tc_psample->esw;
-	if (attr->flags & MLX5_ESW_ATTR_FLAG_SLOW_PATH)
-		return mlx5_eswitch_add_offloaded_rule(esw, spec, attr);
-
 	sample_flow = kzalloc(sizeof(*sample_flow), GFP_KERNEL);
 	if (!sample_flow)
 		return ERR_PTR(-ENOMEM);
@@ -527,6 +520,7 @@ mlx5e_tc_sample_offload(struct mlx5e_tc_psample *tc_psample,
 	 * Only match the fte id instead of the same match in the
 	 * original flow table.
 	 */
+	esw = tc_psample->esw;
 	if (MLX5_CAP_GEN(esw->dev, reg_c_preserve) ||
 	    attr->action & MLX5_FLOW_CONTEXT_ACTION_DECAP) {
 		struct mlx5_flow_table *ft;
@@ -634,15 +628,6 @@ mlx5e_tc_sample_unoffload(struct mlx5e_tc_psample *tc_psample,
 	if (IS_ERR_OR_NULL(tc_psample))
 		return;
 
-	/* If slow path flag is set, sample action is not offloaded.
-	 * No need to delete sample rule.
-	 */
-	esw = tc_psample->esw;
-	if (attr->flags & MLX5_ESW_ATTR_FLAG_SLOW_PATH) {
-		mlx5_eswitch_del_offloaded_rule(esw, rule, attr);
-		return;
-	}
-
 	/* The following delete order can't be changed, otherwise,
 	 * will hit fw syndromes.
 	 */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 03/14] net/mlx5: Fix unused function warning of mlx5i_flow_type_mask
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
  2021-10-25 20:54 ` [net-next 01/14] net/mlx5e: don't write directly to netdev->dev_addr Saeed Mahameed
  2021-10-25 20:54 ` [net-next 02/14] net/mlx5: Remove unnecessary checks for slow path flag Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 04/14] net/mlx5: Reduce flow counters bulk query buffer size for SFs Saeed Mahameed
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Shay Drory, Saeed Mahameed

From: Shay Drory <shayd@nvidia.com>

The cited commit is causing unused-function warning[1] when
CONFIG_MLX5_EN_RXNFC is not set.
Fix this by moving the function into the ifdef, where it's only used

[1]
warning: ‘mlx5i_flow_type_mask’ defined but not used [-Wunused-function]

Fixes: 9fbe1c25ecca ("net/mlx5i: Enable Rx steering for IPoIB via ethtool")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c    | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
index ee0eb4a4b819..962d41418ce7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
@@ -33,11 +33,6 @@
 #include "en.h"
 #include "ipoib.h"
 
-static u32 mlx5i_flow_type_mask(u32 flow_type)
-{
-	return flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
-}
-
 static void mlx5i_get_drvinfo(struct net_device *dev,
 			      struct ethtool_drvinfo *drvinfo)
 {
@@ -223,6 +218,11 @@ static int mlx5i_get_link_ksettings(struct net_device *netdev,
 }
 
 #ifdef CONFIG_MLX5_EN_RXNFC
+static u32 mlx5i_flow_type_mask(u32 flow_type)
+{
+	return flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
+}
+
 static int mlx5i_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
 {
 	struct mlx5e_priv *priv = mlx5i_epriv(dev);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 04/14] net/mlx5: Reduce flow counters bulk query buffer size for SFs
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 03/14] net/mlx5: Fix unused function warning of mlx5i_flow_type_mask Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 05/14] net/mlx5: Extend health buffer dump Saeed Mahameed
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Avihai Horon, Mark Bloch, Saeed Mahameed

From: Avihai Horon <avihaih@nvidia.com>

Currently, the flow counters bulk query buffer takes a little more than
512KB of memory, which is aligned to the next power of 2, to 1MB.

The buffer size determines the maximum number of flow counters that can
be queried at a time. Thus, having a bigger buffer can improve
performance for users that need to query many flow counters.

SFs don't use many flow counters and don't need a big buffer. Since this
size is critical with large scale, reduce the size of the bulk query
buffer for SFs.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
index f542a36be62c..60c9df1bc912 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
@@ -40,6 +40,7 @@
 #define MLX5_FC_STATS_PERIOD msecs_to_jiffies(1000)
 /* Max number of counters to query in bulk read is 32K */
 #define MLX5_SW_MAX_COUNTERS_BULK BIT(15)
+#define MLX5_SF_NUM_COUNTERS_BULK 6
 #define MLX5_FC_POOL_MAX_THRESHOLD BIT(18)
 #define MLX5_FC_POOL_USED_BUFF_RATIO 10
 
@@ -146,8 +147,12 @@ static void mlx5_fc_stats_remove(struct mlx5_core_dev *dev,
 
 static int get_max_bulk_query_len(struct mlx5_core_dev *dev)
 {
-	return min_t(int, MLX5_SW_MAX_COUNTERS_BULK,
-			  (1 << MLX5_CAP_GEN(dev, log_max_flow_counter_bulk)));
+	int num_counters_bulk = mlx5_core_is_sf(dev) ?
+					MLX5_SF_NUM_COUNTERS_BULK :
+					MLX5_SW_MAX_COUNTERS_BULK;
+
+	return min_t(int, num_counters_bulk,
+		     (1 << MLX5_CAP_GEN(dev, log_max_flow_counter_bulk)));
 }
 
 static void update_counter_cache(int index, u32 *bulk_raw_data,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 05/14] net/mlx5: Extend health buffer dump
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 04/14] net/mlx5: Reduce flow counters bulk query buffer size for SFs Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 06/14] net/mlx5: Print health buffer by log level Saeed Mahameed
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Aya Levin, Moshe Shemesh, Saeed Mahameed

From: Aya Levin <ayal@nvidia.com>

Enhance health buffer to include:
 - assert_var5: expose the 6'th assert variable.
 - time: error's time-stamp in seconds (epoch time).
 - rfr: Recovery Flow Requiered. When set, indicates that the error
        cannot be recovered without flow involving reset.
 - severity: error's severity value, ranging from emergency to debug.
Expose them in the health buffer dump (dmesg and devlink fw reporter).

Health buffer in dmesg:
mlx5_core 0000:08:00.0: print_health_info:425:(pid 912): Health issue observed, firmware internal error, severity(3) ERROR:
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[0] 0x08040700
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[1] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[2] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[3] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[4] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[5] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:432:(pid 912): assert_exit_ptr 0x00aaf800
mlx5_core 0000:08:00.0: print_health_info:434:(pid 912): assert_callra 0x00aaf70c
mlx5_core 0000:08:00.0: print_health_info:436:(pid 912): fw_ver 16.32.492
mlx5_core 0000:08:00.0: print_health_info:437:(pid 912): time 1634819758
mlx5_core 0000:08:00.0: print_health_info:438:(pid 912): hw_id 0x0000020d
mlx5_core 0000:08:00.0: print_health_info:439:(pid 912): rfr 0
mlx5_core 0000:08:00.0: print_health_info:440:(pid 912): severity 3 (ERROR)
mlx5_core 0000:08:00.0: print_health_info:441:(pid 912): irisc_index 9
mlx5_core 0000:08:00.0: print_health_info:442:(pid 912): synd 0x1: firmware internal error
mlx5_core 0000:08:00.0: print_health_info:444:(pid 912): ext_synd 0x802b
mlx5_core 0000:08:00.0: print_health_info:445:(pid 912): raw fw_ver 0x102001ec

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/health.c  | 73 +++++++++++++++++--
 include/linux/mlx5/device.h                   | 14 ++--
 include/linux/mlx5/mlx5_ifc.h                 | 10 ++-
 3 files changed, 82 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 6a4dd7f78958..538ef392f54c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -36,6 +36,7 @@
 #include <linux/vmalloc.h>
 #include <linux/hardirq.h>
 #include <linux/mlx5/driver.h>
+#include <linux/kern_levels.h>
 #include "mlx5_core.h"
 #include "lib/eq.h"
 #include "lib/mlx5.h"
@@ -74,6 +75,11 @@ enum  {
 	MLX5_SENSOR_FW_SYND_RFR		= 5,
 };
 
+enum {
+	MLX5_SEVERITY_MASK		= 0x7,
+	MLX5_SEVERITY_VALID_MASK	= 0x8,
+};
+
 u8 mlx5_get_nic_state(struct mlx5_core_dev *dev)
 {
 	return (ioread32be(&dev->iseg->cmdq_addr_l_sz) >> 8) & 7;
@@ -98,12 +104,19 @@ static bool sensor_pci_not_working(struct mlx5_core_dev *dev)
 	return (ioread32be(&h->fw_ver) == 0xffffffff);
 }
 
+static int mlx5_health_get_rfr(u8 rfr_severity)
+{
+	return rfr_severity >> MLX5_RFR_BIT_OFFSET;
+}
+
 static bool sensor_fw_synd_rfr(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
 	struct health_buffer __iomem *h = health->health;
-	u32 rfr = ioread32be(&h->rfr) >> MLX5_RFR_OFFSET;
 	u8 synd = ioread8(&h->synd);
+	u8 rfr;
+
+	rfr = mlx5_health_get_rfr(ioread8(&h->rfr_severity));
 
 	if (rfr && synd)
 		mlx5_core_dbg(dev, "FW requests reset, synd: %d\n", synd);
@@ -366,18 +379,52 @@ static const char *hsynd_str(u8 synd)
 	}
 }
 
+static const char *mlx5_loglevel_str(int level)
+{
+	switch (level) {
+	case LOGLEVEL_EMERG:
+		return "EMERGENCY";
+	case LOGLEVEL_ALERT:
+		return "ALERT";
+	case LOGLEVEL_CRIT:
+		return "CRITICAL";
+	case LOGLEVEL_ERR:
+		return "ERROR";
+	case LOGLEVEL_WARNING:
+		return "WARNING";
+	case LOGLEVEL_NOTICE:
+		return "NOTICE";
+	case LOGLEVEL_INFO:
+		return "INFO";
+	case LOGLEVEL_DEBUG:
+		return "DEBUG";
+	}
+	return "Unknown log level";
+}
+
+static int mlx5_health_get_severity(u8 rfr_severity)
+{
+	return rfr_severity & MLX5_SEVERITY_VALID_MASK ?
+	       rfr_severity & MLX5_SEVERITY_MASK : LOGLEVEL_ERR;
+}
+
 static void print_health_info(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
 	struct health_buffer __iomem *h = health->health;
-	char fw_str[18];
-	u32 fw;
+	u8 rfr_severity;
+	int severity;
 	int i;
 
 	/* If the syndrome is 0, the device is OK and no need to print buffer */
 	if (!ioread8(&h->synd))
 		return;
 
+	rfr_severity = ioread8(&h->rfr_severity);
+	severity  = mlx5_health_get_severity(rfr_severity);
+	mlx5_core_err(dev, "Health issue observed, %s, severity(%d) %s:\n",
+		      hsynd_str(ioread8(&h->synd)), severity, mlx5_loglevel_str(severity));
+
 	for (i = 0; i < ARRAY_SIZE(h->assert_var); i++)
 		mlx5_core_err(dev, "assert_var[%d] 0x%08x\n", i,
 			      ioread32be(h->assert_var + i));
@@ -386,15 +433,16 @@ static void print_health_info(struct mlx5_core_dev *dev)
 		      ioread32be(&h->assert_exit_ptr));
 	mlx5_core_err(dev, "assert_callra 0x%08x\n",
 		      ioread32be(&h->assert_callra));
-	sprintf(fw_str, "%d.%d.%d", fw_rev_maj(dev), fw_rev_min(dev), fw_rev_sub(dev));
-	mlx5_core_err(dev, "fw_ver %s\n", fw_str);
+	mlx5_core_err(dev, "fw_ver %d.%d.%d", fw_rev_maj(dev), fw_rev_min(dev), fw_rev_sub(dev));
+	mlx5_core_err(dev, "time %u\n", ioread32be(&h->time));
 	mlx5_core_err(dev, "hw_id 0x%08x\n", ioread32be(&h->hw_id));
+	mlx5_core_err(dev, "rfr %d\n", mlx5_health_get_rfr(rfr_severity));
+	mlx5_core_err(dev, "severity %d (%s)\n", severity, mlx5_loglevel_str(severity));
 	mlx5_core_err(dev, "irisc_index %d\n", ioread8(&h->irisc_index));
 	mlx5_core_err(dev, "synd 0x%x: %s\n", ioread8(&h->synd),
 		      hsynd_str(ioread8(&h->synd)));
 	mlx5_core_err(dev, "ext_synd 0x%04x\n", ioread16be(&h->ext_synd));
-	fw = ioread32be(&h->fw_ver);
-	mlx5_core_err(dev, "raw fw_ver 0x%08x\n", fw);
+	mlx5_core_err(dev, "raw fw_ver 0x%08x\n", ioread32be(&h->fw_ver));
 }
 
 static int
@@ -443,6 +491,7 @@ mlx5_fw_reporter_heath_buffer_data_put(struct mlx5_core_dev *dev,
 {
 	struct mlx5_core_health *health = &dev->priv.health;
 	struct health_buffer __iomem *h = health->health;
+	u8 rfr_severity;
 	int err;
 	int i;
 
@@ -473,9 +522,19 @@ mlx5_fw_reporter_heath_buffer_data_put(struct mlx5_core_dev *dev,
 		return err;
 	err = devlink_fmsg_u32_pair_put(fmsg, "assert_callra",
 					ioread32be(&h->assert_callra));
+	if (err)
+		return err;
+	err = devlink_fmsg_u32_pair_put(fmsg, "time", ioread32be(&h->time));
 	if (err)
 		return err;
 	err = devlink_fmsg_u32_pair_put(fmsg, "hw_id", ioread32be(&h->hw_id));
+	if (err)
+		return err;
+	rfr_severity = ioread8(&h->rfr_severity);
+	err = devlink_fmsg_u8_pair_put(fmsg, "rfr", mlx5_health_get_rfr(rfr_severity));
+	if (err)
+		return err;
+	err = devlink_fmsg_u8_pair_put(fmsg, "severity", mlx5_health_get_severity(rfr_severity));
 	if (err)
 		return err;
 	err = devlink_fmsg_u8_pair_put(fmsg, "irisc_index",
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index 347167c18802..f8a0bbb42c3b 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -541,19 +541,21 @@ struct mlx5_cmd_layout {
 	u8		status_own;
 };
 
-enum mlx5_fatal_assert_bit_offsets {
-	MLX5_RFR_OFFSET = 31,
+enum mlx5_rfr_severity_bit_offsets {
+	MLX5_RFR_BIT_OFFSET = 0x7,
 };
 
 struct health_buffer {
-	__be32		assert_var[5];
-	__be32		rsvd0[3];
+	__be32		assert_var[6];
+	__be32		rsvd0[2];
 	__be32		assert_exit_ptr;
 	__be32		assert_callra;
-	__be32		rsvd1[2];
+	__be32		rsvd1[1];
+	__be32		time;
 	__be32		fw_ver;
 	__be32		hw_id;
-	__be32		rfr;
+	u8		rfr_severity;
+	u8		rsvd2[3];
 	u8		irisc_index;
 	u8		synd;
 	__be16		ext_synd;
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 09e43019d877..6d292b5b8992 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -4149,13 +4149,19 @@ struct mlx5_ifc_health_buffer_bits {
 
 	u8         assert_callra[0x20];
 
-	u8         reserved_at_140[0x40];
+	u8         reserved_at_140[0x20];
+
+	u8         time[0x20];
 
 	u8         fw_version[0x20];
 
 	u8         hw_id[0x20];
 
-	u8         reserved_at_1c0[0x20];
+	u8         rfr[0x1];
+	u8         reserved_at_1c1[0x3];
+	u8         valid[0x1];
+	u8         severity[0x3];
+	u8         reserved_at_1c8[0x18];
 
 	u8         irisc_index[0x8];
 	u8         synd[0x8];
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 06/14] net/mlx5: Print health buffer by log level
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 05/14] net/mlx5: Extend health buffer dump Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 07/14] net/mlx5: Add periodic update of host time to firmware Saeed Mahameed
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Aya Levin, Moshe Shemesh, Saeed Mahameed

From: Aya Levin <ayal@nvidia.com>

Add log macro which gets log level as a parameter. Use the severity
read from the health buffer and the new log macro to log the health buffer
with severity as log level.  Prior to this patch, health buffer was
printed in error log level regardless of its severity. Now the user may
filter dmesg (--level) or change kernel log level to focus on different
severity levels of firmware errors.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../device_drivers/ethernet/mellanox/mlx5.rst |  2 +
 .../net/ethernet/mellanox/mlx5/core/health.c  | 37 +++++++++----------
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   | 24 ++++++++++++
 3 files changed, 44 insertions(+), 19 deletions(-)

diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
index 4b59cf2c599f..2ee74a49be9d 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
@@ -543,6 +543,8 @@ The CR-space dump uses vsc interface which is valid even if the FW command
 interface is not functional, which is the case in most FW fatal errors.
 The recover function runs recover flow which reloads the driver and triggers fw
 reset if needed.
+On firmware error, the health buffer is dumped into the dmesg. The log
+level is derived from the error's severity (given in health buffer).
 
 User commands examples:
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 538ef392f54c..c35a27255232 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -422,27 +422,26 @@ static void print_health_info(struct mlx5_core_dev *dev)
 
 	rfr_severity = ioread8(&h->rfr_severity);
 	severity  = mlx5_health_get_severity(rfr_severity);
-	mlx5_core_err(dev, "Health issue observed, %s, severity(%d) %s:\n",
-		      hsynd_str(ioread8(&h->synd)), severity, mlx5_loglevel_str(severity));
+	mlx5_log(dev, severity, "Health issue observed, %s, severity(%d) %s:\n",
+		 hsynd_str(ioread8(&h->synd)), severity, mlx5_loglevel_str(severity));
 
 	for (i = 0; i < ARRAY_SIZE(h->assert_var); i++)
-		mlx5_core_err(dev, "assert_var[%d] 0x%08x\n", i,
-			      ioread32be(h->assert_var + i));
-
-	mlx5_core_err(dev, "assert_exit_ptr 0x%08x\n",
-		      ioread32be(&h->assert_exit_ptr));
-	mlx5_core_err(dev, "assert_callra 0x%08x\n",
-		      ioread32be(&h->assert_callra));
-	mlx5_core_err(dev, "fw_ver %d.%d.%d", fw_rev_maj(dev), fw_rev_min(dev), fw_rev_sub(dev));
-	mlx5_core_err(dev, "time %u\n", ioread32be(&h->time));
-	mlx5_core_err(dev, "hw_id 0x%08x\n", ioread32be(&h->hw_id));
-	mlx5_core_err(dev, "rfr %d\n", mlx5_health_get_rfr(rfr_severity));
-	mlx5_core_err(dev, "severity %d (%s)\n", severity, mlx5_loglevel_str(severity));
-	mlx5_core_err(dev, "irisc_index %d\n", ioread8(&h->irisc_index));
-	mlx5_core_err(dev, "synd 0x%x: %s\n", ioread8(&h->synd),
-		      hsynd_str(ioread8(&h->synd)));
-	mlx5_core_err(dev, "ext_synd 0x%04x\n", ioread16be(&h->ext_synd));
-	mlx5_core_err(dev, "raw fw_ver 0x%08x\n", ioread32be(&h->fw_ver));
+		mlx5_log(dev, severity, "assert_var[%d] 0x%08x\n", i,
+			 ioread32be(h->assert_var + i));
+
+	mlx5_log(dev, severity, "assert_exit_ptr 0x%08x\n", ioread32be(&h->assert_exit_ptr));
+	mlx5_log(dev, severity, "assert_callra 0x%08x\n", ioread32be(&h->assert_callra));
+	mlx5_log(dev, severity, "fw_ver %d.%d.%d", fw_rev_maj(dev), fw_rev_min(dev),
+		 fw_rev_sub(dev));
+	mlx5_log(dev, severity, "time %u\n", ioread32be(&h->time));
+	mlx5_log(dev, severity, "hw_id 0x%08x\n", ioread32be(&h->hw_id));
+	mlx5_log(dev, severity, "rfr %d\n", mlx5_health_get_rfr(rfr_severity));
+	mlx5_log(dev, severity, "severity %d (%s)\n", severity, mlx5_loglevel_str(severity));
+	mlx5_log(dev, severity, "irisc_index %d\n", ioread8(&h->irisc_index));
+	mlx5_log(dev, severity, "synd 0x%x: %s\n", ioread8(&h->synd),
+		 hsynd_str(ioread8(&h->synd)));
+	mlx5_log(dev, severity, "ext_synd 0x%04x\n", ioread16be(&h->ext_synd));
+	mlx5_log(dev, severity, "raw fw_ver 0x%08x\n", ioread32be(&h->fw_ver));
 }
 
 static int
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 230eab7e3bc9..bb677329ea08 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -97,6 +97,30 @@ do {								\
 			     __func__, __LINE__, current->pid,	\
 			     ##__VA_ARGS__)
 
+static inline void mlx5_printk(struct mlx5_core_dev *dev, int level, const char *format, ...)
+{
+	struct device *device = dev->device;
+	struct va_format vaf;
+	va_list args;
+
+	if (WARN_ONCE(level < LOGLEVEL_EMERG || level > LOGLEVEL_DEBUG,
+		      "Level %d is out of range, set to default level\n", level))
+		level = LOGLEVEL_DEFAULT;
+
+	va_start(args, format);
+	vaf.fmt = format;
+	vaf.va = &args;
+
+	dev_printk_emit(level, device, "%s %s: %pV", dev_driver_string(device), dev_name(device),
+			&vaf);
+	va_end(args);
+}
+
+#define mlx5_log(__dev, level, format, ...)			\
+	mlx5_printk(__dev, level, "%s:%d:(pid %d): " format,	\
+		    __func__, __LINE__, current->pid,		\
+		    ##__VA_ARGS__)
+
 static inline struct device *mlx5_core_dma_dev(struct mlx5_core_dev *dev)
 {
 	return &dev->pdev->dev;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 07/14] net/mlx5: Add periodic update of host time to firmware
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 06/14] net/mlx5: Print health buffer by log level Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 08/14] net/mlx5: Bridge, extract code to lookup and del/notify entry Saeed Mahameed
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Aya Levin, Moshe Shemesh, Saeed Mahameed

From: Aya Levin <ayal@nvidia.com>

Firmware logs its asserts also to non-volatile memory. In order to
reduce drift between the NIC and the host, the driver sets the host
epoch-time to the firmware every hour.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/health.c  | 30 +++++++++++++++++++
 include/linux/mlx5/driver.h                   |  2 ++
 include/linux/mlx5/mlx5_ifc.h                 | 12 ++++++++
 3 files changed, 44 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index c35a27255232..64f1abc4dc36 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -752,6 +752,31 @@ void mlx5_trigger_health_work(struct mlx5_core_dev *dev)
 	spin_unlock_irqrestore(&health->wq_lock, flags);
 }
 
+#define MLX5_MSEC_PER_HOUR (MSEC_PER_SEC * 60 * 60)
+static void mlx5_health_log_ts_update(struct work_struct *work)
+{
+	struct delayed_work *dwork = to_delayed_work(work);
+	u32 out[MLX5_ST_SZ_DW(mrtc_reg)] = {};
+	u32 in[MLX5_ST_SZ_DW(mrtc_reg)] = {};
+	struct mlx5_core_health *health;
+	struct mlx5_core_dev *dev;
+	struct mlx5_priv *priv;
+	u64 now_us;
+
+	health = container_of(dwork, struct mlx5_core_health, update_fw_log_ts_work);
+	priv = container_of(health, struct mlx5_priv, health);
+	dev = container_of(priv, struct mlx5_core_dev, priv);
+
+	now_us =  ktime_to_us(ktime_get_real());
+
+	MLX5_SET(mrtc_reg, in, time_h, now_us >> 32);
+	MLX5_SET(mrtc_reg, in, time_l, now_us & 0xFFFFFFFF);
+	mlx5_core_access_reg(dev, in, sizeof(in), out, sizeof(out), MLX5_REG_MRTC, 0, 1);
+
+	queue_delayed_work(health->wq, &health->update_fw_log_ts_work,
+			   msecs_to_jiffies(MLX5_MSEC_PER_HOUR));
+}
+
 static void poll_health(struct timer_list *t)
 {
 	struct mlx5_core_dev *dev = from_timer(dev, t, priv.health.timer);
@@ -834,6 +859,7 @@ void mlx5_drain_health_wq(struct mlx5_core_dev *dev)
 	spin_lock_irqsave(&health->wq_lock, flags);
 	set_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags);
 	spin_unlock_irqrestore(&health->wq_lock, flags);
+	cancel_delayed_work_sync(&health->update_fw_log_ts_work);
 	cancel_work_sync(&health->report_work);
 	cancel_work_sync(&health->fatal_report_work);
 }
@@ -849,6 +875,7 @@ void mlx5_health_cleanup(struct mlx5_core_dev *dev)
 {
 	struct mlx5_core_health *health = &dev->priv.health;
 
+	cancel_delayed_work_sync(&health->update_fw_log_ts_work);
 	destroy_workqueue(health->wq);
 	mlx5_fw_reporters_destroy(dev);
 }
@@ -874,6 +901,9 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 	spin_lock_init(&health->wq_lock);
 	INIT_WORK(&health->fatal_report_work, mlx5_fw_fatal_reporter_err_work);
 	INIT_WORK(&health->report_work, mlx5_fw_reporter_err_work);
+	INIT_DELAYED_WORK(&health->update_fw_log_ts_work, mlx5_health_log_ts_update);
+	if (mlx5_core_is_pf(dev))
+		queue_delayed_work(health->wq, &health->update_fw_log_ts_work, 0);
 
 	return 0;
 
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 3f4c0f2314a5..f617dfbcd9fd 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -134,6 +134,7 @@ enum {
 	MLX5_REG_MCIA		 = 0x9014,
 	MLX5_REG_MFRL		 = 0x9028,
 	MLX5_REG_MLCR		 = 0x902b,
+	MLX5_REG_MRTC		 = 0x902d,
 	MLX5_REG_MTRC_CAP	 = 0x9040,
 	MLX5_REG_MTRC_CONF	 = 0x9041,
 	MLX5_REG_MTRC_STDB	 = 0x9042,
@@ -440,6 +441,7 @@ struct mlx5_core_health {
 	struct work_struct		report_work;
 	struct devlink_health_reporter *fw_reporter;
 	struct devlink_health_reporter *fw_fatal_reporter;
+	struct delayed_work		update_fw_log_ts_work;
 };
 
 struct mlx5_qp_table {
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 6d292b5b8992..746381eccccf 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -10358,6 +10358,17 @@ struct mlx5_ifc_pddr_reg_bits {
 	union mlx5_ifc_pddr_reg_page_data_auto_bits page_data;
 };
 
+struct mlx5_ifc_mrtc_reg_bits {
+	u8         time_synced[0x1];
+	u8         reserved_at_1[0x1f];
+
+	u8         reserved_at_20[0x20];
+
+	u8         time_h[0x20];
+
+	u8         time_l[0x20];
+};
+
 union mlx5_ifc_ports_control_registers_document_bits {
 	struct mlx5_ifc_bufferx_reg_bits bufferx_reg;
 	struct mlx5_ifc_eth_2819_cntrs_grp_data_layout_bits eth_2819_cntrs_grp_data_layout;
@@ -10419,6 +10430,7 @@ union mlx5_ifc_ports_control_registers_document_bits {
 	struct mlx5_ifc_mirc_reg_bits mirc_reg;
 	struct mlx5_ifc_mfrl_reg_bits mfrl_reg;
 	struct mlx5_ifc_mtutc_reg_bits mtutc_reg;
+	struct mlx5_ifc_mrtc_reg_bits mrtc_reg;
 	u8         reserved_at_0[0x60e0];
 };
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 08/14] net/mlx5: Bridge, extract code to lookup and del/notify entry
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 07/14] net/mlx5: Add periodic update of host time to firmware Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 09/14] net/mlx5: Bridge, support replacing existing FDB entry Saeed Mahameed
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Paul Blakey, Roi Dayan, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

Following two patterns in bridge code are used in multiple places where
similar code is duplicated:

- Lookup FDB entry from hashtable by address+vid pair.

- Notify software bridge and then delete existing FDB entry.

In order to improve code quality and prepare for following patch series
that also uses described patterns, extract the codes to dedicated helper
functions.

This commit doesn't change functionality.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../ethernet/mellanox/mlx5/core/esw/bridge.c  | 58 ++++++++++---------
 1 file changed, 32 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 588622ba38c1..33d1d2ed4cd6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -888,14 +888,20 @@ mlx5_esw_bridge_fdb_entry_cleanup(struct mlx5_esw_bridge_fdb_entry *entry,
 	kvfree(entry);
 }
 
+static void
+mlx5_esw_bridge_fdb_entry_notify_and_cleanup(struct mlx5_esw_bridge_fdb_entry *entry,
+					     struct mlx5_esw_bridge *bridge)
+{
+	mlx5_esw_bridge_fdb_del_notify(entry);
+	mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
+}
+
 static void mlx5_esw_bridge_fdb_flush(struct mlx5_esw_bridge *bridge)
 {
 	struct mlx5_esw_bridge_fdb_entry *entry, *tmp;
 
-	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list) {
-		mlx5_esw_bridge_fdb_del_notify(entry);
-		mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
-	}
+	list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list)
+		mlx5_esw_bridge_fdb_entry_notify_and_cleanup(entry, bridge);
 }
 
 static struct mlx5_esw_bridge_vlan *
@@ -1065,10 +1071,8 @@ static void mlx5_esw_bridge_vlan_flush(struct mlx5_esw_bridge_vlan *vlan,
 	struct mlx5_eswitch *esw = bridge->br_offloads->esw;
 	struct mlx5_esw_bridge_fdb_entry *entry, *tmp;
 
-	list_for_each_entry_safe(entry, tmp, &vlan->fdb_list, vlan_list) {
-		mlx5_esw_bridge_fdb_del_notify(entry);
-		mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
-	}
+	list_for_each_entry_safe(entry, tmp, &vlan->fdb_list, vlan_list)
+		mlx5_esw_bridge_fdb_entry_notify_and_cleanup(entry, bridge);
 
 	if (vlan->pkt_reformat_pop)
 		mlx5_esw_bridge_vlan_pop_cleanup(vlan, esw);
@@ -1127,6 +1131,17 @@ mlx5_esw_bridge_port_vlan_lookup(u16 vid, u16 vport_num, u16 esw_owner_vhca_id,
 	return vlan;
 }
 
+static struct mlx5_esw_bridge_fdb_entry *
+mlx5_esw_bridge_fdb_lookup(struct mlx5_esw_bridge *bridge,
+			   const unsigned char *addr, u16 vid)
+{
+	struct mlx5_esw_bridge_fdb_key key = {};
+
+	ether_addr_copy(key.addr, addr);
+	key.vid = vid;
+	return rhashtable_lookup_fast(&bridge->fdb_ht, &key, fdb_ht_params);
+}
+
 static struct mlx5_esw_bridge_fdb_entry *
 mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id,
 			       const unsigned char *addr, u16 vid, bool added_by_user, bool peer,
@@ -1444,7 +1459,6 @@ void mlx5_esw_bridge_fdb_update_used(struct net_device *dev, u16 vport_num, u16
 				     struct switchdev_notifier_fdb_info *fdb_info)
 {
 	struct mlx5_esw_bridge_fdb_entry *entry;
-	struct mlx5_esw_bridge_fdb_key key;
 	struct mlx5_esw_bridge_port *port;
 	struct mlx5_esw_bridge *bridge;
 
@@ -1453,13 +1467,11 @@ void mlx5_esw_bridge_fdb_update_used(struct net_device *dev, u16 vport_num, u16
 		return;
 
 	bridge = port->bridge;
-	ether_addr_copy(key.addr, fdb_info->addr);
-	key.vid = fdb_info->vid;
-	entry = rhashtable_lookup_fast(&bridge->fdb_ht, &key, fdb_ht_params);
+	entry = mlx5_esw_bridge_fdb_lookup(bridge, fdb_info->addr, fdb_info->vid);
 	if (!entry) {
 		esw_debug(br_offloads->esw->dev,
 			  "FDB entry with specified key not found (MAC=%pM,vid=%u,vport=%u)\n",
-			  key.addr, key.vid, vport_num);
+			  fdb_info->addr, fdb_info->vid, vport_num);
 		return;
 	}
 
@@ -1501,7 +1513,6 @@ void mlx5_esw_bridge_fdb_remove(struct net_device *dev, u16 vport_num, u16 esw_o
 {
 	struct mlx5_eswitch *esw = br_offloads->esw;
 	struct mlx5_esw_bridge_fdb_entry *entry;
-	struct mlx5_esw_bridge_fdb_key key;
 	struct mlx5_esw_bridge_port *port;
 	struct mlx5_esw_bridge *bridge;
 
@@ -1510,18 +1521,15 @@ void mlx5_esw_bridge_fdb_remove(struct net_device *dev, u16 vport_num, u16 esw_o
 		return;
 
 	bridge = port->bridge;
-	ether_addr_copy(key.addr, fdb_info->addr);
-	key.vid = fdb_info->vid;
-	entry = rhashtable_lookup_fast(&bridge->fdb_ht, &key, fdb_ht_params);
+	entry = mlx5_esw_bridge_fdb_lookup(bridge, fdb_info->addr, fdb_info->vid);
 	if (!entry) {
 		esw_warn(esw->dev,
 			 "FDB entry with specified key not found (MAC=%pM,vid=%u,vport=%u)\n",
-			 key.addr, key.vid, vport_num);
+			 fdb_info->addr, fdb_info->vid, vport_num);
 		return;
 	}
 
-	mlx5_esw_bridge_fdb_del_notify(entry);
-	mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
+	mlx5_esw_bridge_fdb_entry_notify_and_cleanup(entry, bridge);
 }
 
 void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads)
@@ -1537,13 +1545,11 @@ void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads)
 			if (entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER)
 				continue;
 
-			if (time_after(lastuse, entry->lastuse)) {
+			if (time_after(lastuse, entry->lastuse))
 				mlx5_esw_bridge_fdb_entry_refresh(entry);
-			} else if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_PEER) &&
-				   time_is_before_jiffies(entry->lastuse + bridge->ageing_time)) {
-				mlx5_esw_bridge_fdb_del_notify(entry);
-				mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge);
-			}
+			else if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_PEER) &&
+				 time_is_before_jiffies(entry->lastuse + bridge->ageing_time))
+				mlx5_esw_bridge_fdb_entry_notify_and_cleanup(entry, bridge);
 		}
 	}
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 09/14] net/mlx5: Bridge, support replacing existing FDB entry
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 08/14] net/mlx5: Bridge, extract code to lookup and del/notify entry Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 10/14] net/mlx5: Let user configure io_eq_size param Saeed Mahameed
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Vlad Buslov, Paul Blakey, Roi Dayan, Saeed Mahameed

From: Vlad Buslov <vladbu@nvidia.com>

The SWITCHDEV_FDB_ADD_TO_DEVICE is used for both adding new and replacing
existing entry. Implement support for replacing existing FDB entries in
mlx5 offload code.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
index 33d1d2ed4cd6..f690f430f40f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c
@@ -1160,6 +1160,10 @@ mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, u16 esw_ow
 			return ERR_CAST(vlan);
 	}
 
+	entry = mlx5_esw_bridge_fdb_lookup(bridge, addr, vid);
+	if (entry)
+		mlx5_esw_bridge_fdb_entry_notify_and_cleanup(entry, bridge);
+
 	entry = kvzalloc(sizeof(*entry), GFP_KERNEL);
 	if (!entry)
 		return ERR_PTR(-ENOMEM);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 10/14] net/mlx5: Let user configure io_eq_size param
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 09/14] net/mlx5: Bridge, support replacing existing FDB entry Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-26 15:05   ` Jakub Kicinski
  2021-10-25 20:54 ` [net-next 11/14] net/mlx5: Let user configure event_eq_size param Saeed Mahameed
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Shay Drory, Moshe Shemesh, Parav Pandit, Saeed Mahameed

From: Shay Drory <shayd@nvidia.com>

Currently, each I/O EQ is taking 128KB of memory. This size
is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the size of I/O EQs.

For example, to reduce I/O EQ size to 64, execute:
$ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 Documentation/networking/devlink/mlx5.rst     | 12 ++++
 .../net/ethernet/mellanox/mlx5/core/Makefile  |  2 +-
 .../net/ethernet/mellanox/mlx5/core/devlink.h | 11 ++++
 .../ethernet/mellanox/mlx5/core/devlink_res.c | 56 +++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/eq.c  |  3 +-
 .../net/ethernet/mellanox/mlx5/core/main.c    |  3 +
 include/linux/mlx5/driver.h                   |  4 --
 7 files changed, 85 insertions(+), 6 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink_res.c

diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
index 4e4b97f7971a..4e6020570292 100644
--- a/Documentation/networking/devlink/mlx5.rst
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -46,6 +46,18 @@ parameters.
 
 The ``mlx5`` driver supports reloading via ``DEVLINK_CMD_RELOAD``
 
+Resources
+=========
+
+.. list-table:: Driver-specific resources implemented
+   :widths: 5 5 5 85
+
+   * - Name
+     - Description
+   * - ``comp_eq_size``
+     - Control the size of I/O completion EQs.
+       * The default value is 1024, and the range is between 64 and 4096.
+
 Info versions
 =============
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index bdb271b604d9..79c15ee62cde 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -16,7 +16,7 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 		transobj.o vport.o sriov.o fs_cmd.o fs_core.o pci_irq.o \
 		fs_counters.o fs_ft_pool.o rl.o lag/lag.o dev.o events.o wq.o lib/gid.o \
 		lib/devcom.o lib/pci_vsc.o lib/dm.o lib/fs_ttc.o diag/fs_tracepoint.o \
-		diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o \
+		diag/fw_tracer.o diag/crdump.o devlink.o devlink_res.o diag/rsc_dump.o \
 		fw_reset.o qos.o lib/tout.o
 
 #
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.h b/drivers/net/ethernet/mellanox/mlx5/core/devlink.h
index 30bf4882779b..4192f23b1446 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.h
@@ -6,6 +6,13 @@
 
 #include <net/devlink.h>
 
+enum mlx5_devlink_resource_id {
+	MLX5_DL_RES_COMP_EQ = 1,
+
+	__MLX5_ID_RES_MAX,
+	MLX5_ID_RES_MAX = __MLX5_ID_RES_MAX - 1,
+};
+
 enum mlx5_devlink_param_id {
 	MLX5_DEVLINK_PARAM_ID_BASE = DEVLINK_PARAM_GENERIC_ID_MAX,
 	MLX5_DEVLINK_PARAM_ID_FLOW_STEERING_MODE,
@@ -31,6 +38,10 @@ int mlx5_devlink_trap_get_num_active(struct mlx5_core_dev *dev);
 int mlx5_devlink_traps_get_action(struct mlx5_core_dev *dev, int trap_id,
 				  enum devlink_trap_action *action);
 
+void mlx5_devlink_res_register(struct mlx5_core_dev *dev);
+void mlx5_devlink_res_unregister(struct mlx5_core_dev *dev);
+size_t mlx5_devlink_res_size(struct mlx5_core_dev *dev, enum mlx5_devlink_resource_id id);
+
 struct devlink *mlx5_devlink_alloc(struct device *dev);
 void mlx5_devlink_free(struct devlink *devlink);
 int mlx5_devlink_register(struct devlink *devlink);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink_res.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink_res.c
new file mode 100644
index 000000000000..3beedfb8534a
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink_res.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. */
+
+#include "devlink.h"
+#include "mlx5_core.h"
+
+enum {
+	MLX5_EQ_MIN_SIZE = 64,
+	MLX5_EQ_MAX_SIZE = 4096,
+	MLX5_COMP_EQ_SIZE = 1024,
+};
+
+static int comp_eq_res_register(struct mlx5_core_dev *dev)
+{
+	struct devlink_resource_size_params comp_eq_size;
+	struct devlink *devlink = priv_to_devlink(dev);
+
+	devlink_resource_size_params_init(&comp_eq_size, MLX5_EQ_MIN_SIZE,
+					  MLX5_EQ_MAX_SIZE, 1, DEVLINK_RESOURCE_UNIT_ENTRY);
+	return devlink_resource_register(devlink, "io_eq_size", MLX5_COMP_EQ_SIZE,
+					 MLX5_DL_RES_COMP_EQ,
+					 DEVLINK_RESOURCE_ID_PARENT_TOP,
+					 &comp_eq_size);
+}
+
+void mlx5_devlink_res_register(struct mlx5_core_dev *dev)
+{
+	int err;
+
+	err = comp_eq_res_register(dev);
+	if (err)
+		mlx5_core_err(dev, "Failed to register resources, err = %d\n", err);
+}
+
+void mlx5_devlink_res_unregister(struct mlx5_core_dev *dev)
+{
+	devlink_resources_unregister(priv_to_devlink(dev), NULL);
+}
+
+static const size_t default_vals[MLX5_ID_RES_MAX + 1] = {
+	[MLX5_DL_RES_COMP_EQ] = MLX5_COMP_EQ_SIZE,
+};
+
+size_t mlx5_devlink_res_size(struct mlx5_core_dev *dev, enum mlx5_devlink_resource_id id)
+{
+	struct devlink *devlink = priv_to_devlink(dev);
+	u64 size;
+	int err;
+
+	err = devlink_resource_size_get(devlink, id, &size);
+	if (!err)
+		return size;
+	mlx5_core_err(dev, "Failed to get param. using default. err = %d, id = %u\n",
+		      err, id);
+	return default_vals[id];
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 792e0d6aa861..4dda6e2a4cbc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -19,6 +19,7 @@
 #include "lib/clock.h"
 #include "diag/fw_tracer.h"
 #include "mlx5_irq.h"
+#include "devlink.h"
 
 enum {
 	MLX5_EQE_OWNER_INIT_VAL	= 0x1,
@@ -807,7 +808,7 @@ static int create_comp_eqs(struct mlx5_core_dev *dev)
 
 	INIT_LIST_HEAD(&table->comp_eqs_list);
 	ncomp_eqs = table->num_comp_eqs;
-	nent = MLX5_COMP_EQ_SIZE;
+	nent = mlx5_devlink_res_size(dev, MLX5_DL_RES_COMP_EQ);
 	for (i = 0; i < ncomp_eqs; i++) {
 		struct mlx5_eq_param param = {};
 		int vecidx = i;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index f8446395163a..96fdbc0c87bf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -922,6 +922,8 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
 	dev->hv_vhca = mlx5_hv_vhca_create(dev);
 	dev->rsc_dump = mlx5_rsc_dump_create(dev);
 
+	mlx5_devlink_res_register(dev);
+
 	return 0;
 
 err_sf_table_cleanup:
@@ -957,6 +959,7 @@ static int mlx5_init_once(struct mlx5_core_dev *dev)
 
 static void mlx5_cleanup_once(struct mlx5_core_dev *dev)
 {
+	mlx5_devlink_res_unregister(dev);
 	mlx5_rsc_dump_destroy(dev);
 	mlx5_hv_vhca_destroy(dev->hv_vhca);
 	mlx5_fw_tracer_destroy(dev->tracer);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index f617dfbcd9fd..47c07f95bbe1 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -797,10 +797,6 @@ struct mlx5_db {
 	int			index;
 };
 
-enum {
-	MLX5_COMP_EQ_SIZE = 1024,
-};
-
 enum {
 	MLX5_PTYS_IB = 1 << 0,
 	MLX5_PTYS_EN = 1 << 2,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 11/14] net/mlx5: Let user configure event_eq_size param
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 10/14] net/mlx5: Let user configure io_eq_size param Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 12/14] net/mlx5: Let user configure max_macs param Saeed Mahameed
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Shay Drory, Moshe Shemesh, Parav Pandit, Saeed Mahameed

From: Shay Drory <shayd@nvidia.com>

Event EQ is an EQ which received the notification of almost all the
events generated by the NIC.
Currently, each event EQ is taking 512KB of memory. This size is not
needed in most use cases, and is critical with large scale. Hence,
allow user to configure the size of the event EQ.

For example to reduce event EQ size to 64, execute::
$ devlink resource set pci/0000:00:0b.0 path /event_eq_size/ size 64
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 Documentation/networking/devlink/mlx5.rst     |  4 +++
 .../net/ethernet/mellanox/mlx5/core/devlink.h |  1 +
 .../ethernet/mellanox/mlx5/core/devlink_res.c | 26 ++++++++++++++++++-
 drivers/net/ethernet/mellanox/mlx5/core/eq.c  |  2 +-
 include/linux/mlx5/eq.h                       |  1 -
 5 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
index 4e6020570292..5b77863f9c88 100644
--- a/Documentation/networking/devlink/mlx5.rst
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -57,6 +57,10 @@ Resources
    * - ``comp_eq_size``
      - Control the size of I/O completion EQs.
        * The default value is 1024, and the range is between 64 and 4096.
+   * - ``event_eq_size``
+     - Control the size of the asynchronous control events EQ.
+       * The default value is 4096, and the range is between 64 and 4096.
+
 
 Info versions
 =============
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.h b/drivers/net/ethernet/mellanox/mlx5/core/devlink.h
index 4192f23b1446..674415fd0b3a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.h
@@ -8,6 +8,7 @@
 
 enum mlx5_devlink_resource_id {
 	MLX5_DL_RES_COMP_EQ = 1,
+	MLX5_DL_RES_ASYNC_EQ,
 
 	__MLX5_ID_RES_MAX,
 	MLX5_ID_RES_MAX = __MLX5_ID_RES_MAX - 1,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink_res.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink_res.c
index 3beedfb8534a..549d23745942 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink_res.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink_res.c
@@ -7,6 +7,7 @@
 enum {
 	MLX5_EQ_MIN_SIZE = 64,
 	MLX5_EQ_MAX_SIZE = 4096,
+	MLX5_NUM_ASYNC_EQE = 4096,
 	MLX5_COMP_EQ_SIZE = 1024,
 };
 
@@ -23,13 +24,35 @@ static int comp_eq_res_register(struct mlx5_core_dev *dev)
 					 &comp_eq_size);
 }
 
+static int async_eq_resource_register(struct mlx5_core_dev *dev)
+{
+	struct devlink_resource_size_params async_eq_size;
+	struct devlink *devlink = priv_to_devlink(dev);
+
+	devlink_resource_size_params_init(&async_eq_size, MLX5_EQ_MIN_SIZE,
+					  MLX5_EQ_MAX_SIZE, 1, DEVLINK_RESOURCE_UNIT_ENTRY);
+	return devlink_resource_register(devlink, "event_eq_size",
+					 MLX5_NUM_ASYNC_EQE, MLX5_DL_RES_ASYNC_EQ,
+					 DEVLINK_RESOURCE_ID_PARENT_TOP,
+					 &async_eq_size);
+}
+
 void mlx5_devlink_res_register(struct mlx5_core_dev *dev)
 {
 	int err;
 
 	err = comp_eq_res_register(dev);
 	if (err)
-		mlx5_core_err(dev, "Failed to register resources, err = %d\n", err);
+		goto err_msg;
+
+	err = async_eq_resource_register(dev);
+	if (err)
+		goto err;
+	return;
+err:
+	devlink_resources_unregister(priv_to_devlink(dev), NULL);
+err_msg:
+	mlx5_core_err(dev, "Failed to register resources, err = %d\n", err);
 }
 
 void mlx5_devlink_res_unregister(struct mlx5_core_dev *dev)
@@ -39,6 +62,7 @@ void mlx5_devlink_res_unregister(struct mlx5_core_dev *dev)
 
 static const size_t default_vals[MLX5_ID_RES_MAX + 1] = {
 	[MLX5_DL_RES_COMP_EQ] = MLX5_COMP_EQ_SIZE,
+	[MLX5_DL_RES_ASYNC_EQ] = MLX5_NUM_ASYNC_EQE,
 };
 
 size_t mlx5_devlink_res_size(struct mlx5_core_dev *dev, enum mlx5_devlink_resource_id id)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 4dda6e2a4cbc..31e69067284b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -647,7 +647,7 @@ static int create_async_eqs(struct mlx5_core_dev *dev)
 
 	param = (struct mlx5_eq_param) {
 		.irq_index = MLX5_IRQ_EQ_CTRL,
-		.nent = MLX5_NUM_ASYNC_EQE,
+		.nent = mlx5_devlink_res_size(dev, MLX5_DL_RES_ASYNC_EQ),
 	};
 
 	gather_async_events_mask(dev, param.mask);
diff --git a/include/linux/mlx5/eq.h b/include/linux/mlx5/eq.h
index ea3ff5a8ced3..11161e427608 100644
--- a/include/linux/mlx5/eq.h
+++ b/include/linux/mlx5/eq.h
@@ -5,7 +5,6 @@
 #define MLX5_CORE_EQ_H
 
 #define MLX5_NUM_CMD_EQE   (32)
-#define MLX5_NUM_ASYNC_EQE (0x1000)
 #define MLX5_NUM_SPARE_EQE (0x80)
 
 struct mlx5_eq;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 12/14] net/mlx5: Let user configure max_macs param
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (10 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 11/14] net/mlx5: Let user configure event_eq_size param Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 13/14] net/mlx5: SF, Add SF trace points Saeed Mahameed
  2021-10-25 20:54 ` [net-next 14/14] net/mlx5: SF_DEV Add SF device " Saeed Mahameed
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: netdev, Shay Drory, Moshe Shemesh, Parav Pandit, Saeed Mahameed

From: Shay Drory <shayd@nvidia.com>

Currently, max_macs is taking 70Kbytes of memory per function. This
size is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the number of max_macs.

For example, to reduce the number of max_macs to 1, execute::
$ devlink dev param set pci/0000:00:0b.0 name max_macs value 1 \
              cmode driverinit
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 Documentation/networking/devlink/mlx5.rst     |  4 ++
 .../net/ethernet/mellanox/mlx5/core/devlink.c | 69 +++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/main.c    | 18 +++++
 include/linux/mlx5/mlx5_ifc.h                 |  2 +-
 4 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
index 5b77863f9c88..d467e770906e 100644
--- a/Documentation/networking/devlink/mlx5.rst
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -14,8 +14,12 @@ Parameters
 
    * - Name
      - Mode
+     - Validation
    * - ``enable_roce``
      - driverinit
+   * - ``max_macs``
+     - driverinit
+     - The range is between 1 and 2^31. Only power of 2 values are supported.
 
 The ``mlx5`` driver also implements the following driver-specific
 parameters.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index 1c98652b244a..fc78c745ead1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -752,6 +752,68 @@ static void mlx5_devlink_auxdev_params_unregister(struct devlink *devlink)
 	mlx5_devlink_eth_param_unregister(devlink);
 }
 
+static int mlx5_devlink_max_uc_list_validate(struct devlink *devlink, u32 id,
+					     union devlink_param_value val,
+					     struct netlink_ext_ack *extack)
+{
+	struct mlx5_core_dev *dev = devlink_priv(devlink);
+
+	/* At least one unicast mac is needed */
+	if (val.vu32 == 0) {
+		NL_SET_ERR_MSG_MOD(extack, "max_macs value must be greater than 0");
+		return -EINVAL;
+	}
+	/* Check if its power of 2 or not */
+	if (!is_power_of_2(val.vu32)) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Only power of 2 values are supported for max_macs");
+		return -EOPNOTSUPP;
+	}
+
+	if (ilog2(val.vu32) >
+	    MLX5_CAP_GEN_MAX(dev, log_max_current_uc_list)) {
+		NL_SET_ERR_MSG_MOD(extack, "max_macs value is out of the supported range");
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static const struct devlink_param max_uc_list_param =
+	DEVLINK_PARAM_GENERIC(MAX_MACS, BIT(DEVLINK_PARAM_CMODE_DRIVERINIT),
+			      NULL, NULL, mlx5_devlink_max_uc_list_validate);
+
+static int mlx5_devlink_max_uc_list_param_register(struct devlink *devlink)
+{
+	struct mlx5_core_dev *dev = devlink_priv(devlink);
+	union devlink_param_value value;
+	int err;
+
+	if (!MLX5_CAP_GEN(dev, log_max_current_uc_list_wr_supported))
+		return 0;
+
+	err = devlink_param_register(devlink, &max_uc_list_param);
+	if (err)
+		return err;
+
+	value.vu32 = 1 << MLX5_CAP_GEN(dev, log_max_current_uc_list);
+	devlink_param_driverinit_value_set(devlink,
+					   DEVLINK_PARAM_GENERIC_ID_MAX_MACS,
+					   value);
+	return 0;
+}
+
+static void
+mlx5_devlink_max_uc_list_param_unregister(struct devlink *devlink)
+{
+	struct mlx5_core_dev *dev = devlink_priv(devlink);
+
+	if (!MLX5_CAP_GEN(dev, log_max_current_uc_list_wr_supported))
+		return;
+
+	devlink_param_unregister(devlink, &max_uc_list_param);
+}
+
 #define MLX5_TRAP_DROP(_id, _group_id)					\
 	DEVLINK_TRAP_GENERIC(DROP, DROP, _id,				\
 			     DEVLINK_TRAP_GROUP_GENERIC_ID_##_group_id, \
@@ -815,11 +877,17 @@ int mlx5_devlink_register(struct devlink *devlink)
 	if (err)
 		goto traps_reg_err;
 
+	err = mlx5_devlink_max_uc_list_param_register(devlink);
+	if (err)
+		goto uc_list_reg_err;
+
 	if (!mlx5_core_is_mp_slave(dev))
 		devlink_set_features(devlink, DEVLINK_F_RELOAD);
 
 	return 0;
 
+uc_list_reg_err:
+	mlx5_devlink_traps_unregister(devlink);
 traps_reg_err:
 	mlx5_devlink_auxdev_params_unregister(devlink);
 auxdev_reg_err:
@@ -830,6 +898,7 @@ int mlx5_devlink_register(struct devlink *devlink)
 
 void mlx5_devlink_unregister(struct devlink *devlink)
 {
+	mlx5_devlink_max_uc_list_param_unregister(devlink);
 	mlx5_devlink_traps_unregister(devlink);
 	mlx5_devlink_auxdev_params_unregister(devlink);
 	devlink_params_unregister(devlink, mlx5_devlink_params,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 96fdbc0c87bf..079ee9e8da10 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -484,10 +484,23 @@ static int handle_hca_cap_odp(struct mlx5_core_dev *dev, void *set_ctx)
 	return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_ODP);
 }
 
+static int max_uc_list_get_devlink_param(struct mlx5_core_dev *dev)
+{
+	struct devlink *devlink = priv_to_devlink(dev);
+	union devlink_param_value val;
+	int err;
+
+	err = devlink_param_driverinit_value_get(devlink,
+						 DEVLINK_PARAM_GENERIC_ID_MAX_MACS,
+						 &val);
+	return err ? 0 : val.vu32;
+}
+
 static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
 {
 	struct mlx5_profile *prof = &dev->profile;
 	void *set_hca_cap;
+	u32 max_uc_list;
 	int err;
 
 	err = mlx5_core_get_caps(dev, MLX5_CAP_GENERAL);
@@ -561,6 +574,11 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
 	if (MLX5_CAP_GEN(dev, roce_rw_supported))
 		MLX5_SET(cmd_hca_cap, set_hca_cap, roce, mlx5_is_roce_init_enabled(dev));
 
+	max_uc_list = max_uc_list_get_devlink_param(dev);
+	if (max_uc_list)
+		MLX5_SET(cmd_hca_cap, set_hca_cap, log_max_current_uc_list,
+			 ilog2(max_uc_list));
+
 	return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE);
 }
 
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 746381eccccf..97465d00de9d 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1603,7 +1603,7 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 
 	u8         ext_stride_num_range[0x1];
 	u8         roce_rw_supported[0x1];
-	u8         reserved_at_3a2[0x1];
+	u8         log_max_current_uc_list_wr_supported[0x1];
 	u8         log_max_stride_sz_rq[0x5];
 	u8         reserved_at_3a8[0x3];
 	u8         log_min_stride_sz_rq[0x5];
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 13/14] net/mlx5: SF, Add SF trace points
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (11 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 12/14] net/mlx5: Let user configure max_macs param Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  2021-10-25 20:54 ` [net-next 14/14] net/mlx5: SF_DEV Add SF device " Saeed Mahameed
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Parav Pandit, Saeed Mahameed

From: Parav Pandit <parav@nvidia.com>

Add support for trace events for SFs to improve debugging.
This covers
(a) port add and free trace points
(b) device level trace points
(c) SF hardware context add, free trace points.
(d) SF function activate/deacticate and state trace points

SF events examples:
echo mlx5:mlx5_sf_add >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_free >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_hwc_alloc >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_hwc_free >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_hwc_deferred_free >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_update_state >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_activate >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_deactivate >> /sys/kernel/debug/tracing/set_event

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../device_drivers/ethernet/mellanox/mlx5.rst |  37 ++++
 .../ethernet/mellanox/mlx5/core/sf/devlink.c  |   8 +
 .../mlx5/core/sf/diag/sf_tracepoint.h         | 173 ++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/sf/hw_table.c |   4 +
 4 files changed, 222 insertions(+)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/diag/sf_tracepoint.h

diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
index 2ee74a49be9d..d6c10408adc4 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
@@ -702,3 +702,40 @@ Eswitch QoS tracepoints:
     $ cat /sys/kernel/debug/tracing/trace
     ...
     <...>-27418   [006] .... 76547.187258: mlx5_esw_group_qos_destroy: (0000:82:00.0) group=000000007b576bb3 tsar_ix=1
+
+SF tracepoints:
+
+- mlx5_sf_add: trace addition of the SF port::
+
+    $ echo mlx5:mlx5_sf_add >> /sys/kernel/debug/tracing/set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    devlink-9363    [031] ..... 24610.188722: mlx5_sf_add: (0000:06:00.0) port_index=32768 controller=0 hw_id=0x8000 sfnum=88
+
+- mlx5_sf_free: trace freeing of the SF port::
+
+    $ echo mlx5:mlx5_sf_free >> /sys/kernel/debug/tracing/set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    devlink-9830    [038] ..... 26300.404749: mlx5_sf_free: (0000:06:00.0) port_index=32768 controller=0 hw_id=0x8000
+
+- mlx5_sf_hwc_alloc: trace allocating of the hardware SF context::
+
+    $ echo mlx5:mlx5_sf_hwc_alloc >> /sys/kernel/debug/tracing/set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    devlink-9775    [031] ..... 26296.385259: mlx5_sf_hwc_alloc: (0000:06:00.0) controller=0 hw_id=0x8000 sfnum=88
+
+- mlx5_sf_hwc_free: trace freeing of the hardware SF context::
+
+    $ echo mlx5:mlx5_sf_hwc_free >> /sys/kernel/debug/tracing/set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    kworker/u128:3-9093    [046] ..... 24625.365771: mlx5_sf_hwc_free: (0000:06:00.0) hw_id=0x8000
+
+- mlx5_sf_hwc_deferred_free : trace deferred freeing of the hardware SF context::
+
+    $ echo mlx5:mlx5_sf_hwc_deferred_free >> /sys/kernel/debug/tracing/set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    devlink-9519    [046] ..... 24624.400271: mlx5_sf_hwc_deferred_free: (0000:06:00.0) hw_id=0x8000
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c
index e1bb3acf45e6..3be659cd91f1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c
@@ -8,6 +8,8 @@
 #include "mlx5_ifc_vhca_event.h"
 #include "vhca_event.h"
 #include "ecpf.h"
+#define CREATE_TRACE_POINTS
+#include "diag/sf_tracepoint.h"
 
 struct mlx5_sf {
 	struct devlink_port dl_port;
@@ -112,6 +114,7 @@ static void mlx5_sf_free(struct mlx5_sf_table *table, struct mlx5_sf *sf)
 {
 	mlx5_sf_id_erase(table, sf);
 	mlx5_sf_hw_table_sf_free(table->dev, sf->controller, sf->id);
+	trace_mlx5_sf_free(table->dev, sf->port_index, sf->controller, sf->hw_fn_id);
 	kfree(sf);
 }
 
@@ -209,6 +212,7 @@ static int mlx5_sf_activate(struct mlx5_core_dev *dev, struct mlx5_sf *sf,
 		return err;
 
 	sf->hw_state = MLX5_VHCA_STATE_ACTIVE;
+	trace_mlx5_sf_activate(dev, sf->port_index, sf->controller, sf->hw_fn_id);
 	return 0;
 }
 
@@ -224,6 +228,7 @@ static int mlx5_sf_deactivate(struct mlx5_core_dev *dev, struct mlx5_sf *sf)
 		return err;
 
 	sf->hw_state = MLX5_VHCA_STATE_TEARDOWN_REQUEST;
+	trace_mlx5_sf_deactivate(dev, sf->port_index, sf->controller, sf->hw_fn_id);
 	return 0;
 }
 
@@ -293,6 +298,7 @@ static int mlx5_sf_add(struct mlx5_core_dev *dev, struct mlx5_sf_table *table,
 	if (err)
 		goto esw_err;
 	*new_port_index = sf->port_index;
+	trace_mlx5_sf_add(dev, sf->port_index, sf->controller, sf->hw_fn_id, new_attr->sfnum);
 	return 0;
 
 esw_err:
@@ -442,6 +448,8 @@ static int mlx5_sf_vhca_event(struct notifier_block *nb, unsigned long opcode, v
 	update = mlx5_sf_state_update_check(sf, event->new_vhca_state);
 	if (update)
 		sf->hw_state = event->new_vhca_state;
+	trace_mlx5_sf_update_state(table->dev, sf->port_index, sf->controller,
+				   sf->hw_fn_id, sf->hw_state);
 sf_err:
 	mutex_unlock(&table->sf_state_lock);
 	mlx5_sf_table_put(table);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/diag/sf_tracepoint.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/diag/sf_tracepoint.h
new file mode 100644
index 000000000000..8bf1cd90930d
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/diag/sf_tracepoint.h
@@ -0,0 +1,173 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mlx5
+
+#if !defined(_MLX5_SF_TP_) || defined(TRACE_HEADER_MULTI_READ)
+#define _MLX5_SF_TP_
+
+#include <linux/tracepoint.h>
+#include <linux/mlx5/driver.h>
+#include "sf/vhca_event.h"
+
+TRACE_EVENT(mlx5_sf_add,
+	    TP_PROTO(const struct mlx5_core_dev *dev,
+		     unsigned int port_index,
+		     u32 controller,
+		     u16 hw_fn_id,
+		     u32 sfnum),
+	    TP_ARGS(dev, port_index, controller, hw_fn_id, sfnum),
+	    TP_STRUCT__entry(__string(devname, dev_name(dev->device))
+			     __field(unsigned int, port_index)
+			     __field(u32, controller)
+			     __field(u16, hw_fn_id)
+			     __field(u32, sfnum)
+			    ),
+	    TP_fast_assign(__assign_str(devname, dev_name(dev->device));
+		    __entry->port_index = port_index;
+		    __entry->controller = controller;
+		    __entry->hw_fn_id = hw_fn_id;
+		    __entry->sfnum = sfnum;
+	    ),
+	    TP_printk("(%s) port_index=%u controller=%u hw_id=0x%x sfnum=%u\n",
+		      __get_str(devname), __entry->port_index, __entry->controller,
+		      __entry->hw_fn_id, __entry->sfnum)
+);
+
+TRACE_EVENT(mlx5_sf_free,
+	    TP_PROTO(const struct mlx5_core_dev *dev,
+		     unsigned int port_index,
+		     u32 controller,
+		     u16 hw_fn_id),
+	    TP_ARGS(dev, port_index, controller, hw_fn_id),
+	    TP_STRUCT__entry(__string(devname, dev_name(dev->device))
+			     __field(unsigned int, port_index)
+			     __field(u32, controller)
+			     __field(u16, hw_fn_id)
+			    ),
+	    TP_fast_assign(__assign_str(devname, dev_name(dev->device));
+		    __entry->port_index = port_index;
+		    __entry->controller = controller;
+		    __entry->hw_fn_id = hw_fn_id;
+	    ),
+	    TP_printk("(%s) port_index=%u controller=%u hw_id=0x%x\n",
+		      __get_str(devname), __entry->port_index, __entry->controller,
+		      __entry->hw_fn_id)
+);
+
+TRACE_EVENT(mlx5_sf_hwc_alloc,
+	    TP_PROTO(const struct mlx5_core_dev *dev,
+		     u32 controller,
+		     u16 hw_fn_id,
+		     u32 sfnum),
+	    TP_ARGS(dev, controller, hw_fn_id, sfnum),
+	    TP_STRUCT__entry(__string(devname, dev_name(dev->device))
+			     __field(u32, controller)
+			     __field(u16, hw_fn_id)
+			     __field(u32, sfnum)
+			    ),
+	    TP_fast_assign(__assign_str(devname, dev_name(dev->device));
+		    __entry->controller = controller;
+		    __entry->hw_fn_id = hw_fn_id;
+		    __entry->sfnum = sfnum;
+	    ),
+	    TP_printk("(%s) controller=%u hw_id=0x%x sfnum=%u\n",
+		      __get_str(devname), __entry->controller, __entry->hw_fn_id,
+		      __entry->sfnum)
+);
+
+TRACE_EVENT(mlx5_sf_hwc_free,
+	    TP_PROTO(const struct mlx5_core_dev *dev,
+		     u16 hw_fn_id),
+	    TP_ARGS(dev, hw_fn_id),
+	    TP_STRUCT__entry(__string(devname, dev_name(dev->device))
+			     __field(u16, hw_fn_id)
+			    ),
+	    TP_fast_assign(__assign_str(devname, dev_name(dev->device));
+		    __entry->hw_fn_id = hw_fn_id;
+	    ),
+	    TP_printk("(%s) hw_id=0x%x\n", __get_str(devname), __entry->hw_fn_id)
+);
+
+TRACE_EVENT(mlx5_sf_hwc_deferred_free,
+	    TP_PROTO(const struct mlx5_core_dev *dev,
+		     u16 hw_fn_id),
+	    TP_ARGS(dev, hw_fn_id),
+	    TP_STRUCT__entry(__string(devname, dev_name(dev->device))
+			     __field(u16, hw_fn_id)
+			    ),
+	    TP_fast_assign(__assign_str(devname, dev_name(dev->device));
+		    __entry->hw_fn_id = hw_fn_id;
+	    ),
+	    TP_printk("(%s) hw_id=0x%x\n", __get_str(devname), __entry->hw_fn_id)
+);
+
+DECLARE_EVENT_CLASS(mlx5_sf_state_template,
+		    TP_PROTO(const struct mlx5_core_dev *dev,
+			     u32 port_index,
+			     u32 controller,
+			     u16 hw_fn_id),
+		    TP_ARGS(dev, port_index, controller, hw_fn_id),
+		    TP_STRUCT__entry(__string(devname, dev_name(dev->device))
+				     __field(unsigned int, port_index)
+				     __field(u32, controller)
+				     __field(u16, hw_fn_id)),
+		    TP_fast_assign(__assign_str(devname, dev_name(dev->device));
+				   __entry->port_index = port_index;
+				   __entry->controller = controller;
+				   __entry->hw_fn_id = hw_fn_id;
+		    ),
+		    TP_printk("(%s) port_index=%u controller=%u hw_id=0x%x\n",
+			      __get_str(devname), __entry->port_index, __entry->controller,
+			      __entry->hw_fn_id)
+);
+
+DEFINE_EVENT(mlx5_sf_state_template, mlx5_sf_activate,
+	     TP_PROTO(const struct mlx5_core_dev *dev,
+		      u32 port_index,
+		      u32 controller,
+		      u16 hw_fn_id),
+	     TP_ARGS(dev, port_index, controller, hw_fn_id)
+	     );
+
+DEFINE_EVENT(mlx5_sf_state_template, mlx5_sf_deactivate,
+	     TP_PROTO(const struct mlx5_core_dev *dev,
+		      u32 port_index,
+		      u32 controller,
+		      u16 hw_fn_id),
+	     TP_ARGS(dev, port_index, controller, hw_fn_id)
+	     );
+
+TRACE_EVENT(mlx5_sf_update_state,
+	    TP_PROTO(const struct mlx5_core_dev *dev,
+		     unsigned int port_index,
+		     u32 controller,
+		     u16 hw_fn_id,
+		     u8 state),
+	    TP_ARGS(dev, port_index, controller, hw_fn_id, state),
+	    TP_STRUCT__entry(__string(devname, dev_name(dev->device))
+			     __field(unsigned int, port_index)
+			     __field(u32, controller)
+			     __field(u16, hw_fn_id)
+			     __field(u8, state)
+			    ),
+	    TP_fast_assign(__assign_str(devname, dev_name(dev->device));
+		    __entry->port_index = port_index;
+		    __entry->controller = controller;
+		    __entry->hw_fn_id = hw_fn_id;
+		    __entry->state = state;
+	    ),
+	    TP_printk("(%s) port_index=%u controller=%u hw_id=0x%x state=%u\n",
+		      __get_str(devname), __entry->port_index, __entry->controller,
+		      __entry->hw_fn_id, __entry->state)
+);
+
+#endif /* _MLX5_SF_TP_ */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH sf/diag
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE sf_tracepoint
+#include <trace/define_trace.h>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c
index d9c69123c1ab..252d6017387d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c
@@ -8,6 +8,7 @@
 #include "ecpf.h"
 #include "mlx5_core.h"
 #include "eswitch.h"
+#include "diag/sf_tracepoint.h"
 
 struct mlx5_sf_hw {
 	u32 usr_sfnum;
@@ -142,6 +143,7 @@ int mlx5_sf_hw_table_sf_alloc(struct mlx5_core_dev *dev, u32 controller, u32 usr
 			goto vhca_err;
 	}
 
+	trace_mlx5_sf_hwc_alloc(dev, controller, hw_fn_id, usr_sfnum);
 	mutex_unlock(&table->table_lock);
 	return sw_id;
 
@@ -172,6 +174,7 @@ static void mlx5_sf_hw_table_hwc_sf_free(struct mlx5_core_dev *dev,
 	mlx5_cmd_dealloc_sf(dev, hwc->start_fn_id + idx);
 	hwc->sfs[idx].allocated = false;
 	hwc->sfs[idx].pending_delete = false;
+	trace_mlx5_sf_hwc_free(dev, hwc->start_fn_id + idx);
 }
 
 void mlx5_sf_hw_table_sf_deferred_free(struct mlx5_core_dev *dev, u32 controller, u16 id)
@@ -195,6 +198,7 @@ void mlx5_sf_hw_table_sf_deferred_free(struct mlx5_core_dev *dev, u32 controller
 		hwc->sfs[id].allocated = false;
 	} else {
 		hwc->sfs[id].pending_delete = true;
+		trace_mlx5_sf_hwc_deferred_free(dev, hw_fn_id);
 	}
 err:
 	mutex_unlock(&table->table_lock);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [net-next 14/14] net/mlx5: SF_DEV Add SF device trace points
  2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
                   ` (12 preceding siblings ...)
  2021-10-25 20:54 ` [net-next 13/14] net/mlx5: SF, Add SF trace points Saeed Mahameed
@ 2021-10-25 20:54 ` Saeed Mahameed
  13 siblings, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-25 20:54 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski; +Cc: netdev, Parav Pandit, Saeed Mahameed

From: Parav Pandit <parav@nvidia.com>

Add SF device add and delete specific trace points.

echo mlx5:mlx5_sf_dev_add >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_dev_del >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_vhca_event >> /sys/kernel/debug/tracing/set_event

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../device_drivers/ethernet/mellanox/mlx5.rst | 21 +++++++
 .../ethernet/mellanox/mlx5/core/sf/dev/dev.c  | 23 ++++++--
 .../ethernet/mellanox/mlx5/core/sf/dev/dev.h  |  1 +
 .../mlx5/core/sf/dev/diag/dev_tracepoint.h    | 58 +++++++++++++++++++
 .../mlx5/core/sf/diag/vhca_tracepoint.h       | 40 +++++++++++++
 .../mellanox/mlx5/core/sf/vhca_event.c        |  3 +
 6 files changed, 140 insertions(+), 6 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/diag/dev_tracepoint.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/diag/vhca_tracepoint.h

diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
index d6c10408adc4..5edf50d7dbd5 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
@@ -739,3 +739,24 @@ SF tracepoints:
     $ cat /sys/kernel/debug/tracing/trace
     ...
     devlink-9519    [046] ..... 24624.400271: mlx5_sf_hwc_deferred_free: (0000:06:00.0) hw_id=0x8000
+
+- mlx5_sf_vhca_event: trace SF vhca event and state::
+
+    $ echo mlx5:mlx5_sf_vhca_event >> /sys/kernel/debug/tracing/set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    kworker/u128:3-9093    [046] ..... 24625.365525: mlx5_sf_vhca_event: (0000:06:00.0) hw_id=0x8000 sfnum=88 vhca_state=1
+
+- mlx5_sf_dev_add : trace SF device add event::
+
+    $ echo mlx5:mlx5_sf_dev_add>> /sys/kernel/debug/tracing/set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    kworker/u128:3-9093    [000] ..... 24616.524495: mlx5_sf_dev_add: (0000:06:00.0) sfdev=00000000fc5d96fd aux_id=4 hw_id=0x8000 sfnum=88
+
+- mlx5_sf_dev_del : trace SF device delete event::
+
+    $ echo mlx5:mlx5_sf_dev_del >> /sys/kernel/debug/tracing/set_event
+    $ cat /sys/kernel/debug/tracing/trace
+    ...
+    kworker/u128:3-9093    [044] ..... 24624.400749: mlx5_sf_dev_del: (0000:06:00.0) sfdev=00000000fc5d96fd aux_id=4 hw_id=0x8000 sfnum=88
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
index 871c2fbe18d3..f37db7cc32a6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
@@ -9,6 +9,8 @@
 #include "sf/sf.h"
 #include "sf/mlx5_ifc_vhca_event.h"
 #include "ecpf.h"
+#define CREATE_TRACE_POINTS
+#include "diag/dev_tracepoint.h"
 
 struct mlx5_sf_dev_table {
 	struct xarray devices;
@@ -66,13 +68,18 @@ static void mlx5_sf_dev_release(struct device *device)
 	kfree(sf_dev);
 }
 
-static void mlx5_sf_dev_remove(struct mlx5_sf_dev *sf_dev)
+static void mlx5_sf_dev_remove(struct mlx5_core_dev *dev, struct mlx5_sf_dev *sf_dev)
 {
+	int id;
+
+	id = sf_dev->adev.id;
+	trace_mlx5_sf_dev_del(dev, sf_dev, id);
+
 	auxiliary_device_delete(&sf_dev->adev);
 	auxiliary_device_uninit(&sf_dev->adev);
 }
 
-static void mlx5_sf_dev_add(struct mlx5_core_dev *dev, u16 sf_index, u32 sfnum)
+static void mlx5_sf_dev_add(struct mlx5_core_dev *dev, u16 sf_index, u16 fn_id, u32 sfnum)
 {
 	struct mlx5_sf_dev_table *table = dev->priv.sf_dev_table;
 	struct mlx5_sf_dev *sf_dev;
@@ -100,6 +107,7 @@ static void mlx5_sf_dev_add(struct mlx5_core_dev *dev, u16 sf_index, u32 sfnum)
 	sf_dev->adev.dev.groups = sf_attr_groups;
 	sf_dev->sfnum = sfnum;
 	sf_dev->parent_mdev = dev;
+	sf_dev->fn_id = fn_id;
 
 	if (!table->max_sfs) {
 		mlx5_adev_idx_free(id);
@@ -109,6 +117,8 @@ static void mlx5_sf_dev_add(struct mlx5_core_dev *dev, u16 sf_index, u32 sfnum)
 	}
 	sf_dev->bar_base_addr = table->base_address + (sf_index * table->sf_bar_length);
 
+	trace_mlx5_sf_dev_add(dev, sf_dev, id);
+
 	err = auxiliary_device_init(&sf_dev->adev);
 	if (err) {
 		mlx5_adev_idx_free(id);
@@ -128,7 +138,7 @@ static void mlx5_sf_dev_add(struct mlx5_core_dev *dev, u16 sf_index, u32 sfnum)
 	return;
 
 xa_err:
-	mlx5_sf_dev_remove(sf_dev);
+	mlx5_sf_dev_remove(dev, sf_dev);
 add_err:
 	mlx5_core_err(dev, "SF DEV: fail device add for index=%d sfnum=%d err=%d\n",
 		      sf_index, sfnum, err);
@@ -139,7 +149,7 @@ static void mlx5_sf_dev_del(struct mlx5_core_dev *dev, struct mlx5_sf_dev *sf_de
 	struct mlx5_sf_dev_table *table = dev->priv.sf_dev_table;
 
 	xa_erase(&table->devices, sf_index);
-	mlx5_sf_dev_remove(sf_dev);
+	mlx5_sf_dev_remove(dev, sf_dev);
 }
 
 static int
@@ -178,7 +188,8 @@ mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_
 		break;
 	case MLX5_VHCA_STATE_ACTIVE:
 		if (!sf_dev)
-			mlx5_sf_dev_add(table->dev, sf_index, event->sw_function_id);
+			mlx5_sf_dev_add(table->dev, sf_index, event->function_id,
+					event->sw_function_id);
 		break;
 	default:
 		break;
@@ -260,7 +271,7 @@ static void mlx5_sf_dev_destroy_all(struct mlx5_sf_dev_table *table)
 
 	xa_for_each(&table->devices, index, sf_dev) {
 		xa_erase(&table->devices, index);
-		mlx5_sf_dev_remove(sf_dev);
+		mlx5_sf_dev_remove(table->dev, sf_dev);
 	}
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h
index 149fd9e698cf..2a66a427ef15 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h
@@ -16,6 +16,7 @@ struct mlx5_sf_dev {
 	struct mlx5_core_dev *mdev;
 	phys_addr_t bar_base_addr;
 	u32 sfnum;
+	u16 fn_id;
 };
 
 void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/diag/dev_tracepoint.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/diag/dev_tracepoint.h
new file mode 100644
index 000000000000..7f7c9af5deed
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/diag/dev_tracepoint.h
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mlx5
+
+#if !defined(_MLX5_SF_DEV_TP_) || defined(TRACE_HEADER_MULTI_READ)
+#define _MLX5_SF_DEV_TP_
+
+#include <linux/tracepoint.h>
+#include <linux/mlx5/driver.h>
+#include "../../dev/dev.h"
+
+DECLARE_EVENT_CLASS(mlx5_sf_dev_template,
+		    TP_PROTO(const struct mlx5_core_dev *dev,
+			     const struct mlx5_sf_dev *sfdev,
+			     int aux_id),
+		    TP_ARGS(dev, sfdev, aux_id),
+		    TP_STRUCT__entry(__string(devname, dev_name(dev->device))
+				     __field(const struct mlx5_sf_dev*, sfdev)
+				     __field(int, aux_id)
+				     __field(u16, hw_fn_id)
+				     __field(u32, sfnum)
+		    ),
+		    TP_fast_assign(__assign_str(devname, dev_name(dev->device));
+				   __entry->sfdev = sfdev;
+				   __entry->aux_id = aux_id;
+				   __entry->hw_fn_id = sfdev->fn_id;
+				   __entry->sfnum = sfdev->sfnum;
+		    ),
+		    TP_printk("(%s) sfdev=%pK aux_id=%d hw_id=0x%x sfnum=%u\n",
+			      __get_str(devname), __entry->sfdev,
+			      __entry->aux_id, __entry->hw_fn_id,
+			      __entry->sfnum)
+);
+
+DEFINE_EVENT(mlx5_sf_dev_template, mlx5_sf_dev_add,
+	     TP_PROTO(const struct mlx5_core_dev *dev,
+		      const struct mlx5_sf_dev *sfdev,
+		      int aux_id),
+	     TP_ARGS(dev, sfdev, aux_id)
+	     );
+
+DEFINE_EVENT(mlx5_sf_dev_template, mlx5_sf_dev_del,
+	     TP_PROTO(const struct mlx5_core_dev *dev,
+		      const struct mlx5_sf_dev *sfdev,
+		      int aux_id),
+	     TP_ARGS(dev, sfdev, aux_id)
+	     );
+
+#endif /* _MLX5_SF_DEV_TP_ */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH sf/dev/diag
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE dev_tracepoint
+#include <trace/define_trace.h>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/diag/vhca_tracepoint.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/diag/vhca_tracepoint.h
new file mode 100644
index 000000000000..fd814a190b8b
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/diag/vhca_tracepoint.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/* Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mlx5
+
+#if !defined(_MLX5_SF_VHCA_TP_) || defined(TRACE_HEADER_MULTI_READ)
+#define _MLX5_SF_VHCA_TP_
+
+#include <linux/tracepoint.h>
+#include <linux/mlx5/driver.h>
+#include "sf/vhca_event.h"
+
+TRACE_EVENT(mlx5_sf_vhca_event,
+	    TP_PROTO(const struct mlx5_core_dev *dev,
+		     const struct mlx5_vhca_state_event *event),
+	    TP_ARGS(dev, event),
+	    TP_STRUCT__entry(__string(devname, dev_name(dev->device))
+			     __field(u16, hw_fn_id)
+			     __field(u32, sfnum)
+			     __field(u8, vhca_state)
+			    ),
+	    TP_fast_assign(__assign_str(devname, dev_name(dev->device));
+		    __entry->hw_fn_id = event->function_id;
+		    __entry->sfnum = event->sw_function_id;
+		    __entry->vhca_state = event->new_vhca_state;
+	    ),
+	    TP_printk("(%s) hw_id=0x%x sfnum=%u vhca_state=%d\n",
+		      __get_str(devname), __entry->hw_fn_id,
+		      __entry->sfnum, __entry->vhca_state)
+);
+
+#endif /* _MLX5_SF_VHCA_TP_ */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH sf/diag
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE vhca_tracepoint
+#include <trace/define_trace.h>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
index 28b14b05086f..d908fba968f0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
@@ -6,6 +6,8 @@
 #include "mlx5_core.h"
 #include "vhca_event.h"
 #include "ecpf.h"
+#define CREATE_TRACE_POINTS
+#include "diag/vhca_tracepoint.h"
 
 struct mlx5_vhca_state_notifier {
 	struct mlx5_core_dev *dev;
@@ -82,6 +84,7 @@ mlx5_vhca_event_notify(struct mlx5_core_dev *dev, struct mlx5_vhca_state_event *
 					 vhca_state_context.vhca_state);
 
 	mlx5_vhca_event_arm(dev, event->function_id);
+	trace_mlx5_sf_vhca_event(dev, event);
 
 	blocking_notifier_call_chain(&dev->priv.vhca_state_notifier->n_head, 0, event);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [net-next 01/14] net/mlx5e: don't write directly to netdev->dev_addr
  2021-10-25 20:54 ` [net-next 01/14] net/mlx5e: don't write directly to netdev->dev_addr Saeed Mahameed
@ 2021-10-26 12:30   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 21+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-10-26 12:30 UTC (permalink / raw)
  To: Saeed Mahameed; +Cc: davem, kuba, netdev, saeedm

Hello:

This series was applied to netdev/net-next.git (master)
by Saeed Mahameed <saeedm@nvidia.com>:

On Mon, 25 Oct 2021 13:54:18 -0700 you wrote:
> From: Jakub Kicinski <kuba@kernel.org>
> 
> Use a local buffer and eth_hw_addr_set()
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> 
> [...]

Here is the summary with links:
  - [net-next,01/14] net/mlx5e: don't write directly to netdev->dev_addr
    https://git.kernel.org/netdev/net-next/c/537e4d2e6fe3
  - [net-next,02/14] net/mlx5: Remove unnecessary checks for slow path flag
    https://git.kernel.org/netdev/net-next/c/a64c5edbd20e
  - [net-next,03/14] net/mlx5: Fix unused function warning of mlx5i_flow_type_mask
    https://git.kernel.org/netdev/net-next/c/038e5e471874
  - [net-next,04/14] net/mlx5: Reduce flow counters bulk query buffer size for SFs
    https://git.kernel.org/netdev/net-next/c/2fdeb4f4c2ae
  - [net-next,05/14] net/mlx5: Extend health buffer dump
    https://git.kernel.org/netdev/net-next/c/cb464ba53c0c
  - [net-next,06/14] net/mlx5: Print health buffer by log level
    https://git.kernel.org/netdev/net-next/c/b87ef75cb5c9
  - [net-next,07/14] net/mlx5: Add periodic update of host time to firmware
    https://git.kernel.org/netdev/net-next/c/5a1023deeed0
  - [net-next,08/14] net/mlx5: Bridge, extract code to lookup and del/notify entry
    https://git.kernel.org/netdev/net-next/c/2deda2f1bf4e
  - [net-next,09/14] net/mlx5: Bridge, support replacing existing FDB entry
    https://git.kernel.org/netdev/net-next/c/3518c83fc96b
  - [net-next,10/14] net/mlx5: Let user configure io_eq_size param
    https://git.kernel.org/netdev/net-next/c/46ae40b94d88
  - [net-next,11/14] net/mlx5: Let user configure event_eq_size param
    https://git.kernel.org/netdev/net-next/c/a6cb08daa3b4
  - [net-next,12/14] net/mlx5: Let user configure max_macs param
    https://git.kernel.org/netdev/net-next/c/554604061979
  - [net-next,13/14] net/mlx5: SF, Add SF trace points
    https://git.kernel.org/netdev/net-next/c/b3ccada68b2d
  - [net-next,14/14] net/mlx5: SF_DEV Add SF device trace points
    https://git.kernel.org/netdev/net-next/c/d67ab0a8c130

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [net-next 10/14] net/mlx5: Let user configure io_eq_size param
  2021-10-25 20:54 ` [net-next 10/14] net/mlx5: Let user configure io_eq_size param Saeed Mahameed
@ 2021-10-26 15:05   ` Jakub Kicinski
  2021-10-26 15:54     ` Saeed Mahameed
  0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2021-10-26 15:05 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David S. Miller, netdev, Shay Drory, Moshe Shemesh, Parav Pandit,
	Saeed Mahameed

On Mon, 25 Oct 2021 13:54:27 -0700 Saeed Mahameed wrote:
> From: Shay Drory <shayd@nvidia.com>
> 
> Currently, each I/O EQ is taking 128KB of memory. This size
> is not needed in all use cases, and is critical with large scale.
> Hence, allow user to configure the size of I/O EQs.
> 
> For example, to reduce I/O EQ size to 64, execute:
> $ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64
> $ devlink dev reload pci/0000:00:0b.0

This sort of config is needed by more drivers,
we need a standard way of configuring this.

Sorry, I didn't have the time to look thru your patches
yesterday, I'm sending a revert for all your new devlink
params.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [net-next 10/14] net/mlx5: Let user configure io_eq_size param
  2021-10-26 15:05   ` Jakub Kicinski
@ 2021-10-26 15:54     ` Saeed Mahameed
  2021-10-26 17:16       ` Jakub Kicinski
  0 siblings, 1 reply; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-26 15:54 UTC (permalink / raw)
  To: Jiri Pirko, kuba; +Cc: Moshe Shemesh, Shay Drory, Parav Pandit, davem, netdev

On Tue, 2021-10-26 at 08:05 -0700, Jakub Kicinski wrote:
> On Mon, 25 Oct 2021 13:54:27 -0700 Saeed Mahameed wrote:
> > From: Shay Drory <shayd@nvidia.com>
> > 
> > Currently, each I/O EQ is taking 128KB of memory. This size
> > is not needed in all use cases, and is critical with large scale.
> > Hence, allow user to configure the size of I/O EQs.
> > 
> > For example, to reduce I/O EQ size to 64, execute:
> > $ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64
> > $ devlink dev reload pci/0000:00:0b.0
> 
> This sort of config is needed by more drivers,
> we need a standard way of configuring this.
> 

We had a debate internally about the same thing, Jiri and I thought
that EQ might be a ConnectX only thing (maybe some other vendors have
it) but it is not really popular, we thought, until other vendors start
contributing or asking for the same thing, maybe then we can
standardize.

> Sorry, I didn't have the time to look thru your patches
> yesterday, I'm sending a revert for all your new devlink
> params.

Sure, we will submit a RFC to give other vendors a chance to comment,
it will be basically the same patch (devlink resource) while making the
parameters vendor generic.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [net-next 10/14] net/mlx5: Let user configure io_eq_size param
  2021-10-26 15:54     ` Saeed Mahameed
@ 2021-10-26 17:16       ` Jakub Kicinski
  2021-10-26 18:01         ` Saeed Mahameed
  2021-10-27  6:16         ` Gal Pressman
  0 siblings, 2 replies; 21+ messages in thread
From: Jakub Kicinski @ 2021-10-26 17:16 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Jiri Pirko, Moshe Shemesh, Shay Drory, Parav Pandit, davem, netdev

On Tue, 26 Oct 2021 15:54:28 +0000 Saeed Mahameed wrote:
> On Tue, 2021-10-26 at 08:05 -0700, Jakub Kicinski wrote:
> > On Mon, 25 Oct 2021 13:54:27 -0700 Saeed Mahameed wrote:  
> > > From: Shay Drory <shayd@nvidia.com>
> > > 
> > > Currently, each I/O EQ is taking 128KB of memory. This size
> > > is not needed in all use cases, and is critical with large scale.
> > > Hence, allow user to configure the size of I/O EQs.
> > > 
> > > For example, to reduce I/O EQ size to 64, execute:
> > > $ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64
> > > $ devlink dev reload pci/0000:00:0b.0  
> > 
> > This sort of config is needed by more drivers,
> > we need a standard way of configuring this.
>
> We had a debate internally about the same thing, Jiri and I thought
> that EQ might be a ConnectX only thing (maybe some other vendors have
> it) but it is not really popular

I thought it's a RDMA thing. At least according to grep there's 
a handful of non-MLX drivers which have eqs. Are these not actual
event queues? (huawei/hinic, ibm/ehea, microsoft/mana, qlogic/qed)

> we thought, until other vendors start contributing or asking for 
> the same thing, maybe then we can standardize.

Yeah, like the standardization part ever happens :/ 

Look at the EQE/CQE interrupt generation thing. New vendor comes in and
copies best known practice (which is some driver-specific garbage,
ethtool priv-flags in that case). The maintainer (me) has to be the
policeman remember all those knobs with prior art and push back. Most
of the time the vendor decides to just keep the knob out of tree at
this point, kudos to Hauwei for not giving up. New vendor implements
the API, none of the existing vendors provide reviews or feedback.
Then none of the existing vendors implements the now-standard API.
Someone working for a large customer (me, again) has to go and ask 
for the API to be implemented. Which takes months even tho the patches
should be trivial.

If the initial patches adding the cqe/eqe interrupt modes to priv-flags
were nacked and the standard API created we'd all have saved much time.

> > Sorry, I didn't have the time to look thru your patches
> > yesterday, I'm sending a revert for all your new devlink
> > params.  
> 
> Sure, we will submit a RFC to give other vendors a chance to comment,
> it will be basically the same patch (devlink resource) while making the
> parameters vendor generic.

IDK if resource is a right fit (as mentioned to Parav in the discussion
on the revert).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [net-next 10/14] net/mlx5: Let user configure io_eq_size param
  2021-10-26 17:16       ` Jakub Kicinski
@ 2021-10-26 18:01         ` Saeed Mahameed
  2021-10-27  6:16         ` Gal Pressman
  1 sibling, 0 replies; 21+ messages in thread
From: Saeed Mahameed @ 2021-10-26 18:01 UTC (permalink / raw)
  To: kuba; +Cc: Moshe Shemesh, Shay Drory, Parav Pandit, Jiri Pirko, davem, netdev

On Tue, 2021-10-26 at 10:16 -0700, Jakub Kicinski wrote:
> On Tue, 26 Oct 2021 15:54:28 +0000 Saeed Mahameed wrote:
> > On Tue, 2021-10-26 at 08:05 -0700, Jakub Kicinski wrote:
> > > On Mon, 25 Oct 2021 13:54:27 -0700 Saeed Mahameed wrote:  
> > > > From: Shay Drory <shayd@nvidia.com>
> > > > 
> > > > Currently, each I/O EQ is taking 128KB of memory. This size
> > > > is not needed in all use cases, and is critical with large
> > > > scale.
> > > > Hence, allow user to configure the size of I/O EQs.
> > > > 
> > > > For example, to reduce I/O EQ size to 64, execute:
> > > > $ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size
> > > > 64
> > > > $ devlink dev reload pci/0000:00:0b.0  
> > > 
> > > This sort of config is needed by more drivers,
> > > we need a standard way of configuring this.
> > 
> > We had a debate internally about the same thing, Jiri and I thought
> > that EQ might be a ConnectX only thing (maybe some other vendors
> > have
> > it) but it is not really popular
> 
> I thought it's a RDMA thing. At least according to grep there's 
> a handful of non-MLX drivers which have eqs. Are these not actual
> event queues? (huawei/hinic, ibm/ehea, microsoft/mana, qlogic/qed)
> 
> > we thought, until other vendors start contributing or asking for 
> > the same thing, maybe then we can standardize.
> 
> Yeah, like the standardization part ever happens :/ 
> 
> Look at the EQE/CQE interrupt generation thing. New vendor comes in
> and
> copies best known practice (which is some driver-specific garbage,
> ethtool priv-flags in that case). The maintainer (me) has to be the
> policeman remember all those knobs with prior art and push back. Most

Well, i can't even count the patches shot down internally because of
non standard APIs. The driver maintainer (me) has to also be a
policeman. As long as we are in sync I think this can scale, I am sure
other vendors maintainers are filtering as many patches as I do.

> of the time the vendor decides to just keep the knob out of tree at
> this point, kudos to Hauwei for not giving up. New vendor implements
> the API, none of the existing vendors provide reviews or feedback.
> Then none of the existing vendors implements the now-standard API.
> Someone working for a large customer (me, again) has to go and ask 
> for the API to be implemented. Which takes months even tho the
> patches
> should be trivial.
> 
> If the initial patches adding the cqe/eqe interrupt modes to priv-
> flags
> were nacked and the standard API created we'd all have saved much
> time.
> 

Sometimes it is hard to decide what is the best for the user that fits
all vendors, when you are the first to come up with a new concept,
EQE/CQE thing is a relatively new mechanism, took other vendors a while
to catch up, who would've known such mechanism would become popular ?

but I do agree with you  many apis can be standardize or at least
refined with better policing, sorry, but the only way to do this is to
have as many vendors as possible looking at each possible API patch.

Many times we are pioneering in latest features, especially for
smartnics scalability, that involves resource fine tuning and fine-
grained user controls. luckily for the EQ thing we can generalize.

> > > Sorry, I didn't have the time to look thru your patches
> > > yesterday, I'm sending a revert for all your new devlink
> > > params.  
> > 
> > Sure, we will submit a RFC to give other vendors a chance to
> > comment,
> > it will be basically the same patch (devlink resource) while making
> > the
> > parameters vendor generic.
> 
> IDK if resource is a right fit (as mentioned to Parav in the
> discussion
> on the revert).

will switch the discussion to that thread.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [net-next 10/14] net/mlx5: Let user configure io_eq_size param
  2021-10-26 17:16       ` Jakub Kicinski
  2021-10-26 18:01         ` Saeed Mahameed
@ 2021-10-27  6:16         ` Gal Pressman
  1 sibling, 0 replies; 21+ messages in thread
From: Gal Pressman @ 2021-10-27  6:16 UTC (permalink / raw)
  To: Jakub Kicinski, Saeed Mahameed
  Cc: Jiri Pirko, Moshe Shemesh, Shay Drory, Parav Pandit, davem, netdev

On 26/10/2021 20:16, Jakub Kicinski wrote:
> On Tue, 26 Oct 2021 15:54:28 +0000 Saeed Mahameed wrote:
>> On Tue, 2021-10-26 at 08:05 -0700, Jakub Kicinski wrote:
>>> On Mon, 25 Oct 2021 13:54:27 -0700 Saeed Mahameed wrote:  
>>>> From: Shay Drory <shayd@nvidia.com>
>>>>
>>>> Currently, each I/O EQ is taking 128KB of memory. This size
>>>> is not needed in all use cases, and is critical with large scale.
>>>> Hence, allow user to configure the size of I/O EQs.
>>>>
>>>> For example, to reduce I/O EQ size to 64, execute:
>>>> $ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64
>>>> $ devlink dev reload pci/0000:00:0b.0  
>>> This sort of config is needed by more drivers,
>>> we need a standard way of configuring this.
>> We had a debate internally about the same thing, Jiri and I thought
>> that EQ might be a ConnectX only thing (maybe some other vendors have
>> it) but it is not really popular
> I thought it's a RDMA thing. At least according to grep there's 
> a handful of non-MLX drivers which have eqs. Are these not actual
> event queues? (huawei/hinic, ibm/ehea, microsoft/mana, qlogic/qed)


These are indeed event queues in RDMA, but it's more of an
implementation detail in each driver, there's no EQ object definition in
the IB spec AFAIK.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-10-27  6:16 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-25 20:54 [pull request][net-next 00/14] mlx5 updates 2021-10-25 Saeed Mahameed
2021-10-25 20:54 ` [net-next 01/14] net/mlx5e: don't write directly to netdev->dev_addr Saeed Mahameed
2021-10-26 12:30   ` patchwork-bot+netdevbpf
2021-10-25 20:54 ` [net-next 02/14] net/mlx5: Remove unnecessary checks for slow path flag Saeed Mahameed
2021-10-25 20:54 ` [net-next 03/14] net/mlx5: Fix unused function warning of mlx5i_flow_type_mask Saeed Mahameed
2021-10-25 20:54 ` [net-next 04/14] net/mlx5: Reduce flow counters bulk query buffer size for SFs Saeed Mahameed
2021-10-25 20:54 ` [net-next 05/14] net/mlx5: Extend health buffer dump Saeed Mahameed
2021-10-25 20:54 ` [net-next 06/14] net/mlx5: Print health buffer by log level Saeed Mahameed
2021-10-25 20:54 ` [net-next 07/14] net/mlx5: Add periodic update of host time to firmware Saeed Mahameed
2021-10-25 20:54 ` [net-next 08/14] net/mlx5: Bridge, extract code to lookup and del/notify entry Saeed Mahameed
2021-10-25 20:54 ` [net-next 09/14] net/mlx5: Bridge, support replacing existing FDB entry Saeed Mahameed
2021-10-25 20:54 ` [net-next 10/14] net/mlx5: Let user configure io_eq_size param Saeed Mahameed
2021-10-26 15:05   ` Jakub Kicinski
2021-10-26 15:54     ` Saeed Mahameed
2021-10-26 17:16       ` Jakub Kicinski
2021-10-26 18:01         ` Saeed Mahameed
2021-10-27  6:16         ` Gal Pressman
2021-10-25 20:54 ` [net-next 11/14] net/mlx5: Let user configure event_eq_size param Saeed Mahameed
2021-10-25 20:54 ` [net-next 12/14] net/mlx5: Let user configure max_macs param Saeed Mahameed
2021-10-25 20:54 ` [net-next 13/14] net/mlx5: SF, Add SF trace points Saeed Mahameed
2021-10-25 20:54 ` [net-next 14/14] net/mlx5: SF_DEV Add SF device " Saeed Mahameed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.