All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-wired-lan] [PATCH net-next v2 0/2] ice: Support 5 layer tx scheduler topology
@ 2022-06-24 10:21 Michal Wilczynski
  2022-06-24 10:21 ` [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology Michal Wilczynski
  2022-06-24 10:21 ` [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology Michal Wilczynski
  0 siblings, 2 replies; 13+ messages in thread
From: Michal Wilczynski @ 2022-06-24 10:21 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: Michal Wilczynski


For performance reasons there is a need to have support for selectable
tx scheduler topology. Currently firmware supports only the default
9-layer and 5-layer topology. This patch series enables switch from
default to 5-layer topology, if user decides to opt-in.

Michal Wilczynski (1):
  ice: Enable switching default tx scheduler topology
v2: 
- Moved definitions of scheduling layers to other commit

Raj Victor (1):
  ice: Code added to support 5 layer topology
v2:
- Added example of performance decrease in commit message
- Reworded commit message for imperative mood
- Removed unnecessary tags
- Refactored duplicated function call
- Fixed RCT
- Fixed unnecessary call to devm_kfree
- Defined constants


 .../net/ethernet/intel/ice/ice_adminq_cmd.h   |  23 ++
 drivers/net/ethernet/intel/ice/ice_common.c   |   5 +
 .../net/ethernet/intel/ice/ice_flex_pipe.c    | 203 ++++++++++++++++++
 .../net/ethernet/intel/ice/ice_flex_type.h    |  17 +-
 drivers/net/ethernet/intel/ice/ice_main.c     | 113 ++++++++--
 drivers/net/ethernet/intel/ice/ice_sched.c    |  35 +--
 drivers/net/ethernet/intel/ice/ice_sched.h    |   3 +
 drivers/net/ethernet/intel/ice/ice_type.h     |   1 +
 8 files changed, 362 insertions(+), 38 deletions(-)

-- 
2.27.0

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology
  2022-06-24 10:21 [Intel-wired-lan] [PATCH net-next v2 0/2] ice: Support 5 layer tx scheduler topology Michal Wilczynski
@ 2022-06-24 10:21 ` Michal Wilczynski
  2022-06-28 22:21   ` Tony Nguyen
  2022-06-29 10:49   ` Paul Menzel
  2022-06-24 10:21 ` [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology Michal Wilczynski
  1 sibling, 2 replies; 13+ messages in thread
From: Michal Wilczynski @ 2022-06-24 10:21 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: Raj Victor, Michal Wilczynski

From: Raj Victor <victor.raj@intel.com>

There was a performance issue reported when the number of VSIs were
not multiple of 8. This was caused due to the max children limitation
per node(8) in 9 layer topology. The BW credits are shared evenly
among the children by default. Assume one node has 8 children and
the other has 1. The parent of these nodes share the BW credit equally
among them. Apparently this casued a problem for the first node which
has 8 children. The 9th VM got more BW credits than the first 8 VMs.

Example:
1) With 8 VM's:
tx_queue_0_packets: 23283027
tx_queue_1_packets: 23292289
tx_queue_2_packets: 23276136
tx_queue_3_packets: 23279828
tx_queue_4_packets: 23279828
tx_queue_5_packets: 23279333
tx_queue_6_packets: 23277745
tx_queue_7_packets: 23279950
tx_queue_8_packets: 0

2) With 9 VM's:
tx_queue_0_packets: 24163396
tx_queue_1_packets: 24164623
tx_queue_2_packets: 24163188
tx_queue_3_packets: 24163701
tx_queue_4_packets: 24163683
tx_queue_5_packets: 24164668
tx_queue_6_packets: 23327200
tx_queue_7_packets: 24163853
tx_queue_8_packets: 91101417

So on average queue 8 statistics show that 3.7 times
more packets were send there than from the other queues.

The FW has increased the max number of children per node by reducing
the number of layers from 9 to 5. Reflect this on driver side.

Signed-off-by: Raj Victor <victor.raj@intel.com>
Co-developed-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 .../net/ethernet/intel/ice/ice_adminq_cmd.h   |  23 ++
 drivers/net/ethernet/intel/ice/ice_common.c   |   3 +
 .../net/ethernet/intel/ice/ice_flex_pipe.c    | 202 ++++++++++++++++++
 .../net/ethernet/intel/ice/ice_flex_type.h    |  17 +-
 drivers/net/ethernet/intel/ice/ice_sched.h    |   3 +
 drivers/net/ethernet/intel/ice/ice_type.h     |   1 +
 6 files changed, 247 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
index 05cb9dd7035a..2d1da00b4926 100644
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -120,6 +120,7 @@ struct ice_aqc_list_caps_elem {
 #define ICE_AQC_CAPS_PCIE_RESET_AVOIDANCE		0x0076
 #define ICE_AQC_CAPS_POST_UPDATE_RESET_RESTRICT		0x0077
 #define ICE_AQC_CAPS_NVM_MGMT				0x0080
+#define ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE		0x0085
 
 	u8 major_ver;
 	u8 minor_ver;
@@ -798,6 +799,24 @@ struct ice_aqc_get_topo {
 	__le32 addr_low;
 };
 
+/* Get/Set Tx Topology (indirect 0x0418/0x0417) */
+struct ice_aqc_get_set_tx_topo {
+	u8 set_flags;
+#define ICE_AQC_TX_TOPO_FLAGS_CORRER		BIT(0)
+#define ICE_AQC_TX_TOPO_FLAGS_SRC_RAM		BIT(1)
+#define ICE_AQC_TX_TOPO_FLAGS_SET_PSM		BIT(2)
+#define ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW		BIT(4)
+#define ICE_AQC_TX_TOPO_FLAGS_ISSUED		BIT(5)
+	u8 get_flags;
+#define ICE_AQC_TX_TOPO_GET_NO_UPDATE		0
+#define ICE_AQC_TX_TOPO_GET_PSM			1
+#define ICE_AQC_TX_TOPO_GET_RAM			2
+	__le16 reserved1;
+	__le32 reserved2;
+	__le32 addr_high;
+	__le32 addr_low;
+};
+
 /* Update TSE (indirect 0x0403)
  * Get TSE (indirect 0x0404)
  * Add TSE (indirect 0x0401)
@@ -2126,6 +2145,7 @@ struct ice_aq_desc {
 		struct ice_aqc_get_link_topo get_link_topo;
 		struct ice_aqc_i2c read_i2c;
 		struct ice_aqc_read_i2c_resp read_i2c_resp;
+		struct ice_aqc_get_set_tx_topo get_set_tx_topo;
 	} params;
 };
 
@@ -2231,6 +2251,9 @@ enum ice_adminq_opc {
 	ice_aqc_opc_query_sched_res			= 0x0412,
 	ice_aqc_opc_remove_rl_profiles			= 0x0415,
 
+	ice_aqc_opc_set_tx_topo				= 0x0417,
+	ice_aqc_opc_get_tx_topo				= 0x0418,
+
 	/* PHY commands */
 	ice_aqc_opc_get_phy_caps			= 0x0600,
 	ice_aqc_opc_set_phy_cfg				= 0x0601,
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 9619bdb9e49a..8b65e2bfb160 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -2091,6 +2091,9 @@ ice_parse_common_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps,
 			  "%s: reset_restrict_support = %d\n", prefix,
 			  caps->reset_restrict_support);
 		break;
+	case ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE:
+		caps->tx_sched_topo_comp_mode_en = (number == 1);
+		break;
 	default:
 		/* Not one of the recognized common capabilities */
 		found = false;
diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
index c73cdab44f70..b3bd345ea0f3 100644
--- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
+++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
@@ -4,6 +4,7 @@
 #include "ice_common.h"
 #include "ice_flex_pipe.h"
 #include "ice_flow.h"
+#include "ice_sched.h"
 #include "ice.h"
 
 /* For supporting double VLAN mode, it is necessary to enable or disable certain
@@ -1783,6 +1784,207 @@ bool ice_is_init_pkg_successful(enum ice_ddp_state state)
 	}
 }
 
+/**
+ * ice_get_set_tx_topo - get or set tx topology
+ * @hw: pointer to the HW struct
+ * @buf: pointer to tx topology buffer
+ * @buf_size: buffer size
+ * @cd: pointer to command details structure or NULL
+ * @flags: pointer to descriptor flags
+ * @set: 0-get, 1-set topology
+ *
+ * The function will get or set tx topology
+ */
+static int
+ice_get_set_tx_topo(struct ice_hw *hw, u8 *buf, u16 buf_size,
+		    struct ice_sq_cd *cd, u8 *flags, bool set)
+{
+	struct ice_aqc_get_set_tx_topo *cmd;
+	struct ice_aq_desc desc;
+	int status;
+
+	cmd = &desc.params.get_set_tx_topo;
+	if (set) {
+		cmd->set_flags = ICE_AQC_TX_TOPO_FLAGS_ISSUED;
+		/* requested to update a new topology, not a default topology */
+		if (buf)
+			cmd->set_flags |= ICE_AQC_TX_TOPO_FLAGS_SRC_RAM |
+					  ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW;
+	} else {
+		cmd->get_flags = ICE_AQC_TX_TOPO_GET_RAM;
+	}
+	ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_tx_topo);
+	desc.flags |= cpu_to_le16(ICE_AQ_FLAG_RD);
+	status = ice_aq_send_cmd(hw, &desc, buf, buf_size, cd);
+	if (status)
+		return status;
+	/* read the return flag values (first byte) for get operation */
+	if (!set && flags)
+		*flags = desc.params.get_set_tx_topo.set_flags;
+
+	return 0;
+}
+
+/**
+ * ice_cfg_tx_topo - Initialize new tx topology if available
+ * @hw: pointer to the HW struct
+ * @buf: pointer to Tx topology buffer
+ * @len: buffer size
+ *
+ * The function will apply the new Tx topology from the package buffer
+ * if available.
+ */
+int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len)
+{
+	u8 *current_topo, *new_topo = NULL;
+	struct ice_run_time_cfg_seg *seg;
+	struct ice_buf_hdr *section;
+	struct ice_pkg_hdr *pkg_hdr;
+	enum ice_ddp_state state;
+	u16 i, size = 0, offset;
+	u32 reg = 0;
+	int status;
+	u8 flags;
+
+	if (!buf || !len)
+		return -EINVAL;
+
+	/* Does FW support new Tx topology mode ? */
+	if (!hw->func_caps.common_cap.tx_sched_topo_comp_mode_en) {
+		ice_debug(hw, ICE_DBG_INIT, "FW doesn't support compatibility mode\n");
+		return -EOPNOTSUPP;
+	}
+
+	current_topo = kzalloc(ICE_AQ_MAX_BUF_LEN, GFP_KERNEL);
+
+	if (!current_topo)
+		return -ENOMEM;
+
+	/* get the current Tx topology */
+	status = ice_get_set_tx_topo(hw, current_topo, ICE_AQ_MAX_BUF_LEN, NULL,
+				     &flags, false);
+
+	kfree(current_topo);
+
+	if (status) {
+		ice_debug(hw, ICE_DBG_INIT, "Get current topology is failed\n");
+		return status;
+	}
+
+	/* Is default topology already applied ? */
+	if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
+	    hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS) {
+		ice_debug(hw, ICE_DBG_INIT, "Loaded default topology\n");
+		/* Already default topology is loaded */
+		return -EEXIST;
+	}
+
+	/* Is new topology already applied ? */
+	if ((flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
+	    hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) {
+		ice_debug(hw, ICE_DBG_INIT, "Loaded new topology\n");
+		/* Already new topology is loaded */
+		return -EEXIST;
+	}
+
+	/* Is set topology issued already ? */
+	if (flags & ICE_AQC_TX_TOPO_FLAGS_ISSUED) {
+		ice_debug(hw, ICE_DBG_INIT, "Update tx topology was done by another PF\n");
+		/* add a small delay before exiting */
+		for (i = 0; i < 20; i++)
+			msleep(100);
+		return -EEXIST;
+	}
+
+	/* Change the topology from new to default (5 to 9) */
+	if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
+	    hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) {
+		ice_debug(hw, ICE_DBG_INIT, "Change topology from 5 to 9 layers\n");
+		goto update_topo;
+	}
+
+	pkg_hdr = (struct ice_pkg_hdr *)buf;
+	state = ice_verify_pkg(pkg_hdr, len);
+	if (state) {
+		ice_debug(hw, ICE_DBG_INIT, "failed to verify pkg (err: %d)\n",
+			  state);
+		return -EIO;
+	}
+
+	/* find run time configuration segment */
+	seg = (struct ice_run_time_cfg_seg *)
+		ice_find_seg_in_pkg(hw, SEGMENT_TYPE_ICE_RUN_TIME_CFG, pkg_hdr);
+	if (!seg) {
+		ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment is missing\n");
+		return -EIO;
+	}
+
+	if (le32_to_cpu(seg->buf_table.buf_count) < ICE_MIN_S_COUNT) {
+		ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment count(%d) is wrong\n",
+			  seg->buf_table.buf_count);
+		return -EIO;
+	}
+
+	section = ice_pkg_val_buf(seg->buf_table.buf_array);
+
+	if (!section || le32_to_cpu(section->section_entry[0].type) !=
+		ICE_SID_TX_5_LAYER_TOPO) {
+		ice_debug(hw, ICE_DBG_INIT, "5 layer topology section type is wrong\n");
+		return -EIO;
+	}
+
+	size = le16_to_cpu(section->section_entry[0].size);
+	offset = le16_to_cpu(section->section_entry[0].offset);
+	if (size < ICE_MIN_S_SZ || size > ICE_MAX_S_SZ) {
+		ice_debug(hw, ICE_DBG_INIT, "5 layer topology section size is wrong\n");
+		return -EIO;
+	}
+
+	/* make sure the section fits in the buffer */
+	if (offset + size > ICE_PKG_BUF_SIZE) {
+		ice_debug(hw, ICE_DBG_INIT, "5 layer topology buffer > 4K\n");
+		return -EIO;
+	}
+
+	/* Get the new topology buffer */
+	new_topo = ((u8 *)section) + offset;
+
+update_topo:
+	/* acquire global lock to make sure that set topology issued
+	 * by one PF
+	 */
+	status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
+	if (status) {
+		ice_debug(hw, ICE_DBG_INIT, "Failed to acquire global lock\n");
+		return status;
+	}
+
+	/* check reset was triggered already or not */
+	reg = rd32(hw, GLGEN_RSTAT);
+	if (reg & GLGEN_RSTAT_DEVSTATE_M) {
+		/* Reset is in progress, re-init the hw again */
+		ice_debug(hw, ICE_DBG_INIT, "Reset is in progress. layer topology might be applied already\n");
+		ice_check_reset(hw);
+		return 0;
+	}
+
+	/* set new topology */
+	status = ice_get_set_tx_topo(hw, new_topo, size, NULL, NULL, true);
+	if (status) {
+		ice_debug(hw, ICE_DBG_INIT, "Set tx topology is failed\n");
+		return status;
+	}
+
+	/* new topology is updated, delay 1 second before issuing the CORRER */
+	for (i = 0; i < 10; i++)
+		msleep(100);
+	ice_reset(hw, ICE_RESET_CORER);
+	/* CORER will clear the global lock, so no explicit call
+	 * required for release
+	 */
+	return 0;
+}
+
 /**
  * ice_pkg_buf_alloc
  * @hw: pointer to the HW structure
diff --git a/drivers/net/ethernet/intel/ice/ice_flex_type.h b/drivers/net/ethernet/intel/ice/ice_flex_type.h
index 974d14a83b2e..ebbb5a1db8c7 100644
--- a/drivers/net/ethernet/intel/ice/ice_flex_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_flex_type.h
@@ -29,8 +29,9 @@ struct ice_pkg_hdr {
 
 /* generic segment */
 struct ice_generic_seg_hdr {
-#define SEGMENT_TYPE_METADATA	0x00000001
-#define SEGMENT_TYPE_ICE	0x00000010
+#define SEGMENT_TYPE_METADATA	      0x00000001
+#define SEGMENT_TYPE_ICE	      0x00000010
+#define SEGMENT_TYPE_ICE_RUN_TIME_CFG 0x00000020
 	__le32 seg_type;
 	struct ice_pkg_ver seg_format_ver;
 	__le32 seg_size;
@@ -73,6 +74,12 @@ struct ice_buf_table {
 	struct ice_buf buf_array[];
 };
 
+struct ice_run_time_cfg_seg {
+	struct ice_generic_seg_hdr hdr;
+	u8 rsvd[8];
+	struct ice_buf_table buf_table;
+};
+
 /* global metadata specific segment */
 struct ice_global_metadata_seg {
 	struct ice_generic_seg_hdr hdr;
@@ -181,6 +188,9 @@ struct ice_meta_sect {
 /* The following define MUST be updated to reflect the last label section ID */
 #define ICE_SID_LBL_LAST		0x80000038
 
+/* Label ICE runtime configuration section IDs */
+#define ICE_SID_TX_5_LAYER_TOPO		0x10
+
 enum ice_block {
 	ICE_BLK_SW = 0,
 	ICE_BLK_ACL,
@@ -706,4 +716,7 @@ struct ice_meta_init_section {
 	__le16 offset;
 	struct ice_meta_init_entry entry;
 };
+
+int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len);
+
 #endif /* _ICE_FLEX_TYPE_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_sched.h b/drivers/net/ethernet/intel/ice/ice_sched.h
index 4f91577fed56..86dc0f1f4255 100644
--- a/drivers/net/ethernet/intel/ice/ice_sched.h
+++ b/drivers/net/ethernet/intel/ice/ice_sched.h
@@ -6,6 +6,9 @@
 
 #include "ice_common.h"
 
+#define ICE_SCHED_5_LAYERS	5
+#define ICE_SCHED_9_LAYERS	9
+
 #define ICE_QGRP_LAYER_OFFSET	2
 #define ICE_VSI_LAYER_OFFSET	4
 #define ICE_AGG_LAYER_OFFSET	6
diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h
index f2a518a1fd94..ff793fe2a2e7 100644
--- a/drivers/net/ethernet/intel/ice/ice_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_type.h
@@ -290,6 +290,7 @@ struct ice_hw_common_caps {
 	bool pcie_reset_avoidance;
 	/* Post update reset restriction */
 	bool reset_restrict_support;
+	bool tx_sched_topo_comp_mode_en;
 };
 
 /* IEEE 1588 TIME_SYNC specific info */
-- 
2.27.0

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology
  2022-06-24 10:21 [Intel-wired-lan] [PATCH net-next v2 0/2] ice: Support 5 layer tx scheduler topology Michal Wilczynski
  2022-06-24 10:21 ` [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology Michal Wilczynski
@ 2022-06-24 10:21 ` Michal Wilczynski
  2022-06-25  7:10   ` Paul Menzel
  2022-06-28 22:27   ` Tony Nguyen
  1 sibling, 2 replies; 13+ messages in thread
From: Michal Wilczynski @ 2022-06-24 10:21 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: Michal Wilczynski

Introduce support for tx scheduler topology change, based on
user selection, from default 9-layer to 5-layer.
In order for switch to be successful there is a new NVM
and DDP package required.
This commit enables 5-layer topology in init path of
the driver, so before ice driver load, the user selection
should be changed in NVM using some external tools.

Title: Enable switching default tx scheduler topology
Change-type: ImplementationChange
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_common.c   |   2 +
 .../net/ethernet/intel/ice/ice_flex_pipe.c    |   3 +-
 drivers/net/ethernet/intel/ice/ice_main.c     | 113 +++++++++++++++---
 drivers/net/ethernet/intel/ice/ice_sched.c    |  35 +++---
 4 files changed, 116 insertions(+), 37 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index 8b65e2bfb160..167f9d5c345a 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1535,6 +1535,8 @@ ice_aq_send_cmd(struct ice_hw *hw, struct ice_aq_desc *desc, void *buf,
 	case ice_aqc_opc_set_port_params:
 	case ice_aqc_opc_get_vlan_mode_parameters:
 	case ice_aqc_opc_set_vlan_mode_parameters:
+	case ice_aqc_opc_set_tx_topo:
+	case ice_aqc_opc_get_tx_topo:
 	case ice_aqc_opc_add_recipe:
 	case ice_aqc_opc_recipe_to_profile:
 	case ice_aqc_opc_get_recipe:
diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
index b3bd345ea0f3..c2359366b48e 100644
--- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
+++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
@@ -1953,7 +1953,8 @@ int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len)
 	/* acquire global lock to make sure that set topology issued
 	 * by one PF
 	 */
-	status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
+	status = ice_acquire_res(hw, ICE_GLOBAL_CFG_LOCK_RES_ID, ICE_RES_WRITE,
+				 ICE_GLOBAL_CFG_LOCK_TIMEOUT);
 	if (status) {
 		ice_debug(hw, ICE_DBG_INIT, "Failed to acquire global lock\n");
 		return status;
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index c1ac2f746714..5f827bea05d9 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -4453,11 +4453,11 @@ static char *ice_get_opt_fw_name(struct ice_pf *pf)
 /**
  * ice_request_fw - Device initialization routine
  * @pf: pointer to the PF instance
+ * @firmware: double pointer to firmware struct
  */
-static void ice_request_fw(struct ice_pf *pf)
+static int ice_request_fw(struct ice_pf *pf, const struct firmware **firmware)
 {
 	char *opt_fw_filename = ice_get_opt_fw_name(pf);
-	const struct firmware *firmware = NULL;
 	struct device *dev = ice_pf_to_dev(pf);
 	int err = 0;
 
@@ -4466,29 +4466,98 @@ static void ice_request_fw(struct ice_pf *pf)
 	 * and warning messages for other errors.
 	 */
 	if (opt_fw_filename) {
-		err = firmware_request_nowarn(&firmware, opt_fw_filename, dev);
-		if (err) {
-			kfree(opt_fw_filename);
-			goto dflt_pkg_load;
-		}
-
-		/* request for firmware was successful. Download to device */
-		ice_load_pkg(firmware, pf);
+		err = firmware_request_nowarn(firmware, opt_fw_filename, dev);
 		kfree(opt_fw_filename);
-		release_firmware(firmware);
-		return;
+		if (!err)
+			return err;
 	}
 
-dflt_pkg_load:
-	err = request_firmware(&firmware, ICE_DDP_PKG_FILE, dev);
-	if (err) {
+	err = request_firmware(firmware, ICE_DDP_PKG_FILE, dev);
+	if (err)
 		dev_err(dev, "The DDP package file was not found or could not be read. Entering Safe Mode\n");
-		return;
+
+	return err;
+}
+
+/**
+ * ice_init_tx_topology - performs Tx topology initialization
+ * @hw: pointer to the hardware structure
+ * @firmware: pointer to firmware structure
+ */
+static int ice_init_tx_topology(struct ice_hw *hw,
+				const struct firmware *firmware)
+{
+	u8 num_tx_sched_layers = hw->num_tx_sched_layers;
+	struct ice_pf *pf = hw->back;
+	struct device *dev;
+	u8 *buf_copy;
+	int err = 0;
+
+	dev = ice_pf_to_dev(pf);
+	/* ice_cfg_tx_topo buf argument is not a constant,
+	 * so we have to make a copy
+	 */
+	buf_copy = devm_kmemdup(ice_hw_to_dev(hw), firmware->data,
+				firmware->size, GFP_KERNEL);
+
+	err = ice_cfg_tx_topo(hw, buf_copy, firmware->size);
+	if (!err) {
+		if (hw->num_tx_sched_layers > num_tx_sched_layers)
+			dev_info(dev, "Transmit balancing feature disabled\n");
+		else
+			dev_info(dev, "Transmit balancing feature enabled\n");
+		/* if there was a change in topology ice_cfg_tx_topo triggered
+		 * a CORER and we need to re-init hw
+		 */
+		ice_deinit_hw(hw);
+		err = ice_init_hw(hw);
+
+		/* in this case we're not allowing safe mode */
+		devm_kfree(ice_hw_to_dev(hw), buf_copy);
+
+		return err;
+
+	} else if (err == -EIO) {
+		dev_info(dev, "DDP package does not support transmit balancing feature - please update to the latest DDP package and try again\n");
 	}
 
-	/* request for firmware was successful. Download to device */
+	devm_kfree(ice_hw_to_dev(hw), buf_copy);
+
+	return 0;
+}
+
+/**
+ * ice_init_ddp_config - DDP related configuration
+ * @hw: pointer to the hardware structure
+ * @pf: pointer to pf structure
+ *
+ * This function loads DDP file from the disk, then initializes tx
+ * topology. At the end DDP package is loaded on the card.
+ */
+static int ice_init_ddp_config(struct ice_hw *hw, struct ice_pf *pf)
+{
+	struct device *dev = ice_pf_to_dev(pf);
+	const struct firmware *firmware = NULL;
+	int err = 0;
+
+	err = ice_request_fw(pf, &firmware);
+	if (err)
+	/* we can still operate in safe mode if DDP package load fails */
+		return 0;
+
+	err = ice_init_tx_topology(hw, firmware);
+	if (err) {
+		dev_err(dev, "ice_init_hw during change of tx topology failed: %d\n",
+			err);
+		release_firmware(firmware);
+		return err;
+	}
+
+	/* Download firmware to device */
 	ice_load_pkg(firmware, pf);
 	release_firmware(firmware);
+
+	return 0;
 }
 
 /**
@@ -4641,9 +4710,15 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
 
 	ice_init_feature_support(pf);
 
-	ice_request_fw(pf);
+	err = ice_init_ddp_config(hw, pf);
+
+	/* during topology change ice_init_hw may fail */
+	if (err) {
+		err = -EIO;
+		goto err_exit_unroll;
+	}
 
-	/* if ice_request_fw fails, ICE_FLAG_ADV_FEATURES bit won't be
+	/* if ice_init_ddp_config fails, ICE_FLAG_ADV_FEATURES bit won't be
 	 * set in pf->state, which will cause ice_is_safe_mode to return
 	 * true
 	 */
diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c b/drivers/net/ethernet/intel/ice/ice_sched.c
index 7947223536e3..f18a7121ca55 100644
--- a/drivers/net/ethernet/intel/ice/ice_sched.c
+++ b/drivers/net/ethernet/intel/ice/ice_sched.c
@@ -1102,12 +1102,11 @@ static u8 ice_sched_get_vsi_layer(struct ice_hw *hw)
 	 *     5 or less       sw_entry_point_layer
 	 */
 	/* calculate the VSI layer based on number of layers. */
-	if (hw->num_tx_sched_layers > ICE_VSI_LAYER_OFFSET + 1) {
-		u8 layer = hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
-
-		if (layer > hw->sw_entry_point_layer)
-			return layer;
-	}
+	if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
+		return hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
+	else if (hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS)
+		/* qgroup and VSI layers are same */
+		return hw->num_tx_sched_layers - ICE_QGRP_LAYER_OFFSET;
 	return hw->sw_entry_point_layer;
 }
 
@@ -1124,12 +1123,8 @@ static u8 ice_sched_get_agg_layer(struct ice_hw *hw)
 	 *     7 or less       sw_entry_point_layer
 	 */
 	/* calculate the aggregator layer based on number of layers. */
-	if (hw->num_tx_sched_layers > ICE_AGG_LAYER_OFFSET + 1) {
-		u8 layer = hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
-
-		if (layer > hw->sw_entry_point_layer)
-			return layer;
-	}
+	if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
+		return hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
 	return hw->sw_entry_point_layer;
 }
 
@@ -1485,10 +1480,11 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
 {
 	struct ice_sched_node *vsi_node, *qgrp_node;
 	struct ice_vsi_ctx *vsi_ctx;
+	u8 qgrp_layer, vsi_layer;
 	u16 max_children;
-	u8 qgrp_layer;
 
 	qgrp_layer = ice_sched_get_qgrp_layer(pi->hw);
+	vsi_layer = ice_sched_get_vsi_layer(pi->hw);
 	max_children = pi->hw->max_children[qgrp_layer];
 
 	vsi_ctx = ice_get_vsi_ctx(pi->hw, vsi_handle);
@@ -1499,6 +1495,12 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
 	if (!vsi_node)
 		return NULL;
 
+	/* If the queue group and vsi layer are same then queues
+	 * are all attached directly to VSI
+	 */
+	if (qgrp_layer == vsi_layer)
+		return vsi_node;
+
 	/* get the first queue group node from VSI sub-tree */
 	qgrp_node = ice_sched_get_first_node(pi, vsi_node, qgrp_layer);
 	while (qgrp_node) {
@@ -3178,8 +3180,9 @@ ice_sched_add_rl_profile(struct ice_port_info *pi,
 	u8 profile_type;
 	int status;
 
-	if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
+	if (!pi || layer_num >= pi->hw->num_tx_sched_layers)
 		return NULL;
+
 	switch (rl_type) {
 	case ICE_MIN_BW:
 		profile_type = ICE_AQC_RL_PROFILE_TYPE_CIR;
@@ -3194,8 +3197,6 @@ ice_sched_add_rl_profile(struct ice_port_info *pi,
 		return NULL;
 	}
 
-	if (!pi)
-		return NULL;
 	hw = pi->hw;
 	list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
 			    list_entry)
@@ -3425,7 +3426,7 @@ ice_sched_rm_rl_profile(struct ice_port_info *pi, u8 layer_num, u8 profile_type,
 	struct ice_aqc_rl_profile_info *rl_prof_elem;
 	int status = 0;
 
-	if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
+	if (layer_num >= pi->hw->num_tx_sched_layers)
 		return -EINVAL;
 	/* Check the existing list for RL profile */
 	list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
-- 
2.27.0

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology
  2022-06-24 10:21 ` [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology Michal Wilczynski
@ 2022-06-25  7:10   ` Paul Menzel
  2022-06-29  9:54     ` Wilczynski, Michal
  2022-06-28 22:27   ` Tony Nguyen
  1 sibling, 1 reply; 13+ messages in thread
From: Paul Menzel @ 2022-06-25  7:10 UTC (permalink / raw)
  To: Michal Wilczynski; +Cc: intel-wired-lan

Dear Michal,


Thank you for your patch.

Am 24.06.22 um 12:21 schrieb Michal Wilczynski:
> Introduce support for tx scheduler topology change, based on
> user selection, from default 9-layer to 5-layer.
> In order for switch to be successful there is a new NVM
> and DDP package required.

What versions are needed exactly?

> This commit enables 5-layer topology in init path of
> the driver, so before ice driver load, the user selection
> should be changed in NVM using some external tools.

Please add one example, how to change it.

Could you please reflow for 75 characters per line, and do not break 
lines just because a sentence ends, or add a blank line between paragraphs?

> Title: Enable switching default tx scheduler topology
> Change-type: ImplementationChange

These two tags are not used upstream I thinks.

> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
> ---
>   drivers/net/ethernet/intel/ice/ice_common.c   |   2 +
>   .../net/ethernet/intel/ice/ice_flex_pipe.c    |   3 +-
>   drivers/net/ethernet/intel/ice/ice_main.c     | 113 +++++++++++++++---
>   drivers/net/ethernet/intel/ice/ice_sched.c    |  35 +++---
>   4 files changed, 116 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
> index 8b65e2bfb160..167f9d5c345a 100644
> --- a/drivers/net/ethernet/intel/ice/ice_common.c
> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> @@ -1535,6 +1535,8 @@ ice_aq_send_cmd(struct ice_hw *hw, struct ice_aq_desc *desc, void *buf,
>   	case ice_aqc_opc_set_port_params:
>   	case ice_aqc_opc_get_vlan_mode_parameters:
>   	case ice_aqc_opc_set_vlan_mode_parameters:
> +	case ice_aqc_opc_set_tx_topo:
> +	case ice_aqc_opc_get_tx_topo:
>   	case ice_aqc_opc_add_recipe:
>   	case ice_aqc_opc_recipe_to_profile:
>   	case ice_aqc_opc_get_recipe:
> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
> index b3bd345ea0f3..c2359366b48e 100644
> --- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
> +++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
> @@ -1953,7 +1953,8 @@ int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len)
>   	/* acquire global lock to make sure that set topology issued
>   	 * by one PF
>   	 */
> -	status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
> +	status = ice_acquire_res(hw, ICE_GLOBAL_CFG_LOCK_RES_ID, ICE_RES_WRITE,
> +				 ICE_GLOBAL_CFG_LOCK_TIMEOUT);
>   	if (status) {
>   		ice_debug(hw, ICE_DBG_INIT, "Failed to acquire global lock\n");
>   		return status;
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index c1ac2f746714..5f827bea05d9 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -4453,11 +4453,11 @@ static char *ice_get_opt_fw_name(struct ice_pf *pf)
>   /**
>    * ice_request_fw - Device initialization routine
>    * @pf: pointer to the PF instance
> + * @firmware: double pointer to firmware struct
>    */
> -static void ice_request_fw(struct ice_pf *pf)
> +static int ice_request_fw(struct ice_pf *pf, const struct firmware **firmware)
>   {
>   	char *opt_fw_filename = ice_get_opt_fw_name(pf);
> -	const struct firmware *firmware = NULL;
>   	struct device *dev = ice_pf_to_dev(pf);
>   	int err = 0;
>   
> @@ -4466,29 +4466,98 @@ static void ice_request_fw(struct ice_pf *pf)
>   	 * and warning messages for other errors.
>   	 */
>   	if (opt_fw_filename) {
> -		err = firmware_request_nowarn(&firmware, opt_fw_filename, dev);
> -		if (err) {
> -			kfree(opt_fw_filename);
> -			goto dflt_pkg_load;
> -		}
> -
> -		/* request for firmware was successful. Download to device */
> -		ice_load_pkg(firmware, pf);
> +		err = firmware_request_nowarn(firmware, opt_fw_filename, dev);
>   		kfree(opt_fw_filename);
> -		release_firmware(firmware);
> -		return;
> +		if (!err)
> +			return err;

Why is the code above removed?

>   	}
>   
> -dflt_pkg_load:
> -	err = request_firmware(&firmware, ICE_DDP_PKG_FILE, dev);
> -	if (err) {
> +	err = request_firmware(firmware, ICE_DDP_PKG_FILE, dev);
> +	if (err)
>   		dev_err(dev, "The DDP package file was not found or could not be read. Entering Safe Mode\n");
> -		return;
> +
> +	return err;
> +}
> +
> +/**
> + * ice_init_tx_topology - performs Tx topology initialization
> + * @hw: pointer to the hardware structure
> + * @firmware: pointer to firmware structure
> + */
> +static int ice_init_tx_topology(struct ice_hw *hw,
> +				const struct firmware *firmware)
> +{
> +	u8 num_tx_sched_layers = hw->num_tx_sched_layers;
> +	struct ice_pf *pf = hw->back;
> +	struct device *dev;
> +	u8 *buf_copy;
> +	int err = 0;
> +
> +	dev = ice_pf_to_dev(pf);
> +	/* ice_cfg_tx_topo buf argument is not a constant,
> +	 * so we have to make a copy
> +	 */
> +	buf_copy = devm_kmemdup(ice_hw_to_dev(hw), firmware->data,
> +				firmware->size, GFP_KERNEL);
> +
> +	err = ice_cfg_tx_topo(hw, buf_copy, firmware->size);
> +	if (!err) {
> +		if (hw->num_tx_sched_layers > num_tx_sched_layers)
> +			dev_info(dev, "Transmit balancing feature disabled\n");

Should this be an error?

> +		else
> +			dev_info(dev, "Transmit balancing feature enabled\n");

Add a blank line below?

> +		/* if there was a change in topology ice_cfg_tx_topo triggered
> +		 * a CORER and we need to re-init hw

What is CORER? Add a dot/period at the end?

> +		 */
> +		ice_deinit_hw(hw);
> +		err = ice_init_hw(hw);
> +
> +		/* in this case we're not allowing safe mode */
> +		devm_kfree(ice_hw_to_dev(hw), buf_copy);
> +
> +		return err;
> +
> +	} else if (err == -EIO) {
> +		dev_info(dev, "DDP package does not support transmit balancing feature - please update to the latest DDP package and try again\n");

Please mention a version, where it’s supposed to work? Can the DDP 
version be printed out too?

Should this be a warning?

>   	}
>   
> -	/* request for firmware was successful. Download to device */
> +	devm_kfree(ice_hw_to_dev(hw), buf_copy);
> +
> +	return 0;
> +}
> +
> +/**
> + * ice_init_ddp_config - DDP related configuration
> + * @hw: pointer to the hardware structure
> + * @pf: pointer to pf structure
> + *
> + * This function loads DDP file from the disk, then initializes tx
> + * topology. At the end DDP package is loaded on the card.
> + */
> +static int ice_init_ddp_config(struct ice_hw *hw, struct ice_pf *pf)
> +{
> +	struct device *dev = ice_pf_to_dev(pf);
> +	const struct firmware *firmware = NULL;
> +	int err = 0;
> +
> +	err = ice_request_fw(pf, &firmware);
> +	if (err)
> +	/* we can still operate in safe mode if DDP package load fails */

Indent the comment?

> +		return 0;
> +
> +	err = ice_init_tx_topology(hw, firmware);
> +	if (err) {
> +		dev_err(dev, "ice_init_hw during change of tx topology failed: %d\n",
> +			err);
> +		release_firmware(firmware);
> +		return err;
> +	}
> +
> +	/* Download firmware to device */
>   	ice_load_pkg(firmware, pf);
>   	release_firmware(firmware);
> +
> +	return 0;
>   }
>   
>   /**
> @@ -4641,9 +4710,15 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>   
>   	ice_init_feature_support(pf);
>   
> -	ice_request_fw(pf);
> +	err = ice_init_ddp_config(hw, pf);
> +
> +	/* during topology change ice_init_hw may fail */
> +	if (err) {
> +		err = -EIO;
> +		goto err_exit_unroll;
> +	}
>   
> -	/* if ice_request_fw fails, ICE_FLAG_ADV_FEATURES bit won't be
> +	/* if ice_init_ddp_config fails, ICE_FLAG_ADV_FEATURES bit won't be
>   	 * set in pf->state, which will cause ice_is_safe_mode to return
>   	 * true
>   	 */
> diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c b/drivers/net/ethernet/intel/ice/ice_sched.c
> index 7947223536e3..f18a7121ca55 100644
> --- a/drivers/net/ethernet/intel/ice/ice_sched.c
> +++ b/drivers/net/ethernet/intel/ice/ice_sched.c
> @@ -1102,12 +1102,11 @@ static u8 ice_sched_get_vsi_layer(struct ice_hw *hw)
>   	 *     5 or less       sw_entry_point_layer
>   	 */
>   	/* calculate the VSI layer based on number of layers. */
> -	if (hw->num_tx_sched_layers > ICE_VSI_LAYER_OFFSET + 1) {
> -		u8 layer = hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
> -
> -		if (layer > hw->sw_entry_point_layer)
> -			return layer;
> -	}
> +	if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
> +		return hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
> +	else if (hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS)
> +		/* qgroup and VSI layers are same */
> +		return hw->num_tx_sched_layers - ICE_QGRP_LAYER_OFFSET;
>   	return hw->sw_entry_point_layer;
>   }
>   
> @@ -1124,12 +1123,8 @@ static u8 ice_sched_get_agg_layer(struct ice_hw *hw)
>   	 *     7 or less       sw_entry_point_layer
>   	 */
>   	/* calculate the aggregator layer based on number of layers. */
> -	if (hw->num_tx_sched_layers > ICE_AGG_LAYER_OFFSET + 1) {
> -		u8 layer = hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
> -
> -		if (layer > hw->sw_entry_point_layer)
> -			return layer;
> -	}
> +	if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
> +		return hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
>   	return hw->sw_entry_point_layer;
>   }
>   
> @@ -1485,10 +1480,11 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
>   {
>   	struct ice_sched_node *vsi_node, *qgrp_node;
>   	struct ice_vsi_ctx *vsi_ctx;
> +	u8 qgrp_layer, vsi_layer;
>   	u16 max_children;
> -	u8 qgrp_layer;
>   
>   	qgrp_layer = ice_sched_get_qgrp_layer(pi->hw);
> +	vsi_layer = ice_sched_get_vsi_layer(pi->hw);
>   	max_children = pi->hw->max_children[qgrp_layer];
>   
>   	vsi_ctx = ice_get_vsi_ctx(pi->hw, vsi_handle);
> @@ -1499,6 +1495,12 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
>   	if (!vsi_node)
>   		return NULL;
>   
> +	/* If the queue group and vsi layer are same then queues
> +	 * are all attached directly to VSI
> +	 */
> +	if (qgrp_layer == vsi_layer)
> +		return vsi_node;
> +
>   	/* get the first queue group node from VSI sub-tree */
>   	qgrp_node = ice_sched_get_first_node(pi, vsi_node, qgrp_layer);
>   	while (qgrp_node) {
> @@ -3178,8 +3180,9 @@ ice_sched_add_rl_profile(struct ice_port_info *pi,
>   	u8 profile_type;
>   	int status;
>   
> -	if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
> +	if (!pi || layer_num >= pi->hw->num_tx_sched_layers)
>   		return NULL;
> +
>   	switch (rl_type) {
>   	case ICE_MIN_BW:
>   		profile_type = ICE_AQC_RL_PROFILE_TYPE_CIR;
> @@ -3194,8 +3197,6 @@ ice_sched_add_rl_profile(struct ice_port_info *pi,
>   		return NULL;
>   	}
>   
> -	if (!pi)
> -		return NULL;

This hunk (and above) could be a separate commit?

>   	hw = pi->hw;
>   	list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
>   			    list_entry)
> @@ -3425,7 +3426,7 @@ ice_sched_rm_rl_profile(struct ice_port_info *pi, u8 layer_num, u8 profile_type,
>   	struct ice_aqc_rl_profile_info *rl_prof_elem;
>   	int status = 0;
>   
> -	if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
> +	if (layer_num >= pi->hw->num_tx_sched_layers)
>   		return -EINVAL;
>   	/* Check the existing list for RL profile */
>   	list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],


Kind regards,

Paul
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology
  2022-06-24 10:21 ` [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology Michal Wilczynski
@ 2022-06-28 22:21   ` Tony Nguyen
  2022-06-29  8:54     ` Wilczynski, Michal
  2022-06-29 10:49   ` Paul Menzel
  1 sibling, 1 reply; 13+ messages in thread
From: Tony Nguyen @ 2022-06-28 22:21 UTC (permalink / raw)
  To: Michal Wilczynski, intel-wired-lan; +Cc: Raj Victor



On 6/24/2022 3:21 AM, Michal Wilczynski wrote:
> From: Raj Victor <victor.raj@intel.com>
> 
> There was a performance issue reported when the number of VSIs were
> not multiple of 8. This was caused due to the max children limitation
> per node(8) in 9 layer topology. The BW credits are shared evenly
> among the children by default. Assume one node has 8 children and
> the other has 1. The parent of these nodes share the BW credit equally
> among them. Apparently this casued a problem for the first node which
> has 8 children. The 9th VM got more BW credits than the first 8 VMs.
> 
> Example:
> 1) With 8 VM's:
> tx_queue_0_packets: 23283027
> tx_queue_1_packets: 23292289
> tx_queue_2_packets: 23276136
> tx_queue_3_packets: 23279828
> tx_queue_4_packets: 23279828
> tx_queue_5_packets: 23279333
> tx_queue_6_packets: 23277745
> tx_queue_7_packets: 23279950
> tx_queue_8_packets: 0
> 
> 2) With 9 VM's:
> tx_queue_0_packets: 24163396
> tx_queue_1_packets: 24164623
> tx_queue_2_packets: 24163188
> tx_queue_3_packets: 24163701
> tx_queue_4_packets: 24163683
> tx_queue_5_packets: 24164668
> tx_queue_6_packets: 23327200
> tx_queue_7_packets: 24163853
> tx_queue_8_packets: 91101417
> 
> So on average queue 8 statistics show that 3.7 times
> more packets were send there than from the other queues.
> 
> The FW has increased the max number of children per node by reducing
> the number of layers from 9 to 5. Reflect this on driver side.
> 
> Signed-off-by: Raj Victor <victor.raj@intel.com>
> Co-developed-by: Michal Wilczynski <michal.wilczynski@intel.com>
> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
>   .../net/ethernet/intel/ice/ice_adminq_cmd.h   |  23 ++
>   drivers/net/ethernet/intel/ice/ice_common.c   |   3 +
>   .../net/ethernet/intel/ice/ice_flex_pipe.c    | 202 ++++++++++++++++++
>   .../net/ethernet/intel/ice/ice_flex_type.h    |  17 +-
>   drivers/net/ethernet/intel/ice/ice_sched.h    |   3 +
>   drivers/net/ethernet/intel/ice/ice_type.h     |   1 +
>   6 files changed, 247 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
> index 05cb9dd7035a..2d1da00b4926 100644
> --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
> +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
> @@ -120,6 +120,7 @@ struct ice_aqc_list_caps_elem {
>   #define ICE_AQC_CAPS_PCIE_RESET_AVOIDANCE		0x0076
>   #define ICE_AQC_CAPS_POST_UPDATE_RESET_RESTRICT		0x0077
>   #define ICE_AQC_CAPS_NVM_MGMT				0x0080
> +#define ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE		0x0085
>   
>   	u8 major_ver;
>   	u8 minor_ver;
> @@ -798,6 +799,24 @@ struct ice_aqc_get_topo {
>   	__le32 addr_low;
>   };
>   
> +/* Get/Set Tx Topology (indirect 0x0418/0x0417) */
> +struct ice_aqc_get_set_tx_topo {
> +	u8 set_flags;
> +#define ICE_AQC_TX_TOPO_FLAGS_CORRER		BIT(0)
> +#define ICE_AQC_TX_TOPO_FLAGS_SRC_RAM		BIT(1)
> +#define ICE_AQC_TX_TOPO_FLAGS_SET_PSM		BIT(2)
> +#define ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW		BIT(4)
> +#define ICE_AQC_TX_TOPO_FLAGS_ISSUED		BIT(5)

Not all the defines are being used, please remove the unused ones.
Also, please add newline between the member & related defines from the 
next to make the associations a little more distinct.

> +	u8 get_flags;
> +#define ICE_AQC_TX_TOPO_GET_NO_UPDATE		0
> +#define ICE_AQC_TX_TOPO_GET_PSM			1
> +#define ICE_AQC_TX_TOPO_GET_RAM			2

Same comment here

> +	__le16 reserved1;
> +	__le32 reserved2;
> +	__le32 addr_high;
> +	__le32 addr_low;
> +};
> +
>   /* Update TSE (indirect 0x0403)
>    * Get TSE (indirect 0x0404)
>    * Add TSE (indirect 0x0401)
> @@ -2126,6 +2145,7 @@ struct ice_aq_desc {
>   		struct ice_aqc_get_link_topo get_link_topo;
>   		struct ice_aqc_i2c read_i2c;
>   		struct ice_aqc_read_i2c_resp read_i2c_resp;
> +		struct ice_aqc_get_set_tx_topo get_set_tx_topo;
>   	} params;
>   };
>   
> @@ -2231,6 +2251,9 @@ enum ice_adminq_opc {
>   	ice_aqc_opc_query_sched_res			= 0x0412,
>   	ice_aqc_opc_remove_rl_profiles			= 0x0415,
>   
> +	ice_aqc_opc_set_tx_topo				= 0x0417,
> +	ice_aqc_opc_get_tx_topo				= 0x0418,
> +
>   	/* PHY commands */
>   	ice_aqc_opc_get_phy_caps			= 0x0600,
>   	ice_aqc_opc_set_phy_cfg				= 0x0601,
> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
> index 9619bdb9e49a..8b65e2bfb160 100644
> --- a/drivers/net/ethernet/intel/ice/ice_common.c
> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> @@ -2091,6 +2091,9 @@ ice_parse_common_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps,
>   			  "%s: reset_restrict_support = %d\n", prefix,
>   			  caps->reset_restrict_support);
>   		break;
> +	case ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE:
> +		caps->tx_sched_topo_comp_mode_en = (number == 1);
> +		break;
>   	default:
>   		/* Not one of the recognized common capabilities */
>   		found = false;
> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
> index c73cdab44f70..b3bd345ea0f3 100644
> --- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
> +++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
> @@ -4,6 +4,7 @@
>   #include "ice_common.h"
>   #include "ice_flex_pipe.h"
>   #include "ice_flow.h"
> +#include "ice_sched.h"
>   #include "ice.h"
>   
>   /* For supporting double VLAN mode, it is necessary to enable or disable certain
> @@ -1783,6 +1784,207 @@ bool ice_is_init_pkg_successful(enum ice_ddp_state state)
>   	}
>   }
>   
> +/**
> + * ice_get_set_tx_topo - get or set tx topology

nit: s/tx/Tx
a lot of other places within this patch as well

> + * @hw: pointer to the HW struct
> + * @buf: pointer to tx topology buffer
> + * @buf_size: buffer size
> + * @cd: pointer to command details structure or NULL
> + * @flags: pointer to descriptor flags
> + * @set: 0-get, 1-set topology
> + *
> + * The function will get or set tx topology
> + */
> +static int
> +ice_get_set_tx_topo(struct ice_hw *hw, u8 *buf, u16 buf_size,
> +		    struct ice_sq_cd *cd, u8 *flags, bool set)
> +{
> +	struct ice_aqc_get_set_tx_topo *cmd;
> +	struct ice_aq_desc desc;
> +	int status;
> +
> +	cmd = &desc.params.get_set_tx_topo;
> +	if (set) {
> +		cmd->set_flags = ICE_AQC_TX_TOPO_FLAGS_ISSUED;
> +		/* requested to update a new topology, not a default topology */
> +		if (buf)
> +			cmd->set_flags |= ICE_AQC_TX_TOPO_FLAGS_SRC_RAM |
> +					  ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW;
> +	} else {
> +		cmd->get_flags = ICE_AQC_TX_TOPO_GET_RAM;
> +	}
> +	ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_tx_topo);
> +	desc.flags |= cpu_to_le16(ICE_AQ_FLAG_RD);
> +	status = ice_aq_send_cmd(hw, &desc, buf, buf_size, cd);
> +	if (status)
> +		return status;
> +	/* read the return flag values (first byte) for get operation */
> +	if (!set && flags)
> +		*flags = desc.params.get_set_tx_topo.set_flags;
> +
> +	return 0;
> +}
> +
> +/**
> + * ice_cfg_tx_topo - Initialize new tx topology if available
> + * @hw: pointer to the HW struct
> + * @buf: pointer to Tx topology buffer
> + * @len: buffer size
> + *
> + * The function will apply the new Tx topology from the package buffer
> + * if available.
> + */
> +int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len)
> +{
> +	u8 *current_topo, *new_topo = NULL;
> +	struct ice_run_time_cfg_seg *seg;
> +	struct ice_buf_hdr *section;
> +	struct ice_pkg_hdr *pkg_hdr;
> +	enum ice_ddp_state state;
> +	u16 i, size = 0, offset;
> +	u32 reg = 0;
> +	int status;
> +	u8 flags;
> +
> +	if (!buf || !len)
> +		return -EINVAL;
> +
> +	/* Does FW support new Tx topology mode ? */
> +	if (!hw->func_caps.common_cap.tx_sched_topo_comp_mode_en) {
> +		ice_debug(hw, ICE_DBG_INIT, "FW doesn't support compatibility mode\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	current_topo = kzalloc(ICE_AQ_MAX_BUF_LEN, GFP_KERNEL);
> +

No newline between the error checks please.

> +	if (!current_topo)
> +		return -ENOMEM;
> +
> +	/* get the current Tx topology */
> +	status = ice_get_set_tx_topo(hw, current_topo, ICE_AQ_MAX_BUF_LEN, NULL,
> +				     &flags, false);
> +
> +	kfree(current_topo);
> +
> +	if (status) {
> +		ice_debug(hw, ICE_DBG_INIT, "Get current topology is failed\n");
> +		return status;
> +	}
> +
> +	/* Is default topology already applied ? */
> +	if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
> +	    hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS) {
> +		ice_debug(hw, ICE_DBG_INIT, "Loaded default topology\n");
> +		/* Already default topology is loaded */
> +		return -EEXIST;
> +	}
> +
> +	/* Is new topology already applied ? */
> +	if ((flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
> +	    hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) {
> +		ice_debug(hw, ICE_DBG_INIT, "Loaded new topology\n");
> +		/* Already new topology is loaded */
> +		return -EEXIST;
> +	}
> +
> +	/* Is set topology issued already ? */
> +	if (flags & ICE_AQC_TX_TOPO_FLAGS_ISSUED) {
> +		ice_debug(hw, ICE_DBG_INIT, "Update tx topology was done by another PF\n");
> +		/* add a small delay before exiting */
> +		for (i = 0; i < 20; i++)
> +			msleep(100);
> +		return -EEXIST;
> +	}
> +
> +	/* Change the topology from new to default (5 to 9) */
> +	if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
> +	    hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) {
> +		ice_debug(hw, ICE_DBG_INIT, "Change topology from 5 to 9 layers\n");
> +		goto update_topo;
> +	}
> +
> +	pkg_hdr = (struct ice_pkg_hdr *)buf;
> +	state = ice_verify_pkg(pkg_hdr, len);
> +	if (state) {
> +		ice_debug(hw, ICE_DBG_INIT, "failed to verify pkg (err: %d)\n",
> +			  state);
> +		return -EIO;
> +	}
> +
> +	/* find run time configuration segment */
> +	seg = (struct ice_run_time_cfg_seg *)
> +		ice_find_seg_in_pkg(hw, SEGMENT_TYPE_ICE_RUN_TIME_CFG, pkg_hdr);
> +	if (!seg) {
> +		ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment is missing\n");
> +		return -EIO;
> +	}
> +
> +	if (le32_to_cpu(seg->buf_table.buf_count) < ICE_MIN_S_COUNT) {
> +		ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment count(%d) is wrong\n",
> +			  seg->buf_table.buf_count);
> +		return -EIO;
> +	}
> +
> +	section = ice_pkg_val_buf(seg->buf_table.buf_array);
> +
> +	if (!section || le32_to_cpu(section->section_entry[0].type) !=
> +		ICE_SID_TX_5_LAYER_TOPO) {
> +		ice_debug(hw, ICE_DBG_INIT, "5 layer topology section type is wrong\n");
> +		return -EIO;
> +	}
> +
> +	size = le16_to_cpu(section->section_entry[0].size);
> +	offset = le16_to_cpu(section->section_entry[0].offset);
> +	if (size < ICE_MIN_S_SZ || size > ICE_MAX_S_SZ) {
> +		ice_debug(hw, ICE_DBG_INIT, "5 layer topology section size is wrong\n");
> +		return -EIO;
> +	}
> +
> +	/* make sure the section fits in the buffer */
> +	if (offset + size > ICE_PKG_BUF_SIZE) {
> +		ice_debug(hw, ICE_DBG_INIT, "5 layer topology buffer > 4K\n");
> +		return -EIO;
> +	}
> +
> +	/* Get the new topology buffer */
> +	new_topo = ((u8 *)section) + offset;
> +
> +update_topo:
> +	/* acquire global lock to make sure that set topology issued
> +	 * by one PF
> +	 */
> +	status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
> +	if (status) {
> +		ice_debug(hw, ICE_DBG_INIT, "Failed to acquire global lock\n");
> +		return status;
> +	}
> +
> +	/* check reset was triggered already or not */
> +	reg = rd32(hw, GLGEN_RSTAT);
> +	if (reg & GLGEN_RSTAT_DEVSTATE_M) {
> +		/* Reset is in progress, re-init the hw again */
> +		ice_debug(hw, ICE_DBG_INIT, "Reset is in progress. layer topology might be applied already\n");
> +		ice_check_reset(hw);
> +		return 0;
> +	}
> +
> +	/* set new topology */
> +	status = ice_get_set_tx_topo(hw, new_topo, size, NULL, NULL, true);
> +	if (status) {
> +		ice_debug(hw, ICE_DBG_INIT, "Set tx topology is failed\n");
> +		return status;
> +	}
> +
> +	/* new topology is updated, delay 1 second before issuing the CORRER */
> +	for (i = 0; i < 10; i++)
> +		msleep(100);
> +	ice_reset(hw, ICE_RESET_CORER);
> +	/* CORER will clear the global lock, so no explicit call
> +	 * required for release
> +	 */
> +	return 0;
> +}
> +
>   /**
>    * ice_pkg_buf_alloc
>    * @hw: pointer to the HW structure
> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_type.h b/drivers/net/ethernet/intel/ice/ice_flex_type.h
> index 974d14a83b2e..ebbb5a1db8c7 100644
> --- a/drivers/net/ethernet/intel/ice/ice_flex_type.h
> +++ b/drivers/net/ethernet/intel/ice/ice_flex_type.h
> @@ -29,8 +29,9 @@ struct ice_pkg_hdr {
>   
>   /* generic segment */
>   struct ice_generic_seg_hdr {
> -#define SEGMENT_TYPE_METADATA	0x00000001
> -#define SEGMENT_TYPE_ICE	0x00000010
> +#define SEGMENT_TYPE_METADATA	      0x00000001
> +#define SEGMENT_TYPE_ICE	      0x00000010
> +#define SEGMENT_TYPE_ICE_RUN_TIME_CFG 0x00000020
>   	__le32 seg_type;
>   	struct ice_pkg_ver seg_format_ver;
>   	__le32 seg_size;
> @@ -73,6 +74,12 @@ struct ice_buf_table {
>   	struct ice_buf buf_array[];
>   };
>   
> +struct ice_run_time_cfg_seg {
> +	struct ice_generic_seg_hdr hdr;
> +	u8 rsvd[8];
> +	struct ice_buf_table buf_table;
> +};
> +
>   /* global metadata specific segment */
>   struct ice_global_metadata_seg {
>   	struct ice_generic_seg_hdr hdr;
> @@ -181,6 +188,9 @@ struct ice_meta_sect {
>   /* The following define MUST be updated to reflect the last label section ID */
>   #define ICE_SID_LBL_LAST		0x80000038
>   
> +/* Label ICE runtime configuration section IDs */
> +#define ICE_SID_TX_5_LAYER_TOPO		0x10
> +
>   enum ice_block {
>   	ICE_BLK_SW = 0,
>   	ICE_BLK_ACL,
> @@ -706,4 +716,7 @@ struct ice_meta_init_section {
>   	__le16 offset;
>   	struct ice_meta_init_entry entry;
>   };
> +
> +int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len);

I don't see an external caller, should it be static?

> +
>   #endif /* _ICE_FLEX_TYPE_H_ */
> diff --git a/drivers/net/ethernet/intel/ice/ice_sched.h b/drivers/net/ethernet/intel/ice/ice_sched.h
> index 4f91577fed56..86dc0f1f4255 100644
> --- a/drivers/net/ethernet/intel/ice/ice_sched.h
> +++ b/drivers/net/ethernet/intel/ice/ice_sched.h
> @@ -6,6 +6,9 @@
>   
>   #include "ice_common.h"
>   
> +#define ICE_SCHED_5_LAYERS	5
> +#define ICE_SCHED_9_LAYERS	9
> +
>   #define ICE_QGRP_LAYER_OFFSET	2
>   #define ICE_VSI_LAYER_OFFSET	4
>   #define ICE_AGG_LAYER_OFFSET	6
> diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h
> index f2a518a1fd94..ff793fe2a2e7 100644
> --- a/drivers/net/ethernet/intel/ice/ice_type.h
> +++ b/drivers/net/ethernet/intel/ice/ice_type.h
> @@ -290,6 +290,7 @@ struct ice_hw_common_caps {
>   	bool pcie_reset_avoidance;
>   	/* Post update reset restriction */
>   	bool reset_restrict_support;
> +	bool tx_sched_topo_comp_mode_en;
>   };
>   
>   /* IEEE 1588 TIME_SYNC specific info */
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology
  2022-06-24 10:21 ` [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology Michal Wilczynski
  2022-06-25  7:10   ` Paul Menzel
@ 2022-06-28 22:27   ` Tony Nguyen
  2022-07-01 15:19     ` Wilczynski, Michal
  1 sibling, 1 reply; 13+ messages in thread
From: Tony Nguyen @ 2022-06-28 22:27 UTC (permalink / raw)
  To: Michal Wilczynski, intel-wired-lan



On 6/24/2022 3:21 AM, Michal Wilczynski wrote:
> Introduce support for tx scheduler topology change, based on
> user selection, from default 9-layer to 5-layer.
> In order for switch to be successful there is a new NVM
> and DDP package required.
> This commit enables 5-layer topology in init path of
> the driver, so before ice driver load, the user selection
> should be changed in NVM using some external tools.
> 
> Title: Enable switching default tx scheduler topology
> Change-type: ImplementationChange
> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
> ---

...

> +/**
> + * ice_init_tx_topology - performs Tx topology initialization
> + * @hw: pointer to the hardware structure
> + * @firmware: pointer to firmware structure
> + */
> +static int ice_init_tx_topology(struct ice_hw *hw,
> +				const struct firmware *firmware)
> +{
> +	u8 num_tx_sched_layers = hw->num_tx_sched_layers;
> +	struct ice_pf *pf = hw->back;
> +	struct device *dev;
> +	u8 *buf_copy;
> +	int err = 0;

This initialization is unnecessary.

> +
> +	dev = ice_pf_to_dev(pf);
> +	/* ice_cfg_tx_topo buf argument is not a constant,
> +	 * so we have to make a copy
> +	 */
> +	buf_copy = devm_kmemdup(ice_hw_to_dev(hw), firmware->data,
> +				firmware->size, GFP_KERNEL);

It looks like the devm variant is not needed

> +
> +	err = ice_cfg_tx_topo(hw, buf_copy, firmware->size);
> +	if (!err) {
> +		if (hw->num_tx_sched_layers > num_tx_sched_layers)
> +			dev_info(dev, "Transmit balancing feature disabled\n");
> +		else
> +			dev_info(dev, "Transmit balancing feature enabled\n");
> +		/* if there was a change in topology ice_cfg_tx_topo triggered
> +		 * a CORER and we need to re-init hw
> +		 */
> +		ice_deinit_hw(hw);
> +		err = ice_init_hw(hw);
> +
> +		/* in this case we're not allowing safe mode */
> +		devm_kfree(ice_hw_to_dev(hw), buf_copy);
> +
> +		return err;
> +
> +	} else if (err == -EIO) {
> +		dev_info(dev, "DDP package does not support transmit balancing feature - please update to the latest DDP package and try again\n");
>   	}
>   
> -	/* request for firmware was successful. Download to device */
> +	devm_kfree(ice_hw_to_dev(hw), buf_copy);
> +
> +	return 0;
> +}
> +
> +/**
> + * ice_init_ddp_config - DDP related configuration
> + * @hw: pointer to the hardware structure
> + * @pf: pointer to pf structure
> + *
> + * This function loads DDP file from the disk, then initializes tx
> + * topology. At the end DDP package is loaded on the card.
> + */
> +static int ice_init_ddp_config(struct ice_hw *hw, struct ice_pf *pf)
> +{
> +	struct device *dev = ice_pf_to_dev(pf);
> +	const struct firmware *firmware = NULL;
> +	int err = 0;

Initialization not needed.

> +
> +	err = ice_request_fw(pf, &firmware);
> +	if (err)
> +	/* we can still operate in safe mode if DDP package load fails */
> +		return 0;
> +
> +	err = ice_init_tx_topology(hw, firmware);
> +	if (err) {
> +		dev_err(dev, "ice_init_hw during change of tx topology failed: %d\n",
> +			err);
> +		release_firmware(firmware);
> +		return err;
> +	}
> +
> +	/* Download firmware to device */
>   	ice_load_pkg(firmware, pf);
>   	release_firmware(firmware);
> +
> +	return 0;
>   }
>   
>   /**
> @@ -4641,9 +4710,15 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
>   
>   	ice_init_feature_support(pf);
>   
> -	ice_request_fw(pf);
> +	err = ice_init_ddp_config(hw, pf);
> +
> +	/* during topology change ice_init_hw may fail */
> +	if (err) {
> +		err = -EIO;
> +		goto err_exit_unroll;
> +	}
>   
> -	/* if ice_request_fw fails, ICE_FLAG_ADV_FEATURES bit won't be
> +	/* if ice_init_ddp_config fails, ICE_FLAG_ADV_FEATURES bit won't be
>   	 * set in pf->state, which will cause ice_is_safe_mode to return
>   	 * true
>   	 */
> diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c b/drivers/net/ethernet/intel/ice/ice_sched.c
> index 7947223536e3..f18a7121ca55 100644
> --- a/drivers/net/ethernet/intel/ice/ice_sched.c
> +++ b/drivers/net/ethernet/intel/ice/ice_sched.c
> @@ -1102,12 +1102,11 @@ static u8 ice_sched_get_vsi_layer(struct ice_hw *hw)
>   	 *     5 or less       sw_entry_point_layer
>   	 */
>   	/* calculate the VSI layer based on number of layers. */
> -	if (hw->num_tx_sched_layers > ICE_VSI_LAYER_OFFSET + 1) {
> -		u8 layer = hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
> -
> -		if (layer > hw->sw_entry_point_layer)
> -			return layer;
> -	}
> +	if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
> +		return hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
> +	else if (hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS)
> +		/* qgroup and VSI layers are same */
> +		return hw->num_tx_sched_layers - ICE_QGRP_LAYER_OFFSET;
>   	return hw->sw_entry_point_layer;

These can all be if's since they all return:

	if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
		return hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
	if (hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS)
		/* qgroup and VSI layers are same */
		return hw->num_tx_sched_layers - ICE_QGRP_LAYER_OFFSET;
  	return hw->sw_entry_point_layer;

>   }
>   
> @@ -1124,12 +1123,8 @@ static u8 ice_sched_get_agg_layer(struct ice_hw *hw)
>   	 *     7 or less       sw_entry_point_layer
>   	 */
>   	/* calculate the aggregator layer based on number of layers. */
> -	if (hw->num_tx_sched_layers > ICE_AGG_LAYER_OFFSET + 1) {
> -		u8 layer = hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
> -
> -		if (layer > hw->sw_entry_point_layer)
> -			return layer;
> -	}
> +	if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
> +		return hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
>   	return hw->sw_entry_point_layer;
>   }
>   
> @@ -1485,10 +1480,11 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
>   {
>   	struct ice_sched_node *vsi_node, *qgrp_node;
>   	struct ice_vsi_ctx *vsi_ctx;
> +	u8 qgrp_layer, vsi_layer;
>   	u16 max_children;
> -	u8 qgrp_layer;
>   
>   	qgrp_layer = ice_sched_get_qgrp_layer(pi->hw);
> +	vsi_layer = ice_sched_get_vsi_layer(pi->hw);
>   	max_children = pi->hw->max_children[qgrp_layer];
>   
>   	vsi_ctx = ice_get_vsi_ctx(pi->hw, vsi_handle);
> @@ -1499,6 +1495,12 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 vsi_handle, u8 tc,
>   	if (!vsi_node)
>   		return NULL;
>   
> +	/* If the queue group and vsi layer are same then queues
> +	 * are all attached directly to VSI
> +	 */
> +	if (qgrp_layer == vsi_layer)
> +		return vsi_node;
> +
>   	/* get the first queue group node from VSI sub-tree */
>   	qgrp_node = ice_sched_get_first_node(pi, vsi_node, qgrp_layer);
>   	while (qgrp_node) {
> @@ -3178,8 +3180,9 @@ ice_sched_add_rl_profile(struct ice_port_info *pi,
>   	u8 profile_type;
>   	int status;
>   
> -	if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
> +	if (!pi || layer_num >= pi->hw->num_tx_sched_layers)
>   		return NULL;
> +
>   	switch (rl_type) {
>   	case ICE_MIN_BW:
>   		profile_type = ICE_AQC_RL_PROFILE_TYPE_CIR;
> @@ -3194,8 +3197,6 @@ ice_sched_add_rl_profile(struct ice_port_info *pi,
>   		return NULL;
>   	}
>   
> -	if (!pi)
> -		return NULL;
>   	hw = pi->hw;
>   	list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
>   			    list_entry)
> @@ -3425,7 +3426,7 @@ ice_sched_rm_rl_profile(struct ice_port_info *pi, u8 layer_num, u8 profile_type,
>   	struct ice_aqc_rl_profile_info *rl_prof_elem;
>   	int status = 0;
>   
> -	if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
> +	if (layer_num >= pi->hw->num_tx_sched_layers)
>   		return -EINVAL;
>   	/* Check the existing list for RL profile */
>   	list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology
  2022-06-28 22:21   ` Tony Nguyen
@ 2022-06-29  8:54     ` Wilczynski, Michal
  2022-06-29 15:57       ` Tony Nguyen
  0 siblings, 1 reply; 13+ messages in thread
From: Wilczynski, Michal @ 2022-06-29  8:54 UTC (permalink / raw)
  To: Tony Nguyen, intel-wired-lan; +Cc: Raj Victor

Hi, thanks for your review

On 6/29/2022 12:21 AM, Tony Nguyen wrote:
>
>
> On 6/24/2022 3:21 AM, Michal Wilczynski wrote:
>> From: Raj Victor <victor.raj@intel.com>
>>
>> There was a performance issue reported when the number of VSIs were
>> not multiple of 8. This was caused due to the max children limitation
>> per node(8) in 9 layer topology. The BW credits are shared evenly
>> among the children by default. Assume one node has 8 children and
>> the other has 1. The parent of these nodes share the BW credit equally
>> among them. Apparently this casued a problem for the first node which
>> has 8 children. The 9th VM got more BW credits than the first 8 VMs.
>>
>> Example:
>> 1) With 8 VM's:
>> tx_queue_0_packets: 23283027
>> tx_queue_1_packets: 23292289
>> tx_queue_2_packets: 23276136
>> tx_queue_3_packets: 23279828
>> tx_queue_4_packets: 23279828
>> tx_queue_5_packets: 23279333
>> tx_queue_6_packets: 23277745
>> tx_queue_7_packets: 23279950
>> tx_queue_8_packets: 0
>>
>> 2) With 9 VM's:
>> tx_queue_0_packets: 24163396
>> tx_queue_1_packets: 24164623
>> tx_queue_2_packets: 24163188
>> tx_queue_3_packets: 24163701
>> tx_queue_4_packets: 24163683
>> tx_queue_5_packets: 24164668
>> tx_queue_6_packets: 23327200
>> tx_queue_7_packets: 24163853
>> tx_queue_8_packets: 91101417
>>
>> So on average queue 8 statistics show that 3.7 times
>> more packets were send there than from the other queues.
>>
>> The FW has increased the max number of children per node by reducing
>> the number of layers from 9 to 5. Reflect this on driver side.
>>
>> Signed-off-by: Raj Victor <victor.raj@intel.com>
>> Co-developed-by: Michal Wilczynski <michal.wilczynski@intel.com>
>> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
>> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>> ---
>>   .../net/ethernet/intel/ice/ice_adminq_cmd.h   |  23 ++
>>   drivers/net/ethernet/intel/ice/ice_common.c   |   3 +
>>   .../net/ethernet/intel/ice/ice_flex_pipe.c    | 202 ++++++++++++++++++
>>   .../net/ethernet/intel/ice/ice_flex_type.h    |  17 +-
>>   drivers/net/ethernet/intel/ice/ice_sched.h    |   3 +
>>   drivers/net/ethernet/intel/ice/ice_type.h     |   1 +
>>   6 files changed, 247 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h 
>> b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
>> index 05cb9dd7035a..2d1da00b4926 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
>> +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
>> @@ -120,6 +120,7 @@ struct ice_aqc_list_caps_elem {
>>   #define ICE_AQC_CAPS_PCIE_RESET_AVOIDANCE        0x0076
>>   #define ICE_AQC_CAPS_POST_UPDATE_RESET_RESTRICT        0x0077
>>   #define ICE_AQC_CAPS_NVM_MGMT                0x0080
>> +#define ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE        0x0085
>>         u8 major_ver;
>>       u8 minor_ver;
>> @@ -798,6 +799,24 @@ struct ice_aqc_get_topo {
>>       __le32 addr_low;
>>   };
>>   +/* Get/Set Tx Topology (indirect 0x0418/0x0417) */
>> +struct ice_aqc_get_set_tx_topo {
>> +    u8 set_flags;
>> +#define ICE_AQC_TX_TOPO_FLAGS_CORRER        BIT(0)
>> +#define ICE_AQC_TX_TOPO_FLAGS_SRC_RAM        BIT(1)
>> +#define ICE_AQC_TX_TOPO_FLAGS_SET_PSM        BIT(2)
>> +#define ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW        BIT(4)
>> +#define ICE_AQC_TX_TOPO_FLAGS_ISSUED        BIT(5)
>
> Not all the defines are being used, please remove the unused ones.
> Also, please add newline between the member & related defines from the 
> next to make the associations a little more distinct.


Hi, good point with the unused ones, but the newline idea doesn't seem 
to follow the convention used in the file.


>
>> +    u8 get_flags;
>> +#define ICE_AQC_TX_TOPO_GET_NO_UPDATE        0
>> +#define ICE_AQC_TX_TOPO_GET_PSM            1
>> +#define ICE_AQC_TX_TOPO_GET_RAM            2
>
> Same comment here
>
>> +    __le16 reserved1;
>> +    __le32 reserved2;
>> +    __le32 addr_high;
>> +    __le32 addr_low;
>> +};
>> +
>>   /* Update TSE (indirect 0x0403)
>>    * Get TSE (indirect 0x0404)
>>    * Add TSE (indirect 0x0401)
>> @@ -2126,6 +2145,7 @@ struct ice_aq_desc {
>>           struct ice_aqc_get_link_topo get_link_topo;
>>           struct ice_aqc_i2c read_i2c;
>>           struct ice_aqc_read_i2c_resp read_i2c_resp;
>> +        struct ice_aqc_get_set_tx_topo get_set_tx_topo;
>>       } params;
>>   };
>>   @@ -2231,6 +2251,9 @@ enum ice_adminq_opc {
>>       ice_aqc_opc_query_sched_res            = 0x0412,
>>       ice_aqc_opc_remove_rl_profiles            = 0x0415,
>>   +    ice_aqc_opc_set_tx_topo                = 0x0417,
>> +    ice_aqc_opc_get_tx_topo                = 0x0418,
>> +
>>       /* PHY commands */
>>       ice_aqc_opc_get_phy_caps            = 0x0600,
>>       ice_aqc_opc_set_phy_cfg                = 0x0601,
>> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c 
>> b/drivers/net/ethernet/intel/ice/ice_common.c
>> index 9619bdb9e49a..8b65e2bfb160 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_common.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
>> @@ -2091,6 +2091,9 @@ ice_parse_common_caps(struct ice_hw *hw, struct 
>> ice_hw_common_caps *caps,
>>                 "%s: reset_restrict_support = %d\n", prefix,
>>                 caps->reset_restrict_support);
>>           break;
>> +    case ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE:
>> +        caps->tx_sched_topo_comp_mode_en = (number == 1);
>> +        break;
>>       default:
>>           /* Not one of the recognized common capabilities */
>>           found = false;
>> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c 
>> b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>> index c73cdab44f70..b3bd345ea0f3 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>> @@ -4,6 +4,7 @@
>>   #include "ice_common.h"
>>   #include "ice_flex_pipe.h"
>>   #include "ice_flow.h"
>> +#include "ice_sched.h"
>>   #include "ice.h"
>>     /* For supporting double VLAN mode, it is necessary to enable or 
>> disable certain
>> @@ -1783,6 +1784,207 @@ bool ice_is_init_pkg_successful(enum 
>> ice_ddp_state state)
>>       }
>>   }
>>   +/**
>> + * ice_get_set_tx_topo - get or set tx topology
>
> nit: s/tx/Tx
> a lot of other places within this patch as well


OK I'll try to find all that and fix


>
>> + * @hw: pointer to the HW struct
>> + * @buf: pointer to tx topology buffer
>> + * @buf_size: buffer size
>> + * @cd: pointer to command details structure or NULL
>> + * @flags: pointer to descriptor flags
>> + * @set: 0-get, 1-set topology
>> + *
>> + * The function will get or set tx topology
>> + */
>> +static int
>> +ice_get_set_tx_topo(struct ice_hw *hw, u8 *buf, u16 buf_size,
>> +            struct ice_sq_cd *cd, u8 *flags, bool set)
>> +{
>> +    struct ice_aqc_get_set_tx_topo *cmd;
>> +    struct ice_aq_desc desc;
>> +    int status;
>> +
>> +    cmd = &desc.params.get_set_tx_topo;
>> +    if (set) {
>> +        cmd->set_flags = ICE_AQC_TX_TOPO_FLAGS_ISSUED;
>> +        /* requested to update a new topology, not a default 
>> topology */
>> +        if (buf)
>> +            cmd->set_flags |= ICE_AQC_TX_TOPO_FLAGS_SRC_RAM |
>> +                      ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW;
>> +    } else {
>> +        cmd->get_flags = ICE_AQC_TX_TOPO_GET_RAM;
>> +    }
>> +    ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_tx_topo);
>> +    desc.flags |= cpu_to_le16(ICE_AQ_FLAG_RD);
>> +    status = ice_aq_send_cmd(hw, &desc, buf, buf_size, cd);
>> +    if (status)
>> +        return status;
>> +    /* read the return flag values (first byte) for get operation */
>> +    if (!set && flags)
>> +        *flags = desc.params.get_set_tx_topo.set_flags;
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>> + * ice_cfg_tx_topo - Initialize new tx topology if available
>> + * @hw: pointer to the HW struct
>> + * @buf: pointer to Tx topology buffer
>> + * @len: buffer size
>> + *
>> + * The function will apply the new Tx topology from the package buffer
>> + * if available.
>> + */
>> +int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len)
>> +{
>> +    u8 *current_topo, *new_topo = NULL;
>> +    struct ice_run_time_cfg_seg *seg;
>> +    struct ice_buf_hdr *section;
>> +    struct ice_pkg_hdr *pkg_hdr;
>> +    enum ice_ddp_state state;
>> +    u16 i, size = 0, offset;
>> +    u32 reg = 0;
>> +    int status;
>> +    u8 flags;
>> +
>> +    if (!buf || !len)
>> +        return -EINVAL;
>> +
>> +    /* Does FW support new Tx topology mode ? */
>> +    if (!hw->func_caps.common_cap.tx_sched_topo_comp_mode_en) {
>> +        ice_debug(hw, ICE_DBG_INIT, "FW doesn't support 
>> compatibility mode\n");
>> +        return -EOPNOTSUPP;
>> +    }
>> +
>> +    current_topo = kzalloc(ICE_AQ_MAX_BUF_LEN, GFP_KERNEL);
>> +
>
> No newline between the error checks please.


OK


>
>> +    if (!current_topo)
>> +        return -ENOMEM;
>> +
>> +    /* get the current Tx topology */
>> +    status = ice_get_set_tx_topo(hw, current_topo, 
>> ICE_AQ_MAX_BUF_LEN, NULL,
>> +                     &flags, false);
>> +
>> +    kfree(current_topo);
>> +
>> +    if (status) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Get current topology is 
>> failed\n");
>> +        return status;
>> +    }
>> +
>> +    /* Is default topology already applied ? */
>> +    if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
>> +        hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Loaded default topology\n");
>> +        /* Already default topology is loaded */
>> +        return -EEXIST;
>> +    }
>> +
>> +    /* Is new topology already applied ? */
>> +    if ((flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
>> +        hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Loaded new topology\n");
>> +        /* Already new topology is loaded */
>> +        return -EEXIST;
>> +    }
>> +
>> +    /* Is set topology issued already ? */
>> +    if (flags & ICE_AQC_TX_TOPO_FLAGS_ISSUED) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Update tx topology was done by 
>> another PF\n");
>> +        /* add a small delay before exiting */
>> +        for (i = 0; i < 20; i++)
>> +            msleep(100);
>> +        return -EEXIST;
>> +    }
>> +
>> +    /* Change the topology from new to default (5 to 9) */
>> +    if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
>> +        hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Change topology from 5 to 9 
>> layers\n");
>> +        goto update_topo;
>> +    }
>> +
>> +    pkg_hdr = (struct ice_pkg_hdr *)buf;
>> +    state = ice_verify_pkg(pkg_hdr, len);
>> +    if (state) {
>> +        ice_debug(hw, ICE_DBG_INIT, "failed to verify pkg (err: %d)\n",
>> +              state);
>> +        return -EIO;
>> +    }
>> +
>> +    /* find run time configuration segment */
>> +    seg = (struct ice_run_time_cfg_seg *)
>> +        ice_find_seg_in_pkg(hw, SEGMENT_TYPE_ICE_RUN_TIME_CFG, 
>> pkg_hdr);
>> +    if (!seg) {
>> +        ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment is 
>> missing\n");
>> +        return -EIO;
>> +    }
>> +
>> +    if (le32_to_cpu(seg->buf_table.buf_count) < ICE_MIN_S_COUNT) {
>> +        ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment 
>> count(%d) is wrong\n",
>> +              seg->buf_table.buf_count);
>> +        return -EIO;
>> +    }
>> +
>> +    section = ice_pkg_val_buf(seg->buf_table.buf_array);
>> +
>> +    if (!section || le32_to_cpu(section->section_entry[0].type) !=
>> +        ICE_SID_TX_5_LAYER_TOPO) {
>> +        ice_debug(hw, ICE_DBG_INIT, "5 layer topology section type 
>> is wrong\n");
>> +        return -EIO;
>> +    }
>> +
>> +    size = le16_to_cpu(section->section_entry[0].size);
>> +    offset = le16_to_cpu(section->section_entry[0].offset);
>> +    if (size < ICE_MIN_S_SZ || size > ICE_MAX_S_SZ) {
>> +        ice_debug(hw, ICE_DBG_INIT, "5 layer topology section size 
>> is wrong\n");
>> +        return -EIO;
>> +    }
>> +
>> +    /* make sure the section fits in the buffer */
>> +    if (offset + size > ICE_PKG_BUF_SIZE) {
>> +        ice_debug(hw, ICE_DBG_INIT, "5 layer topology buffer > 4K\n");
>> +        return -EIO;
>> +    }
>> +
>> +    /* Get the new topology buffer */
>> +    new_topo = ((u8 *)section) + offset;
>> +
>> +update_topo:
>> +    /* acquire global lock to make sure that set topology issued
>> +     * by one PF
>> +     */
>> +    status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
>> +    if (status) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Failed to acquire global lock\n");
>> +        return status;
>> +    }
>> +
>> +    /* check reset was triggered already or not */
>> +    reg = rd32(hw, GLGEN_RSTAT);
>> +    if (reg & GLGEN_RSTAT_DEVSTATE_M) {
>> +        /* Reset is in progress, re-init the hw again */
>> +        ice_debug(hw, ICE_DBG_INIT, "Reset is in progress. layer 
>> topology might be applied already\n");
>> +        ice_check_reset(hw);
>> +        return 0;
>> +    }
>> +
>> +    /* set new topology */
>> +    status = ice_get_set_tx_topo(hw, new_topo, size, NULL, NULL, true);
>> +    if (status) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Set tx topology is failed\n");
>> +        return status;
>> +    }
>> +
>> +    /* new topology is updated, delay 1 second before issuing the 
>> CORRER */
>> +    for (i = 0; i < 10; i++)
>> +        msleep(100);
>> +    ice_reset(hw, ICE_RESET_CORER);
>> +    /* CORER will clear the global lock, so no explicit call
>> +     * required for release
>> +     */
>> +    return 0;
>> +}
>> +
>>   /**
>>    * ice_pkg_buf_alloc
>>    * @hw: pointer to the HW structure
>> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_type.h 
>> b/drivers/net/ethernet/intel/ice/ice_flex_type.h
>> index 974d14a83b2e..ebbb5a1db8c7 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_flex_type.h
>> +++ b/drivers/net/ethernet/intel/ice/ice_flex_type.h
>> @@ -29,8 +29,9 @@ struct ice_pkg_hdr {
>>     /* generic segment */
>>   struct ice_generic_seg_hdr {
>> -#define SEGMENT_TYPE_METADATA    0x00000001
>> -#define SEGMENT_TYPE_ICE    0x00000010
>> +#define SEGMENT_TYPE_METADATA          0x00000001
>> +#define SEGMENT_TYPE_ICE          0x00000010
>> +#define SEGMENT_TYPE_ICE_RUN_TIME_CFG 0x00000020
>>       __le32 seg_type;
>>       struct ice_pkg_ver seg_format_ver;
>>       __le32 seg_size;
>> @@ -73,6 +74,12 @@ struct ice_buf_table {
>>       struct ice_buf buf_array[];
>>   };
>>   +struct ice_run_time_cfg_seg {
>> +    struct ice_generic_seg_hdr hdr;
>> +    u8 rsvd[8];
>> +    struct ice_buf_table buf_table;
>> +};
>> +
>>   /* global metadata specific segment */
>>   struct ice_global_metadata_seg {
>>       struct ice_generic_seg_hdr hdr;
>> @@ -181,6 +188,9 @@ struct ice_meta_sect {
>>   /* The following define MUST be updated to reflect the last label 
>> section ID */
>>   #define ICE_SID_LBL_LAST        0x80000038
>>   +/* Label ICE runtime configuration section IDs */
>> +#define ICE_SID_TX_5_LAYER_TOPO        0x10
>> +
>>   enum ice_block {
>>       ICE_BLK_SW = 0,
>>       ICE_BLK_ACL,
>> @@ -706,4 +716,7 @@ struct ice_meta_init_section {
>>       __le16 offset;
>>       struct ice_meta_init_entry entry;
>>   };
>> +
>> +int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len);
>
> I don't see an external caller, should it be static?


It's called in the next commit in this patch series


>
>> +
>>   #endif /* _ICE_FLEX_TYPE_H_ */
>> diff --git a/drivers/net/ethernet/intel/ice/ice_sched.h 
>> b/drivers/net/ethernet/intel/ice/ice_sched.h
>> index 4f91577fed56..86dc0f1f4255 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_sched.h
>> +++ b/drivers/net/ethernet/intel/ice/ice_sched.h
>> @@ -6,6 +6,9 @@
>>     #include "ice_common.h"
>>   +#define ICE_SCHED_5_LAYERS    5
>> +#define ICE_SCHED_9_LAYERS    9
>> +
>>   #define ICE_QGRP_LAYER_OFFSET    2
>>   #define ICE_VSI_LAYER_OFFSET    4
>>   #define ICE_AGG_LAYER_OFFSET    6
>> diff --git a/drivers/net/ethernet/intel/ice/ice_type.h 
>> b/drivers/net/ethernet/intel/ice/ice_type.h
>> index f2a518a1fd94..ff793fe2a2e7 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_type.h
>> +++ b/drivers/net/ethernet/intel/ice/ice_type.h
>> @@ -290,6 +290,7 @@ struct ice_hw_common_caps {
>>       bool pcie_reset_avoidance;
>>       /* Post update reset restriction */
>>       bool reset_restrict_support;
>> +    bool tx_sched_topo_comp_mode_en;
>>   };
>>     /* IEEE 1588 TIME_SYNC specific info */
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology
  2022-06-25  7:10   ` Paul Menzel
@ 2022-06-29  9:54     ` Wilczynski, Michal
  2022-07-01 13:12       ` Wilczynski, Michal
  0 siblings, 1 reply; 13+ messages in thread
From: Wilczynski, Michal @ 2022-06-29  9:54 UTC (permalink / raw)
  To: Paul Menzel; +Cc: intel-wired-lan

Hi, Thanks for your review

On 6/25/2022 9:10 AM, Paul Menzel wrote:
> Dear Michal,
>
>
> Thank you for your patch.
>
> Am 24.06.22 um 12:21 schrieb Michal Wilczynski:
>> Introduce support for tx scheduler topology change, based on
>> user selection, from default 9-layer to 5-layer.
>> In order for switch to be successful there is a new NVM
>> and DDP package required.
>
> What versions are needed exactly?


Will try to find exact numbers.


>
>> This commit enables 5-layer topology in init path of
>> the driver, so before ice driver load, the user selection
>> should be changed in NVM using some external tools.
>
> Please add one example, how to change it.


The reason why it's a bit cryptic is that the devlink mechanism is not 
upstreamed yet, and epct way of doing

that doesn't seem to work with upstream driver.


Will post an example of devlink way of doing that, it will be upstreamed 
very soon.



>
> Could you please reflow for 75 characters per line, and do not break 
> lines just because a sentence ends, or add a blank line between 
> paragraphs?
>
>> Title: Enable switching default tx scheduler topology
>> Change-type: ImplementationChange
>
> These two tags are not used upstream I thinks.


Yep, will fix that


>
>> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
>> ---
>>   drivers/net/ethernet/intel/ice/ice_common.c   |   2 +
>>   .../net/ethernet/intel/ice/ice_flex_pipe.c    |   3 +-
>>   drivers/net/ethernet/intel/ice/ice_main.c     | 113 +++++++++++++++---
>>   drivers/net/ethernet/intel/ice/ice_sched.c    |  35 +++---
>>   4 files changed, 116 insertions(+), 37 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c 
>> b/drivers/net/ethernet/intel/ice/ice_common.c
>> index 8b65e2bfb160..167f9d5c345a 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_common.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
>> @@ -1535,6 +1535,8 @@ ice_aq_send_cmd(struct ice_hw *hw, struct 
>> ice_aq_desc *desc, void *buf,
>>       case ice_aqc_opc_set_port_params:
>>       case ice_aqc_opc_get_vlan_mode_parameters:
>>       case ice_aqc_opc_set_vlan_mode_parameters:
>> +    case ice_aqc_opc_set_tx_topo:
>> +    case ice_aqc_opc_get_tx_topo:
>>       case ice_aqc_opc_add_recipe:
>>       case ice_aqc_opc_recipe_to_profile:
>>       case ice_aqc_opc_get_recipe:
>> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c 
>> b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>> index b3bd345ea0f3..c2359366b48e 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>> @@ -1953,7 +1953,8 @@ int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, 
>> u32 len)
>>       /* acquire global lock to make sure that set topology issued
>>        * by one PF
>>        */
>> -    status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
>> +    status = ice_acquire_res(hw, ICE_GLOBAL_CFG_LOCK_RES_ID, 
>> ICE_RES_WRITE,
>> +                 ICE_GLOBAL_CFG_LOCK_TIMEOUT);
>>       if (status) {
>>           ice_debug(hw, ICE_DBG_INIT, "Failed to acquire global 
>> lock\n");
>>           return status;
>> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c 
>> b/drivers/net/ethernet/intel/ice/ice_main.c
>> index c1ac2f746714..5f827bea05d9 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_main.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
>> @@ -4453,11 +4453,11 @@ static char *ice_get_opt_fw_name(struct 
>> ice_pf *pf)
>>   /**
>>    * ice_request_fw - Device initialization routine
>>    * @pf: pointer to the PF instance
>> + * @firmware: double pointer to firmware struct
>>    */
>> -static void ice_request_fw(struct ice_pf *pf)
>> +static int ice_request_fw(struct ice_pf *pf, const struct firmware 
>> **firmware)
>>   {
>>       char *opt_fw_filename = ice_get_opt_fw_name(pf);
>> -    const struct firmware *firmware = NULL;
>>       struct device *dev = ice_pf_to_dev(pf);
>>       int err = 0;
>>   @@ -4466,29 +4466,98 @@ static void ice_request_fw(struct ice_pf *pf)
>>        * and warning messages for other errors.
>>        */
>>       if (opt_fw_filename) {
>> -        err = firmware_request_nowarn(&firmware, opt_fw_filename, dev);
>> -        if (err) {
>> -            kfree(opt_fw_filename);
>> -            goto dflt_pkg_load;
>> -        }
>> -
>> -        /* request for firmware was successful. Download to device */
>> -        ice_load_pkg(firmware, pf);
>> +        err = firmware_request_nowarn(firmware, opt_fw_filename, dev);
>>           kfree(opt_fw_filename);
>> -        release_firmware(firmware);
>> -        return;
>> +        if (!err)
>> +            return err;
>
> Why is the code above removed?


The code received a bit of refactor, logical flow is a bit different, 
now we don't upload firmware to the device

immediately after loading it from the disk. We change the topology using 
AQ comand, then trigger CORER.

And only after that we upload firmware to the device.


>
>>       }
>>   -dflt_pkg_load:
>> -    err = request_firmware(&firmware, ICE_DDP_PKG_FILE, dev);
>> -    if (err) {
>> +    err = request_firmware(firmware, ICE_DDP_PKG_FILE, dev);
>> +    if (err)
>>           dev_err(dev, "The DDP package file was not found or could 
>> not be read. Entering Safe Mode\n");
>> -        return;
>> +
>> +    return err;
>> +}
>> +
>> +/**
>> + * ice_init_tx_topology - performs Tx topology initialization
>> + * @hw: pointer to the hardware structure
>> + * @firmware: pointer to firmware structure
>> + */
>> +static int ice_init_tx_topology(struct ice_hw *hw,
>> +                const struct firmware *firmware)
>> +{
>> +    u8 num_tx_sched_layers = hw->num_tx_sched_layers;
>> +    struct ice_pf *pf = hw->back;
>> +    struct device *dev;
>> +    u8 *buf_copy;
>> +    int err = 0;
>> +
>> +    dev = ice_pf_to_dev(pf);
>> +    /* ice_cfg_tx_topo buf argument is not a constant,
>> +     * so we have to make a copy
>> +     */
>> +    buf_copy = devm_kmemdup(ice_hw_to_dev(hw), firmware->data,
>> +                firmware->size, GFP_KERNEL);
>> +
>> +    err = ice_cfg_tx_topo(hw, buf_copy, firmware->size);
>> +    if (!err) {
>> +        if (hw->num_tx_sched_layers > num_tx_sched_layers)
>> +            dev_info(dev, "Transmit balancing feature disabled\n");
>
> Should this be an error?


This is an opt-in feature, so we don't treat failure as an error.


>
>> +        else
>> +            dev_info(dev, "Transmit balancing feature enabled\n");
>
> Add a blank line below?

Sure


>
>> +        /* if there was a change in topology ice_cfg_tx_topo triggered
>> +         * a CORER and we need to re-init hw
>
> What is CORER? Add a dot/period at the end?


CORER is a type of reset of our hardware. This concept seems to be used 
extensively in our driver, not sure

if this requires an explanation in the comment.


>
>> +         */
>> +        ice_deinit_hw(hw);
>> +        err = ice_init_hw(hw);
>> +
>> +        /* in this case we're not allowing safe mode */
>> +        devm_kfree(ice_hw_to_dev(hw), buf_copy);
>> +
>> +        return err;
>> +
>> +    } else if (err == -EIO) {
>> +        dev_info(dev, "DDP package does not support transmit 
>> balancing feature - please update to the latest DDP package and try 
>> again\n");
>
> Please mention a version, where it’s supposed to work? Can the DDP 
> version be printed out too?
>
> Should this be a warning?


I will try to find out exact versions required.

This could maybe be a warning, it wasn't cause this is an opt-in feature.


>
>>       }
>>   -    /* request for firmware was successful. Download to device */
>> +    devm_kfree(ice_hw_to_dev(hw), buf_copy);
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>> + * ice_init_ddp_config - DDP related configuration
>> + * @hw: pointer to the hardware structure
>> + * @pf: pointer to pf structure
>> + *
>> + * This function loads DDP file from the disk, then initializes tx
>> + * topology. At the end DDP package is loaded on the card.
>> + */
>> +static int ice_init_ddp_config(struct ice_hw *hw, struct ice_pf *pf)
>> +{
>> +    struct device *dev = ice_pf_to_dev(pf);
>> +    const struct firmware *firmware = NULL;
>> +    int err = 0;
>> +
>> +    err = ice_request_fw(pf, &firmware);
>> +    if (err)
>> +    /* we can still operate in safe mode if DDP package load fails */
>
> Indent the comment?


Sure


>
>> +        return 0;
>> +
>> +    err = ice_init_tx_topology(hw, firmware);
>> +    if (err) {
>> +        dev_err(dev, "ice_init_hw during change of tx topology 
>> failed: %d\n",
>> +            err);
>> +        release_firmware(firmware);
>> +        return err;
>> +    }
>> +
>> +    /* Download firmware to device */
>>       ice_load_pkg(firmware, pf);
>>       release_firmware(firmware);
>> +
>> +    return 0;
>>   }
>>     /**
>> @@ -4641,9 +4710,15 @@ ice_probe(struct pci_dev *pdev, const struct 
>> pci_device_id __always_unused *ent)
>>         ice_init_feature_support(pf);
>>   -    ice_request_fw(pf);
>> +    err = ice_init_ddp_config(hw, pf);
>> +
>> +    /* during topology change ice_init_hw may fail */
>> +    if (err) {
>> +        err = -EIO;
>> +        goto err_exit_unroll;
>> +    }
>>   -    /* if ice_request_fw fails, ICE_FLAG_ADV_FEATURES bit won't be
>> +    /* if ice_init_ddp_config fails, ICE_FLAG_ADV_FEATURES bit won't be
>>        * set in pf->state, which will cause ice_is_safe_mode to return
>>        * true
>>        */
>> diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c 
>> b/drivers/net/ethernet/intel/ice/ice_sched.c
>> index 7947223536e3..f18a7121ca55 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_sched.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_sched.c
>> @@ -1102,12 +1102,11 @@ static u8 ice_sched_get_vsi_layer(struct 
>> ice_hw *hw)
>>        *     5 or less       sw_entry_point_layer
>>        */
>>       /* calculate the VSI layer based on number of layers. */
>> -    if (hw->num_tx_sched_layers > ICE_VSI_LAYER_OFFSET + 1) {
>> -        u8 layer = hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
>> -
>> -        if (layer > hw->sw_entry_point_layer)
>> -            return layer;
>> -    }
>> +    if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
>> +        return hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
>> +    else if (hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS)
>> +        /* qgroup and VSI layers are same */
>> +        return hw->num_tx_sched_layers - ICE_QGRP_LAYER_OFFSET;
>>       return hw->sw_entry_point_layer;
>>   }
>>   @@ -1124,12 +1123,8 @@ static u8 ice_sched_get_agg_layer(struct 
>> ice_hw *hw)
>>        *     7 or less       sw_entry_point_layer
>>        */
>>       /* calculate the aggregator layer based on number of layers. */
>> -    if (hw->num_tx_sched_layers > ICE_AGG_LAYER_OFFSET + 1) {
>> -        u8 layer = hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
>> -
>> -        if (layer > hw->sw_entry_point_layer)
>> -            return layer;
>> -    }
>> +    if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
>> +        return hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
>>       return hw->sw_entry_point_layer;
>>   }
>>   @@ -1485,10 +1480,11 @@ ice_sched_get_free_qparent(struct 
>> ice_port_info *pi, u16 vsi_handle, u8 tc,
>>   {
>>       struct ice_sched_node *vsi_node, *qgrp_node;
>>       struct ice_vsi_ctx *vsi_ctx;
>> +    u8 qgrp_layer, vsi_layer;
>>       u16 max_children;
>> -    u8 qgrp_layer;
>>         qgrp_layer = ice_sched_get_qgrp_layer(pi->hw);
>> +    vsi_layer = ice_sched_get_vsi_layer(pi->hw);
>>       max_children = pi->hw->max_children[qgrp_layer];
>>         vsi_ctx = ice_get_vsi_ctx(pi->hw, vsi_handle);
>> @@ -1499,6 +1495,12 @@ ice_sched_get_free_qparent(struct 
>> ice_port_info *pi, u16 vsi_handle, u8 tc,
>>       if (!vsi_node)
>>           return NULL;
>>   +    /* If the queue group and vsi layer are same then queues
>> +     * are all attached directly to VSI
>> +     */
>> +    if (qgrp_layer == vsi_layer)
>> +        return vsi_node;
>> +
>>       /* get the first queue group node from VSI sub-tree */
>>       qgrp_node = ice_sched_get_first_node(pi, vsi_node, qgrp_layer);
>>       while (qgrp_node) {
>> @@ -3178,8 +3180,9 @@ ice_sched_add_rl_profile(struct ice_port_info *pi,
>>       u8 profile_type;
>>       int status;
>>   -    if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
>> +    if (!pi || layer_num >= pi->hw->num_tx_sched_layers)
>>           return NULL;
>> +
>>       switch (rl_type) {
>>       case ICE_MIN_BW:
>>           profile_type = ICE_AQC_RL_PROFILE_TYPE_CIR;
>> @@ -3194,8 +3197,6 @@ ice_sched_add_rl_profile(struct ice_port_info *pi,
>>           return NULL;
>>       }
>>   -    if (!pi)
>> -        return NULL;
>
> This hunk (and above) could be a separate commit?


Seems like a good idea


>
>>       hw = pi->hw;
>>       list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
>>                   list_entry)
>> @@ -3425,7 +3426,7 @@ ice_sched_rm_rl_profile(struct ice_port_info 
>> *pi, u8 layer_num, u8 profile_type,
>>       struct ice_aqc_rl_profile_info *rl_prof_elem;
>>       int status = 0;
>>   -    if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
>> +    if (layer_num >= pi->hw->num_tx_sched_layers)
>>           return -EINVAL;
>>       /* Check the existing list for RL profile */
>>       list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
>
>
> Kind regards,
>
> Paul
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology
  2022-06-24 10:21 ` [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology Michal Wilczynski
  2022-06-28 22:21   ` Tony Nguyen
@ 2022-06-29 10:49   ` Paul Menzel
  2022-07-01 13:22     ` Wilczynski, Michal
  1 sibling, 1 reply; 13+ messages in thread
From: Paul Menzel @ 2022-06-29 10:49 UTC (permalink / raw)
  To: Michal Wilczynski, Raj Victor; +Cc: intel-wired-lan

Dear Raj, dear Michal,


Thank you for the patch.

Am 24.06.22 um 12:21 schrieb Michal Wilczynski:
> From: Raj Victor <victor.raj@intel.com>

I recommend to make the git commit message summary a statement by adding 
a verb in imperative mood.

Add code to support 5 layer topoloy

Maybe even short it to:

Support 5 layer topology

> There was a performance issue reported when the number of VSIs were
> not multiple of 8. This was caused due to the max children limitation

Maybe even describe the test setup.

> per node(8) in 9 layer topology. The BW credits are shared evenly
> among the children by default. Assume one node has 8 children and
> the other has 1. The parent of these nodes share the BW credit equally
> among them. Apparently this casued a problem for the first node which

caused

> has 8 children. The 9th VM got more BW credits than the first 8 VMs.

I’d use present tense to describe the problem in the commit message.

> Example:
> 1) With 8 VM's:
> tx_queue_0_packets: 23283027
> tx_queue_1_packets: 23292289
> tx_queue_2_packets: 23276136
> tx_queue_3_packets: 23279828
> tx_queue_4_packets: 23279828
> tx_queue_5_packets: 23279333
> tx_queue_6_packets: 23277745
> tx_queue_7_packets: 23279950
> tx_queue_8_packets: 0
> 
> 2) With 9 VM's:
> tx_queue_0_packets: 24163396
> tx_queue_1_packets: 24164623
> tx_queue_2_packets: 24163188
> tx_queue_3_packets: 24163701
> tx_queue_4_packets: 24163683
> tx_queue_5_packets: 24164668
> tx_queue_6_packets: 23327200
> tx_queue_7_packets: 24163853
> tx_queue_8_packets: 91101417
> 
> So on average queue 8 statistics show that 3.7 times
> more packets were send there than from the other queues.

Maybe reflow the paragraph to use the full 75 characters per line length.

> The FW has increased the max number of children per node by reducing
> the number of layers from 9 to 5. Reflect this on driver side.

In what firmware version did that happen?

In the code you check for some compatibility mode. In what datasheet is 
that documented?

> Signed-off-by: Raj Victor <victor.raj@intel.com>
> Co-developed-by: Michal Wilczynski <michal.wilczynski@intel.com>
> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
>   .../net/ethernet/intel/ice/ice_adminq_cmd.h   |  23 ++
>   drivers/net/ethernet/intel/ice/ice_common.c   |   3 +
>   .../net/ethernet/intel/ice/ice_flex_pipe.c    | 202 ++++++++++++++++++
>   .../net/ethernet/intel/ice/ice_flex_type.h    |  17 +-
>   drivers/net/ethernet/intel/ice/ice_sched.h    |   3 +
>   drivers/net/ethernet/intel/ice/ice_type.h     |   1 +
>   6 files changed, 247 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
> index 05cb9dd7035a..2d1da00b4926 100644
> --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
> +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
> @@ -120,6 +120,7 @@ struct ice_aqc_list_caps_elem {
>   #define ICE_AQC_CAPS_PCIE_RESET_AVOIDANCE		0x0076
>   #define ICE_AQC_CAPS_POST_UPDATE_RESET_RESTRICT		0x0077
>   #define ICE_AQC_CAPS_NVM_MGMT				0x0080
> +#define ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE		0x0085
>   
>   	u8 major_ver;
>   	u8 minor_ver;
> @@ -798,6 +799,24 @@ struct ice_aqc_get_topo {
>   	__le32 addr_low;
>   };
>   
> +/* Get/Set Tx Topology (indirect 0x0418/0x0417) */
> +struct ice_aqc_get_set_tx_topo {
> +	u8 set_flags;
> +#define ICE_AQC_TX_TOPO_FLAGS_CORRER		BIT(0)
> +#define ICE_AQC_TX_TOPO_FLAGS_SRC_RAM		BIT(1)
> +#define ICE_AQC_TX_TOPO_FLAGS_SET_PSM		BIT(2)
> +#define ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW		BIT(4)
> +#define ICE_AQC_TX_TOPO_FLAGS_ISSUED		BIT(5)
> +	u8 get_flags;
> +#define ICE_AQC_TX_TOPO_GET_NO_UPDATE		0
> +#define ICE_AQC_TX_TOPO_GET_PSM			1
> +#define ICE_AQC_TX_TOPO_GET_RAM			2
> +	__le16 reserved1;
> +	__le32 reserved2;
> +	__le32 addr_high;
> +	__le32 addr_low;
> +};
> +
>   /* Update TSE (indirect 0x0403)
>    * Get TSE (indirect 0x0404)
>    * Add TSE (indirect 0x0401)
> @@ -2126,6 +2145,7 @@ struct ice_aq_desc {
>   		struct ice_aqc_get_link_topo get_link_topo;
>   		struct ice_aqc_i2c read_i2c;
>   		struct ice_aqc_read_i2c_resp read_i2c_resp;
> +		struct ice_aqc_get_set_tx_topo get_set_tx_topo;
>   	} params;
>   };
>   
> @@ -2231,6 +2251,9 @@ enum ice_adminq_opc {
>   	ice_aqc_opc_query_sched_res			= 0x0412,
>   	ice_aqc_opc_remove_rl_profiles			= 0x0415,
>   
> +	ice_aqc_opc_set_tx_topo				= 0x0417,
> +	ice_aqc_opc_get_tx_topo				= 0x0418,
> +
>   	/* PHY commands */
>   	ice_aqc_opc_get_phy_caps			= 0x0600,
>   	ice_aqc_opc_set_phy_cfg				= 0x0601,
> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
> index 9619bdb9e49a..8b65e2bfb160 100644
> --- a/drivers/net/ethernet/intel/ice/ice_common.c
> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> @@ -2091,6 +2091,9 @@ ice_parse_common_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps,
>   			  "%s: reset_restrict_support = %d\n", prefix,
>   			  caps->reset_restrict_support);
>   		break;
> +	case ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE:
> +		caps->tx_sched_topo_comp_mode_en = (number == 1);
> +		break;
>   	default:
>   		/* Not one of the recognized common capabilities */
>   		found = false;
> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
> index c73cdab44f70..b3bd345ea0f3 100644
> --- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
> +++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
> @@ -4,6 +4,7 @@
>   #include "ice_common.h"
>   #include "ice_flex_pipe.h"
>   #include "ice_flow.h"
> +#include "ice_sched.h"
>   #include "ice.h"
>   
>   /* For supporting double VLAN mode, it is necessary to enable or disable certain
> @@ -1783,6 +1784,207 @@ bool ice_is_init_pkg_successful(enum ice_ddp_state state)
>   	}
>   }
>   
> +/**
> + * ice_get_set_tx_topo - get or set tx topology
> + * @hw: pointer to the HW struct
> + * @buf: pointer to tx topology buffer
> + * @buf_size: buffer size
> + * @cd: pointer to command details structure or NULL
> + * @flags: pointer to descriptor flags
> + * @set: 0-get, 1-set topology
> + *
> + * The function will get or set tx topology
> + */
> +static int
> +ice_get_set_tx_topo(struct ice_hw *hw, u8 *buf, u16 buf_size,
> +		    struct ice_sq_cd *cd, u8 *flags, bool set)
> +{
> +	struct ice_aqc_get_set_tx_topo *cmd;
> +	struct ice_aq_desc desc;
> +	int status;
> +
> +	cmd = &desc.params.get_set_tx_topo;
> +	if (set) {
> +		cmd->set_flags = ICE_AQC_TX_TOPO_FLAGS_ISSUED;
> +		/* requested to update a new topology, not a default topology */
> +		if (buf)
> +			cmd->set_flags |= ICE_AQC_TX_TOPO_FLAGS_SRC_RAM |
> +					  ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW;
> +	} else {
> +		cmd->get_flags = ICE_AQC_TX_TOPO_GET_RAM;
> +	}
> +	ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_tx_topo);
> +	desc.flags |= cpu_to_le16(ICE_AQ_FLAG_RD);
> +	status = ice_aq_send_cmd(hw, &desc, buf, buf_size, cd);
> +	if (status)
> +		return status;
> +	/* read the return flag values (first byte) for get operation */
> +	if (!set && flags)
> +		*flags = desc.params.get_set_tx_topo.set_flags;
> +
> +	return 0;
> +}
> +
> +/**
> + * ice_cfg_tx_topo - Initialize new tx topology if available
> + * @hw: pointer to the HW struct
> + * @buf: pointer to Tx topology buffer
> + * @len: buffer size
> + *
> + * The function will apply the new Tx topology from the package buffer
> + * if available.
> + */
> +int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len)
> +{
> +	u8 *current_topo, *new_topo = NULL;
> +	struct ice_run_time_cfg_seg *seg;
> +	struct ice_buf_hdr *section;
> +	struct ice_pkg_hdr *pkg_hdr;
> +	enum ice_ddp_state state;
> +	u16 i, size = 0, offset;

Is limiting the length needed?

> +	u32 reg = 0;
> +	int status;
> +	u8 flags;
> +
> +	if (!buf || !len)
> +		return -EINVAL;
> +
> +	/* Does FW support new Tx topology mode ? */
> +	if (!hw->func_caps.common_cap.tx_sched_topo_comp_mode_en) {
> +		ice_debug(hw, ICE_DBG_INIT, "FW doesn't support compatibility mode\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	current_topo = kzalloc(ICE_AQ_MAX_BUF_LEN, GFP_KERNEL);
> +
> +	if (!current_topo)
> +		return -ENOMEM;
> +
> +	/* get the current Tx topology */
> +	status = ice_get_set_tx_topo(hw, current_topo, ICE_AQ_MAX_BUF_LEN, NULL,
> +				     &flags, false);
> +
> +	kfree(current_topo);
> +
> +	if (status) {
> +		ice_debug(hw, ICE_DBG_INIT, "Get current topology is failed\n");
> +		return status;
> +	}
> +
> +	/* Is default topology already applied ? */
> +	if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
> +	    hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS) {
> +		ice_debug(hw, ICE_DBG_INIT, "Loaded default topology\n");
> +		/* Already default topology is loaded */
> +		return -EEXIST;
> +	}
> +
> +	/* Is new topology already applied ? */
> +	if ((flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
> +	    hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) {
> +		ice_debug(hw, ICE_DBG_INIT, "Loaded new topology\n");
> +		/* Already new topology is loaded */
> +		return -EEXIST;
> +	}
> +
> +	/* Is set topology issued already ? */
> +	if (flags & ICE_AQC_TX_TOPO_FLAGS_ISSUED) {
> +		ice_debug(hw, ICE_DBG_INIT, "Update tx topology was done by another PF\n");
> +		/* add a small delay before exiting */
> +		for (i = 0; i < 20; i++)
> +			msleep(100);
> +		return -EEXIST;
> +	}
> +
> +	/* Change the topology from new to default (5 to 9) */
> +	if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
> +	    hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) {
> +		ice_debug(hw, ICE_DBG_INIT, "Change topology from 5 to 9 layers\n");
> +		goto update_topo;
> +	}
> +
> +	pkg_hdr = (struct ice_pkg_hdr *)buf;
> +	state = ice_verify_pkg(pkg_hdr, len);
> +	if (state) {
> +		ice_debug(hw, ICE_DBG_INIT, "failed to verify pkg (err: %d)\n",
> +			  state);
> +		return -EIO;
> +	}
> +
> +	/* find run time configuration segment */
> +	seg = (struct ice_run_time_cfg_seg *)
> +		ice_find_seg_in_pkg(hw, SEGMENT_TYPE_ICE_RUN_TIME_CFG, pkg_hdr);
> +	if (!seg) {
> +		ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment is missing\n");
> +		return -EIO;
> +	}
> +
> +	if (le32_to_cpu(seg->buf_table.buf_count) < ICE_MIN_S_COUNT) {
> +		ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment count(%d) is wrong\n",
> +			  seg->buf_table.buf_count);
> +		return -EIO;
> +	}
> +
> +	section = ice_pkg_val_buf(seg->buf_table.buf_array);
> +
> +	if (!section || le32_to_cpu(section->section_entry[0].type) !=
> +		ICE_SID_TX_5_LAYER_TOPO) {
> +		ice_debug(hw, ICE_DBG_INIT, "5 layer topology section type is wrong\n");
> +		return -EIO;
> +	}
> +
> +	size = le16_to_cpu(section->section_entry[0].size);
> +	offset = le16_to_cpu(section->section_entry[0].offset);
> +	if (size < ICE_MIN_S_SZ || size > ICE_MAX_S_SZ) {
> +		ice_debug(hw, ICE_DBG_INIT, "5 layer topology section size is wrong\n");
> +		return -EIO;
> +	}
> +
> +	/* make sure the section fits in the buffer */
> +	if (offset + size > ICE_PKG_BUF_SIZE) {
> +		ice_debug(hw, ICE_DBG_INIT, "5 layer topology buffer > 4K\n");
> +		return -EIO;
> +	}
> +
> +	/* Get the new topology buffer */
> +	new_topo = ((u8 *)section) + offset;
> +
> +update_topo:
> +	/* acquire global lock to make sure that set topology issued
> +	 * by one PF
> +	 */
> +	status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
> +	if (status) {
> +		ice_debug(hw, ICE_DBG_INIT, "Failed to acquire global lock\n");
> +		return status;
> +	}
> +
> +	/* check reset was triggered already or not */
> +	reg = rd32(hw, GLGEN_RSTAT);
> +	if (reg & GLGEN_RSTAT_DEVSTATE_M) {
> +		/* Reset is in progress, re-init the hw again */
> +		ice_debug(hw, ICE_DBG_INIT, "Reset is in progress. layer topology might be applied already\n");
> +		ice_check_reset(hw);
> +		return 0;
> +	}
> +
> +	/* set new topology */
> +	status = ice_get_set_tx_topo(hw, new_topo, size, NULL, NULL, true);
> +	if (status) {
> +		ice_debug(hw, ICE_DBG_INIT, "Set tx topology is failed\n");
> +		return status;
> +	}
> +
> +	/* new topology is updated, delay 1 second before issuing the CORRER */
> +	for (i = 0; i < 10; i++)
> +		msleep(100);

Why make it in a loop?

> +	ice_reset(hw, ICE_RESET_CORER);
> +	/* CORER will clear the global lock, so no explicit call
> +	 * required for release
> +	 */
> +	return 0;
> +}
> +
>   /**
>    * ice_pkg_buf_alloc
>    * @hw: pointer to the HW structure
> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_type.h b/drivers/net/ethernet/intel/ice/ice_flex_type.h
> index 974d14a83b2e..ebbb5a1db8c7 100644
> --- a/drivers/net/ethernet/intel/ice/ice_flex_type.h
> +++ b/drivers/net/ethernet/intel/ice/ice_flex_type.h
> @@ -29,8 +29,9 @@ struct ice_pkg_hdr {
>   
>   /* generic segment */
>   struct ice_generic_seg_hdr {
> -#define SEGMENT_TYPE_METADATA	0x00000001
> -#define SEGMENT_TYPE_ICE	0x00000010
> +#define SEGMENT_TYPE_METADATA	      0x00000001
> +#define SEGMENT_TYPE_ICE	      0x00000010
> +#define SEGMENT_TYPE_ICE_RUN_TIME_CFG 0x00000020
>   	__le32 seg_type;
>   	struct ice_pkg_ver seg_format_ver;
>   	__le32 seg_size;
> @@ -73,6 +74,12 @@ struct ice_buf_table {
>   	struct ice_buf buf_array[];
>   };
>   
> +struct ice_run_time_cfg_seg {
> +	struct ice_generic_seg_hdr hdr;
> +	u8 rsvd[8];
> +	struct ice_buf_table buf_table;
> +};
> +
>   /* global metadata specific segment */
>   struct ice_global_metadata_seg {
>   	struct ice_generic_seg_hdr hdr;
> @@ -181,6 +188,9 @@ struct ice_meta_sect {
>   /* The following define MUST be updated to reflect the last label section ID */
>   #define ICE_SID_LBL_LAST		0x80000038
>   
> +/* Label ICE runtime configuration section IDs */
> +#define ICE_SID_TX_5_LAYER_TOPO		0x10
> +
>   enum ice_block {
>   	ICE_BLK_SW = 0,
>   	ICE_BLK_ACL,
> @@ -706,4 +716,7 @@ struct ice_meta_init_section {
>   	__le16 offset;
>   	struct ice_meta_init_entry entry;
>   };
> +
> +int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len);
> +
>   #endif /* _ICE_FLEX_TYPE_H_ */
> diff --git a/drivers/net/ethernet/intel/ice/ice_sched.h b/drivers/net/ethernet/intel/ice/ice_sched.h
> index 4f91577fed56..86dc0f1f4255 100644
> --- a/drivers/net/ethernet/intel/ice/ice_sched.h
> +++ b/drivers/net/ethernet/intel/ice/ice_sched.h
> @@ -6,6 +6,9 @@
>   
>   #include "ice_common.h"
>   
> +#define ICE_SCHED_5_LAYERS	5
> +#define ICE_SCHED_9_LAYERS	9
> +
>   #define ICE_QGRP_LAYER_OFFSET	2
>   #define ICE_VSI_LAYER_OFFSET	4
>   #define ICE_AGG_LAYER_OFFSET	6
> diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h
> index f2a518a1fd94..ff793fe2a2e7 100644
> --- a/drivers/net/ethernet/intel/ice/ice_type.h
> +++ b/drivers/net/ethernet/intel/ice/ice_type.h
> @@ -290,6 +290,7 @@ struct ice_hw_common_caps {
>   	bool pcie_reset_avoidance;
>   	/* Post update reset restriction */
>   	bool reset_restrict_support;
> +	bool tx_sched_topo_comp_mode_en;
>   };
>   
>   /* IEEE 1588 TIME_SYNC specific info */


Kind regards,

Paul
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology
  2022-06-29  8:54     ` Wilczynski, Michal
@ 2022-06-29 15:57       ` Tony Nguyen
  0 siblings, 0 replies; 13+ messages in thread
From: Tony Nguyen @ 2022-06-29 15:57 UTC (permalink / raw)
  To: Wilczynski, Michal, intel-wired-lan; +Cc: Raj Victor



On 6/29/2022 1:54 AM, Wilczynski, Michal wrote:
> Hi, thanks for your review
> 
> On 6/29/2022 12:21 AM, Tony Nguyen wrote:
>>
>>
>> On 6/24/2022 3:21 AM, Michal Wilczynski wrote:
>>> +struct ice_aqc_get_set_tx_topo {
>>> +    u8 set_flags;
>>> +#define ICE_AQC_TX_TOPO_FLAGS_CORRER        BIT(0)
>>> +#define ICE_AQC_TX_TOPO_FLAGS_SRC_RAM        BIT(1)
>>> +#define ICE_AQC_TX_TOPO_FLAGS_SET_PSM        BIT(2)
>>> +#define ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW        BIT(4)
>>> +#define ICE_AQC_TX_TOPO_FLAGS_ISSUED        BIT(5)
>>
>> Not all the defines are being used, please remove the unused ones.
>> Also, please add newline between the member & related defines from the 
>> next to make the associations a little more distinct.
> 
> 
> Hi, good point with the unused ones, but the newline idea doesn't seem 
> to follow the convention used in the file.

This was a more recent comment/change due to maintainer feedback [1]. 
Though we haven't changed existing usages, new ones should comply to it. 
The other option is to pull it out of the structure completely.


https://lore.kernel.org/netdev/1f87f6c4173c239d2c082500509736cf5f18357d.camel@intel.com/
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology
  2022-06-29  9:54     ` Wilczynski, Michal
@ 2022-07-01 13:12       ` Wilczynski, Michal
  0 siblings, 0 replies; 13+ messages in thread
From: Wilczynski, Michal @ 2022-07-01 13:12 UTC (permalink / raw)
  To: Paul Menzel; +Cc: intel-wired-lan

Created a v3, need to clarify one of mine answers

On 6/29/2022 11:54 AM, Wilczynski, Michal wrote:
> Hi, Thanks for your review
>
> On 6/25/2022 9:10 AM, Paul Menzel wrote:
>> Dear Michal,
>>
>>
>> Thank you for your patch.
>>
>> Am 24.06.22 um 12:21 schrieb Michal Wilczynski:
>>> Introduce support for tx scheduler topology change, based on
>>> user selection, from default 9-layer to 5-layer.
>>> In order for switch to be successful there is a new NVM
>>> and DDP package required.
>>
>> What versions are needed exactly?
>
>
> Will try to find exact numbers.
>
>
>>
>>> This commit enables 5-layer topology in init path of
>>> the driver, so before ice driver load, the user selection
>>> should be changed in NVM using some external tools.
>>
>> Please add one example, how to change it.
>
>
> The reason why it's a bit cryptic is that the devlink mechanism is not 
> upstreamed yet, and epct way of doing
>
> that doesn't seem to work with upstream driver.
>
>
> Will post an example of devlink way of doing that, it will be 
> upstreamed very soon.
>
>
>
>>
>> Could you please reflow for 75 characters per line, and do not break 
>> lines just because a sentence ends, or add a blank line between 
>> paragraphs?
>>
>>> Title: Enable switching default tx scheduler topology
>>> Change-type: ImplementationChange
>>
>> These two tags are not used upstream I thinks.
>
>
> Yep, will fix that
>
>
>>
>>> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
>>> ---
>>>   drivers/net/ethernet/intel/ice/ice_common.c   |   2 +
>>>   .../net/ethernet/intel/ice/ice_flex_pipe.c    |   3 +-
>>>   drivers/net/ethernet/intel/ice/ice_main.c     | 113 
>>> +++++++++++++++---
>>>   drivers/net/ethernet/intel/ice/ice_sched.c    |  35 +++---
>>>   4 files changed, 116 insertions(+), 37 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c 
>>> b/drivers/net/ethernet/intel/ice/ice_common.c
>>> index 8b65e2bfb160..167f9d5c345a 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_common.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
>>> @@ -1535,6 +1535,8 @@ ice_aq_send_cmd(struct ice_hw *hw, struct 
>>> ice_aq_desc *desc, void *buf,
>>>       case ice_aqc_opc_set_port_params:
>>>       case ice_aqc_opc_get_vlan_mode_parameters:
>>>       case ice_aqc_opc_set_vlan_mode_parameters:
>>> +    case ice_aqc_opc_set_tx_topo:
>>> +    case ice_aqc_opc_get_tx_topo:
>>>       case ice_aqc_opc_add_recipe:
>>>       case ice_aqc_opc_recipe_to_profile:
>>>       case ice_aqc_opc_get_recipe:
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c 
>>> b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>>> index b3bd345ea0f3..c2359366b48e 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>>> @@ -1953,7 +1953,8 @@ int ice_cfg_tx_topo(struct ice_hw *hw, u8 
>>> *buf, u32 len)
>>>       /* acquire global lock to make sure that set topology issued
>>>        * by one PF
>>>        */
>>> -    status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
>>> +    status = ice_acquire_res(hw, ICE_GLOBAL_CFG_LOCK_RES_ID, 
>>> ICE_RES_WRITE,
>>> +                 ICE_GLOBAL_CFG_LOCK_TIMEOUT);
>>>       if (status) {
>>>           ice_debug(hw, ICE_DBG_INIT, "Failed to acquire global 
>>> lock\n");
>>>           return status;
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c 
>>> b/drivers/net/ethernet/intel/ice/ice_main.c
>>> index c1ac2f746714..5f827bea05d9 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_main.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
>>> @@ -4453,11 +4453,11 @@ static char *ice_get_opt_fw_name(struct 
>>> ice_pf *pf)
>>>   /**
>>>    * ice_request_fw - Device initialization routine
>>>    * @pf: pointer to the PF instance
>>> + * @firmware: double pointer to firmware struct
>>>    */
>>> -static void ice_request_fw(struct ice_pf *pf)
>>> +static int ice_request_fw(struct ice_pf *pf, const struct firmware 
>>> **firmware)
>>>   {
>>>       char *opt_fw_filename = ice_get_opt_fw_name(pf);
>>> -    const struct firmware *firmware = NULL;
>>>       struct device *dev = ice_pf_to_dev(pf);
>>>       int err = 0;
>>>   @@ -4466,29 +4466,98 @@ static void ice_request_fw(struct ice_pf *pf)
>>>        * and warning messages for other errors.
>>>        */
>>>       if (opt_fw_filename) {
>>> -        err = firmware_request_nowarn(&firmware, opt_fw_filename, 
>>> dev);
>>> -        if (err) {
>>> -            kfree(opt_fw_filename);
>>> -            goto dflt_pkg_load;
>>> -        }
>>> -
>>> -        /* request for firmware was successful. Download to device */
>>> -        ice_load_pkg(firmware, pf);
>>> +        err = firmware_request_nowarn(firmware, opt_fw_filename, dev);
>>>           kfree(opt_fw_filename);
>>> -        release_firmware(firmware);
>>> -        return;
>>> +        if (!err)
>>> +            return err;
>>
>> Why is the code above removed?
>
>
> The code received a bit of refactor, logical flow is a bit different, 
> now we don't upload firmware to the device
>
> immediately after loading it from the disk. We change the topology 
> using AQ comand, then trigger CORER.
>
> And only after that we upload firmware to the device.
>
>
>>
>>>       }
>>>   -dflt_pkg_load:
>>> -    err = request_firmware(&firmware, ICE_DDP_PKG_FILE, dev);
>>> -    if (err) {
>>> +    err = request_firmware(firmware, ICE_DDP_PKG_FILE, dev);
>>> +    if (err)
>>>           dev_err(dev, "The DDP package file was not found or could 
>>> not be read. Entering Safe Mode\n");
>>> -        return;
>>> +
>>> +    return err;
>>> +}
>>> +
>>> +/**
>>> + * ice_init_tx_topology - performs Tx topology initialization
>>> + * @hw: pointer to the hardware structure
>>> + * @firmware: pointer to firmware structure
>>> + */
>>> +static int ice_init_tx_topology(struct ice_hw *hw,
>>> +                const struct firmware *firmware)
>>> +{
>>> +    u8 num_tx_sched_layers = hw->num_tx_sched_layers;
>>> +    struct ice_pf *pf = hw->back;
>>> +    struct device *dev;
>>> +    u8 *buf_copy;
>>> +    int err = 0;
>>> +
>>> +    dev = ice_pf_to_dev(pf);
>>> +    /* ice_cfg_tx_topo buf argument is not a constant,
>>> +     * so we have to make a copy
>>> +     */
>>> +    buf_copy = devm_kmemdup(ice_hw_to_dev(hw), firmware->data,
>>> +                firmware->size, GFP_KERNEL);
>>> +
>>> +    err = ice_cfg_tx_topo(hw, buf_copy, firmware->size);
>>> +    if (!err) {
>>> +        if (hw->num_tx_sched_layers > num_tx_sched_layers)
>>> +            dev_info(dev, "Transmit balancing feature disabled\n");
>>
>> Should this be an error?
>
>
> This is an opt-in feature, so we don't treat failure as an error.
>
>
>>
>>> +        else
>>> +            dev_info(dev, "Transmit balancing feature enabled\n");
>>
>> Add a blank line below?
>
> Sure
>
>
>>
>>> +        /* if there was a change in topology ice_cfg_tx_topo triggered
>>> +         * a CORER and we need to re-init hw
>>
>> What is CORER? Add a dot/period at the end?
>
>
> CORER is a type of reset of our hardware. This concept seems to be 
> used extensively in our driver, not sure
>
> if this requires an explanation in the comment.
>
>
>>
>>> +         */
>>> +        ice_deinit_hw(hw);
>>> +        err = ice_init_hw(hw);
>>> +
>>> +        /* in this case we're not allowing safe mode */
>>> +        devm_kfree(ice_hw_to_dev(hw), buf_copy);
>>> +
>>> +        return err;
>>> +
>>> +    } else if (err == -EIO) {
>>> +        dev_info(dev, "DDP package does not support transmit 
>>> balancing feature - please update to the latest DDP package and try 
>>> again\n");
>>
>> Please mention a version, where it’s supposed to work? Can the DDP 
>> version be printed out too?


This works like that by design, cause we have a number of different DDP 
packages with different versions.

So hinting the version user would need upgrade to is not convenient here.


>>
>> Should this be a warning?
>
>
> I will try to find out exact versions required.
>
> This could maybe be a warning, it wasn't cause this is an opt-in feature.
>
>
>>
>>>       }
>>>   -    /* request for firmware was successful. Download to device */
>>> +    devm_kfree(ice_hw_to_dev(hw), buf_copy);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +/**
>>> + * ice_init_ddp_config - DDP related configuration
>>> + * @hw: pointer to the hardware structure
>>> + * @pf: pointer to pf structure
>>> + *
>>> + * This function loads DDP file from the disk, then initializes tx
>>> + * topology. At the end DDP package is loaded on the card.
>>> + */
>>> +static int ice_init_ddp_config(struct ice_hw *hw, struct ice_pf *pf)
>>> +{
>>> +    struct device *dev = ice_pf_to_dev(pf);
>>> +    const struct firmware *firmware = NULL;
>>> +    int err = 0;
>>> +
>>> +    err = ice_request_fw(pf, &firmware);
>>> +    if (err)
>>> +    /* we can still operate in safe mode if DDP package load fails */
>>
>> Indent the comment?
>
>
> Sure
>
>
>>
>>> +        return 0;
>>> +
>>> +    err = ice_init_tx_topology(hw, firmware);
>>> +    if (err) {
>>> +        dev_err(dev, "ice_init_hw during change of tx topology 
>>> failed: %d\n",
>>> +            err);
>>> +        release_firmware(firmware);
>>> +        return err;
>>> +    }
>>> +
>>> +    /* Download firmware to device */
>>>       ice_load_pkg(firmware, pf);
>>>       release_firmware(firmware);
>>> +
>>> +    return 0;
>>>   }
>>>     /**
>>> @@ -4641,9 +4710,15 @@ ice_probe(struct pci_dev *pdev, const struct 
>>> pci_device_id __always_unused *ent)
>>>         ice_init_feature_support(pf);
>>>   -    ice_request_fw(pf);
>>> +    err = ice_init_ddp_config(hw, pf);
>>> +
>>> +    /* during topology change ice_init_hw may fail */
>>> +    if (err) {
>>> +        err = -EIO;
>>> +        goto err_exit_unroll;
>>> +    }
>>>   -    /* if ice_request_fw fails, ICE_FLAG_ADV_FEATURES bit won't be
>>> +    /* if ice_init_ddp_config fails, ICE_FLAG_ADV_FEATURES bit 
>>> won't be
>>>        * set in pf->state, which will cause ice_is_safe_mode to return
>>>        * true
>>>        */
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c 
>>> b/drivers/net/ethernet/intel/ice/ice_sched.c
>>> index 7947223536e3..f18a7121ca55 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_sched.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_sched.c
>>> @@ -1102,12 +1102,11 @@ static u8 ice_sched_get_vsi_layer(struct 
>>> ice_hw *hw)
>>>        *     5 or less       sw_entry_point_layer
>>>        */
>>>       /* calculate the VSI layer based on number of layers. */
>>> -    if (hw->num_tx_sched_layers > ICE_VSI_LAYER_OFFSET + 1) {
>>> -        u8 layer = hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
>>> -
>>> -        if (layer > hw->sw_entry_point_layer)
>>> -            return layer;
>>> -    }
>>> +    if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
>>> +        return hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
>>> +    else if (hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS)
>>> +        /* qgroup and VSI layers are same */
>>> +        return hw->num_tx_sched_layers - ICE_QGRP_LAYER_OFFSET;
>>>       return hw->sw_entry_point_layer;
>>>   }
>>>   @@ -1124,12 +1123,8 @@ static u8 ice_sched_get_agg_layer(struct 
>>> ice_hw *hw)
>>>        *     7 or less       sw_entry_point_layer
>>>        */
>>>       /* calculate the aggregator layer based on number of layers. */
>>> -    if (hw->num_tx_sched_layers > ICE_AGG_LAYER_OFFSET + 1) {
>>> -        u8 layer = hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
>>> -
>>> -        if (layer > hw->sw_entry_point_layer)
>>> -            return layer;
>>> -    }
>>> +    if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
>>> +        return hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
>>>       return hw->sw_entry_point_layer;
>>>   }
>>>   @@ -1485,10 +1480,11 @@ ice_sched_get_free_qparent(struct 
>>> ice_port_info *pi, u16 vsi_handle, u8 tc,
>>>   {
>>>       struct ice_sched_node *vsi_node, *qgrp_node;
>>>       struct ice_vsi_ctx *vsi_ctx;
>>> +    u8 qgrp_layer, vsi_layer;
>>>       u16 max_children;
>>> -    u8 qgrp_layer;
>>>         qgrp_layer = ice_sched_get_qgrp_layer(pi->hw);
>>> +    vsi_layer = ice_sched_get_vsi_layer(pi->hw);
>>>       max_children = pi->hw->max_children[qgrp_layer];
>>>         vsi_ctx = ice_get_vsi_ctx(pi->hw, vsi_handle);
>>> @@ -1499,6 +1495,12 @@ ice_sched_get_free_qparent(struct 
>>> ice_port_info *pi, u16 vsi_handle, u8 tc,
>>>       if (!vsi_node)
>>>           return NULL;
>>>   +    /* If the queue group and vsi layer are same then queues
>>> +     * are all attached directly to VSI
>>> +     */
>>> +    if (qgrp_layer == vsi_layer)
>>> +        return vsi_node;
>>> +
>>>       /* get the first queue group node from VSI sub-tree */
>>>       qgrp_node = ice_sched_get_first_node(pi, vsi_node, qgrp_layer);
>>>       while (qgrp_node) {
>>> @@ -3178,8 +3180,9 @@ ice_sched_add_rl_profile(struct ice_port_info 
>>> *pi,
>>>       u8 profile_type;
>>>       int status;
>>>   -    if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
>>> +    if (!pi || layer_num >= pi->hw->num_tx_sched_layers)
>>>           return NULL;
>>> +
>>>       switch (rl_type) {
>>>       case ICE_MIN_BW:
>>>           profile_type = ICE_AQC_RL_PROFILE_TYPE_CIR;
>>> @@ -3194,8 +3197,6 @@ ice_sched_add_rl_profile(struct ice_port_info 
>>> *pi,
>>>           return NULL;
>>>       }
>>>   -    if (!pi)
>>> -        return NULL;
>>
>> This hunk (and above) could be a separate commit?
>
>
> Seems like a good idea
>
>
>>
>>>       hw = pi->hw;
>>>       list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
>>>                   list_entry)
>>> @@ -3425,7 +3426,7 @@ ice_sched_rm_rl_profile(struct ice_port_info 
>>> *pi, u8 layer_num, u8 profile_type,
>>>       struct ice_aqc_rl_profile_info *rl_prof_elem;
>>>       int status = 0;
>>>   -    if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
>>> +    if (layer_num >= pi->hw->num_tx_sched_layers)
>>>           return -EINVAL;
>>>       /* Check the existing list for RL profile */
>>>       list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
>>
>>
>> Kind regards,
>>
>> Paul
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology
  2022-06-29 10:49   ` Paul Menzel
@ 2022-07-01 13:22     ` Wilczynski, Michal
  0 siblings, 0 replies; 13+ messages in thread
From: Wilczynski, Michal @ 2022-07-01 13:22 UTC (permalink / raw)
  To: Paul Menzel, Raj Victor; +Cc: intel-wired-lan

Hi, Paul

Thanks for the review

On 6/29/2022 12:49 PM, Paul Menzel wrote:
> Dear Raj, dear Michal,
>
>
> Thank you for the patch.
>
> Am 24.06.22 um 12:21 schrieb Michal Wilczynski:
>> From: Raj Victor <victor.raj@intel.com>
>
> I recommend to make the git commit message summary a statement by 
> adding a verb in imperative mood.
>
> Add code to support 5 layer topoloy
>
> Maybe even short it to:
>
> Support 5 layer topology


Changed that in v3


>
>> There was a performance issue reported when the number of VSIs were
>> not multiple of 8. This was caused due to the max children limitation
>
> Maybe even describe the test setup.


Described a test methodology not the full setup, should be detailed 
enough though


>
>> per node(8) in 9 layer topology. The BW credits are shared evenly
>> among the children by default. Assume one node has 8 children and
>> the other has 1. The parent of these nodes share the BW credit equally
>> among them. Apparently this casued a problem for the first node which
>
> caused


sure


>
>> has 8 children. The 9th VM got more BW credits than the first 8 VMs.
>
> I’d use present tense to describe the problem in the commit message.


Yep


>
>> Example:
>> 1) With 8 VM's:
>> tx_queue_0_packets: 23283027
>> tx_queue_1_packets: 23292289
>> tx_queue_2_packets: 23276136
>> tx_queue_3_packets: 23279828
>> tx_queue_4_packets: 23279828
>> tx_queue_5_packets: 23279333
>> tx_queue_6_packets: 23277745
>> tx_queue_7_packets: 23279950
>> tx_queue_8_packets: 0
>>
>> 2) With 9 VM's:
>> tx_queue_0_packets: 24163396
>> tx_queue_1_packets: 24164623
>> tx_queue_2_packets: 24163188
>> tx_queue_3_packets: 24163701
>> tx_queue_4_packets: 24163683
>> tx_queue_5_packets: 24164668
>> tx_queue_6_packets: 23327200
>> tx_queue_7_packets: 24163853
>> tx_queue_8_packets: 91101417
>>
>> So on average queue 8 statistics show that 3.7 times
>> more packets were send there than from the other queues.
>
> Maybe reflow the paragraph to use the full 75 characters per line length.


Sure


>
>> The FW has increased the max number of children per node by reducing
>> the number of layers from 9 to 5. Reflect this on driver side.
>
> In what firmware version did that happen?
>
> In the code you check for some compatibility mode. In what datasheet 
> is that documented?


I filled firmware version, I don't think that the datasheet you're 
referring to is publicly available


>
>> Signed-off-by: Raj Victor <victor.raj@intel.com>
>> Co-developed-by: Michal Wilczynski <michal.wilczynski@intel.com>
>> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
>> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>> ---
>>   .../net/ethernet/intel/ice/ice_adminq_cmd.h   |  23 ++
>>   drivers/net/ethernet/intel/ice/ice_common.c   |   3 +
>>   .../net/ethernet/intel/ice/ice_flex_pipe.c    | 202 ++++++++++++++++++
>>   .../net/ethernet/intel/ice/ice_flex_type.h    |  17 +-
>>   drivers/net/ethernet/intel/ice/ice_sched.h    |   3 +
>>   drivers/net/ethernet/intel/ice/ice_type.h     |   1 +
>>   6 files changed, 247 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h 
>> b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
>> index 05cb9dd7035a..2d1da00b4926 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
>> +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
>> @@ -120,6 +120,7 @@ struct ice_aqc_list_caps_elem {
>>   #define ICE_AQC_CAPS_PCIE_RESET_AVOIDANCE        0x0076
>>   #define ICE_AQC_CAPS_POST_UPDATE_RESET_RESTRICT        0x0077
>>   #define ICE_AQC_CAPS_NVM_MGMT                0x0080
>> +#define ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE        0x0085
>>         u8 major_ver;
>>       u8 minor_ver;
>> @@ -798,6 +799,24 @@ struct ice_aqc_get_topo {
>>       __le32 addr_low;
>>   };
>>   +/* Get/Set Tx Topology (indirect 0x0418/0x0417) */
>> +struct ice_aqc_get_set_tx_topo {
>> +    u8 set_flags;
>> +#define ICE_AQC_TX_TOPO_FLAGS_CORRER        BIT(0)
>> +#define ICE_AQC_TX_TOPO_FLAGS_SRC_RAM        BIT(1)
>> +#define ICE_AQC_TX_TOPO_FLAGS_SET_PSM        BIT(2)
>> +#define ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW        BIT(4)
>> +#define ICE_AQC_TX_TOPO_FLAGS_ISSUED        BIT(5)
>> +    u8 get_flags;
>> +#define ICE_AQC_TX_TOPO_GET_NO_UPDATE        0
>> +#define ICE_AQC_TX_TOPO_GET_PSM            1
>> +#define ICE_AQC_TX_TOPO_GET_RAM            2
>> +    __le16 reserved1;
>> +    __le32 reserved2;
>> +    __le32 addr_high;
>> +    __le32 addr_low;
>> +};
>> +
>>   /* Update TSE (indirect 0x0403)
>>    * Get TSE (indirect 0x0404)
>>    * Add TSE (indirect 0x0401)
>> @@ -2126,6 +2145,7 @@ struct ice_aq_desc {
>>           struct ice_aqc_get_link_topo get_link_topo;
>>           struct ice_aqc_i2c read_i2c;
>>           struct ice_aqc_read_i2c_resp read_i2c_resp;
>> +        struct ice_aqc_get_set_tx_topo get_set_tx_topo;
>>       } params;
>>   };
>>   @@ -2231,6 +2251,9 @@ enum ice_adminq_opc {
>>       ice_aqc_opc_query_sched_res            = 0x0412,
>>       ice_aqc_opc_remove_rl_profiles            = 0x0415,
>>   +    ice_aqc_opc_set_tx_topo                = 0x0417,
>> +    ice_aqc_opc_get_tx_topo                = 0x0418,
>> +
>>       /* PHY commands */
>>       ice_aqc_opc_get_phy_caps            = 0x0600,
>>       ice_aqc_opc_set_phy_cfg                = 0x0601,
>> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c 
>> b/drivers/net/ethernet/intel/ice/ice_common.c
>> index 9619bdb9e49a..8b65e2bfb160 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_common.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
>> @@ -2091,6 +2091,9 @@ ice_parse_common_caps(struct ice_hw *hw, struct 
>> ice_hw_common_caps *caps,
>>                 "%s: reset_restrict_support = %d\n", prefix,
>>                 caps->reset_restrict_support);
>>           break;
>> +    case ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE:
>> +        caps->tx_sched_topo_comp_mode_en = (number == 1);
>> +        break;
>>       default:
>>           /* Not one of the recognized common capabilities */
>>           found = false;
>> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c 
>> b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>> index c73cdab44f70..b3bd345ea0f3 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c
>> @@ -4,6 +4,7 @@
>>   #include "ice_common.h"
>>   #include "ice_flex_pipe.h"
>>   #include "ice_flow.h"
>> +#include "ice_sched.h"
>>   #include "ice.h"
>>     /* For supporting double VLAN mode, it is necessary to enable or 
>> disable certain
>> @@ -1783,6 +1784,207 @@ bool ice_is_init_pkg_successful(enum 
>> ice_ddp_state state)
>>       }
>>   }
>>   +/**
>> + * ice_get_set_tx_topo - get or set tx topology
>> + * @hw: pointer to the HW struct
>> + * @buf: pointer to tx topology buffer
>> + * @buf_size: buffer size
>> + * @cd: pointer to command details structure or NULL
>> + * @flags: pointer to descriptor flags
>> + * @set: 0-get, 1-set topology
>> + *
>> + * The function will get or set tx topology
>> + */
>> +static int
>> +ice_get_set_tx_topo(struct ice_hw *hw, u8 *buf, u16 buf_size,
>> +            struct ice_sq_cd *cd, u8 *flags, bool set)
>> +{
>> +    struct ice_aqc_get_set_tx_topo *cmd;
>> +    struct ice_aq_desc desc;
>> +    int status;
>> +
>> +    cmd = &desc.params.get_set_tx_topo;
>> +    if (set) {
>> +        cmd->set_flags = ICE_AQC_TX_TOPO_FLAGS_ISSUED;
>> +        /* requested to update a new topology, not a default 
>> topology */
>> +        if (buf)
>> +            cmd->set_flags |= ICE_AQC_TX_TOPO_FLAGS_SRC_RAM |
>> +                      ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW;
>> +    } else {
>> +        cmd->get_flags = ICE_AQC_TX_TOPO_GET_RAM;
>> +    }
>> +    ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_tx_topo);
>> +    desc.flags |= cpu_to_le16(ICE_AQ_FLAG_RD);
>> +    status = ice_aq_send_cmd(hw, &desc, buf, buf_size, cd);
>> +    if (status)
>> +        return status;
>> +    /* read the return flag values (first byte) for get operation */
>> +    if (!set && flags)
>> +        *flags = desc.params.get_set_tx_topo.set_flags;
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>> + * ice_cfg_tx_topo - Initialize new tx topology if available
>> + * @hw: pointer to the HW struct
>> + * @buf: pointer to Tx topology buffer
>> + * @len: buffer size
>> + *
>> + * The function will apply the new Tx topology from the package buffer
>> + * if available.
>> + */
>> +int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len)
>> +{
>> +    u8 *current_topo, *new_topo = NULL;
>> +    struct ice_run_time_cfg_seg *seg;
>> +    struct ice_buf_hdr *section;
>> +    struct ice_pkg_hdr *pkg_hdr;
>> +    enum ice_ddp_state state;
>> +    u16 i, size = 0, offset;
>
> Is limiting the length needed?


You mean, like setting maximum possible size?

Probably not, cause we're checking later if the size doesn't overflow 
the buffer.


>
>> +    u32 reg = 0;
>> +    int status;
>> +    u8 flags;
>> +
>> +    if (!buf || !len)
>> +        return -EINVAL;
>> +
>> +    /* Does FW support new Tx topology mode ? */
>> +    if (!hw->func_caps.common_cap.tx_sched_topo_comp_mode_en) {
>> +        ice_debug(hw, ICE_DBG_INIT, "FW doesn't support 
>> compatibility mode\n");
>> +        return -EOPNOTSUPP;
>> +    }
>> +
>> +    current_topo = kzalloc(ICE_AQ_MAX_BUF_LEN, GFP_KERNEL);
>> +
>> +    if (!current_topo)
>> +        return -ENOMEM;
>> +
>> +    /* get the current Tx topology */
>> +    status = ice_get_set_tx_topo(hw, current_topo, 
>> ICE_AQ_MAX_BUF_LEN, NULL,
>> +                     &flags, false);
>> +
>> +    kfree(current_topo);
>> +
>> +    if (status) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Get current topology is 
>> failed\n");
>> +        return status;
>> +    }
>> +
>> +    /* Is default topology already applied ? */
>> +    if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
>> +        hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Loaded default topology\n");
>> +        /* Already default topology is loaded */
>> +        return -EEXIST;
>> +    }
>> +
>> +    /* Is new topology already applied ? */
>> +    if ((flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
>> +        hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Loaded new topology\n");
>> +        /* Already new topology is loaded */
>> +        return -EEXIST;
>> +    }
>> +
>> +    /* Is set topology issued already ? */
>> +    if (flags & ICE_AQC_TX_TOPO_FLAGS_ISSUED) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Update tx topology was done by 
>> another PF\n");
>> +        /* add a small delay before exiting */
>> +        for (i = 0; i < 20; i++)
>> +            msleep(100);
>> +        return -EEXIST;
>> +    }
>> +
>> +    /* Change the topology from new to default (5 to 9) */
>> +    if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
>> +        hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Change topology from 5 to 9 
>> layers\n");
>> +        goto update_topo;
>> +    }
>> +
>> +    pkg_hdr = (struct ice_pkg_hdr *)buf;
>> +    state = ice_verify_pkg(pkg_hdr, len);
>> +    if (state) {
>> +        ice_debug(hw, ICE_DBG_INIT, "failed to verify pkg (err: %d)\n",
>> +              state);
>> +        return -EIO;
>> +    }
>> +
>> +    /* find run time configuration segment */
>> +    seg = (struct ice_run_time_cfg_seg *)
>> +        ice_find_seg_in_pkg(hw, SEGMENT_TYPE_ICE_RUN_TIME_CFG, 
>> pkg_hdr);
>> +    if (!seg) {
>> +        ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment is 
>> missing\n");
>> +        return -EIO;
>> +    }
>> +
>> +    if (le32_to_cpu(seg->buf_table.buf_count) < ICE_MIN_S_COUNT) {
>> +        ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment 
>> count(%d) is wrong\n",
>> +              seg->buf_table.buf_count);
>> +        return -EIO;
>> +    }
>> +
>> +    section = ice_pkg_val_buf(seg->buf_table.buf_array);
>> +
>> +    if (!section || le32_to_cpu(section->section_entry[0].type) !=
>> +        ICE_SID_TX_5_LAYER_TOPO) {
>> +        ice_debug(hw, ICE_DBG_INIT, "5 layer topology section type 
>> is wrong\n");
>> +        return -EIO;
>> +    }
>> +
>> +    size = le16_to_cpu(section->section_entry[0].size);
>> +    offset = le16_to_cpu(section->section_entry[0].offset);
>> +    if (size < ICE_MIN_S_SZ || size > ICE_MAX_S_SZ) {
>> +        ice_debug(hw, ICE_DBG_INIT, "5 layer topology section size 
>> is wrong\n");
>> +        return -EIO;
>> +    }
>> +
>> +    /* make sure the section fits in the buffer */
>> +    if (offset + size > ICE_PKG_BUF_SIZE) {
>> +        ice_debug(hw, ICE_DBG_INIT, "5 layer topology buffer > 4K\n");
>> +        return -EIO;
>> +    }
>> +
>> +    /* Get the new topology buffer */
>> +    new_topo = ((u8 *)section) + offset;
>> +
>> +update_topo:
>> +    /* acquire global lock to make sure that set topology issued
>> +     * by one PF
>> +     */
>> +    status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
>> +    if (status) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Failed to acquire global lock\n");
>> +        return status;
>> +    }
>> +
>> +    /* check reset was triggered already or not */
>> +    reg = rd32(hw, GLGEN_RSTAT);
>> +    if (reg & GLGEN_RSTAT_DEVSTATE_M) {
>> +        /* Reset is in progress, re-init the hw again */
>> +        ice_debug(hw, ICE_DBG_INIT, "Reset is in progress. layer 
>> topology might be applied already\n");
>> +        ice_check_reset(hw);
>> +        return 0;
>> +    }
>> +
>> +    /* set new topology */
>> +    status = ice_get_set_tx_topo(hw, new_topo, size, NULL, NULL, true);
>> +    if (status) {
>> +        ice_debug(hw, ICE_DBG_INIT, "Set tx topology is failed\n");
>> +        return status;
>> +    }
>> +
>> +    /* new topology is updated, delay 1 second before issuing the 
>> CORRER */
>> +    for (i = 0; i < 10; i++)
>> +        msleep(100);
>
> Why make it in a loop?


This is tricky to answer, Raj would probably know that, but I think he's 
on vacation.


>
>> +    ice_reset(hw, ICE_RESET_CORER);
>> +    /* CORER will clear the global lock, so no explicit call
>> +     * required for release
>> +     */
>> +    return 0;
>> +}
>> +
>>   /**
>>    * ice_pkg_buf_alloc
>>    * @hw: pointer to the HW structure
>> diff --git a/drivers/net/ethernet/intel/ice/ice_flex_type.h 
>> b/drivers/net/ethernet/intel/ice/ice_flex_type.h
>> index 974d14a83b2e..ebbb5a1db8c7 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_flex_type.h
>> +++ b/drivers/net/ethernet/intel/ice/ice_flex_type.h
>> @@ -29,8 +29,9 @@ struct ice_pkg_hdr {
>>     /* generic segment */
>>   struct ice_generic_seg_hdr {
>> -#define SEGMENT_TYPE_METADATA    0x00000001
>> -#define SEGMENT_TYPE_ICE    0x00000010
>> +#define SEGMENT_TYPE_METADATA          0x00000001
>> +#define SEGMENT_TYPE_ICE          0x00000010
>> +#define SEGMENT_TYPE_ICE_RUN_TIME_CFG 0x00000020
>>       __le32 seg_type;
>>       struct ice_pkg_ver seg_format_ver;
>>       __le32 seg_size;
>> @@ -73,6 +74,12 @@ struct ice_buf_table {
>>       struct ice_buf buf_array[];
>>   };
>>   +struct ice_run_time_cfg_seg {
>> +    struct ice_generic_seg_hdr hdr;
>> +    u8 rsvd[8];
>> +    struct ice_buf_table buf_table;
>> +};
>> +
>>   /* global metadata specific segment */
>>   struct ice_global_metadata_seg {
>>       struct ice_generic_seg_hdr hdr;
>> @@ -181,6 +188,9 @@ struct ice_meta_sect {
>>   /* The following define MUST be updated to reflect the last label 
>> section ID */
>>   #define ICE_SID_LBL_LAST        0x80000038
>>   +/* Label ICE runtime configuration section IDs */
>> +#define ICE_SID_TX_5_LAYER_TOPO        0x10
>> +
>>   enum ice_block {
>>       ICE_BLK_SW = 0,
>>       ICE_BLK_ACL,
>> @@ -706,4 +716,7 @@ struct ice_meta_init_section {
>>       __le16 offset;
>>       struct ice_meta_init_entry entry;
>>   };
>> +
>> +int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len);
>> +
>>   #endif /* _ICE_FLEX_TYPE_H_ */
>> diff --git a/drivers/net/ethernet/intel/ice/ice_sched.h 
>> b/drivers/net/ethernet/intel/ice/ice_sched.h
>> index 4f91577fed56..86dc0f1f4255 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_sched.h
>> +++ b/drivers/net/ethernet/intel/ice/ice_sched.h
>> @@ -6,6 +6,9 @@
>>     #include "ice_common.h"
>>   +#define ICE_SCHED_5_LAYERS    5
>> +#define ICE_SCHED_9_LAYERS    9
>> +
>>   #define ICE_QGRP_LAYER_OFFSET    2
>>   #define ICE_VSI_LAYER_OFFSET    4
>>   #define ICE_AGG_LAYER_OFFSET    6
>> diff --git a/drivers/net/ethernet/intel/ice/ice_type.h 
>> b/drivers/net/ethernet/intel/ice/ice_type.h
>> index f2a518a1fd94..ff793fe2a2e7 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_type.h
>> +++ b/drivers/net/ethernet/intel/ice/ice_type.h
>> @@ -290,6 +290,7 @@ struct ice_hw_common_caps {
>>       bool pcie_reset_avoidance;
>>       /* Post update reset restriction */
>>       bool reset_restrict_support;
>> +    bool tx_sched_topo_comp_mode_en;
>>   };
>>     /* IEEE 1588 TIME_SYNC specific info */
>
>
> Kind regards,
>
> Paul
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology
  2022-06-28 22:27   ` Tony Nguyen
@ 2022-07-01 15:19     ` Wilczynski, Michal
  0 siblings, 0 replies; 13+ messages in thread
From: Wilczynski, Michal @ 2022-07-01 15:19 UTC (permalink / raw)
  To: Tony Nguyen, intel-wired-lan

Thanks for your review,

Based on your comments posted v3.

On 6/29/2022 12:27 AM, Tony Nguyen wrote:
>
>
> On 6/24/2022 3:21 AM, Michal Wilczynski wrote:
>> Introduce support for tx scheduler topology change, based on
>> user selection, from default 9-layer to 5-layer.
>> In order for switch to be successful there is a new NVM
>> and DDP package required.
>> This commit enables 5-layer topology in init path of
>> the driver, so before ice driver load, the user selection
>> should be changed in NVM using some external tools.
>>
>> Title: Enable switching default tx scheduler topology
>> Change-type: ImplementationChange
>> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
>> ---
>
> ...
>
>> +/**
>> + * ice_init_tx_topology - performs Tx topology initialization
>> + * @hw: pointer to the hardware structure
>> + * @firmware: pointer to firmware structure
>> + */
>> +static int ice_init_tx_topology(struct ice_hw *hw,
>> +                const struct firmware *firmware)
>> +{
>> +    u8 num_tx_sched_layers = hw->num_tx_sched_layers;
>> +    struct ice_pf *pf = hw->back;
>> +    struct device *dev;
>> +    u8 *buf_copy;
>> +    int err = 0;
>
> This initialization is unnecessary.
>
>> +
>> +    dev = ice_pf_to_dev(pf);
>> +    /* ice_cfg_tx_topo buf argument is not a constant,
>> +     * so we have to make a copy
>> +     */
>> +    buf_copy = devm_kmemdup(ice_hw_to_dev(hw), firmware->data,
>> +                firmware->size, GFP_KERNEL);
>
> It looks like the devm variant is not needed
>
>> +
>> +    err = ice_cfg_tx_topo(hw, buf_copy, firmware->size);
>> +    if (!err) {
>> +        if (hw->num_tx_sched_layers > num_tx_sched_layers)
>> +            dev_info(dev, "Transmit balancing feature disabled\n");
>> +        else
>> +            dev_info(dev, "Transmit balancing feature enabled\n");
>> +        /* if there was a change in topology ice_cfg_tx_topo triggered
>> +         * a CORER and we need to re-init hw
>> +         */
>> +        ice_deinit_hw(hw);
>> +        err = ice_init_hw(hw);
>> +
>> +        /* in this case we're not allowing safe mode */
>> +        devm_kfree(ice_hw_to_dev(hw), buf_copy);
>> +
>> +        return err;
>> +
>> +    } else if (err == -EIO) {
>> +        dev_info(dev, "DDP package does not support transmit 
>> balancing feature - please update to the latest DDP package and try 
>> again\n");
>>       }
>>   -    /* request for firmware was successful. Download to device */
>> +    devm_kfree(ice_hw_to_dev(hw), buf_copy);
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>> + * ice_init_ddp_config - DDP related configuration
>> + * @hw: pointer to the hardware structure
>> + * @pf: pointer to pf structure
>> + *
>> + * This function loads DDP file from the disk, then initializes tx
>> + * topology. At the end DDP package is loaded on the card.
>> + */
>> +static int ice_init_ddp_config(struct ice_hw *hw, struct ice_pf *pf)
>> +{
>> +    struct device *dev = ice_pf_to_dev(pf);
>> +    const struct firmware *firmware = NULL;
>> +    int err = 0;
>
> Initialization not needed.
>
>> +
>> +    err = ice_request_fw(pf, &firmware);
>> +    if (err)
>> +    /* we can still operate in safe mode if DDP package load fails */
>> +        return 0;
>> +
>> +    err = ice_init_tx_topology(hw, firmware);
>> +    if (err) {
>> +        dev_err(dev, "ice_init_hw during change of tx topology 
>> failed: %d\n",
>> +            err);
>> +        release_firmware(firmware);
>> +        return err;
>> +    }
>> +
>> +    /* Download firmware to device */
>>       ice_load_pkg(firmware, pf);
>>       release_firmware(firmware);
>> +
>> +    return 0;
>>   }
>>     /**
>> @@ -4641,9 +4710,15 @@ ice_probe(struct pci_dev *pdev, const struct 
>> pci_device_id __always_unused *ent)
>>         ice_init_feature_support(pf);
>>   -    ice_request_fw(pf);
>> +    err = ice_init_ddp_config(hw, pf);
>> +
>> +    /* during topology change ice_init_hw may fail */
>> +    if (err) {
>> +        err = -EIO;
>> +        goto err_exit_unroll;
>> +    }
>>   -    /* if ice_request_fw fails, ICE_FLAG_ADV_FEATURES bit won't be
>> +    /* if ice_init_ddp_config fails, ICE_FLAG_ADV_FEATURES bit won't be
>>        * set in pf->state, which will cause ice_is_safe_mode to return
>>        * true
>>        */
>> diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c 
>> b/drivers/net/ethernet/intel/ice/ice_sched.c
>> index 7947223536e3..f18a7121ca55 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_sched.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_sched.c
>> @@ -1102,12 +1102,11 @@ static u8 ice_sched_get_vsi_layer(struct 
>> ice_hw *hw)
>>        *     5 or less       sw_entry_point_layer
>>        */
>>       /* calculate the VSI layer based on number of layers. */
>> -    if (hw->num_tx_sched_layers > ICE_VSI_LAYER_OFFSET + 1) {
>> -        u8 layer = hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
>> -
>> -        if (layer > hw->sw_entry_point_layer)
>> -            return layer;
>> -    }
>> +    if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
>> +        return hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
>> +    else if (hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS)
>> +        /* qgroup and VSI layers are same */
>> +        return hw->num_tx_sched_layers - ICE_QGRP_LAYER_OFFSET;
>>       return hw->sw_entry_point_layer;
>
> These can all be if's since they all return:
>
>     if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
>         return hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET;
>     if (hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS)
>         /* qgroup and VSI layers are same */
>         return hw->num_tx_sched_layers - ICE_QGRP_LAYER_OFFSET;
>      return hw->sw_entry_point_layer;
>
>>   }
>>   @@ -1124,12 +1123,8 @@ static u8 ice_sched_get_agg_layer(struct 
>> ice_hw *hw)
>>        *     7 or less       sw_entry_point_layer
>>        */
>>       /* calculate the aggregator layer based on number of layers. */
>> -    if (hw->num_tx_sched_layers > ICE_AGG_LAYER_OFFSET + 1) {
>> -        u8 layer = hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
>> -
>> -        if (layer > hw->sw_entry_point_layer)
>> -            return layer;
>> -    }
>> +    if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS)
>> +        return hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET;
>>       return hw->sw_entry_point_layer;
>>   }
>>   @@ -1485,10 +1480,11 @@ ice_sched_get_free_qparent(struct 
>> ice_port_info *pi, u16 vsi_handle, u8 tc,
>>   {
>>       struct ice_sched_node *vsi_node, *qgrp_node;
>>       struct ice_vsi_ctx *vsi_ctx;
>> +    u8 qgrp_layer, vsi_layer;
>>       u16 max_children;
>> -    u8 qgrp_layer;
>>         qgrp_layer = ice_sched_get_qgrp_layer(pi->hw);
>> +    vsi_layer = ice_sched_get_vsi_layer(pi->hw);
>>       max_children = pi->hw->max_children[qgrp_layer];
>>         vsi_ctx = ice_get_vsi_ctx(pi->hw, vsi_handle);
>> @@ -1499,6 +1495,12 @@ ice_sched_get_free_qparent(struct 
>> ice_port_info *pi, u16 vsi_handle, u8 tc,
>>       if (!vsi_node)
>>           return NULL;
>>   +    /* If the queue group and vsi layer are same then queues
>> +     * are all attached directly to VSI
>> +     */
>> +    if (qgrp_layer == vsi_layer)
>> +        return vsi_node;
>> +
>>       /* get the first queue group node from VSI sub-tree */
>>       qgrp_node = ice_sched_get_first_node(pi, vsi_node, qgrp_layer);
>>       while (qgrp_node) {
>> @@ -3178,8 +3180,9 @@ ice_sched_add_rl_profile(struct ice_port_info *pi,
>>       u8 profile_type;
>>       int status;
>>   -    if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
>> +    if (!pi || layer_num >= pi->hw->num_tx_sched_layers)
>>           return NULL;
>> +
>>       switch (rl_type) {
>>       case ICE_MIN_BW:
>>           profile_type = ICE_AQC_RL_PROFILE_TYPE_CIR;
>> @@ -3194,8 +3197,6 @@ ice_sched_add_rl_profile(struct ice_port_info *pi,
>>           return NULL;
>>       }
>>   -    if (!pi)
>> -        return NULL;
>>       hw = pi->hw;
>>       list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
>>                   list_entry)
>> @@ -3425,7 +3426,7 @@ ice_sched_rm_rl_profile(struct ice_port_info 
>> *pi, u8 layer_num, u8 profile_type,
>>       struct ice_aqc_rl_profile_info *rl_prof_elem;
>>       int status = 0;
>>   -    if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM)
>> +    if (layer_num >= pi->hw->num_tx_sched_layers)
>>           return -EINVAL;
>>       /* Check the existing list for RL profile */
>>       list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num],
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-07-01 15:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-24 10:21 [Intel-wired-lan] [PATCH net-next v2 0/2] ice: Support 5 layer tx scheduler topology Michal Wilczynski
2022-06-24 10:21 ` [Intel-wired-lan] [PATCH net-next v2 1/2] ice: Code added to support 5 layer topology Michal Wilczynski
2022-06-28 22:21   ` Tony Nguyen
2022-06-29  8:54     ` Wilczynski, Michal
2022-06-29 15:57       ` Tony Nguyen
2022-06-29 10:49   ` Paul Menzel
2022-07-01 13:22     ` Wilczynski, Michal
2022-06-24 10:21 ` [Intel-wired-lan] [PATCH net-next v2 2/2] ice: Enable switching default tx scheduler topology Michal Wilczynski
2022-06-25  7:10   ` Paul Menzel
2022-06-29  9:54     ` Wilczynski, Michal
2022-07-01 13:12       ` Wilczynski, Michal
2022-06-28 22:27   ` Tony Nguyen
2022-07-01 15:19     ` Wilczynski, Michal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.