All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice
@ 2021-12-22  6:21 Michal Swiatkowski
  2021-12-22  6:21 ` [Intel-wired-lan] [PATCH net-next 1/3] ice: add check for eswitch support Michal Swiatkowski
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Michal Swiatkowski @ 2021-12-22  6:21 UTC (permalink / raw)
  To: intel-wired-lan

The ice driver uses the old PCI irq reseveration API. Change the ice
driver to use the current API.

Implement a fallback mechanism where, if the driver can't reserve the
maximum number of interrupts, it will limit the number of queues or
disable capabilities.

First two patches add ability to turn on and off eswitch offload. This
is needed when driver can't reserve maximum number of interrupts. In
this case driver turns off eswitch offload as driver can work
without it. Additionally, the eswitch can be supported only if SRIOV is
available, so set eswitch capabilities only if SRIOV is supported.

Michal Swiatkowski (3):
  ice: add check for eswitch support
  ice: change mode only if eswitch is supported
  ice: use new alloc irqs API

 drivers/net/ethernet/intel/ice/Makefile      |   3 +-
 drivers/net/ethernet/intel/ice/ice.h         |   4 +-
 drivers/net/ethernet/intel/ice/ice_arfs.c    |   3 +-
 drivers/net/ethernet/intel/ice/ice_eswitch.c |  46 +++-
 drivers/net/ethernet/intel/ice/ice_eswitch.h |  12 +
 drivers/net/ethernet/intel/ice/ice_irq.c     | 213 ++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_irq.h     |  12 +
 drivers/net/ethernet/intel/ice/ice_lib.c     |   5 +-
 drivers/net/ethernet/intel/ice/ice_main.c    | 220 +------------------
 drivers/net/ethernet/intel/ice/ice_xsk.c     |   3 +-
 10 files changed, 300 insertions(+), 221 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_irq.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_irq.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Intel-wired-lan] [PATCH net-next 1/3] ice: add check for eswitch support
  2021-12-22  6:21 [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice Michal Swiatkowski
@ 2021-12-22  6:21 ` Michal Swiatkowski
  2021-12-22  6:22 ` [Intel-wired-lan] [PATCH net-next 2/3] ice: change mode only if eswitch is supported Michal Swiatkowski
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Michal Swiatkowski @ 2021-12-22  6:21 UTC (permalink / raw)
  To: intel-wired-lan

Driver support eswitch mode if there is SRIOV capabilities on hardware
and CONFIG_ICE_SWITCHDEV is on.

Create function to check if it is supported. Use it in MSI-X
reservation to not reserve additional vector when eswitch isn't
supported.

Introduce new capability flags to allow driver to disable eswitch
support when MSI-X reservation fails.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h         |  1 +
 drivers/net/ethernet/intel/ice/ice_eswitch.c | 29 ++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_eswitch.h | 12 ++++++++
 drivers/net/ethernet/intel/ice/ice_main.c    | 15 ++++++----
 4 files changed, 52 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 0b9b5b9c24b6..1ca309feabbf 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -488,6 +488,7 @@ enum ice_pf_flags {
 	ICE_FLAG_MDD_AUTO_RESET_VF,
 	ICE_FLAG_VF_VLAN_PRUNING,
 	ICE_FLAG_LINK_LENIENT_MODE_ENA,
+	ICE_FLAG_ESWITCH_CAPABLE,
 	ICE_FLAG_GNSS,			/* GNSS successfully initialized */
 	ICE_PF_FLAGS_NBITS		/* must be last */
 };
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c
index 30a00fe59c52..fbe640d501c6 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
@@ -678,3 +678,32 @@ int ice_eswitch_rebuild(struct ice_pf *pf)
 
 	return 0;
 }
+
+/**
+ * ice_is_eswitch_supported - check if eswitch can be supported
+ * @pf: pointer to PF structure
+ */
+bool ice_is_eswitch_supported(struct ice_pf *pf)
+{
+	return test_bit(ICE_FLAG_ESWITCH_CAPABLE, pf->flags);
+}
+
+/**
+ * ice_eswitch_set_cap - set eswitch cap based on SRIOV cap
+ * @pf: pointer to PF structure
+ */
+void ice_eswitch_set_cap(struct ice_pf *pf)
+{
+	clear_bit(ICE_FLAG_ESWITCH_CAPABLE, pf->flags);
+	if (test_bit(ICE_FLAG_SRIOV_CAPABLE, pf->flags))
+		set_bit(ICE_FLAG_ESWITCH_CAPABLE, pf->flags);
+}
+
+/**
+ * ice_eswitch_clear_cap - clear switchdev cap when driver can't support it
+ * @pf: pointer to PF structure
+ */
+void ice_eswitch_clear_cap(struct ice_pf *pf)
+{
+	clear_bit(ICE_FLAG_ESWITCH_CAPABLE, pf->flags);
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.h b/drivers/net/ethernet/intel/ice/ice_eswitch.h
index 0d0fadaf2ba5..b405e1c9c2bc 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch.h
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch.h
@@ -29,6 +29,9 @@ void ice_eswitch_set_target_vsi(struct sk_buff *skb,
 				struct ice_tx_offload_params *off);
 netdev_tx_t
 ice_eswitch_port_start_xmit(struct sk_buff *skb, struct net_device *netdev);
+bool ice_is_eswitch_supported(struct ice_pf *pf);
+void ice_eswitch_set_cap(struct ice_pf *pf);
+void ice_eswitch_clear_cap(struct ice_pf *pf);
 #else /* CONFIG_ICE_SWITCHDEV */
 static inline void ice_eswitch_release(struct ice_pf *pf) { }
 
@@ -36,6 +39,9 @@ static inline void ice_eswitch_stop_all_tx_queues(struct ice_pf *pf) { }
 static inline void ice_eswitch_replay_vf_mac_rule(struct ice_vf *vf) { }
 static inline void ice_eswitch_del_vf_mac_rule(struct ice_vf *vf) { }
 
+static inline void ice_eswitch_set_cap(struct ice_pf *pf) { }
+static inline void ice_eswitch_clear_cap(struct ice_pf *pf) { }
+
 static inline int
 ice_eswitch_add_vf_mac_rule(struct ice_pf *pf, struct ice_vf *vf,
 			    const u8 *mac)
@@ -81,5 +87,11 @@ ice_eswitch_port_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 {
 	return NETDEV_TX_BUSY;
 }
+
+static inline bool
+ice_is_eswitch_supported(struct ice_pf *pf)
+{
+	return false;
+}
 #endif /* CONFIG_ICE_SWITCHDEV */
 #endif /* _ICE_ESWITCH_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 296c4dd90e26..e31c01673d3a 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -3735,6 +3735,8 @@ static void ice_set_pf_caps(struct ice_pf *pf)
 	if (func_caps->common_cap.ieee_1588)
 		set_bit(ICE_FLAG_PTP_SUPPORTED, pf->flags);
 
+	ice_eswitch_set_cap(pf);
+
 	pf->max_pf_txqs = func_caps->common_cap.num_txq;
 	pf->max_pf_rxqs = func_caps->common_cap.num_rxq;
 }
@@ -3810,11 +3812,13 @@ static int ice_ena_msix_range(struct ice_pf *pf)
 	}
 
 	/* reserve for switchdev */
-	needed = ICE_ESWITCH_MSIX;
-	if (v_left < needed)
-		goto no_hw_vecs_left_err;
-	v_budget += needed;
-	v_left -= needed;
+	if (ice_is_eswitch_supported(pf)) {
+		needed = ICE_ESWITCH_MSIX;
+		if (v_left < needed)
+			goto no_hw_vecs_left_err;
+		v_budget += needed;
+		v_left -= needed;
+	}
 
 	/* total used for non-traffic vectors */
 	v_other = v_budget;
@@ -3920,6 +3924,7 @@ static int ice_ena_msix_range(struct ice_pf *pf)
 		needed, v_left);
 	err = -ERANGE;
 exit_err:
+	ice_eswitch_clear_cap(pf);
 	pf->num_rdma_msix = 0;
 	pf->num_lan_msix = 0;
 	return err;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Intel-wired-lan] [PATCH net-next 2/3] ice: change mode only if eswitch is supported
  2021-12-22  6:21 [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice Michal Swiatkowski
  2021-12-22  6:21 ` [Intel-wired-lan] [PATCH net-next 1/3] ice: add check for eswitch support Michal Swiatkowski
@ 2021-12-22  6:22 ` Michal Swiatkowski
  2021-12-22  6:22 ` [Intel-wired-lan] [PATCH net-next 3/3] ice: use new alloc irqs API Michal Swiatkowski
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Michal Swiatkowski @ 2021-12-22  6:22 UTC (permalink / raw)
  To: intel-wired-lan

Support for eswitch can be turned off when reservation for msix fails.
Cover this situation and return error in changing eswitch mode function.

Add new lines in each dev_info call.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ice/ice_eswitch.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c
index fbe640d501c6..2b6c7d27ac96 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
@@ -518,25 +518,34 @@ ice_eswitch_mode_set(struct devlink *devlink, u16 mode,
 		     struct netlink_ext_ack *extack)
 {
 	struct ice_pf *pf = devlink_priv(devlink);
+	struct device *dev = ice_pf_to_dev(pf);
 
 	if (pf->eswitch_mode == mode)
 		return 0;
 
 	if (pf->num_alloc_vfs) {
-		dev_info(ice_pf_to_dev(pf), "Changing eswitch mode is allowed only if there is no VFs created");
-		NL_SET_ERR_MSG_MOD(extack, "Changing eswitch mode is allowed only if there is no VFs created");
+		dev_info(dev, "Changing eswitch mode is allowed only if there is no VFs created\n");
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Changing eswitch mode is allowed only if there is no VFs created");
+		return -EOPNOTSUPP;
+	}
+
+	if (!ice_is_eswitch_supported(pf)) {
+		dev_info(dev, "There is no eswitch support or eswitch resource allocation failed\n");
+		NL_SET_ERR_MSG_MOD(extack,
+				   "There is no eswitch support or eswitch resource allocation failed");
 		return -EOPNOTSUPP;
 	}
 
 	switch (mode) {
 	case DEVLINK_ESWITCH_MODE_LEGACY:
-		dev_info(ice_pf_to_dev(pf), "PF %d changed eswitch mode to legacy",
+		dev_info(ice_pf_to_dev(pf), "PF %d changed eswitch mode to legacy\n",
 			 pf->hw.pf_id);
 		NL_SET_ERR_MSG_MOD(extack, "Changed eswitch mode to legacy");
 		break;
 	case DEVLINK_ESWITCH_MODE_SWITCHDEV:
 	{
-		dev_info(ice_pf_to_dev(pf), "PF %d changed eswitch mode to switchdev",
+		dev_info(ice_pf_to_dev(pf), "PF %d changed eswitch mode to switchdev\n",
 			 pf->hw.pf_id);
 		NL_SET_ERR_MSG_MOD(extack, "Changed eswitch mode to switchdev");
 		break;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Intel-wired-lan] [PATCH net-next 3/3] ice: use new alloc irqs API
  2021-12-22  6:21 [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice Michal Swiatkowski
  2021-12-22  6:21 ` [Intel-wired-lan] [PATCH net-next 1/3] ice: add check for eswitch support Michal Swiatkowski
  2021-12-22  6:22 ` [Intel-wired-lan] [PATCH net-next 2/3] ice: change mode only if eswitch is supported Michal Swiatkowski
@ 2021-12-22  6:22 ` Michal Swiatkowski
  2021-12-22 16:33 ` [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice Jonathan Toppins
  2022-04-27  0:24 ` Tony Nguyen
  4 siblings, 0 replies; 6+ messages in thread
From: Michal Swiatkowski @ 2021-12-22  6:22 UTC (permalink / raw)
  To: intel-wired-lan

Move code related to allocating and managing irq to new file to separate
it from ice_main.

In new API system vectors number are tracked in kernel and driver can
get it by calling pci_irq_vector. There is no need to track these values
on driver site.

As there is no function to get exact number of irqs rewrite enabling
irqs function to adjust number of irqs based on value returned from
kernel.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
---
 drivers/net/ethernet/intel/ice/Makefile   |   3 +-
 drivers/net/ethernet/intel/ice/ice.h      |   3 +-
 drivers/net/ethernet/intel/ice/ice_arfs.c |   3 +-
 drivers/net/ethernet/intel/ice/ice_irq.c  | 213 +++++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_irq.h  |  12 ++
 drivers/net/ethernet/intel/ice/ice_lib.c  |   5 +-
 drivers/net/ethernet/intel/ice/ice_main.c | 221 +---------------------
 drivers/net/ethernet/intel/ice/ice_xsk.c  |   3 +-
 8 files changed, 243 insertions(+), 220 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_irq.c
 create mode 100644 drivers/net/ethernet/intel/ice/ice_irq.h

diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile
index 8771ac8460a7..c69da65d6f31 100644
--- a/drivers/net/ethernet/intel/ice/Makefile
+++ b/drivers/net/ethernet/intel/ice/Makefile
@@ -32,7 +32,8 @@ ice-y := ice_main.o	\
 	 ice_lag.o	\
 	 ice_ethtool.o  \
 	 ice_repr.o	\
-	 ice_tc_lib.o
+	 ice_tc_lib.o	\
+	 ice_irq.o
 ice-$(CONFIG_PCI_IOV) += ice_virtchnl_allowlist.o
 ice-$(CONFIG_PCI_IOV) += ice_virtchnl_pf.o ice_sriov.o ice_virtchnl_fdir.o ice_vf_vsi_vlan_ops.o
 ice-$(CONFIG_PTP_1588_CLOCK) += ice_ptp.o ice_ptp_hw.o
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 1ca309feabbf..1c6437b4dea2 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -92,6 +92,8 @@
 #define ICE_MIN_LAN_OICR_MSIX	1
 #define ICE_MIN_MSIX		(ICE_MIN_LAN_TXRX_MSIX + ICE_MIN_LAN_OICR_MSIX)
 #define ICE_FDIR_MSIX		2
+#define ICE_MIN_LAN_MSIX        1
+#define ICE_OICR_MSIX           1
 #define ICE_RDMA_NUM_AEQ_MSIX	4
 #define ICE_MIN_RDMA_MSIX	2
 #define ICE_ESWITCH_MSIX	1
@@ -517,7 +519,6 @@ struct ice_pf {
 	struct devlink_port devlink_port;
 
 	/* OS reserved IRQ details */
-	struct msix_entry *msix_entries;
 	struct ice_res_tracker *irq_tracker;
 	/* First MSIX vector used by SR-IOV VFs. Calculated by subtracting the
 	 * number of MSIX vectors needed for all SR-IOV VFs from the number of
diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c b/drivers/net/ethernet/intel/ice/ice_arfs.c
index 5daade32ea62..5bdfabd14c83 100644
--- a/drivers/net/ethernet/intel/ice/ice_arfs.c
+++ b/drivers/net/ethernet/intel/ice/ice_arfs.c
@@ -2,6 +2,7 @@
 /* Copyright (C) 2018-2020, Intel Corporation. */
 
 #include "ice.h"
+#include "ice_irq.h"
 
 /**
  * ice_is_arfs_active - helper to check is aRFS is active
@@ -616,7 +617,7 @@ int ice_set_cpu_rx_rmap(struct ice_vsi *vsi)
 	base_idx = vsi->base_vector;
 	ice_for_each_q_vector(vsi, i)
 		if (irq_cpu_rmap_add(netdev->rx_cpu_rmap,
-				     pf->msix_entries[base_idx + i].vector)) {
+				     ice_get_irq_num(pf, base_idx + 1))) {
 			ice_free_cpu_rx_rmap(vsi);
 			return -EINVAL;
 		}
diff --git a/drivers/net/ethernet/intel/ice/ice_irq.c b/drivers/net/ethernet/intel/ice/ice_irq.c
new file mode 100644
index 000000000000..26a894911a8d
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_irq.c
@@ -0,0 +1,213 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2021, Intel Corporation. */
+
+#include "ice.h"
+#include "ice_lib.h"
+#include "ice_irq.h"
+
+static void ice_dis_msix(struct ice_pf *pf)
+{
+	pci_free_irq_vectors(pf->pdev);
+}
+
+static int ice_ena_msix(struct ice_pf *pf, int nvec)
+{
+	return pci_alloc_irq_vectors(pf->pdev, ICE_MIN_MSIX, nvec,
+				     PCI_IRQ_MSIX);
+}
+
+#define ICE_ADJ_VEC_STEPS 5
+static void ice_adj_vec_sum(int *dst, int *src)
+{
+	int i;
+
+	for (i = 0; i < ICE_ADJ_VEC_STEPS; i++)
+		dst[i] += src[i];
+}
+
+/**
+ * ice_ena_msix_range - request a range of MSI-X vectors from the OS
+ * @pf: board private structure
+ *
+ * The driver tries to enable best-case scenario MSI-X vectors. If that doesn't
+ * succeed then adjust to irqs number returned by kernel.
+ *
+ * The fall-back logic is described below with each [#] represented needed irqs
+ * number for the step. If any of the steps is lower than received number, then
+ * return the number of MSI-X. If any of the steps is greater, then check next
+ * one. If received value is lower than irqs value in last step return error.
+ *
+ * Step [4]: Enable the best-case scenario MSI-X vectors.
+ *
+ * Step [3]: Enable MSI-X vectors with eswitch support disabled
+ *
+ * Step [2]: Enable MSI-X vectors with the number of pf->num_lan_msix reduced
+ * by a factor of 2 from the previous step (i.e. num_online_cpus() / 2).
+ * Also, with the number of pf->num_rdma_msix reduced by a factor of ~2 from the
+ * previous step (i.e. num_online_cpus() / 2 + ICE_RDMA_NUM_AEQ_MSIX).
+ *
+ * Step [1]: Same as step [2], except reduce both by a factor of 4.
+ *
+ * Step [0]: Enable the bare-minimum MSI-X vectors.
+ *
+ * Each feature has separeate table with needed irqs in each step. Sum of these
+ * tables is tracked in adj_vec to show needed irqs in each step. Separate
+ * tables are later use to set correct number of irqs for each feature based on
+ * choosed step.
+ */
+static int ice_ena_msix_range(struct ice_pf *pf)
+{
+	enum {
+		ICE_ADJ_VEC_WORST_CASE	= 0,
+		ICE_ADJ_VEC_STEP_1	= 1,
+		ICE_ADJ_VEC_STEP_2	= 2,
+		ICE_ADJ_VEC_STEP_3	= 3,
+		ICE_ADJ_VEC_BEST_CASE	= ICE_ADJ_VEC_STEPS - 1,
+	};
+	int num_cpus = num_possible_cpus();
+	int rdma_adj_vec[ICE_ADJ_VEC_STEPS] = {
+		[ICE_ADJ_VEC_WORST_CASE] = ICE_MIN_RDMA_MSIX,
+		[ICE_ADJ_VEC_STEP_1] = num_cpus / 4 > ICE_MIN_RDMA_MSIX ?
+			num_cpus / 4 + ICE_RDMA_NUM_AEQ_MSIX :
+			ICE_MIN_RDMA_MSIX,
+		[ICE_ADJ_VEC_STEP_2] = num_cpus / 2 > ICE_MIN_RDMA_MSIX ?
+			num_cpus / 2 + ICE_RDMA_NUM_AEQ_MSIX :
+			ICE_MIN_RDMA_MSIX,
+		[ICE_ADJ_VEC_STEP_3] = num_cpus > ICE_MIN_RDMA_MSIX ?
+			num_cpus + ICE_RDMA_NUM_AEQ_MSIX : ICE_MIN_RDMA_MSIX,
+		[ICE_ADJ_VEC_BEST_CASE] = num_cpus > ICE_MIN_RDMA_MSIX ?
+			num_cpus + ICE_RDMA_NUM_AEQ_MSIX : ICE_MIN_RDMA_MSIX,
+	};
+	int lan_adj_vec[ICE_ADJ_VEC_STEPS] = {
+		[ICE_ADJ_VEC_WORST_CASE] = ICE_MIN_LAN_MSIX,
+		[ICE_ADJ_VEC_STEP_1] =
+			max_t(int, num_cpus / 4, ICE_MIN_LAN_MSIX),
+		[ICE_ADJ_VEC_STEP_2] =
+			max_t(int, num_cpus / 2, ICE_MIN_LAN_MSIX),
+		[ICE_ADJ_VEC_STEP_3] =
+			max_t(int, num_cpus, ICE_MIN_LAN_MSIX),
+		[ICE_ADJ_VEC_BEST_CASE] =
+			max_t(int, num_cpus, ICE_MIN_LAN_MSIX),
+	};
+	int fdir_adj_vec[ICE_ADJ_VEC_STEPS] = {
+		ICE_FDIR_MSIX, ICE_FDIR_MSIX, ICE_FDIR_MSIX,
+		ICE_FDIR_MSIX, ICE_FDIR_MSIX,
+	};
+	int adj_vec[ICE_ADJ_VEC_STEPS] = {
+		ICE_OICR_MSIX, ICE_OICR_MSIX, ICE_OICR_MSIX,
+		ICE_OICR_MSIX, ICE_OICR_MSIX,
+	};
+	int eswitch_adj_vec[ICE_ADJ_VEC_STEPS] = {
+		0, 0, 0, 0,
+		[ICE_ADJ_VEC_BEST_CASE] = ICE_ESWITCH_MSIX,
+	};
+	struct device *dev = ice_pf_to_dev(pf);
+	int adj_step = ICE_ADJ_VEC_BEST_CASE;
+	int needed = ICE_OICR_MSIX;
+	int err = -ENOSPC;
+	int v_actual, i;
+
+	needed += lan_adj_vec[ICE_ADJ_VEC_BEST_CASE];
+	ice_adj_vec_sum(adj_vec, lan_adj_vec);
+
+	if (ice_is_eswitch_supported(pf)) {
+		needed += eswitch_adj_vec[ICE_ADJ_VEC_BEST_CASE];
+		ice_adj_vec_sum(adj_vec, eswitch_adj_vec);
+	} else {
+		memset(&eswitch_adj_vec, 0, sizeof(eswitch_adj_vec));
+	}
+
+	if (ice_is_rdma_ena(pf)) {
+		needed += rdma_adj_vec[ICE_ADJ_VEC_BEST_CASE];
+		ice_adj_vec_sum(adj_vec, rdma_adj_vec);
+	} else {
+		memset(&rdma_adj_vec, 0, sizeof(rdma_adj_vec));
+	}
+
+	if (test_bit(ICE_FLAG_FD_ENA, pf->flags)) {
+		needed += fdir_adj_vec[ICE_ADJ_VEC_BEST_CASE];
+		ice_adj_vec_sum(adj_vec, fdir_adj_vec);
+	} else {
+		memset(&fdir_adj_vec, 0, sizeof(fdir_adj_vec));
+	}
+
+	v_actual = ice_ena_msix(pf, needed);
+	if (v_actual < 0) {
+		err = v_actual;
+		goto err;
+	} else if (v_actual < adj_vec[ICE_ADJ_VEC_WORST_CASE]) {
+		ice_dis_msix(pf);
+		goto err;
+	}
+
+	for (i = ICE_ADJ_VEC_WORST_CASE + 1; i < ICE_ADJ_VEC_STEPS; i++) {
+		if (v_actual < adj_vec[i]) {
+			adj_step = i - 1;
+			break;
+		}
+	}
+
+	pf->num_lan_msix = lan_adj_vec[adj_step];
+	pf->num_rdma_msix = rdma_adj_vec[adj_step];
+
+	if (ice_is_eswitch_supported(pf) &&
+	    !eswitch_adj_vec[adj_step]) {
+		dev_warn(dev, "Not enough MSI-X for eswitch support, disabling feature\n");
+	}
+
+	return v_actual;
+
+err:
+	dev_err(dev, "Failed to enable MSI-X vectors\n");
+	return  err;
+}
+
+/**
+ * ice_init_interrupt_scheme - Determine proper interrupt scheme
+ * @pf: board private structure to initialize
+ */
+int ice_init_interrupt_scheme(struct ice_pf *pf)
+{
+	int vectors = ice_ena_msix_range(pf);
+
+	if (vectors < 0)
+		return vectors;
+
+	/* set up vector assignment tracking */
+	pf->irq_tracker =
+		kzalloc(struct_size(pf->irq_tracker, list, vectors),
+			GFP_KERNEL);
+	if (!pf->irq_tracker) {
+		ice_dis_msix(pf);
+		return -ENOMEM;
+	}
+
+	/* populate SW interrupts pool with number of OS granted IRQs. */
+	pf->num_avail_sw_msix = (u16)vectors;
+	pf->irq_tracker->num_entries = (u16)vectors;
+	pf->irq_tracker->end = pf->irq_tracker->num_entries;
+
+	return 0;
+}
+
+/**
+ * ice_clear_interrupt_scheme - Undo things done by ice_init_interrupt_scheme
+ * @pf: board private structure
+ */
+void ice_clear_interrupt_scheme(struct ice_pf *pf)
+{
+	ice_dis_msix(pf);
+
+	kfree(pf->irq_tracker);
+	pf->irq_tracker = NULL;
+}
+
+/**
+ * ice_get_irq_num - get system irq number based on index from driver
+ * @pf: board private structure
+ * @idx: driver irq index
+ */
+int ice_get_irq_num(struct ice_pf *pf, int idx)
+{
+	return pci_irq_vector(pf->pdev, idx);
+}
diff --git a/drivers/net/ethernet/intel/ice/ice_irq.h b/drivers/net/ethernet/intel/ice/ice_irq.h
new file mode 100644
index 000000000000..f4db6964f9c7
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_irq.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2021, Intel Corporation. */
+
+#ifndef _ICE_IRQ_H_
+#define _ICE_IRQ_H_
+
+int ice_init_interrupt_scheme(struct ice_pf *pf);
+void ice_clear_interrupt_scheme(struct ice_pf *pf);
+
+int ice_get_irq_num(struct ice_pf *pf, int idx);
+
+#endif
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 4e9efd49c149..6e7d121b6746 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -9,6 +9,7 @@
 #include "ice_dcb_lib.h"
 #include "ice_devlink.h"
 #include "ice_vsi_vlan_ops.h"
+#include "ice_irq.h"
 
 /**
  * ice_vsi_type_str - maps VSI type enum to string equivalents
@@ -2659,7 +2660,7 @@ void ice_vsi_free_irq(struct ice_vsi *vsi)
 		u16 vector = i + base;
 		int irq_num;
 
-		irq_num = pf->msix_entries[vector].vector;
+		irq_num = ice_get_irq_num(pf, vector);
 
 		/* free only the irqs that were actually requested */
 		if (!vsi->q_vectors[i] ||
@@ -2837,7 +2838,7 @@ void ice_vsi_dis_irq(struct ice_vsi *vsi)
 		return;
 
 	ice_for_each_q_vector(vsi, i)
-		synchronize_irq(pf->msix_entries[i + base].vector);
+		synchronize_irq(ice_get_irq_num(pf, i + base));
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index e31c01673d3a..7ae5bf4f70d8 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -22,6 +22,7 @@
 #include "ice_eswitch.h"
 #include "ice_tc_lib.h"
 #include "ice_vsi_vlan_ops.h"
+#include "ice_irq.h"
 
 #define DRV_SUMMARY	"Intel(R) Ethernet Connection E800 Series Linux Driver"
 static const char ice_driver_string[] = DRV_SUMMARY;
@@ -2418,7 +2419,7 @@ static int ice_vsi_req_irq_msix(struct ice_vsi *vsi, char *basename)
 	for (vector = 0; vector < q_vectors; vector++) {
 		struct ice_q_vector *q_vector = vsi->q_vectors[vector];
 
-		irq_num = pf->msix_entries[base + vector].vector;
+		irq_num = ice_get_irq_num(pf, base + vector);
 
 		if (q_vector->tx.tx_ring && q_vector->rx.rx_ring) {
 			snprintf(q_vector->name, sizeof(q_vector->name) - 1,
@@ -2467,7 +2468,7 @@ static int ice_vsi_req_irq_msix(struct ice_vsi *vsi, char *basename)
 free_q_irqs:
 	while (vector) {
 		vector--;
-		irq_num = pf->msix_entries[base + vector].vector;
+		irq_num = ice_get_irq_num(pf, base + vector);
 		if (!IS_ENABLED(CONFIG_RFS_ACCEL))
 			irq_set_affinity_notifier(irq_num, NULL);
 		irq_set_affinity_hint(irq_num, NULL);
@@ -3085,6 +3086,7 @@ static void ice_dis_ctrlq_interrupts(struct ice_hw *hw)
  */
 static void ice_free_irq_msix_misc(struct ice_pf *pf)
 {
+	int irq_num = ice_get_irq_num(pf, pf->oicr_idx);
 	struct ice_hw *hw = &pf->hw;
 
 	ice_dis_ctrlq_interrupts(hw);
@@ -3093,11 +3095,8 @@ static void ice_free_irq_msix_misc(struct ice_pf *pf)
 	wr32(hw, PFINT_OICR_ENA, 0);
 	ice_flush(hw);
 
-	if (pf->msix_entries) {
-		synchronize_irq(pf->msix_entries[pf->oicr_idx].vector);
-		devm_free_irq(ice_pf_to_dev(pf),
-			      pf->msix_entries[pf->oicr_idx].vector, pf);
-	}
+	synchronize_irq(irq_num);
+	devm_free_irq(ice_pf_to_dev(pf), irq_num, pf);
 
 	pf->num_avail_sw_msix += 1;
 	ice_free_res(pf->irq_tracker, pf->oicr_idx, ICE_RES_MISC_VEC_ID);
@@ -3167,7 +3166,7 @@ static int ice_req_irq_msix_misc(struct ice_pf *pf)
 	pf->num_avail_sw_msix -= 1;
 	pf->oicr_idx = (u16)oicr_idx;
 
-	err = devm_request_irq(dev, pf->msix_entries[pf->oicr_idx].vector,
+	err = devm_request_irq(dev, ice_get_irq_num(pf, pf->oicr_idx),
 			       ice_misc_intr, 0, pf->int_name, pf);
 	if (err) {
 		dev_err(dev, "devm_request_irq for %s failed: %d\n",
@@ -3779,212 +3778,6 @@ static int ice_init_pf(struct ice_pf *pf)
 	return 0;
 }
 
-/**
- * ice_ena_msix_range - Request a range of MSIX vectors from the OS
- * @pf: board private structure
- *
- * compute the number of MSIX vectors required (v_budget) and request from
- * the OS. Return the number of vectors reserved or negative on failure
- */
-static int ice_ena_msix_range(struct ice_pf *pf)
-{
-	int num_cpus, v_left, v_actual, v_other, v_budget = 0;
-	struct device *dev = ice_pf_to_dev(pf);
-	int needed, err, i;
-
-	v_left = pf->hw.func_caps.common_cap.num_msix_vectors;
-	num_cpus = num_online_cpus();
-
-	/* reserve for LAN miscellaneous handler */
-	needed = ICE_MIN_LAN_OICR_MSIX;
-	if (v_left < needed)
-		goto no_hw_vecs_left_err;
-	v_budget += needed;
-	v_left -= needed;
-
-	/* reserve for flow director */
-	if (test_bit(ICE_FLAG_FD_ENA, pf->flags)) {
-		needed = ICE_FDIR_MSIX;
-		if (v_left < needed)
-			goto no_hw_vecs_left_err;
-		v_budget += needed;
-		v_left -= needed;
-	}
-
-	/* reserve for switchdev */
-	if (ice_is_eswitch_supported(pf)) {
-		needed = ICE_ESWITCH_MSIX;
-		if (v_left < needed)
-			goto no_hw_vecs_left_err;
-		v_budget += needed;
-		v_left -= needed;
-	}
-
-	/* total used for non-traffic vectors */
-	v_other = v_budget;
-
-	/* reserve vectors for LAN traffic */
-	needed = num_cpus;
-	if (v_left < needed)
-		goto no_hw_vecs_left_err;
-	pf->num_lan_msix = needed;
-	v_budget += needed;
-	v_left -= needed;
-
-	/* reserve vectors for RDMA auxiliary driver */
-	if (ice_is_rdma_ena(pf)) {
-		needed = num_cpus + ICE_RDMA_NUM_AEQ_MSIX;
-		if (v_left < needed)
-			goto no_hw_vecs_left_err;
-		pf->num_rdma_msix = needed;
-		v_budget += needed;
-		v_left -= needed;
-	}
-
-	pf->msix_entries = devm_kcalloc(dev, v_budget,
-					sizeof(*pf->msix_entries), GFP_KERNEL);
-	if (!pf->msix_entries) {
-		err = -ENOMEM;
-		goto exit_err;
-	}
-
-	for (i = 0; i < v_budget; i++)
-		pf->msix_entries[i].entry = i;
-
-	/* actually reserve the vectors */
-	v_actual = pci_enable_msix_range(pf->pdev, pf->msix_entries,
-					 ICE_MIN_MSIX, v_budget);
-	if (v_actual < 0) {
-		dev_err(dev, "unable to reserve MSI-X vectors\n");
-		err = v_actual;
-		goto msix_err;
-	}
-
-	if (v_actual < v_budget) {
-		dev_warn(dev, "not enough OS MSI-X vectors. requested = %d, obtained = %d\n",
-			 v_budget, v_actual);
-
-		if (v_actual < ICE_MIN_MSIX) {
-			/* error if we can't get minimum vectors */
-			pci_disable_msix(pf->pdev);
-			err = -ERANGE;
-			goto msix_err;
-		} else {
-			int v_remain = v_actual - v_other;
-			int v_rdma = 0, v_min_rdma = 0;
-
-			if (ice_is_rdma_ena(pf)) {
-				/* Need at least 1 interrupt in addition to
-				 * AEQ MSIX
-				 */
-				v_rdma = ICE_RDMA_NUM_AEQ_MSIX + 1;
-				v_min_rdma = ICE_MIN_RDMA_MSIX;
-			}
-
-			if (v_actual == ICE_MIN_MSIX ||
-			    v_remain < ICE_MIN_LAN_TXRX_MSIX + v_min_rdma) {
-				dev_warn(dev, "Not enough MSI-X vectors to support RDMA.\n");
-				clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
-
-				pf->num_rdma_msix = 0;
-				pf->num_lan_msix = ICE_MIN_LAN_TXRX_MSIX;
-			} else if ((v_remain < ICE_MIN_LAN_TXRX_MSIX + v_rdma) ||
-				   (v_remain - v_rdma < v_rdma)) {
-				/* Support minimum RDMA and give remaining
-				 * vectors to LAN MSIX
-				 */
-				pf->num_rdma_msix = v_min_rdma;
-				pf->num_lan_msix = v_remain - v_min_rdma;
-			} else {
-				/* Split remaining MSIX with RDMA after
-				 * accounting for AEQ MSIX
-				 */
-				pf->num_rdma_msix = (v_remain - ICE_RDMA_NUM_AEQ_MSIX) / 2 +
-						    ICE_RDMA_NUM_AEQ_MSIX;
-				pf->num_lan_msix = v_remain - pf->num_rdma_msix;
-			}
-
-			dev_notice(dev, "Enabled %d MSI-X vectors for LAN traffic.\n",
-				   pf->num_lan_msix);
-
-			if (ice_is_rdma_ena(pf))
-				dev_notice(dev, "Enabled %d MSI-X vectors for RDMA.\n",
-					   pf->num_rdma_msix);
-		}
-	}
-
-	return v_actual;
-
-msix_err:
-	devm_kfree(dev, pf->msix_entries);
-	goto exit_err;
-
-no_hw_vecs_left_err:
-	dev_err(dev, "not enough device MSI-X vectors. requested = %d, available = %d\n",
-		needed, v_left);
-	err = -ERANGE;
-exit_err:
-	ice_eswitch_clear_cap(pf);
-	pf->num_rdma_msix = 0;
-	pf->num_lan_msix = 0;
-	return err;
-}
-
-/**
- * ice_dis_msix - Disable MSI-X interrupt setup in OS
- * @pf: board private structure
- */
-static void ice_dis_msix(struct ice_pf *pf)
-{
-	pci_disable_msix(pf->pdev);
-	devm_kfree(ice_pf_to_dev(pf), pf->msix_entries);
-	pf->msix_entries = NULL;
-}
-
-/**
- * ice_clear_interrupt_scheme - Undo things done by ice_init_interrupt_scheme
- * @pf: board private structure
- */
-static void ice_clear_interrupt_scheme(struct ice_pf *pf)
-{
-	ice_dis_msix(pf);
-
-	if (pf->irq_tracker) {
-		devm_kfree(ice_pf_to_dev(pf), pf->irq_tracker);
-		pf->irq_tracker = NULL;
-	}
-}
-
-/**
- * ice_init_interrupt_scheme - Determine proper interrupt scheme
- * @pf: board private structure to initialize
- */
-static int ice_init_interrupt_scheme(struct ice_pf *pf)
-{
-	int vectors;
-
-	vectors = ice_ena_msix_range(pf);
-
-	if (vectors < 0)
-		return vectors;
-
-	/* set up vector assignment tracking */
-	pf->irq_tracker = devm_kzalloc(ice_pf_to_dev(pf),
-				       struct_size(pf->irq_tracker, list, vectors),
-				       GFP_KERNEL);
-	if (!pf->irq_tracker) {
-		ice_dis_msix(pf);
-		return -ENOMEM;
-	}
-
-	/* populate SW interrupts pool with number of OS granted IRQs. */
-	pf->num_avail_sw_msix = (u16)vectors;
-	pf->irq_tracker->num_entries = (u16)vectors;
-	pf->irq_tracker->end = pf->irq_tracker->num_entries;
-
-	return 0;
-}
-
 /**
  * ice_is_wol_supported - check if WoL is supported
  * @hw: pointer to hardware info
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 0749f0e7a11c..d32571e2abb4 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -11,6 +11,7 @@
 #include "ice_txrx.h"
 #include "ice_txrx_lib.h"
 #include "ice_lib.h"
+#include "ice_irq.h"
 
 static struct xdp_buff **ice_xdp_buf(struct ice_rx_ring *rx_ring, u32 idx)
 {
@@ -94,7 +95,7 @@ ice_qvec_dis_irq(struct ice_vsi *vsi, struct ice_rx_ring *rx_ring,
 
 		wr32(hw, GLINT_DYN_CTL(q_vector->reg_idx), 0);
 		ice_flush(hw);
-		synchronize_irq(pf->msix_entries[v_idx + base].vector);
+		synchronize_irq(ice_get_irq_num(pf, v_idx + base));
 	}
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice
  2021-12-22  6:21 [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice Michal Swiatkowski
                   ` (2 preceding siblings ...)
  2021-12-22  6:22 ` [Intel-wired-lan] [PATCH net-next 3/3] ice: use new alloc irqs API Michal Swiatkowski
@ 2021-12-22 16:33 ` Jonathan Toppins
  2022-04-27  0:24 ` Tony Nguyen
  4 siblings, 0 replies; 6+ messages in thread
From: Jonathan Toppins @ 2021-12-22 16:33 UTC (permalink / raw)
  To: intel-wired-lan

On 12/22/21 01:21, Michal Swiatkowski wrote:
> The ice driver uses the old PCI irq reseveration API. Change the ice
> driver to use the current API.
> 
> Implement a fallback mechanism where, if the driver can't reserve the
> maximum number of interrupts, it will limit the number of queues or
> disable capabilities.
> 
> First two patches add ability to turn on and off eswitch offload. This
> is needed when driver can't reserve maximum number of interrupts. In
> this case driver turns off eswitch offload as driver can work
> without it. Additionally, the eswitch can be supported only if SRIOV is
> available, so set eswitch capabilities only if SRIOV is supported.
> 
> Michal Swiatkowski (3):
>    ice: add check for eswitch support
>    ice: change mode only if eswitch is supported
>    ice: use new alloc irqs API
> 
>   drivers/net/ethernet/intel/ice/Makefile      |   3 +-
>   drivers/net/ethernet/intel/ice/ice.h         |   4 +-
>   drivers/net/ethernet/intel/ice/ice_arfs.c    |   3 +-
>   drivers/net/ethernet/intel/ice/ice_eswitch.c |  46 +++-
>   drivers/net/ethernet/intel/ice/ice_eswitch.h |  12 +
>   drivers/net/ethernet/intel/ice/ice_irq.c     | 213 ++++++++++++++++++
>   drivers/net/ethernet/intel/ice/ice_irq.h     |  12 +
>   drivers/net/ethernet/intel/ice/ice_lib.c     |   5 +-
>   drivers/net/ethernet/intel/ice/ice_main.c    | 220 +------------------
>   drivers/net/ethernet/intel/ice/ice_xsk.c     |   3 +-
>   10 files changed, 300 insertions(+), 221 deletions(-)
>   create mode 100644 drivers/net/ethernet/intel/ice/ice_irq.c
>   create mode 100644 drivers/net/ethernet/intel/ice/ice_irq.h
> 

Reviewed-by: Jonathan Toppins <jtoppins@redhat.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice
  2021-12-22  6:21 [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice Michal Swiatkowski
                   ` (3 preceding siblings ...)
  2021-12-22 16:33 ` [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice Jonathan Toppins
@ 2022-04-27  0:24 ` Tony Nguyen
  4 siblings, 0 replies; 6+ messages in thread
From: Tony Nguyen @ 2022-04-27  0:24 UTC (permalink / raw)
  To: intel-wired-lan


On 12/21/2021 10:21 PM, Michal Swiatkowski wrote:
> The ice driver uses the old PCI irq reseveration API. Change the ice
> driver to use the current API.
>
> Implement a fallback mechanism where, if the driver can't reserve the
> maximum number of interrupts, it will limit the number of queues or
> disable capabilities.

A very similar implementation was already rejected by netdev [1]. I 
believe there's kernel work being done for dynamic MSIX allocations 
which is the route we should look to take on this (when it's completed).

Thanks,

Tony

[1] 
https://lore.kernel.org/netdev/20210113234226.3638426-1-anthony.l.nguyen at intel.com/


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-04-27  0:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-22  6:21 [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice Michal Swiatkowski
2021-12-22  6:21 ` [Intel-wired-lan] [PATCH net-next 1/3] ice: add check for eswitch support Michal Swiatkowski
2021-12-22  6:22 ` [Intel-wired-lan] [PATCH net-next 2/3] ice: change mode only if eswitch is supported Michal Swiatkowski
2021-12-22  6:22 ` [Intel-wired-lan] [PATCH net-next 3/3] ice: use new alloc irqs API Michal Swiatkowski
2021-12-22 16:33 ` [Intel-wired-lan] [PATCH net-next 0/3] refactor irq allocation in ice Jonathan Toppins
2022-04-27  0:24 ` Tony Nguyen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.