netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 net-next 0/7] ethtool: track custom RSS contexts in the core
@ 2023-04-11 18:26 edward.cree
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 1/7] net: move ethtool-related netdev state into its own struct edward.cree
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: edward.cree @ 2023-04-11 18:26 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx, sudheer.mogilappagari

From: Edward Cree <ecree.xilinx@gmail.com>

Make the core responsible for tracking the set of custom RSS contexts,
 their IDs, indirection tables, hash keys, and hash functions; this
 lets us get rid of duplicative code in drivers, and will allow us to
 support netlink dumps later.

This series only moves the sfc EF10 & EF100 driver over to the new API; if
 the design is approved of, I plan to post a follow-up series to convert the
 other drivers and remove the legacy API.  (However, I don't have hardware
 for the drivers besides sfc, so I won't be able to test those myself.)

Edward Cree (7):
  net: move ethtool-related netdev state into its own struct
  net: ethtool: attach an IDR of custom RSS contexts to a netdevice
  net: ethtool: record custom RSS contexts in the IDR
  net: ethtool: let the core choose RSS context IDs
  net: ethtool: add an extack parameter to new rxfh_context APIs
  net: ethtool: add a mutex protecting RSS contexts
  sfc: use new rxfh_context API

 drivers/net/ethernet/realtek/r8169_main.c |   4 +-
 drivers/net/ethernet/sfc/ef10.c           |   2 +-
 drivers/net/ethernet/sfc/ef100_ethtool.c  |   5 +-
 drivers/net/ethernet/sfc/efx.c            |   2 +-
 drivers/net/ethernet/sfc/efx.h            |   2 +-
 drivers/net/ethernet/sfc/efx_common.c     |  10 +-
 drivers/net/ethernet/sfc/ethtool.c        |   5 +-
 drivers/net/ethernet/sfc/ethtool_common.c | 147 +++++++++++++---------
 drivers/net/ethernet/sfc/ethtool_common.h |  18 ++-
 drivers/net/ethernet/sfc/mcdi_filters.c   | 133 ++++++++++----------
 drivers/net/ethernet/sfc/mcdi_filters.h   |   8 +-
 drivers/net/ethernet/sfc/net_driver.h     |  28 ++---
 drivers/net/ethernet/sfc/rx_common.c      |  64 ++--------
 drivers/net/ethernet/sfc/rx_common.h      |   8 +-
 drivers/net/phy/phy.c                     |   2 +-
 drivers/net/phy/phy_device.c              |   4 +-
 drivers/net/phy/phylink.c                 |   2 +-
 include/linux/ethtool.h                   | 109 +++++++++++++++-
 include/linux/netdevice.h                 |   7 +-
 net/core/dev.c                            |  38 ++++++
 net/ethtool/ioctl.c                       | 124 ++++++++++++++++--
 net/ethtool/wol.c                         |   2 +-
 22 files changed, 484 insertions(+), 240 deletions(-)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v2 net-next 1/7] net: move ethtool-related netdev state into its own struct
  2023-04-11 18:26 [RFC PATCH v2 net-next 0/7] ethtool: track custom RSS contexts in the core edward.cree
@ 2023-04-11 18:26 ` edward.cree
  2023-04-11 20:37   ` Andrew Lunn
  2023-04-13  1:36   ` Jakub Kicinski
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 2/7] net: ethtool: attach an IDR of custom RSS contexts to a netdevice edward.cree
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: edward.cree @ 2023-04-11 18:26 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx, sudheer.mogilappagari

From: Edward Cree <ecree.xilinx@gmail.com>

net_dev->ethtool is a pointer to new struct ethtool_netdev_state, which
 currently contains only the wol_enabled field.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
Changes in v2: New patch.
---
 drivers/net/ethernet/realtek/r8169_main.c | 4 ++--
 drivers/net/phy/phy.c                     | 2 +-
 drivers/net/phy/phy_device.c              | 4 ++--
 drivers/net/phy/phylink.c                 | 2 +-
 include/linux/ethtool.h                   | 8 ++++++++
 include/linux/netdevice.h                 | 7 ++++---
 net/core/dev.c                            | 4 ++++
 net/ethtool/ioctl.c                       | 2 +-
 net/ethtool/wol.c                         | 2 +-
 9 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 9f8357bbc8a4..356f43fac74f 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -1451,7 +1451,7 @@ static void __rtl8169_set_wol(struct rtl8169_private *tp, u32 wolopts)
 
 	if (tp->dash_type == RTL_DASH_NONE) {
 		rtl_set_d3_pll_down(tp, !wolopts);
-		tp->dev->wol_enabled = wolopts ? 1 : 0;
+		tp->dev->ethtool->wol_enabled = wolopts ? 1 : 0;
 	}
 }
 
@@ -5330,7 +5330,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		rtl_set_d3_pll_down(tp, true);
 	} else {
 		rtl_set_d3_pll_down(tp, false);
-		dev->wol_enabled = 1;
+		dev->ethtool->wol_enabled = 1;
 	}
 
 	jumbo_max = rtl_jumbo_max(tp);
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 0c0df38cd1ab..2d8307e9c351 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1245,7 +1245,7 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
 		if (netdev) {
 			struct device *parent = netdev->dev.parent;
 
-			if (netdev->wol_enabled)
+			if (netdev->ethtool->wol_enabled)
 				pm_system_wakeup();
 			else if (device_may_wakeup(&netdev->dev))
 				pm_wakeup_dev_event(&netdev->dev, 0, true);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 917ba84105fc..535002e75dc5 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -281,7 +281,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
 	if (!netdev)
 		goto out;
 
-	if (netdev->wol_enabled)
+	if (netdev->ethtool->wol_enabled)
 		return false;
 
 	/* As long as not all affected network drivers support the
@@ -1859,7 +1859,7 @@ int phy_suspend(struct phy_device *phydev)
 
 	/* If the device has WOL enabled, we cannot suspend the PHY */
 	phy_ethtool_get_wol(phydev, &wol);
-	if (wol.wolopts || (netdev && netdev->wol_enabled))
+	if (wol.wolopts || (netdev && netdev->ethtool->wol_enabled))
 		return -EBUSY;
 
 	if (!phydrv || !phydrv->suspend)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index f7da96f0c75b..c332d8950f01 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -2005,7 +2005,7 @@ void phylink_suspend(struct phylink *pl, bool mac_wol)
 {
 	ASSERT_RTNL();
 
-	if (mac_wol && (!pl->netdev || pl->netdev->wol_enabled)) {
+	if (mac_wol && (!pl->netdev || pl->netdev->ethtool->wol_enabled)) {
 		/* Wake-on-Lan enabled, MAC handling */
 		mutex_lock(&pl->state_mutex);
 
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 798d35890118..c73b28df301c 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -934,6 +934,14 @@ int ethtool_virtdev_set_link_ksettings(struct net_device *dev,
 				       const struct ethtool_link_ksettings *cmd,
 				       u32 *dev_speed, u8 *dev_duplex);
 
+/**
+ * struct ethtool_netdev_state - per-netdevice state for ethtool features
+ * @wol_enabled:	Wake-on-LAN is enabled
+ */
+struct ethtool_netdev_state {
+	unsigned		wol_enabled:1;
+};
+
 struct phy_device;
 struct phy_tdr_config;
 struct phy_plca_cfg;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a740be3bb911..1915a6221096 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -77,6 +77,7 @@ struct udp_tunnel_nic;
 struct bpf_prog;
 struct xdp_buff;
 struct xdp_md;
+struct ethtool_netdev_state;
 
 void synchronize_net(void);
 void netdev_set_default_ethtool_ops(struct net_device *dev,
@@ -2015,8 +2016,6 @@ enum netdev_ml_priv_type {
  *			switch driver and used to set the phys state of the
  *			switch port.
  *
- *	@wol_enabled:	Wake-on-LAN is enabled
- *
  *	@threaded:	napi threaded mode is enabled
  *
  *	@net_notifier_list:	List of per-net netdev notifier block
@@ -2028,6 +2027,7 @@ enum netdev_ml_priv_type {
  *	@udp_tunnel_nic_info:	static structure describing the UDP tunnel
  *				offload capabilities of the device
  *	@udp_tunnel_nic:	UDP tunnel offload state
+ *	@ethtool:	ethtool related state
  *	@xdp_state:		stores info on attached XDP BPF programs
  *
  *	@nested_level:	Used as a parameter of spin_lock_nested() of
@@ -2385,7 +2385,6 @@ struct net_device {
 	struct sfp_bus		*sfp_bus;
 	struct lock_class_key	*qdisc_tx_busylock;
 	bool			proto_down;
-	unsigned		wol_enabled:1;
 	unsigned		threaded:1;
 
 	struct list_head	net_notifier_list;
@@ -2397,6 +2396,8 @@ struct net_device {
 	const struct udp_tunnel_nic_info	*udp_tunnel_nic_info;
 	struct udp_tunnel_nic	*udp_tunnel_nic;
 
+	struct ethtool_netdev_state *ethtool;
+
 	/* protected by rtnl_lock */
 	struct bpf_xdp_entity	xdp_state[__MAX_XDP_MODE];
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 7ce5985be84b..93960861a11f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10676,6 +10676,9 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	dev->real_num_rx_queues = rxqs;
 	if (netif_alloc_rx_queues(dev))
 		goto free_all;
+	dev->ethtool = kzalloc(sizeof(*dev->ethtool), GFP_KERNEL_ACCOUNT);
+	if (!dev->ethtool)
+		goto free_all;
 
 	strcpy(dev->name, name);
 	dev->name_assign_type = name_assign_type;
@@ -10726,6 +10729,7 @@ void free_netdev(struct net_device *dev)
 		return;
 	}
 
+	kfree(dev->ethtool);
 	netif_free_tx_queues(dev);
 	netif_free_rx_queues(dev);
 
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 59adc4e6e9ee..0effaca4ff9e 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1449,7 +1449,7 @@ static int ethtool_set_wol(struct net_device *dev, char __user *useraddr)
 	if (ret)
 		return ret;
 
-	dev->wol_enabled = !!wol.wolopts;
+	dev->ethtool->wol_enabled = !!wol.wolopts;
 	ethtool_notify(dev, ETHTOOL_MSG_WOL_NTF, NULL);
 
 	return 0;
diff --git a/net/ethtool/wol.c b/net/ethtool/wol.c
index a4a43d9e6e9d..820578b70073 100644
--- a/net/ethtool/wol.c
+++ b/net/ethtool/wol.c
@@ -136,7 +136,7 @@ ethnl_set_wol(struct ethnl_req_info *req_info, struct genl_info *info)
 	ret = dev->ethtool_ops->set_wol(dev, &wol);
 	if (ret)
 		return ret;
-	dev->wol_enabled = !!wol.wolopts;
+	dev->ethtool->wol_enabled = !!wol.wolopts;
 	return 1;
 }
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v2 net-next 2/7] net: ethtool: attach an IDR of custom RSS contexts to a netdevice
  2023-04-11 18:26 [RFC PATCH v2 net-next 0/7] ethtool: track custom RSS contexts in the core edward.cree
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 1/7] net: move ethtool-related netdev state into its own struct edward.cree
@ 2023-04-11 18:26 ` edward.cree
  2023-04-11 20:36   ` Andrew Lunn
  2023-04-13  1:39   ` Jakub Kicinski
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 3/7] net: ethtool: record custom RSS contexts in the IDR edward.cree
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: edward.cree @ 2023-04-11 18:26 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx, sudheer.mogilappagari

From: Edward Cree <ecree.xilinx@gmail.com>

Each context stores the RXFH settings (indir, key, and hfunc) as well
 as optionally some driver private data.
Delete any still-existing contexts at netdev unregister time.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
Changes in v2: fix data area to ensure proper alignment (kuba)
---
 include/linux/ethtool.h | 41 +++++++++++++++++++++++++++++++++++++++++
 net/core/dev.c          | 23 +++++++++++++++++++++++
 2 files changed, 64 insertions(+)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index c73b28df301c..7963b06da484 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -157,6 +157,43 @@ static inline u32 ethtool_rxfh_indir_default(u32 index, u32 n_rx_rings)
 	return index % n_rx_rings;
 }
 
+/**
+ * struct ethtool_rxfh_context - a custom RSS context configuration
+ * @indir_size: Number of u32 entries in indirection table
+ * @key_size: Size of hash key, in bytes
+ * @hfunc: RSS hash function identifier.  One of the %ETH_RSS_HASH_*
+ * @priv_size: Size of driver private data, in bytes
+ * @indir_no_change: indir was not specified at create time
+ * @key_no_change: hkey was not specified at create time
+ */
+struct ethtool_rxfh_context {
+	u32 indir_size;
+	u32 key_size;
+	u8 hfunc;
+	u16 priv_size;
+	u8 indir_no_change:1;
+	u8 key_no_change:1;
+	/* private: driver private data, indirection table, and hash key are
+	 * stored sequentially in @data area.  Use below helpers to access.
+	 */
+	u8 data[] __aligned(sizeof(void *));
+};
+
+static inline void *ethtool_rxfh_context_priv(struct ethtool_rxfh_context *ctx)
+{
+	return ctx->data;
+}
+
+static inline u32 *ethtool_rxfh_context_indir(struct ethtool_rxfh_context *ctx)
+{
+	return (u32 *)(ctx->data + ALIGN(ctx->priv_size, sizeof(u32)));
+}
+
+static inline u8 *ethtool_rxfh_context_key(struct ethtool_rxfh_context *ctx)
+{
+	return (u8 *)(ethtool_rxfh_context_indir(ctx) + ctx->indir_size);
+}
+
 /* declare a link mode bitmap */
 #define __ETHTOOL_DECLARE_LINK_MODE_MASK(name)		\
 	DECLARE_BITMAP(name, __ETHTOOL_LINK_MODE_MASK_NBITS)
@@ -936,9 +973,13 @@ int ethtool_virtdev_set_link_ksettings(struct net_device *dev,
 
 /**
  * struct ethtool_netdev_state - per-netdevice state for ethtool features
+ * @rss_ctx_max_id:	maximum (exclusive) supported RSS context ID
+ * @rss_ctx:		IDR storing custom RSS context state
  * @wol_enabled:	Wake-on-LAN is enabled
  */
 struct ethtool_netdev_state {
+	u32			rss_ctx_max_id;
+	struct idr		rss_ctx;
 	unsigned		wol_enabled:1;
 };
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 93960861a11f..c9ed9f6ea695 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -9983,6 +9983,9 @@ int register_netdevice(struct net_device *dev)
 	if (ret)
 		return ret;
 
+	/* rss ctx ID 0 is reserved for the default context, start from 1 */
+	idr_init_base(&dev->ethtool->rss_ctx, 1);
+
 	spin_lock_init(&dev->addr_list_lock);
 	netdev_set_addr_lockdep_class(dev);
 
@@ -10781,6 +10784,24 @@ void synchronize_net(void)
 }
 EXPORT_SYMBOL(synchronize_net);
 
+static void netdev_rss_contexts_free(struct net_device *dev)
+{
+	struct ethtool_rxfh_context *ctx;
+	u32 context;
+
+	if (!dev->ethtool_ops->set_rxfh_context)
+		return;
+	idr_for_each_entry(&dev->ethtool->rss_ctx, ctx, context) {
+		u32 *indir = ethtool_rxfh_context_indir(ctx);
+		u8 *key = ethtool_rxfh_context_key(ctx);
+
+		idr_remove(&dev->ethtool->rss_ctx, context);
+		dev->ethtool_ops->set_rxfh_context(dev, indir, key, ctx->hfunc,
+						   &context, true);
+		kfree(ctx);
+	}
+}
+
 /**
  *	unregister_netdevice_queue - remove device from the kernel
  *	@dev: device
@@ -10885,6 +10906,8 @@ void unregister_netdevice_many_notify(struct list_head *head,
 		netdev_name_node_alt_flush(dev);
 		netdev_name_node_free(dev->name_node);
 
+		netdev_rss_contexts_free(dev);
+
 		call_netdevice_notifiers(NETDEV_PRE_UNINIT, dev);
 
 		if (dev->netdev_ops->ndo_uninit)

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v2 net-next 3/7] net: ethtool: record custom RSS contexts in the IDR
  2023-04-11 18:26 [RFC PATCH v2 net-next 0/7] ethtool: track custom RSS contexts in the core edward.cree
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 1/7] net: move ethtool-related netdev state into its own struct edward.cree
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 2/7] net: ethtool: attach an IDR of custom RSS contexts to a netdevice edward.cree
@ 2023-04-11 18:26 ` edward.cree
  2023-04-13  1:49   ` Jakub Kicinski
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 4/7] net: ethtool: let the core choose RSS context IDs edward.cree
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: edward.cree @ 2023-04-11 18:26 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx, sudheer.mogilappagari

From: Edward Cree <ecree.xilinx@gmail.com>

Since drivers are still choosing the context IDs, we have to force the
 IDR to use the ID they've chosen rather than picking one ourselves.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
Changes in v2:
* change .get_rxfh_priv_size op into rxfh_priv_size value member (kuba)
* use GFP_KERNEL_ACCOUNT rather than GFP_USER (kuba)
* adjust size calculation to allow for alignment padding from patch #2
---
 include/linux/ethtool.h | 14 +++++++++
 net/ethtool/ioctl.c     | 63 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 7963b06da484..710d6a985347 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -194,6 +194,17 @@ static inline u8 *ethtool_rxfh_context_key(struct ethtool_rxfh_context *ctx)
 	return (u8 *)(ethtool_rxfh_context_indir(ctx) + ctx->indir_size);
 }
 
+static inline size_t ethtool_rxfh_context_size(u32 indir_size, u32 key_size,
+					       u16 priv_size)
+{
+	size_t indir_bytes = array_size(indir_size, sizeof(u32));
+	size_t flex_len;
+
+	flex_len = size_add(size_add(indir_bytes, key_size),
+			    ALIGN(priv_size, sizeof(u32)));
+	return struct_size((struct ethtool_rxfh_context *)0, data, flex_len);
+}
+
 /* declare a link mode bitmap */
 #define __ETHTOOL_DECLARE_LINK_MODE_MASK(name)		\
 	DECLARE_BITMAP(name, __ETHTOOL_LINK_MODE_MASK_NBITS)
@@ -731,6 +742,8 @@ struct ethtool_mm_stats {
  *	will remain unchanged.
  *	Returns a negative error code or zero. An error code must be returned
  *	if at least one unsupported change was requested.
+ * @rxfh_priv_size: size of the driver private data area the core should
+ *	allocate for an RSS context.
  * @get_rxfh_context: Get the contents of the RX flow hash indirection table,
  *	hash key, and/or hash function assiciated to the given rss context.
  *	Returns a negative error code or zero.
@@ -823,6 +836,7 @@ struct ethtool_ops {
 	u32     cap_link_lanes_supported:1;
 	u32	supported_coalesce_params;
 	u32	supported_ring_params;
+	u16	rxfh_priv_size;
 	void	(*get_drvinfo)(struct net_device *, struct ethtool_drvinfo *);
 	int	(*get_regs_len)(struct net_device *);
 	void	(*get_regs)(struct net_device *, struct ethtool_regs *, void *);
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 0effaca4ff9e..9f9f8ba9c0f6 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1248,6 +1248,7 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 {
 	int ret;
 	const struct ethtool_ops *ops = dev->ethtool_ops;
+	struct ethtool_rxfh_context *ctx = NULL;
 	struct ethtool_rxnfc rx_rings;
 	struct ethtool_rxfh rxfh;
 	u32 dev_indir_size = 0, dev_key_size = 0, i;
@@ -1255,7 +1256,7 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 	u8 *hkey = NULL;
 	u8 *rss_config;
 	u32 rss_cfg_offset = offsetof(struct ethtool_rxfh, rss_config[0]);
-	bool delete = false;
+	bool create = false, delete = false;
 
 	if (!ops->get_rxnfc || !ops->set_rxfh)
 		return -EOPNOTSUPP;
@@ -1274,6 +1275,7 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 	/* Most drivers don't handle rss_context, check it's 0 as well */
 	if (rxfh.rss_context && !ops->set_rxfh_context)
 		return -EOPNOTSUPP;
+	create = rxfh.rss_context == ETH_RXFH_CONTEXT_ALLOC;
 
 	/* If either indir, hash key or function is valid, proceed further.
 	 * Must request at least one change: indir size, hash key or function.
@@ -1331,6 +1333,31 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 		}
 	}
 
+	if (create) {
+		if (delete) {
+			ret = -EINVAL;
+			goto out;
+		}
+		ctx = kzalloc(ethtool_rxfh_context_size(dev_indir_size,
+							dev_key_size,
+							ops->rxfh_priv_size),
+			      GFP_KERNEL_ACCOUNT);
+		if (!ctx) {
+			ret = -ENOMEM;
+			goto out;
+		}
+		ctx->indir_size = dev_indir_size;
+		ctx->key_size = dev_key_size;
+		ctx->hfunc = rxfh.hfunc;
+		ctx->priv_size = ops->rxfh_priv_size;
+	} else if (rxfh.rss_context) {
+		ctx = idr_find(&dev->ethtool->rss_ctx, rxfh.rss_context);
+		if (!ctx) {
+			ret = -ENOENT;
+			goto out;
+		}
+	}
+
 	if (rxfh.rss_context)
 		ret = ops->set_rxfh_context(dev, indir, hkey, rxfh.hfunc,
 					    &rxfh.rss_context, delete);
@@ -1350,6 +1377,40 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 		else if (rxfh.indir_size != ETH_RXFH_INDIR_NO_CHANGE)
 			dev->priv_flags |= IFF_RXFH_CONFIGURED;
 	}
+	/* Update rss_ctx tracking */
+	if (create) {
+		/* Ideally this should happen before calling the driver,
+		 * so that we can fail more cleanly; but we don't have the
+		 * context ID until the driver picks it, so we have to
+		 * wait until after.
+		 */
+		if (WARN_ON(idr_find(&dev->ethtool->rss_ctx, rxfh.rss_context)))
+			/* context ID reused, our tracking is screwed */
+			goto out;
+		/* Allocate the exact ID the driver gave us */
+		WARN_ON(idr_alloc(&dev->ethtool->rss_ctx, ctx, rxfh.rss_context,
+				  rxfh.rss_context + 1, GFP_KERNEL) !=
+			rxfh.rss_context);
+		ctx->indir_no_change = rxfh.indir_size == ETH_RXFH_INDIR_NO_CHANGE;
+		ctx->key_no_change = !rxfh.key_size;
+	}
+	if (delete) {
+		WARN_ON(idr_remove(&dev->ethtool->rss_ctx, rxfh.rss_context) != ctx);
+		kfree(ctx);
+	} else if (ctx) {
+		if (indir) {
+			for (i = 0; i < dev_indir_size; i++)
+				ethtool_rxfh_context_indir(ctx)[i] = indir[i];
+			ctx->indir_no_change = 0;
+		}
+		if (hkey) {
+			memcpy(ethtool_rxfh_context_key(ctx), hkey,
+			       dev_key_size);
+			ctx->key_no_change = 0;
+		}
+		if (rxfh.hfunc != ETH_RSS_HASH_NO_CHANGE)
+			ctx->hfunc = rxfh.hfunc;
+	}
 
 out:
 	kfree(rss_config);

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v2 net-next 4/7] net: ethtool: let the core choose RSS context IDs
  2023-04-11 18:26 [RFC PATCH v2 net-next 0/7] ethtool: track custom RSS contexts in the core edward.cree
                   ` (2 preceding siblings ...)
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 3/7] net: ethtool: record custom RSS contexts in the IDR edward.cree
@ 2023-04-11 18:26 ` edward.cree
  2023-04-13  1:53   ` Jakub Kicinski
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 5/7] net: ethtool: add an extack parameter to new rxfh_context APIs edward.cree
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: edward.cree @ 2023-04-11 18:26 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx, sudheer.mogilappagari

From: Edward Cree <ecree.xilinx@gmail.com>

Add a new API to create/modify/remove RSS contexts, that passes in the
 newly-chosen context ID (not as a pointer) rather than leaving the
 driver to choose it on create.  Also pass in the ctx, allowing drivers
 to easily use its private data area to store their hardware-specific
 state.
Keep the existing .set_rxfh_context API for now as a fallback, but
 deprecate it.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
Changes in v2:
* split the new API into create/modify/remove ops (kuba).  Also means we
  don't need to rename the old API and touch legacy drivers
* squash patch "net: ethtool: pass ctx_priv and create into .set_rxfh_context"
---
 include/linux/ethtool.h | 40 ++++++++++++++++++++++--
 net/core/dev.c          | 11 +++++--
 net/ethtool/ioctl.c     | 67 ++++++++++++++++++++++++++++++++---------
 3 files changed, 97 insertions(+), 21 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 710d6a985347..12ed3b79be68 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -747,10 +747,33 @@ struct ethtool_mm_stats {
  * @get_rxfh_context: Get the contents of the RX flow hash indirection table,
  *	hash key, and/or hash function assiciated to the given rss context.
  *	Returns a negative error code or zero.
- * @set_rxfh_context: Create, remove and configure RSS contexts. Allows setting
+ * @create_rxfh_context: Create a new RSS context with the specified RX flow
+ *	hash indirection table, hash key, and hash function.
+ *	Arguments which are set to %NULL or zero will be populated to
+ *	appropriate defaults by the driver.
+ *	The &struct ethtool_rxfh_context for this context is passed in @ctx;
+ *	note that the indir table, hkey and hfunc are not yet populated as
+ *	of this call.  The driver does not need to update these; the core
+ *	will do so if this op succeeds.
+ *	If the driver provides this method, it must also provide
+ *	@modify_rxfh_context and @remove_rxfh_context.
+ *	Returns a negative error code or zero.
+ * @modify_rxfh_context: Reconfigure the specified RSS context.  Allows setting
  *	the contents of the RX flow hash indirection table, hash key, and/or
- *	hash function associated to the given context. Arguments which are set
- *	to %NULL or zero will remain unchanged.
+ *	hash function associated with the given context.
+ *	Arguments which are set to %NULL or zero will remain unchanged.
+ *	The &struct ethtool_rxfh_context for this context is passed in @ctx;
+ *	note that it will still contain the *old* settings.  The driver does
+ *	not need to update these; the core will do so if this op succeeds.
+ *	Returns a negative error code or zero. An error code must be returned
+ *	if at least one unsupported change was requested.
+ * @remove_rxfh_context: Remove the specified RSS context.
+ *	The &struct ethtool_rxfh_context for this context is passed in @ctx.
+ *	Returns a negative error code or zero.
+ * @set_rxfh_context: Deprecated API to create, remove and configure RSS
+ *	contexts. Allows setting the contents of the RX flow hash indirection
+ *	table, hash key, and/or hash function associated to the given context.
+ *	Arguments which are set to %NULL or zero will remain unchanged.
  *	Returns a negative error code or zero. An error code must be returned
  *	if at least one unsupported change was requested.
  * @get_channels: Get number of channels.
@@ -900,6 +923,17 @@ struct ethtool_ops {
 			    const u8 *key, const u8 hfunc);
 	int	(*get_rxfh_context)(struct net_device *, u32 *indir, u8 *key,
 				    u8 *hfunc, u32 rss_context);
+	int	(*create_rxfh_context)(struct net_device *,
+				       struct ethtool_rxfh_context *ctx,
+				       const u32 *indir, const u8 *key,
+				       const u8 hfunc, u32 rss_context);
+	int	(*modify_rxfh_context)(struct net_device *,
+				       struct ethtool_rxfh_context *ctx,
+				       const u32 *indir, const u8 *key,
+				       const u8 hfunc, u32 rss_context);
+	int	(*remove_rxfh_context)(struct net_device *,
+				       struct ethtool_rxfh_context *ctx,
+				       u32 rss_context);
 	int	(*set_rxfh_context)(struct net_device *, const u32 *indir,
 				    const u8 *key, const u8 hfunc,
 				    u32 *rss_context, bool delete);
diff --git a/net/core/dev.c b/net/core/dev.c
index c9ed9f6ea695..4feb58b0beb3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10789,15 +10789,20 @@ static void netdev_rss_contexts_free(struct net_device *dev)
 	struct ethtool_rxfh_context *ctx;
 	u32 context;
 
-	if (!dev->ethtool_ops->set_rxfh_context)
+	if (!dev->ethtool_ops->create_rxfh_context &&
+	    !dev->ethtool_ops->set_rxfh_context)
 		return;
 	idr_for_each_entry(&dev->ethtool->rss_ctx, ctx, context) {
 		u32 *indir = ethtool_rxfh_context_indir(ctx);
 		u8 *key = ethtool_rxfh_context_key(ctx);
 
 		idr_remove(&dev->ethtool->rss_ctx, context);
-		dev->ethtool_ops->set_rxfh_context(dev, indir, key, ctx->hfunc,
-						   &context, true);
+		if (dev->ethtool_ops->create_rxfh_context)
+			dev->ethtool_ops->remove_rxfh_context(dev, ctx, context);
+		else
+			dev->ethtool_ops->set_rxfh_context(dev, indir, key,
+							   ctx->hfunc,
+							   &context, true);
 		kfree(ctx);
 	}
 }
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 9f9f8ba9c0f6..20154d6159a1 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1273,7 +1273,8 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 	if (rxfh.rsvd8[0] || rxfh.rsvd8[1] || rxfh.rsvd8[2] || rxfh.rsvd32)
 		return -EINVAL;
 	/* Most drivers don't handle rss_context, check it's 0 as well */
-	if (rxfh.rss_context && !ops->set_rxfh_context)
+	if (rxfh.rss_context && !(ops->create_rxfh_context ||
+				  ops->set_rxfh_context))
 		return -EOPNOTSUPP;
 	create = rxfh.rss_context == ETH_RXFH_CONTEXT_ALLOC;
 
@@ -1348,8 +1349,28 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 		}
 		ctx->indir_size = dev_indir_size;
 		ctx->key_size = dev_key_size;
-		ctx->hfunc = rxfh.hfunc;
 		ctx->priv_size = ops->rxfh_priv_size;
+		/* Initialise to an empty context */
+		ctx->indir_no_change = ctx->key_no_change = 1;
+		ctx->hfunc = ETH_RSS_HASH_NO_CHANGE;
+		if (ops->create_rxfh_context) {
+			int ctx_id;
+
+			/* driver uses new API, core allocates ID */
+			/* if rss_ctx_max_id is not specified (left as 0), it is
+			 * treated as INT_MAX + 1 by idr_alloc
+			 */
+			ctx_id = idr_alloc(&dev->ethtool->rss_ctx, ctx, 1,
+					   dev->ethtool->rss_ctx_max_id,
+					   GFP_KERNEL_ACCOUNT);
+			/* 0 is not allowed, so treat it like an error here */
+			if (ctx_id <= 0) {
+				kfree(ctx);
+				ret = -ENOMEM;
+				goto out;
+			}
+			rxfh.rss_context = ctx_id;
+		}
 	} else if (rxfh.rss_context) {
 		ctx = idr_find(&dev->ethtool->rss_ctx, rxfh.rss_context);
 		if (!ctx) {
@@ -1358,13 +1379,35 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 		}
 	}
 
-	if (rxfh.rss_context)
-		ret = ops->set_rxfh_context(dev, indir, hkey, rxfh.hfunc,
-					    &rxfh.rss_context, delete);
-	else
+	if (rxfh.rss_context) {
+		if (ops->create_rxfh_context) {
+			if (create)
+				ret = ops->create_rxfh_context(dev, ctx, indir,
+							       hkey, rxfh.hfunc,
+							       rxfh.rss_context);
+			else if (delete)
+				ret = ops->remove_rxfh_context(dev, ctx,
+							       rxfh.rss_context);
+			else
+				ret = ops->modify_rxfh_context(dev, ctx, indir,
+							       hkey, rxfh.hfunc,
+							       rxfh.rss_context);
+		} else {
+			ret = ops->set_rxfh_context(dev, indir, hkey,
+						    rxfh.hfunc,
+						    &rxfh.rss_context, delete);
+		}
+	} else {
 		ret = ops->set_rxfh(dev, indir, hkey, rxfh.hfunc);
-	if (ret)
+	}
+	if (ret) {
+		if (create && ops->create_rxfh_context) {
+			/* failed to create, discard our new tracking entry */
+			idr_remove(&dev->ethtool->rss_ctx, rxfh.rss_context);
+			kfree(ctx);
+		}
 		goto out;
+	}
 
 	if (copy_to_user(useraddr + offsetof(struct ethtool_rxfh, rss_context),
 			 &rxfh.rss_context, sizeof(rxfh.rss_context)))
@@ -1378,12 +1421,8 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 			dev->priv_flags |= IFF_RXFH_CONFIGURED;
 	}
 	/* Update rss_ctx tracking */
-	if (create) {
-		/* Ideally this should happen before calling the driver,
-		 * so that we can fail more cleanly; but we don't have the
-		 * context ID until the driver picks it, so we have to
-		 * wait until after.
-		 */
+	if (create && !ops->create_rxfh_context) {
+		/* driver uses old API, it chose context ID */
 		if (WARN_ON(idr_find(&dev->ethtool->rss_ctx, rxfh.rss_context)))
 			/* context ID reused, our tracking is screwed */
 			goto out;
@@ -1391,8 +1430,6 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 		WARN_ON(idr_alloc(&dev->ethtool->rss_ctx, ctx, rxfh.rss_context,
 				  rxfh.rss_context + 1, GFP_KERNEL) !=
 			rxfh.rss_context);
-		ctx->indir_no_change = rxfh.indir_size == ETH_RXFH_INDIR_NO_CHANGE;
-		ctx->key_no_change = !rxfh.key_size;
 	}
 	if (delete) {
 		WARN_ON(idr_remove(&dev->ethtool->rss_ctx, rxfh.rss_context) != ctx);

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v2 net-next 5/7] net: ethtool: add an extack parameter to new rxfh_context APIs
  2023-04-11 18:26 [RFC PATCH v2 net-next 0/7] ethtool: track custom RSS contexts in the core edward.cree
                   ` (3 preceding siblings ...)
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 4/7] net: ethtool: let the core choose RSS context IDs edward.cree
@ 2023-04-11 18:26 ` edward.cree
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts edward.cree
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 7/7] sfc: use new rxfh_context API edward.cree
  6 siblings, 0 replies; 24+ messages in thread
From: edward.cree @ 2023-04-11 18:26 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx, sudheer.mogilappagari

From: Edward Cree <ecree.xilinx@gmail.com>

Currently passed as NULL, but will allow drivers to report back errors
 when ethnl support for these ops is added.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
Changes in v2: New patch.
---
 include/linux/ethtool.h | 9 ++++++---
 net/core/dev.c          | 3 ++-
 net/ethtool/ioctl.c     | 9 ++++++---
 3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 12ed3b79be68..724da9234cf1 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -926,14 +926,17 @@ struct ethtool_ops {
 	int	(*create_rxfh_context)(struct net_device *,
 				       struct ethtool_rxfh_context *ctx,
 				       const u32 *indir, const u8 *key,
-				       const u8 hfunc, u32 rss_context);
+				       const u8 hfunc, u32 rss_context,
+				       struct netlink_ext_ack *extack);
 	int	(*modify_rxfh_context)(struct net_device *,
 				       struct ethtool_rxfh_context *ctx,
 				       const u32 *indir, const u8 *key,
-				       const u8 hfunc, u32 rss_context);
+				       const u8 hfunc, u32 rss_context,
+				       struct netlink_ext_ack *extack);
 	int	(*remove_rxfh_context)(struct net_device *,
 				       struct ethtool_rxfh_context *ctx,
-				       u32 rss_context);
+				       u32 rss_context,
+				       struct netlink_ext_ack *extack);
 	int	(*set_rxfh_context)(struct net_device *, const u32 *indir,
 				    const u8 *key, const u8 hfunc,
 				    u32 *rss_context, bool delete);
diff --git a/net/core/dev.c b/net/core/dev.c
index 4feb58b0beb3..44668386f376 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10798,7 +10798,8 @@ static void netdev_rss_contexts_free(struct net_device *dev)
 
 		idr_remove(&dev->ethtool->rss_ctx, context);
 		if (dev->ethtool_ops->create_rxfh_context)
-			dev->ethtool_ops->remove_rxfh_context(dev, ctx, context);
+			dev->ethtool_ops->remove_rxfh_context(dev, ctx, context,
+							      NULL);
 		else
 			dev->ethtool_ops->set_rxfh_context(dev, indir, key,
 							   ctx->hfunc,
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 20154d6159a1..abd1cf50e681 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1384,14 +1384,17 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 			if (create)
 				ret = ops->create_rxfh_context(dev, ctx, indir,
 							       hkey, rxfh.hfunc,
-							       rxfh.rss_context);
+							       rxfh.rss_context,
+							       NULL);
 			else if (delete)
 				ret = ops->remove_rxfh_context(dev, ctx,
-							       rxfh.rss_context);
+							       rxfh.rss_context,
+							       NULL);
 			else
 				ret = ops->modify_rxfh_context(dev, ctx, indir,
 							       hkey, rxfh.hfunc,
-							       rxfh.rss_context);
+							       rxfh.rss_context,
+							       NULL);
 		} else {
 			ret = ops->set_rxfh_context(dev, indir, hkey,
 						    rxfh.hfunc,

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts
  2023-04-11 18:26 [RFC PATCH v2 net-next 0/7] ethtool: track custom RSS contexts in the core edward.cree
                   ` (4 preceding siblings ...)
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 5/7] net: ethtool: add an extack parameter to new rxfh_context APIs edward.cree
@ 2023-04-11 18:26 ` edward.cree
  2023-04-11 20:40   ` Andrew Lunn
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 7/7] sfc: use new rxfh_context API edward.cree
  6 siblings, 1 reply; 24+ messages in thread
From: edward.cree @ 2023-04-11 18:26 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx, sudheer.mogilappagari

From: Edward Cree <ecree.xilinx@gmail.com>

While this is not needed to serialise the ethtool entry points (which
 are all under RTNL), drivers may have cause to asynchronously access
 dev->ethtool->rss_ctx; taking dev->ethtool->rss_lock allows them to
 do this safely without needing to take the RTNL.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
 include/linux/ethtool.h | 3 +++
 net/core/dev.c          | 5 +++++
 net/ethtool/ioctl.c     | 7 +++++++
 3 files changed, 15 insertions(+)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 724da9234cf1..e8e88d5900d3 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -1026,11 +1026,14 @@ int ethtool_virtdev_set_link_ksettings(struct net_device *dev,
  * struct ethtool_netdev_state - per-netdevice state for ethtool features
  * @rss_ctx_max_id:	maximum (exclusive) supported RSS context ID
  * @rss_ctx:		IDR storing custom RSS context state
+ * @rss_lock:		Protects entries in @rss_ctx.  May be taken from
+ *			within RTNL.
  * @wol_enabled:	Wake-on-LAN is enabled
  */
 struct ethtool_netdev_state {
 	u32			rss_ctx_max_id;
 	struct idr		rss_ctx;
+	struct mutex		rss_lock;
 	unsigned		wol_enabled:1;
 };
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 44668386f376..60c844b372e3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -9987,6 +9987,7 @@ int register_netdevice(struct net_device *dev)
 	idr_init_base(&dev->ethtool->rss_ctx, 1);
 
 	spin_lock_init(&dev->addr_list_lock);
+	mutex_init(&dev->ethtool->rss_lock);
 	netdev_set_addr_lockdep_class(dev);
 
 	ret = dev_get_valid_name(net, dev, dev->name);
@@ -10792,6 +10793,7 @@ static void netdev_rss_contexts_free(struct net_device *dev)
 	if (!dev->ethtool_ops->create_rxfh_context &&
 	    !dev->ethtool_ops->set_rxfh_context)
 		return;
+	mutex_lock(&dev->ethtool->rss_lock);
 	idr_for_each_entry(&dev->ethtool->rss_ctx, ctx, context) {
 		u32 *indir = ethtool_rxfh_context_indir(ctx);
 		u8 *key = ethtool_rxfh_context_key(ctx);
@@ -10806,6 +10808,7 @@ static void netdev_rss_contexts_free(struct net_device *dev)
 							   &context, true);
 		kfree(ctx);
 	}
+	mutex_unlock(&dev->ethtool->rss_lock);
 }
 
 /**
@@ -10919,6 +10922,8 @@ void unregister_netdevice_many_notify(struct list_head *head,
 		if (dev->netdev_ops->ndo_uninit)
 			dev->netdev_ops->ndo_uninit(dev);
 
+		mutex_destroy(&dev->ethtool->rss_lock);
+
 		if (skb)
 			rtmsg_ifinfo_send(skb, dev, GFP_KERNEL, portid, nlh);
 
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index abd1cf50e681..8b2e90ba03a1 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1257,6 +1257,7 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 	u8 *rss_config;
 	u32 rss_cfg_offset = offsetof(struct ethtool_rxfh, rss_config[0]);
 	bool create = false, delete = false;
+	bool locked = false; /* dev->ethtool->rss_lock taken */
 
 	if (!ops->get_rxnfc || !ops->set_rxfh)
 		return -EOPNOTSUPP;
@@ -1334,6 +1335,10 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 		}
 	}
 
+	if (rxfh.rss_context) {
+		mutex_lock(&dev->ethtool->rss_lock);
+		locked = true;
+	}
 	if (create) {
 		if (delete) {
 			ret = -EINVAL;
@@ -1453,6 +1458,8 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
 	}
 
 out:
+	if (locked)
+		mutex_unlock(&dev->ethtool->rss_lock);
 	kfree(rss_config);
 	return ret;
 }

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v2 net-next 7/7] sfc: use new rxfh_context API
  2023-04-11 18:26 [RFC PATCH v2 net-next 0/7] ethtool: track custom RSS contexts in the core edward.cree
                   ` (5 preceding siblings ...)
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts edward.cree
@ 2023-04-11 18:26 ` edward.cree
  6 siblings, 0 replies; 24+ messages in thread
From: edward.cree @ 2023-04-11 18:26 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx, sudheer.mogilappagari

From: Edward Cree <ecree.xilinx@gmail.com>

The core is now responsible for allocating IDs and a memory region for
 us to store our state (struct efx_rss_context_priv), so we no longer
 need efx_alloc_rss_context_entry() and friends.
Since the contexts are now maintained by the core, use the core's lock
 (net_dev->ethtool->rss_lock), rather than our own mutex (efx->rss_lock),
 to serialise access against changes; and remove the now-unused
 efx->rss_lock from struct efx_nic.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
Changes in v2:
* actually hook up rxfh_priv_size in ethtool ops structs
* port to the updated API
---
 drivers/net/ethernet/sfc/ef10.c           |   2 +-
 drivers/net/ethernet/sfc/ef100_ethtool.c  |   5 +-
 drivers/net/ethernet/sfc/efx.c            |   2 +-
 drivers/net/ethernet/sfc/efx.h            |   2 +-
 drivers/net/ethernet/sfc/efx_common.c     |  10 +-
 drivers/net/ethernet/sfc/ethtool.c        |   5 +-
 drivers/net/ethernet/sfc/ethtool_common.c | 147 +++++++++++++---------
 drivers/net/ethernet/sfc/ethtool_common.h |  18 ++-
 drivers/net/ethernet/sfc/mcdi_filters.c   | 133 ++++++++++----------
 drivers/net/ethernet/sfc/mcdi_filters.h   |   8 +-
 drivers/net/ethernet/sfc/net_driver.h     |  28 ++---
 drivers/net/ethernet/sfc/rx_common.c      |  64 ++--------
 drivers/net/ethernet/sfc/rx_common.h      |   8 +-
 13 files changed, 213 insertions(+), 219 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index d30459dbfe8f..6f12fcee8247 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -1394,7 +1394,7 @@ static void efx_ef10_table_reset_mc_allocations(struct efx_nic *efx)
 	efx_mcdi_filter_table_reset_mc_allocations(efx);
 	nic_data->must_restore_piobufs = true;
 	efx_ef10_forget_old_piobufs(efx);
-	efx->rss_context.context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
+	efx->rss_context.priv.context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
 
 	/* Driver-created vswitches and vports must be re-created */
 	nic_data->must_probe_vswitching = true;
diff --git a/drivers/net/ethernet/sfc/ef100_ethtool.c b/drivers/net/ethernet/sfc/ef100_ethtool.c
index 702abbe59b76..c5f82eb0e5b4 100644
--- a/drivers/net/ethernet/sfc/ef100_ethtool.c
+++ b/drivers/net/ethernet/sfc/ef100_ethtool.c
@@ -58,10 +58,13 @@ const struct ethtool_ops ef100_ethtool_ops = {
 
 	.get_rxfh_indir_size	= efx_ethtool_get_rxfh_indir_size,
 	.get_rxfh_key_size	= efx_ethtool_get_rxfh_key_size,
+	.rxfh_priv_size		= sizeof(struct efx_rss_context_priv),
 	.get_rxfh		= efx_ethtool_get_rxfh,
 	.set_rxfh		= efx_ethtool_set_rxfh,
 	.get_rxfh_context	= efx_ethtool_get_rxfh_context,
-	.set_rxfh_context	= efx_ethtool_set_rxfh_context,
+	.create_rxfh_context	= efx_ethtool_create_rxfh_context,
+	.modify_rxfh_context	= efx_ethtool_modify_rxfh_context,
+	.remove_rxfh_context	= efx_ethtool_remove_rxfh_context,
 
 	.get_module_info	= efx_ethtool_get_module_info,
 	.get_module_eeprom	= efx_ethtool_get_module_eeprom,
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 746fd9164e30..1b2c281c1cc1 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -298,7 +298,7 @@ static int efx_probe_nic(struct efx_nic *efx)
 	if (efx->n_channels > 1)
 		netdev_rss_key_fill(efx->rss_context.rx_hash_key,
 				    sizeof(efx->rss_context.rx_hash_key));
-	efx_set_default_rx_indir_table(efx, &efx->rss_context);
+	efx_set_default_rx_indir_table(efx, efx->rss_context.rx_indir_table);
 
 	/* Initialise the interrupt moderation settings */
 	efx->irq_mod_step_us = DIV_ROUND_UP(efx->timer_quantum_ns, 1000);
diff --git a/drivers/net/ethernet/sfc/efx.h b/drivers/net/ethernet/sfc/efx.h
index 4239c7ece123..a077f648bbde 100644
--- a/drivers/net/ethernet/sfc/efx.h
+++ b/drivers/net/ethernet/sfc/efx.h
@@ -160,7 +160,7 @@ static inline s32 efx_filter_get_rx_ids(struct efx_nic *efx,
 }
 
 /* RSS contexts */
-static inline bool efx_rss_active(struct efx_rss_context *ctx)
+static inline bool efx_rss_active(struct efx_rss_context_priv *ctx)
 {
 	return ctx->context_id != EFX_MCDI_RSS_CONTEXT_INVALID;
 }
diff --git a/drivers/net/ethernet/sfc/efx_common.c b/drivers/net/ethernet/sfc/efx_common.c
index cc30524c2fe4..23e3778716b2 100644
--- a/drivers/net/ethernet/sfc/efx_common.c
+++ b/drivers/net/ethernet/sfc/efx_common.c
@@ -717,7 +717,7 @@ void efx_reset_down(struct efx_nic *efx, enum reset_type method)
 
 	mutex_lock(&efx->mac_lock);
 	down_write(&efx->filter_sem);
-	mutex_lock(&efx->rss_lock);
+	mutex_lock(&efx->net_dev->ethtool->rss_lock);
 	efx->type->fini(efx);
 }
 
@@ -780,7 +780,7 @@ int efx_reset_up(struct efx_nic *efx, enum reset_type method, bool ok)
 
 	if (efx->type->rx_restore_rss_contexts)
 		efx->type->rx_restore_rss_contexts(efx);
-	mutex_unlock(&efx->rss_lock);
+	mutex_unlock(&efx->net_dev->ethtool->rss_lock);
 	efx->type->filter_table_restore(efx);
 	up_write(&efx->filter_sem);
 	if (efx->type->sriov_reset)
@@ -798,7 +798,7 @@ int efx_reset_up(struct efx_nic *efx, enum reset_type method, bool ok)
 fail:
 	efx->port_initialized = false;
 
-	mutex_unlock(&efx->rss_lock);
+	mutex_unlock(&efx->net_dev->ethtool->rss_lock);
 	up_write(&efx->filter_sem);
 	mutex_unlock(&efx->mac_lock);
 
@@ -1005,9 +1005,7 @@ int efx_init_struct(struct efx_nic *efx, struct pci_dev *pci_dev)
 		efx->type->rx_hash_offset - efx->type->rx_prefix_size;
 	efx->rx_packet_ts_offset =
 		efx->type->rx_ts_offset - efx->type->rx_prefix_size;
-	INIT_LIST_HEAD(&efx->rss_context.list);
-	efx->rss_context.context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
-	mutex_init(&efx->rss_lock);
+	efx->rss_context.priv.context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
 	efx->vport_id = EVB_PORT_ID_ASSIGNED;
 	spin_lock_init(&efx->stats_lock);
 	efx->vi_stride = EFX_DEFAULT_VI_STRIDE;
diff --git a/drivers/net/ethernet/sfc/ethtool.c b/drivers/net/ethernet/sfc/ethtool.c
index 364323599f7b..f5fb7464e025 100644
--- a/drivers/net/ethernet/sfc/ethtool.c
+++ b/drivers/net/ethernet/sfc/ethtool.c
@@ -267,10 +267,13 @@ const struct ethtool_ops efx_ethtool_ops = {
 	.set_rxnfc		= efx_ethtool_set_rxnfc,
 	.get_rxfh_indir_size	= efx_ethtool_get_rxfh_indir_size,
 	.get_rxfh_key_size	= efx_ethtool_get_rxfh_key_size,
+	.rxfh_priv_size		= sizeof(struct efx_rss_context_priv),
 	.get_rxfh		= efx_ethtool_get_rxfh,
 	.set_rxfh		= efx_ethtool_set_rxfh,
 	.get_rxfh_context	= efx_ethtool_get_rxfh_context,
-	.set_rxfh_context	= efx_ethtool_set_rxfh_context,
+	.create_rxfh_context	= efx_ethtool_create_rxfh_context,
+	.modify_rxfh_context	= efx_ethtool_modify_rxfh_context,
+	.remove_rxfh_context	= efx_ethtool_remove_rxfh_context,
 	.get_ts_info		= efx_ethtool_get_ts_info,
 	.get_module_info	= efx_ethtool_get_module_info,
 	.get_module_eeprom	= efx_ethtool_get_module_eeprom,
diff --git a/drivers/net/ethernet/sfc/ethtool_common.c b/drivers/net/ethernet/sfc/ethtool_common.c
index a8cbceeb301b..7cd01012152e 100644
--- a/drivers/net/ethernet/sfc/ethtool_common.c
+++ b/drivers/net/ethernet/sfc/ethtool_common.c
@@ -820,10 +820,10 @@ int efx_ethtool_get_rxnfc(struct net_device *net_dev,
 		return 0;
 
 	case ETHTOOL_GRXFH: {
-		struct efx_rss_context *ctx = &efx->rss_context;
+		struct efx_rss_context_priv *ctx = &efx->rss_context.priv;
 		__u64 data;
 
-		mutex_lock(&efx->rss_lock);
+		mutex_lock(&net_dev->ethtool->rss_lock);
 		if (info->flow_type & FLOW_RSS && info->rss_context) {
 			ctx = efx_find_rss_context_entry(efx, info->rss_context);
 			if (!ctx) {
@@ -864,7 +864,7 @@ int efx_ethtool_get_rxnfc(struct net_device *net_dev,
 out_setdata_unlock:
 		info->data = data;
 out_unlock:
-		mutex_unlock(&efx->rss_lock);
+		mutex_unlock(&net_dev->ethtool->rss_lock);
 		return rc;
 	}
 
@@ -1207,96 +1207,121 @@ int efx_ethtool_get_rxfh_context(struct net_device *net_dev, u32 *indir,
 				 u8 *key, u8 *hfunc, u32 rss_context)
 {
 	struct efx_nic *efx = efx_netdev_priv(net_dev);
-	struct efx_rss_context *ctx;
+	struct efx_rss_context_priv *ctx_priv;
+	struct efx_rss_context ctx;
 	int rc = 0;
 
 	if (!efx->type->rx_pull_rss_context_config)
 		return -EOPNOTSUPP;
 
-	mutex_lock(&efx->rss_lock);
-	ctx = efx_find_rss_context_entry(efx, rss_context);
-	if (!ctx) {
+	mutex_lock(&net_dev->ethtool->rss_lock);
+	ctx_priv = efx_find_rss_context_entry(efx, rss_context);
+	if (!ctx_priv) {
 		rc = -ENOENT;
 		goto out_unlock;
 	}
-	rc = efx->type->rx_pull_rss_context_config(efx, ctx);
+	ctx.priv = *ctx_priv;
+	rc = efx->type->rx_pull_rss_context_config(efx, &ctx);
 	if (rc)
 		goto out_unlock;
 
 	if (hfunc)
 		*hfunc = ETH_RSS_HASH_TOP;
 	if (indir)
-		memcpy(indir, ctx->rx_indir_table, sizeof(ctx->rx_indir_table));
+		memcpy(indir, ctx.rx_indir_table, sizeof(ctx.rx_indir_table));
 	if (key)
-		memcpy(key, ctx->rx_hash_key, efx->type->rx_hash_key_size);
+		memcpy(key, ctx.rx_hash_key, efx->type->rx_hash_key_size);
 out_unlock:
-	mutex_unlock(&efx->rss_lock);
+	mutex_unlock(&net_dev->ethtool->rss_lock);
 	return rc;
 }
 
-int efx_ethtool_set_rxfh_context(struct net_device *net_dev,
-				 const u32 *indir, const u8 *key,
-				 const u8 hfunc, u32 *rss_context,
-				 bool delete)
+int efx_ethtool_modify_rxfh_context(struct net_device *net_dev,
+				    struct ethtool_rxfh_context *ctx,
+				    const u32 *indir, const u8 *key,
+				    const u8 hfunc, u32 rss_context,
+				    struct netlink_ext_ack *extack)
 {
 	struct efx_nic *efx = efx_netdev_priv(net_dev);
-	struct efx_rss_context *ctx;
-	bool allocated = false;
-	int rc;
+	struct efx_rss_context_priv *priv;
 
-	if (!efx->type->rx_push_rss_context_config)
+	if (!efx->type->rx_push_rss_context_config) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "NIC type does not support custom contexts");
 		return -EOPNOTSUPP;
+	}
 	/* Hash function is Toeplitz, cannot be changed */
-	if (hfunc != ETH_RSS_HASH_NO_CHANGE && hfunc != ETH_RSS_HASH_TOP)
+	if (hfunc != ETH_RSS_HASH_NO_CHANGE && hfunc != ETH_RSS_HASH_TOP) {
+		NL_SET_ERR_MSG_MOD(extack, "Only Toeplitz hash is supported");
 		return -EOPNOTSUPP;
+	}
 
-	mutex_lock(&efx->rss_lock);
+	priv = ethtool_rxfh_context_priv(ctx);
 
-	if (*rss_context == ETH_RXFH_CONTEXT_ALLOC) {
-		if (delete) {
-			/* alloc + delete == Nothing to do */
-			rc = -EINVAL;
-			goto out_unlock;
-		}
-		ctx = efx_alloc_rss_context_entry(efx);
-		if (!ctx) {
-			rc = -ENOMEM;
-			goto out_unlock;
-		}
-		ctx->context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
-		/* Initialise indir table and key to defaults */
-		efx_set_default_rx_indir_table(efx, ctx);
-		netdev_rss_key_fill(ctx->rx_hash_key, sizeof(ctx->rx_hash_key));
-		allocated = true;
-	} else {
-		ctx = efx_find_rss_context_entry(efx, *rss_context);
-		if (!ctx) {
-			rc = -ENOENT;
-			goto out_unlock;
-		}
-	}
+	if (!key)
+		key = ethtool_rxfh_context_key(ctx);
+	if (!indir)
+		indir = ethtool_rxfh_context_indir(ctx);
 
-	if (delete) {
-		/* delete this context */
-		rc = efx->type->rx_push_rss_context_config(efx, ctx, NULL, NULL);
-		if (!rc)
-			efx_free_rss_context_entry(ctx);
-		goto out_unlock;
+	return efx->type->rx_push_rss_context_config(efx, priv, indir, key,
+						     false);
+}
+
+int efx_ethtool_create_rxfh_context(struct net_device *net_dev,
+				    struct ethtool_rxfh_context *ctx,
+				    const u32 *indir, const u8 *key,
+				    const u8 hfunc, u32 rss_context,
+				    struct netlink_ext_ack *extack)
+{
+	struct efx_nic *efx = efx_netdev_priv(net_dev);
+	struct efx_rss_context_priv *priv;
+
+	if (!efx->type->rx_push_rss_context_config) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "NIC type does not support custom contexts");
+		return -EOPNOTSUPP;
+	}
+	/* Hash function is Toeplitz, cannot be changed */
+	if (hfunc != ETH_RSS_HASH_NO_CHANGE && hfunc != ETH_RSS_HASH_TOP) {
+		NL_SET_ERR_MSG_MOD(extack, "Only Toeplitz hash is supported");
+		return -EOPNOTSUPP;
 	}
 
-	if (!key)
-		key = ctx->rx_hash_key;
+	priv = ethtool_rxfh_context_priv(ctx);
+
+	priv->context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
+	priv->rx_hash_udp_4tuple = false;
+	/* Generate default indir table and/or key if not specified.
+	 * We use ctx as a place to store these; this is fine because
+	 * we're doing a create, so if we fail then the ctx will just
+	 * be deleted.
+	 */
 	if (!indir)
-		indir = ctx->rx_indir_table;
+		efx_set_default_rx_indir_table(efx, ethtool_rxfh_context_indir(ctx));
+	if (!key)
+		netdev_rss_key_fill(ethtool_rxfh_context_key(ctx),
+				    ctx->key_size);
+	return efx_ethtool_modify_rxfh_context(net_dev, ctx, indir, key, hfunc,
+					       rss_context, extack);
+}
 
-	rc = efx->type->rx_push_rss_context_config(efx, ctx, indir, key);
-	if (rc && allocated)
-		efx_free_rss_context_entry(ctx);
-	else
-		*rss_context = ctx->user_id;
-out_unlock:
-	mutex_unlock(&efx->rss_lock);
-	return rc;
+int efx_ethtool_remove_rxfh_context(struct net_device *net_dev,
+				    struct ethtool_rxfh_context *ctx,
+				    u32 rss_context,
+				    struct netlink_ext_ack *extack)
+{
+	struct efx_nic *efx = efx_netdev_priv(net_dev);
+	struct efx_rss_context_priv *priv;
+
+	if (!efx->type->rx_push_rss_context_config) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "NIC type does not support custom contexts");
+		return -EOPNOTSUPP;
+	}
+
+	priv = ethtool_rxfh_context_priv(ctx);
+	return efx->type->rx_push_rss_context_config(efx, priv, NULL, NULL,
+						     true);
 }
 
 int efx_ethtool_reset(struct net_device *net_dev, u32 *flags)
diff --git a/drivers/net/ethernet/sfc/ethtool_common.h b/drivers/net/ethernet/sfc/ethtool_common.h
index 659491932101..3df852eaab20 100644
--- a/drivers/net/ethernet/sfc/ethtool_common.h
+++ b/drivers/net/ethernet/sfc/ethtool_common.h
@@ -50,10 +50,20 @@ int efx_ethtool_set_rxfh(struct net_device *net_dev,
 			 const u32 *indir, const u8 *key, const u8 hfunc);
 int efx_ethtool_get_rxfh_context(struct net_device *net_dev, u32 *indir,
 				 u8 *key, u8 *hfunc, u32 rss_context);
-int efx_ethtool_set_rxfh_context(struct net_device *net_dev,
-				 const u32 *indir, const u8 *key,
-				 const u8 hfunc, u32 *rss_context,
-				 bool delete);
+int efx_ethtool_create_rxfh_context(struct net_device *net_dev,
+				    struct ethtool_rxfh_context *ctx,
+				    const u32 *indir, const u8 *key,
+				    const u8 hfunc, u32 rss_context,
+				    struct netlink_ext_ack *extack);
+int efx_ethtool_modify_rxfh_context(struct net_device *net_dev,
+				    struct ethtool_rxfh_context *ctx,
+				    const u32 *indir, const u8 *key,
+				    const u8 hfunc, u32 rss_context,
+				    struct netlink_ext_ack *extack);
+int efx_ethtool_remove_rxfh_context(struct net_device *net_dev,
+				    struct ethtool_rxfh_context *ctx,
+				    u32 rss_context,
+				    struct netlink_ext_ack *extack);
 int efx_ethtool_reset(struct net_device *net_dev, u32 *flags);
 int efx_ethtool_get_module_eeprom(struct net_device *net_dev,
 				  struct ethtool_eeprom *ee,
diff --git a/drivers/net/ethernet/sfc/mcdi_filters.c b/drivers/net/ethernet/sfc/mcdi_filters.c
index 4ff6586116ee..fa6eb4ec170a 100644
--- a/drivers/net/ethernet/sfc/mcdi_filters.c
+++ b/drivers/net/ethernet/sfc/mcdi_filters.c
@@ -194,7 +194,7 @@ efx_mcdi_filter_push_prep_set_match_fields(struct efx_nic *efx,
 static void efx_mcdi_filter_push_prep(struct efx_nic *efx,
 				      const struct efx_filter_spec *spec,
 				      efx_dword_t *inbuf, u64 handle,
-				      struct efx_rss_context *ctx,
+				      struct efx_rss_context_priv *ctx,
 				      bool replacing)
 {
 	u32 flags = spec->flags;
@@ -245,7 +245,7 @@ static void efx_mcdi_filter_push_prep(struct efx_nic *efx,
 
 static int efx_mcdi_filter_push(struct efx_nic *efx,
 				const struct efx_filter_spec *spec, u64 *handle,
-				struct efx_rss_context *ctx, bool replacing)
+				struct efx_rss_context_priv *ctx, bool replacing)
 {
 	MCDI_DECLARE_BUF(inbuf, MC_CMD_FILTER_OP_EXT_IN_LEN);
 	MCDI_DECLARE_BUF(outbuf, MC_CMD_FILTER_OP_EXT_OUT_LEN);
@@ -345,9 +345,9 @@ static s32 efx_mcdi_filter_insert_locked(struct efx_nic *efx,
 					 bool replace_equal)
 {
 	DECLARE_BITMAP(mc_rem_map, EFX_EF10_FILTER_SEARCH_LIMIT);
+	struct efx_rss_context_priv *ctx = NULL;
 	struct efx_mcdi_filter_table *table;
 	struct efx_filter_spec *saved_spec;
-	struct efx_rss_context *ctx = NULL;
 	unsigned int match_pri, hash;
 	unsigned int priv_flags;
 	bool rss_locked = false;
@@ -380,12 +380,12 @@ static s32 efx_mcdi_filter_insert_locked(struct efx_nic *efx,
 		bitmap_zero(mc_rem_map, EFX_EF10_FILTER_SEARCH_LIMIT);
 
 	if (spec->flags & EFX_FILTER_FLAG_RX_RSS) {
-		mutex_lock(&efx->rss_lock);
+		mutex_lock(&efx->net_dev->ethtool->rss_lock);
 		rss_locked = true;
 		if (spec->rss_context)
 			ctx = efx_find_rss_context_entry(efx, spec->rss_context);
 		else
-			ctx = &efx->rss_context;
+			ctx = &efx->rss_context.priv;
 		if (!ctx) {
 			rc = -ENOENT;
 			goto out_unlock;
@@ -548,7 +548,7 @@ static s32 efx_mcdi_filter_insert_locked(struct efx_nic *efx,
 
 out_unlock:
 	if (rss_locked)
-		mutex_unlock(&efx->rss_lock);
+		mutex_unlock(&efx->net_dev->ethtool->rss_lock);
 	up_write(&table->lock);
 	return rc;
 }
@@ -611,13 +611,13 @@ static int efx_mcdi_filter_remove_internal(struct efx_nic *efx,
 
 		new_spec.priority = EFX_FILTER_PRI_AUTO;
 		new_spec.flags = (EFX_FILTER_FLAG_RX |
-				  (efx_rss_active(&efx->rss_context) ?
+				  (efx_rss_active(&efx->rss_context.priv) ?
 				   EFX_FILTER_FLAG_RX_RSS : 0));
 		new_spec.dmaq_id = 0;
 		new_spec.rss_context = 0;
 		rc = efx_mcdi_filter_push(efx, &new_spec,
 					  &table->entry[filter_idx].handle,
-					  &efx->rss_context,
+					  &efx->rss_context.priv,
 					  true);
 
 		if (rc == 0)
@@ -764,7 +764,7 @@ static int efx_mcdi_filter_insert_addr_list(struct efx_nic *efx,
 		ids = vlan->uc;
 	}
 
-	filter_flags = efx_rss_active(&efx->rss_context) ? EFX_FILTER_FLAG_RX_RSS : 0;
+	filter_flags = efx_rss_active(&efx->rss_context.priv) ? EFX_FILTER_FLAG_RX_RSS : 0;
 
 	/* Insert/renew filters */
 	for (i = 0; i < addr_count; i++) {
@@ -833,7 +833,7 @@ static int efx_mcdi_filter_insert_def(struct efx_nic *efx,
 	int rc;
 	u16 *id;
 
-	filter_flags = efx_rss_active(&efx->rss_context) ? EFX_FILTER_FLAG_RX_RSS : 0;
+	filter_flags = efx_rss_active(&efx->rss_context.priv) ? EFX_FILTER_FLAG_RX_RSS : 0;
 
 	efx_filter_init_rx(&spec, EFX_FILTER_PRI_AUTO, filter_flags, 0);
 
@@ -1375,8 +1375,8 @@ void efx_mcdi_filter_table_restore(struct efx_nic *efx)
 	struct efx_mcdi_filter_table *table = efx->filter_state;
 	unsigned int invalid_filters = 0, failed = 0;
 	struct efx_mcdi_filter_vlan *vlan;
+	struct efx_rss_context_priv *ctx;
 	struct efx_filter_spec *spec;
-	struct efx_rss_context *ctx;
 	unsigned int filter_idx;
 	u32 mcdi_flags;
 	int match_pri;
@@ -1388,7 +1388,7 @@ void efx_mcdi_filter_table_restore(struct efx_nic *efx)
 		return;
 
 	down_write(&table->lock);
-	mutex_lock(&efx->rss_lock);
+	mutex_lock(&efx->net_dev->ethtool->rss_lock);
 
 	for (filter_idx = 0; filter_idx < EFX_MCDI_FILTER_TBL_ROWS; filter_idx++) {
 		spec = efx_mcdi_filter_entry_spec(table, filter_idx);
@@ -1407,7 +1407,7 @@ void efx_mcdi_filter_table_restore(struct efx_nic *efx)
 		if (spec->rss_context)
 			ctx = efx_find_rss_context_entry(efx, spec->rss_context);
 		else
-			ctx = &efx->rss_context;
+			ctx = &efx->rss_context.priv;
 		if (spec->flags & EFX_FILTER_FLAG_RX_RSS) {
 			if (!ctx) {
 				netif_warn(efx, drv, efx->net_dev,
@@ -1444,7 +1444,7 @@ void efx_mcdi_filter_table_restore(struct efx_nic *efx)
 		}
 	}
 
-	mutex_unlock(&efx->rss_lock);
+	mutex_unlock(&efx->net_dev->ethtool->rss_lock);
 	up_write(&table->lock);
 
 	/*
@@ -1861,7 +1861,8 @@ bool efx_mcdi_filter_rfs_expire_one(struct efx_nic *efx, u32 flow_id,
 					 RSS_MODE_HASH_ADDRS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV6_RSS_MODE_LBN |\
 					 RSS_MODE_HASH_ADDRS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_OTHER_IPV6_RSS_MODE_LBN)
 
-int efx_mcdi_get_rss_context_flags(struct efx_nic *efx, u32 context, u32 *flags)
+static int efx_mcdi_get_rss_context_flags(struct efx_nic *efx, u32 context,
+					  u32 *flags)
 {
 	/*
 	 * Firmware had a bug (sfc bug 61952) where it would not actually
@@ -1909,8 +1910,8 @@ int efx_mcdi_get_rss_context_flags(struct efx_nic *efx, u32 context, u32 *flags)
  * Defaults are 4-tuple for TCP and 2-tuple for UDP and other-IP, so we
  * just need to set the UDP ports flags (for both IP versions).
  */
-void efx_mcdi_set_rss_context_flags(struct efx_nic *efx,
-				    struct efx_rss_context *ctx)
+static void efx_mcdi_set_rss_context_flags(struct efx_nic *efx,
+					   struct efx_rss_context_priv *ctx)
 {
 	MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_SET_FLAGS_IN_LEN);
 	u32 flags;
@@ -1931,7 +1932,7 @@ void efx_mcdi_set_rss_context_flags(struct efx_nic *efx,
 }
 
 static int efx_mcdi_filter_alloc_rss_context(struct efx_nic *efx, bool exclusive,
-					     struct efx_rss_context *ctx,
+					     struct efx_rss_context_priv *ctx,
 					     unsigned *context_size)
 {
 	MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_ALLOC_IN_LEN);
@@ -2032,25 +2033,26 @@ void efx_mcdi_rx_free_indir_table(struct efx_nic *efx)
 {
 	int rc;
 
-	if (efx->rss_context.context_id != EFX_MCDI_RSS_CONTEXT_INVALID) {
-		rc = efx_mcdi_filter_free_rss_context(efx, efx->rss_context.context_id);
+	if (efx->rss_context.priv.context_id != EFX_MCDI_RSS_CONTEXT_INVALID) {
+		rc = efx_mcdi_filter_free_rss_context(efx, efx->rss_context.priv.context_id);
 		WARN_ON(rc != 0);
 	}
-	efx->rss_context.context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
+	efx->rss_context.priv.context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
 }
 
 static int efx_mcdi_filter_rx_push_shared_rss_config(struct efx_nic *efx,
 					      unsigned *context_size)
 {
 	struct efx_mcdi_filter_table *table = efx->filter_state;
-	int rc = efx_mcdi_filter_alloc_rss_context(efx, false, &efx->rss_context,
-					    context_size);
+	int rc = efx_mcdi_filter_alloc_rss_context(efx, false,
+						   &efx->rss_context.priv,
+						   context_size);
 
 	if (rc != 0)
 		return rc;
 
 	table->rx_rss_context_exclusive = false;
-	efx_set_default_rx_indir_table(efx, &efx->rss_context);
+	efx_set_default_rx_indir_table(efx, efx->rss_context.rx_indir_table);
 	return 0;
 }
 
@@ -2058,26 +2060,27 @@ static int efx_mcdi_filter_rx_push_exclusive_rss_config(struct efx_nic *efx,
 						 const u32 *rx_indir_table,
 						 const u8 *key)
 {
+	u32 old_rx_rss_context = efx->rss_context.priv.context_id;
 	struct efx_mcdi_filter_table *table = efx->filter_state;
-	u32 old_rx_rss_context = efx->rss_context.context_id;
 	int rc;
 
-	if (efx->rss_context.context_id == EFX_MCDI_RSS_CONTEXT_INVALID ||
+	if (efx->rss_context.priv.context_id == EFX_MCDI_RSS_CONTEXT_INVALID ||
 	    !table->rx_rss_context_exclusive) {
-		rc = efx_mcdi_filter_alloc_rss_context(efx, true, &efx->rss_context,
-						NULL);
+		rc = efx_mcdi_filter_alloc_rss_context(efx, true,
+						       &efx->rss_context.priv,
+						       NULL);
 		if (rc == -EOPNOTSUPP)
 			return rc;
 		else if (rc != 0)
 			goto fail1;
 	}
 
-	rc = efx_mcdi_filter_populate_rss_table(efx, efx->rss_context.context_id,
-					 rx_indir_table, key);
+	rc = efx_mcdi_filter_populate_rss_table(efx, efx->rss_context.priv.context_id,
+						rx_indir_table, key);
 	if (rc != 0)
 		goto fail2;
 
-	if (efx->rss_context.context_id != old_rx_rss_context &&
+	if (efx->rss_context.priv.context_id != old_rx_rss_context &&
 	    old_rx_rss_context != EFX_MCDI_RSS_CONTEXT_INVALID)
 		WARN_ON(efx_mcdi_filter_free_rss_context(efx, old_rx_rss_context) != 0);
 	table->rx_rss_context_exclusive = true;
@@ -2091,9 +2094,9 @@ static int efx_mcdi_filter_rx_push_exclusive_rss_config(struct efx_nic *efx,
 	return 0;
 
 fail2:
-	if (old_rx_rss_context != efx->rss_context.context_id) {
-		WARN_ON(efx_mcdi_filter_free_rss_context(efx, efx->rss_context.context_id) != 0);
-		efx->rss_context.context_id = old_rx_rss_context;
+	if (old_rx_rss_context != efx->rss_context.priv.context_id) {
+		WARN_ON(efx_mcdi_filter_free_rss_context(efx, efx->rss_context.priv.context_id) != 0);
+		efx->rss_context.priv.context_id = old_rx_rss_context;
 	}
 fail1:
 	netif_err(efx, hw, efx->net_dev, "%s: failed rc=%d\n", __func__, rc);
@@ -2101,33 +2104,28 @@ static int efx_mcdi_filter_rx_push_exclusive_rss_config(struct efx_nic *efx,
 }
 
 int efx_mcdi_rx_push_rss_context_config(struct efx_nic *efx,
-					struct efx_rss_context *ctx,
+					struct efx_rss_context_priv *ctx,
 					const u32 *rx_indir_table,
-					const u8 *key)
+					const u8 *key, bool delete)
 {
 	int rc;
 
-	WARN_ON(!mutex_is_locked(&efx->rss_lock));
+	WARN_ON(!mutex_is_locked(&efx->net_dev->ethtool->rss_lock));
 
 	if (ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID) {
+		if (delete)
+			/* already wasn't in HW, nothing to do */
+			return 0;
 		rc = efx_mcdi_filter_alloc_rss_context(efx, true, ctx, NULL);
 		if (rc)
 			return rc;
 	}
 
-	if (!rx_indir_table) /* Delete this context */
+	if (delete) /* Delete this context */
 		return efx_mcdi_filter_free_rss_context(efx, ctx->context_id);
 
-	rc = efx_mcdi_filter_populate_rss_table(efx, ctx->context_id,
-					 rx_indir_table, key);
-	if (rc)
-		return rc;
-
-	memcpy(ctx->rx_indir_table, rx_indir_table,
-	       sizeof(efx->rss_context.rx_indir_table));
-	memcpy(ctx->rx_hash_key, key, efx->type->rx_hash_key_size);
-
-	return 0;
+	return efx_mcdi_filter_populate_rss_table(efx, ctx->context_id,
+						  rx_indir_table, key);
 }
 
 int efx_mcdi_rx_pull_rss_context_config(struct efx_nic *efx,
@@ -2139,16 +2137,16 @@ int efx_mcdi_rx_pull_rss_context_config(struct efx_nic *efx,
 	size_t outlen;
 	int rc, i;
 
-	WARN_ON(!mutex_is_locked(&efx->rss_lock));
+	WARN_ON(!mutex_is_locked(&efx->net_dev->ethtool->rss_lock));
 
 	BUILD_BUG_ON(MC_CMD_RSS_CONTEXT_GET_TABLE_IN_LEN !=
 		     MC_CMD_RSS_CONTEXT_GET_KEY_IN_LEN);
 
-	if (ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID)
+	if (ctx->priv.context_id == EFX_MCDI_RSS_CONTEXT_INVALID)
 		return -ENOENT;
 
 	MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_TABLE_IN_RSS_CONTEXT_ID,
-		       ctx->context_id);
+		       ctx->priv.context_id);
 	BUILD_BUG_ON(ARRAY_SIZE(ctx->rx_indir_table) !=
 		     MC_CMD_RSS_CONTEXT_GET_TABLE_OUT_INDIRECTION_TABLE_LEN);
 	rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_GET_TABLE, inbuf, sizeof(inbuf),
@@ -2164,7 +2162,7 @@ int efx_mcdi_rx_pull_rss_context_config(struct efx_nic *efx,
 				RSS_CONTEXT_GET_TABLE_OUT_INDIRECTION_TABLE)[i];
 
 	MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_KEY_IN_RSS_CONTEXT_ID,
-		       ctx->context_id);
+		       ctx->priv.context_id);
 	BUILD_BUG_ON(ARRAY_SIZE(ctx->rx_hash_key) !=
 		     MC_CMD_RSS_CONTEXT_SET_KEY_IN_TOEPLITZ_KEY_LEN);
 	rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_GET_KEY, inbuf, sizeof(inbuf),
@@ -2186,35 +2184,42 @@ int efx_mcdi_rx_pull_rss_config(struct efx_nic *efx)
 {
 	int rc;
 
-	mutex_lock(&efx->rss_lock);
+	mutex_lock(&efx->net_dev->ethtool->rss_lock);
 	rc = efx_mcdi_rx_pull_rss_context_config(efx, &efx->rss_context);
-	mutex_unlock(&efx->rss_lock);
+	mutex_unlock(&efx->net_dev->ethtool->rss_lock);
 	return rc;
 }
 
 void efx_mcdi_rx_restore_rss_contexts(struct efx_nic *efx)
 {
 	struct efx_mcdi_filter_table *table = efx->filter_state;
-	struct efx_rss_context *ctx;
+	struct ethtool_rxfh_context *ctx;
+	u32 context;
 	int rc;
 
-	WARN_ON(!mutex_is_locked(&efx->rss_lock));
+	WARN_ON(!mutex_is_locked(&efx->net_dev->ethtool->rss_lock));
 
 	if (!table->must_restore_rss_contexts)
 		return;
 
-	list_for_each_entry(ctx, &efx->rss_context.list, list) {
+	idr_for_each_entry(&efx->net_dev->ethtool->rss_ctx, ctx, context) {
+		struct efx_rss_context_priv *priv;
+		u32 *indir;
+		u8 *key;
+
+		priv = ethtool_rxfh_context_priv(ctx);
 		/* previous NIC RSS context is gone */
-		ctx->context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
+		priv->context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
 		/* so try to allocate a new one */
-		rc = efx_mcdi_rx_push_rss_context_config(efx, ctx,
-							 ctx->rx_indir_table,
-							 ctx->rx_hash_key);
+		indir = ethtool_rxfh_context_indir(ctx);
+		key = ethtool_rxfh_context_key(ctx);
+		rc = efx_mcdi_rx_push_rss_context_config(efx, priv, indir, key,
+							 false);
 		if (rc)
 			netif_warn(efx, probe, efx->net_dev,
 				   "failed to restore RSS context %u, rc=%d"
 				   "; RSS filters may fail to be applied\n",
-				   ctx->user_id, rc);
+				   context, rc);
 	}
 	table->must_restore_rss_contexts = false;
 }
@@ -2276,7 +2281,7 @@ int efx_mcdi_vf_rx_push_rss_config(struct efx_nic *efx, bool user,
 {
 	if (user)
 		return -EOPNOTSUPP;
-	if (efx->rss_context.context_id != EFX_MCDI_RSS_CONTEXT_INVALID)
+	if (efx->rss_context.priv.context_id != EFX_MCDI_RSS_CONTEXT_INVALID)
 		return 0;
 	return efx_mcdi_filter_rx_push_shared_rss_config(efx, NULL);
 }
@@ -2295,7 +2300,7 @@ int efx_mcdi_push_default_indir_table(struct efx_nic *efx,
 
 	efx_mcdi_rx_free_indir_table(efx);
 	if (rss_spread > 1) {
-		efx_set_default_rx_indir_table(efx, &efx->rss_context);
+		efx_set_default_rx_indir_table(efx, efx->rss_context.rx_indir_table);
 		rc = efx->type->rx_push_rss_config(efx, false,
 				   efx->rss_context.rx_indir_table, NULL);
 	}
diff --git a/drivers/net/ethernet/sfc/mcdi_filters.h b/drivers/net/ethernet/sfc/mcdi_filters.h
index c0d6558b9fd2..11b9f87ed9e1 100644
--- a/drivers/net/ethernet/sfc/mcdi_filters.h
+++ b/drivers/net/ethernet/sfc/mcdi_filters.h
@@ -145,9 +145,9 @@ void efx_mcdi_filter_del_vlan(struct efx_nic *efx, u16 vid);
 
 void efx_mcdi_rx_free_indir_table(struct efx_nic *efx);
 int efx_mcdi_rx_push_rss_context_config(struct efx_nic *efx,
-					struct efx_rss_context *ctx,
+					struct efx_rss_context_priv *ctx,
 					const u32 *rx_indir_table,
-					const u8 *key);
+					const u8 *key, bool delete);
 int efx_mcdi_pf_rx_push_rss_config(struct efx_nic *efx, bool user,
 				   const u32 *rx_indir_table,
 				   const u8 *key);
@@ -161,10 +161,6 @@ int efx_mcdi_push_default_indir_table(struct efx_nic *efx,
 int efx_mcdi_rx_pull_rss_config(struct efx_nic *efx);
 int efx_mcdi_rx_pull_rss_context_config(struct efx_nic *efx,
 					struct efx_rss_context *ctx);
-int efx_mcdi_get_rss_context_flags(struct efx_nic *efx, u32 context,
-				   u32 *flags);
-void efx_mcdi_set_rss_context_flags(struct efx_nic *efx,
-				    struct efx_rss_context *ctx);
 void efx_mcdi_rx_restore_rss_contexts(struct efx_nic *efx);
 
 static inline void efx_mcdi_update_rx_scatter(struct efx_nic *efx)
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index fcd51d3992fa..bceeada24d6c 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -770,21 +770,24 @@ struct vfdi_status;
 /* The reserved RSS context value */
 #define EFX_MCDI_RSS_CONTEXT_INVALID	0xffffffff
 /**
- * struct efx_rss_context - A user-defined RSS context for filtering
- * @list: node of linked list on which this struct is stored
+ * struct efx_rss_context_priv - driver private data for an RSS context
  * @context_id: the RSS_CONTEXT_ID returned by MC firmware, or
  *	%EFX_MCDI_RSS_CONTEXT_INVALID if this context is not present on the NIC.
- *	For Siena, 0 if RSS is active, else %EFX_MCDI_RSS_CONTEXT_INVALID.
- * @user_id: the rss_context ID exposed to userspace over ethtool.
  * @rx_hash_udp_4tuple: UDP 4-tuple hashing enabled
+ */
+struct efx_rss_context_priv {
+	u32 context_id;
+	bool rx_hash_udp_4tuple;
+};
+
+/**
+ * struct efx_rss_context - an RSS context
+ * @priv: hardware-specific state
  * @rx_hash_key: Toeplitz hash key for this RSS context
  * @indir_table: Indirection table for this RSS context
  */
 struct efx_rss_context {
-	struct list_head list;
-	u32 context_id;
-	u32 user_id;
-	bool rx_hash_udp_4tuple;
+	struct efx_rss_context_priv priv;
 	u8 rx_hash_key[40];
 	u32 rx_indir_table[128];
 };
@@ -917,9 +920,7 @@ struct efx_mae;
  * @rx_packet_ts_offset: Offset of timestamp from start of packet data
  *	(valid only if channel->sync_timestamps_enabled; always negative)
  * @rx_scatter: Scatter mode enabled for receives
- * @rss_context: Main RSS context.  Its @list member is the head of the list of
- *	RSS contexts created by user requests
- * @rss_lock: Protects custom RSS context software state in @rss_context.list
+ * @rss_context: Main RSS context.
  * @vport_id: The function's vport ID, only relevant for PFs
  * @int_error_count: Number of internal errors seen recently
  * @int_error_expire: Time at which error count will be expired
@@ -1090,7 +1091,6 @@ struct efx_nic {
 	int rx_packet_ts_offset;
 	bool rx_scatter;
 	struct efx_rss_context rss_context;
-	struct mutex rss_lock;
 	u32 vport_id;
 
 	unsigned int_error_count;
@@ -1462,9 +1462,9 @@ struct efx_nic_type {
 				  const u32 *rx_indir_table, const u8 *key);
 	int (*rx_pull_rss_config)(struct efx_nic *efx);
 	int (*rx_push_rss_context_config)(struct efx_nic *efx,
-					  struct efx_rss_context *ctx,
+					  struct efx_rss_context_priv *ctx,
 					  const u32 *rx_indir_table,
-					  const u8 *key);
+					  const u8 *key, bool delete);
 	int (*rx_pull_rss_context_config)(struct efx_nic *efx,
 					  struct efx_rss_context *ctx);
 	void (*rx_restore_rss_contexts)(struct efx_nic *efx);
diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c
index d2f35ee15eff..f5632c210ab2 100644
--- a/drivers/net/ethernet/sfc/rx_common.c
+++ b/drivers/net/ethernet/sfc/rx_common.c
@@ -556,69 +556,25 @@ efx_rx_packet_gro(struct efx_channel *channel, struct efx_rx_buffer *rx_buf,
 	napi_gro_frags(napi);
 }
 
-/* RSS contexts.  We're using linked lists and crappy O(n) algorithms, because
- * (a) this is an infrequent control-plane operation and (b) n is small (max 64)
- */
-struct efx_rss_context *efx_alloc_rss_context_entry(struct efx_nic *efx)
+struct efx_rss_context_priv *efx_find_rss_context_entry(struct efx_nic *efx,
+							u32 id)
 {
-	struct list_head *head = &efx->rss_context.list;
-	struct efx_rss_context *ctx, *new;
-	u32 id = 1; /* Don't use zero, that refers to the master RSS context */
-
-	WARN_ON(!mutex_is_locked(&efx->rss_lock));
+	struct ethtool_rxfh_context *ctx;
 
-	/* Search for first gap in the numbering */
-	list_for_each_entry(ctx, head, list) {
-		if (ctx->user_id != id)
-			break;
-		id++;
-		/* Check for wrap.  If this happens, we have nearly 2^32
-		 * allocated RSS contexts, which seems unlikely.
-		 */
-		if (WARN_ON_ONCE(!id))
-			return NULL;
-	}
+	WARN_ON(!mutex_is_locked(&efx->net_dev->ethtool->rss_lock));
 
-	/* Create the new entry */
-	new = kmalloc(sizeof(*new), GFP_KERNEL);
-	if (!new)
+	ctx = idr_find(&efx->net_dev->ethtool->rss_ctx, id);
+	if (!ctx)
 		return NULL;
-	new->context_id = EFX_MCDI_RSS_CONTEXT_INVALID;
-	new->rx_hash_udp_4tuple = false;
-
-	/* Insert the new entry into the gap */
-	new->user_id = id;
-	list_add_tail(&new->list, &ctx->list);
-	return new;
-}
-
-struct efx_rss_context *efx_find_rss_context_entry(struct efx_nic *efx, u32 id)
-{
-	struct list_head *head = &efx->rss_context.list;
-	struct efx_rss_context *ctx;
-
-	WARN_ON(!mutex_is_locked(&efx->rss_lock));
-
-	list_for_each_entry(ctx, head, list)
-		if (ctx->user_id == id)
-			return ctx;
-	return NULL;
-}
-
-void efx_free_rss_context_entry(struct efx_rss_context *ctx)
-{
-	list_del(&ctx->list);
-	kfree(ctx);
+	return ethtool_rxfh_context_priv(ctx);
 }
 
-void efx_set_default_rx_indir_table(struct efx_nic *efx,
-				    struct efx_rss_context *ctx)
+void efx_set_default_rx_indir_table(struct efx_nic *efx, u32 *indir)
 {
 	size_t i;
 
-	for (i = 0; i < ARRAY_SIZE(ctx->rx_indir_table); i++)
-		ctx->rx_indir_table[i] =
-			ethtool_rxfh_indir_default(i, efx->rss_spread);
+	for (i = 0; i < ARRAY_SIZE(efx->rss_context.rx_indir_table); i++)
+		indir[i] = ethtool_rxfh_indir_default(i, efx->rss_spread);
 }
 
 /**
diff --git a/drivers/net/ethernet/sfc/rx_common.h b/drivers/net/ethernet/sfc/rx_common.h
index fbd2769307f9..75fa84192362 100644
--- a/drivers/net/ethernet/sfc/rx_common.h
+++ b/drivers/net/ethernet/sfc/rx_common.h
@@ -84,11 +84,9 @@ void
 efx_rx_packet_gro(struct efx_channel *channel, struct efx_rx_buffer *rx_buf,
 		  unsigned int n_frags, u8 *eh, __wsum csum);
 
-struct efx_rss_context *efx_alloc_rss_context_entry(struct efx_nic *efx);
-struct efx_rss_context *efx_find_rss_context_entry(struct efx_nic *efx, u32 id);
-void efx_free_rss_context_entry(struct efx_rss_context *ctx);
-void efx_set_default_rx_indir_table(struct efx_nic *efx,
-				    struct efx_rss_context *ctx);
+struct efx_rss_context_priv *efx_find_rss_context_entry(struct efx_nic *efx,
+							u32 id);
+void efx_set_default_rx_indir_table(struct efx_nic *efx, u32 *indir);
 
 bool efx_filter_is_mc_recipient(const struct efx_filter_spec *spec);
 bool efx_filter_spec_equal(const struct efx_filter_spec *left,

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 2/7] net: ethtool: attach an IDR of custom RSS contexts to a netdevice
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 2/7] net: ethtool: attach an IDR of custom RSS contexts to a netdevice edward.cree
@ 2023-04-11 20:36   ` Andrew Lunn
  2023-04-12 15:52     ` Edward Cree
  2023-04-13  1:39   ` Jakub Kicinski
  1 sibling, 1 reply; 24+ messages in thread
From: Andrew Lunn @ 2023-04-11 20:36 UTC (permalink / raw)
  To: edward.cree
  Cc: linux-net-drivers, davem, kuba, pabeni, edumazet, Edward Cree,
	netdev, habetsm.xilinx, sudheer.mogilappagari

>  /**
>   * struct ethtool_netdev_state - per-netdevice state for ethtool features
> + * @rss_ctx_max_id:	maximum (exclusive) supported RSS context ID
> + * @rss_ctx:		IDR storing custom RSS context state
>   * @wol_enabled:	Wake-on-LAN is enabled
>   */
>  struct ethtool_netdev_state {
> +	u32			rss_ctx_max_id;
> +	struct idr		rss_ctx;
>  	unsigned		wol_enabled:1;
>  };

A nitpick. On 64 bit systems, you have a hole between rss_ctx_max_id
and rss_ctx. If you swap those around, and change wol_enabled to also
be a u32 bitfield, the compiler can probably do without the hole.

   Andrew

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 1/7] net: move ethtool-related netdev state into its own struct
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 1/7] net: move ethtool-related netdev state into its own struct edward.cree
@ 2023-04-11 20:37   ` Andrew Lunn
  2023-04-13  1:36   ` Jakub Kicinski
  1 sibling, 0 replies; 24+ messages in thread
From: Andrew Lunn @ 2023-04-11 20:37 UTC (permalink / raw)
  To: edward.cree
  Cc: linux-net-drivers, davem, kuba, pabeni, edumazet, Edward Cree,
	netdev, habetsm.xilinx, sudheer.mogilappagari

On Tue, Apr 11, 2023 at 07:26:09PM +0100, edward.cree@amd.com wrote:
> From: Edward Cree <ecree.xilinx@gmail.com>
> 
> net_dev->ethtool is a pointer to new struct ethtool_netdev_state, which
>  currently contains only the wol_enabled field.
> 
> Suggested-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts edward.cree
@ 2023-04-11 20:40   ` Andrew Lunn
  2023-04-12 16:16     ` Edward Cree
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Lunn @ 2023-04-11 20:40 UTC (permalink / raw)
  To: edward.cree
  Cc: linux-net-drivers, davem, kuba, pabeni, edumazet, Edward Cree,
	netdev, habetsm.xilinx, sudheer.mogilappagari

On Tue, Apr 11, 2023 at 07:26:14PM +0100, edward.cree@amd.com wrote:
> From: Edward Cree <ecree.xilinx@gmail.com>
> 
> While this is not needed to serialise the ethtool entry points (which
>  are all under RTNL), drivers may have cause to asynchronously access
>  dev->ethtool->rss_ctx; taking dev->ethtool->rss_lock allows them to
>  do this safely without needing to take the RTNL.

What is actually wrong with taking RTNL? KISS is often best,
especially for locks.

     Andrew

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 2/7] net: ethtool: attach an IDR of custom RSS contexts to a netdevice
  2023-04-11 20:36   ` Andrew Lunn
@ 2023-04-12 15:52     ` Edward Cree
  0 siblings, 0 replies; 24+ messages in thread
From: Edward Cree @ 2023-04-12 15:52 UTC (permalink / raw)
  To: Andrew Lunn, edward.cree
  Cc: linux-net-drivers, davem, kuba, pabeni, edumazet, netdev,
	habetsm.xilinx, sudheer.mogilappagari

On 11/04/2023 21:36, Andrew Lunn wrote:
>>  /**
>>   * struct ethtool_netdev_state - per-netdevice state for ethtool features
>> + * @rss_ctx_max_id:	maximum (exclusive) supported RSS context ID
>> + * @rss_ctx:		IDR storing custom RSS context state
>>   * @wol_enabled:	Wake-on-LAN is enabled
>>   */
>>  struct ethtool_netdev_state {
>> +	u32			rss_ctx_max_id;
>> +	struct idr		rss_ctx;
>>  	unsigned		wol_enabled:1;
>>  };
> 
> A nitpick. On 64 bit systems, you have a hole between rss_ctx_max_id
> and rss_ctx. If you swap those around, and change wol_enabled to also
> be a u32 bitfield, the compiler can probably do without the hole.

Sure, makes sense.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts
  2023-04-11 20:40   ` Andrew Lunn
@ 2023-04-12 16:16     ` Edward Cree
  2023-04-12 17:15       ` Andrew Lunn
  0 siblings, 1 reply; 24+ messages in thread
From: Edward Cree @ 2023-04-12 16:16 UTC (permalink / raw)
  To: Andrew Lunn, edward.cree
  Cc: linux-net-drivers, davem, kuba, pabeni, edumazet, netdev,
	habetsm.xilinx, sudheer.mogilappagari

On 11/04/2023 21:40, Andrew Lunn wrote:
> On Tue, Apr 11, 2023 at 07:26:14PM +0100, edward.cree@amd.com wrote:
>> While this is not needed to serialise the ethtool entry points (which
>>  are all under RTNL), drivers may have cause to asynchronously access
>>  dev->ethtool->rss_ctx; taking dev->ethtool->rss_lock allows them to
>>  do this safely without needing to take the RTNL.
> 
> What is actually wrong with taking RTNL? KISS is often best,
> especially for locks.

The examples I have of driver code that needs to access rss_ctx (in the
 sfc driver) are deep inside call chains where RTNL may or may not
 already be held.
1) filter insertion.  E.g. ethtool -U is already holding RTNL, but other
 sources of filters (e.g. aRFS) aren't, and thus taking it if necessary
 might mean passing a 'bool rtnl_locked' all the way down the chain.
2) device reset handling (we have to restore the RSS contexts in the
 hardware after a reset).  Again resets don't always happen under RTNL,
 and I don't fully understand the details (EFX_ASSERT_RESET_SERIALISED()).
So it makes life much simpler if we just have a finer-grained lock we can
 just take when we need to.
Also, RTNL is a very big hammer that serialises all kinds of operations
 across all the netdevs in the system, holding it for any length of time
 can cause annoying user-visible latency (e.g. iirc sshd accepting a new
 connection has to wait for it) so I prefer to avoid it if possible.  If
 anything we want to be breaking up this BKL[1], not making it bigger.

-ed

[1]: https://legacy.netdevconf.info/2.2/slides/westphal-rtnlmutex-talk.pdf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts
  2023-04-12 16:16     ` Edward Cree
@ 2023-04-12 17:15       ` Andrew Lunn
  2023-04-12 17:42         ` Edward Cree
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Lunn @ 2023-04-12 17:15 UTC (permalink / raw)
  To: Edward Cree
  Cc: edward.cree, linux-net-drivers, davem, kuba, pabeni, edumazet,
	netdev, habetsm.xilinx, sudheer.mogilappagari

On Wed, Apr 12, 2023 at 05:16:11PM +0100, Edward Cree wrote:
> On 11/04/2023 21:40, Andrew Lunn wrote:
> > On Tue, Apr 11, 2023 at 07:26:14PM +0100, edward.cree@amd.com wrote:
> >> While this is not needed to serialise the ethtool entry points (which
> >>  are all under RTNL), drivers may have cause to asynchronously access
> >>  dev->ethtool->rss_ctx; taking dev->ethtool->rss_lock allows them to
> >>  do this safely without needing to take the RTNL.
> > 
> > What is actually wrong with taking RTNL? KISS is often best,
> > especially for locks.
> 
> The examples I have of driver code that needs to access rss_ctx (in the
>  sfc driver) are deep inside call chains where RTNL may or may not
>  already be held.
> 1) filter insertion.  E.g. ethtool -U is already holding RTNL, but other
>  sources of filters (e.g. aRFS) aren't, and thus taking it if necessary
>  might mean passing a 'bool rtnl_locked' all the way down the chain.
> 2) device reset handling (we have to restore the RSS contexts in the
>  hardware after a reset).  Again resets don't always happen under RTNL,
>  and I don't fully understand the details (EFX_ASSERT_RESET_SERIALISED()).
> So it makes life much simpler if we just have a finer-grained lock we can
>  just take when we need to.
> Also, RTNL is a very big hammer that serialises all kinds of operations
>  across all the netdevs in the system, holding it for any length of time
>  can cause annoying user-visible latency (e.g. iirc sshd accepting a new
>  connection has to wait for it) so I prefer to avoid it if possible.  If
>  anything we want to be breaking up this BKL[1], not making it bigger.

Hi Ed

I have to wonder if your locking model is wrong. When i look at the
next patch, i see the driver is also using this lock. And i generally
find that is wrong.

As a rule of thumb, driver writes don't understand locking. Yes, there
are some that do, but most don't. As core code developers, i find it
good practice to have the locks in the core, and only in the
core. Drivers writers should not need to worry about locking. The API
into the driver will take the locks needed before entering the driver,
and release them on exit.

So i don't really agree with 'it makes life much simpler if we just
have a finer-grained lock'. It makes life more complex having to help
driver writers find the corruption and deadlock bugs in their code
because they got the locking wrong.

Please try to work on the abstraction so that drivers don't need this
lock, just the core.

	Andrew

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts
  2023-04-12 17:15       ` Andrew Lunn
@ 2023-04-12 17:42         ` Edward Cree
  2023-04-13  2:06           ` Jakub Kicinski
  0 siblings, 1 reply; 24+ messages in thread
From: Edward Cree @ 2023-04-12 17:42 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: edward.cree, linux-net-drivers, davem, kuba, pabeni, edumazet,
	netdev, habetsm.xilinx, sudheer.mogilappagari

On 12/04/2023 18:15, Andrew Lunn wrote:
> I have to wonder if your locking model is wrong. When i look at the
> next patch, i see the driver is also using this lock. And i generally
> find that is wrong.
...
> Drivers writers should not need to worry about locking. The API
> into the driver will take the locks needed before entering the driver,
> and release them on exit.
I don't think that's possible without increasing driver complexity in
 other ways.  Essentially, for the driver to take advantage of the core
 tracking these contexts, and thus not need its own data structures to
 track them as well (like the efx->rss_context.list we remove in patch
 #7), it has to be able to access them on driver-initiated (not just
 core-initiated) events.  (The central example of this being "oh, the
 NIC MCPU just rebooted, we have to reinstall all our filters".)  And
 it needs to be able to exclude writes while it does that, not only for
 consistency but also because e.g. context deletion will free that
 memory (though I guess we could finesse that part with RCU?).
What I *could* do is add suitable wrapper functions for access to
 dev->ethtool->rss_ctx (e.g. a core equivalent of
 efx_find_rss_context_entry() that wraps the idr_find()) and have them
 assert that the lock is held (like efx_find_rss_context_entry() does);
 that would at least validate the driver locking somewhat.
But having those helper functions perform the locking themselves would
 mean going to a get/put model for managing the lifetime of the
 driver's reference (with a separate get_write for exclusive access),
 at which point it's probably harder for driver writers to understand
 than "any time you're touching rss_ctx you need to hold the rss_lock".

-ed

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 1/7] net: move ethtool-related netdev state into its own struct
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 1/7] net: move ethtool-related netdev state into its own struct edward.cree
  2023-04-11 20:37   ` Andrew Lunn
@ 2023-04-13  1:36   ` Jakub Kicinski
  1 sibling, 0 replies; 24+ messages in thread
From: Jakub Kicinski @ 2023-04-13  1:36 UTC (permalink / raw)
  To: edward.cree
  Cc: linux-net-drivers, davem, pabeni, edumazet, Edward Cree, netdev,
	habetsm.xilinx, sudheer.mogilappagari

On Tue, 11 Apr 2023 19:26:09 +0100 edward.cree@amd.com wrote:
> From: Edward Cree <ecree.xilinx@gmail.com>
> 
> net_dev->ethtool is a pointer to new struct ethtool_netdev_state, which
>  currently contains only the wol_enabled field.
> 
> Suggested-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>

Reviewed-by: Jakub Kicinski <kuba@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 2/7] net: ethtool: attach an IDR of custom RSS contexts to a netdevice
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 2/7] net: ethtool: attach an IDR of custom RSS contexts to a netdevice edward.cree
  2023-04-11 20:36   ` Andrew Lunn
@ 2023-04-13  1:39   ` Jakub Kicinski
  1 sibling, 0 replies; 24+ messages in thread
From: Jakub Kicinski @ 2023-04-13  1:39 UTC (permalink / raw)
  To: edward.cree
  Cc: linux-net-drivers, davem, pabeni, edumazet, Edward Cree, netdev,
	habetsm.xilinx, sudheer.mogilappagari

On Tue, 11 Apr 2023 19:26:10 +0100 edward.cree@amd.com wrote:
> +static void netdev_rss_contexts_free(struct net_device *dev)
> +{
> +	struct ethtool_rxfh_context *ctx;
> +	u32 context;
> +
> +	if (!dev->ethtool_ops->set_rxfh_context)
> +		return;
> +	idr_for_each_entry(&dev->ethtool->rss_ctx, ctx, context) {
> +		u32 *indir = ethtool_rxfh_context_indir(ctx);
> +		u8 *key = ethtool_rxfh_context_key(ctx);
> +
> +		idr_remove(&dev->ethtool->rss_ctx, context);
> +		dev->ethtool_ops->set_rxfh_context(dev, indir, key, ctx->hfunc,
> +						   &context, true);
> +		kfree(ctx);
> +	}
> +}

nit: maybe move the ethtool related code out to a new
net/ethtool/netdev.c ? We can probably put forward declarations 
in net/core/dev.h or a new header among sources?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 3/7] net: ethtool: record custom RSS contexts in the IDR
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 3/7] net: ethtool: record custom RSS contexts in the IDR edward.cree
@ 2023-04-13  1:49   ` Jakub Kicinski
  2023-06-09 20:01     ` Edward Cree
  0 siblings, 1 reply; 24+ messages in thread
From: Jakub Kicinski @ 2023-04-13  1:49 UTC (permalink / raw)
  To: edward.cree
  Cc: linux-net-drivers, davem, pabeni, edumazet, Edward Cree, netdev,
	habetsm.xilinx, sudheer.mogilappagari

On Tue, 11 Apr 2023 19:26:11 +0100 edward.cree@amd.com wrote:
>  	if (rxfh.rss_context)
>  		ret = ops->set_rxfh_context(dev, indir, hkey, rxfh.hfunc,
>  					    &rxfh.rss_context, delete);
> @@ -1350,6 +1377,40 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
>  		else if (rxfh.indir_size != ETH_RXFH_INDIR_NO_CHANGE)
>  			dev->priv_flags |= IFF_RXFH_CONFIGURED;
>  	}

This is probably transient but I think we're potentially leaking @ctx
in a goto out hiding inside the context here, and...

> +	/* Update rss_ctx tracking */
> +	if (create) {
> +		/* Ideally this should happen before calling the driver,
> +		 * so that we can fail more cleanly; but we don't have the
> +		 * context ID until the driver picks it, so we have to
> +		 * wait until after.
> +		 */
> +		if (WARN_ON(idr_find(&dev->ethtool->rss_ctx, rxfh.rss_context)))
> +			/* context ID reused, our tracking is screwed */
> +			goto out;

here.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 4/7] net: ethtool: let the core choose RSS context IDs
  2023-04-11 18:26 ` [RFC PATCH v2 net-next 4/7] net: ethtool: let the core choose RSS context IDs edward.cree
@ 2023-04-13  1:53   ` Jakub Kicinski
  0 siblings, 0 replies; 24+ messages in thread
From: Jakub Kicinski @ 2023-04-13  1:53 UTC (permalink / raw)
  To: edward.cree
  Cc: linux-net-drivers, davem, pabeni, edumazet, Edward Cree, netdev,
	habetsm.xilinx, sudheer.mogilappagari

On Tue, 11 Apr 2023 19:26:12 +0100 edward.cree@amd.com wrote:
> From: Edward Cree <ecree.xilinx@gmail.com>
> 
> Add a new API to create/modify/remove RSS contexts, that passes in the
>  newly-chosen context ID (not as a pointer) rather than leaving the
>  driver to choose it on create.  Also pass in the ctx, allowing drivers
>  to easily use its private data area to store their hardware-specific
>  state.
> Keep the existing .set_rxfh_context API for now as a fallback, but
>  deprecate it.
> 
> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>

Reviewed-by: Jakub Kicinski <kuba@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts
  2023-04-12 17:42         ` Edward Cree
@ 2023-04-13  2:06           ` Jakub Kicinski
  2023-04-13 21:52             ` Edward Cree
  0 siblings, 1 reply; 24+ messages in thread
From: Jakub Kicinski @ 2023-04-13  2:06 UTC (permalink / raw)
  To: Edward Cree
  Cc: Andrew Lunn, edward.cree, linux-net-drivers, davem, pabeni,
	edumazet, netdev, habetsm.xilinx, sudheer.mogilappagari

On Wed, 12 Apr 2023 18:42:16 +0100 Edward Cree wrote:
> On 12/04/2023 18:15, Andrew Lunn wrote:
> > I have to wonder if your locking model is wrong. When i look at the
> > next patch, i see the driver is also using this lock. And i generally
> > find that is wrong.  
> ...
> > Drivers writers should not need to worry about locking. The API
> > into the driver will take the locks needed before entering the driver,
> > and release them on exit.  
> I don't think that's possible without increasing driver complexity in
>  other ways.  Essentially, for the driver to take advantage of the core
>  tracking these contexts, and thus not need its own data structures to
>  track them as well (like the efx->rss_context.list we remove in patch
>  #7), it has to be able to access them on driver-initiated (not just
>  core-initiated) events.  (The central example of this being "oh, the
>  NIC MCPU just rebooted, we have to reinstall all our filters".)  And
>  it needs to be able to exclude writes while it does that, not only for
>  consistency but also because e.g. context deletion will free that
>  memory (though I guess we could finesse that part with RCU?).
> What I *could* do is add suitable wrapper functions for access to
>  dev->ethtool->rss_ctx (e.g. a core equivalent of
>  efx_find_rss_context_entry() that wraps the idr_find()) and have them
>  assert that the lock is held (like efx_find_rss_context_entry() does);
>  that would at least validate the driver locking somewhat.
> But having those helper functions perform the locking themselves would
>  mean going to a get/put model for managing the lifetime of the
>  driver's reference (with a separate get_write for exclusive access),
>  at which point it's probably harder for driver writers to understand
>  than "any time you're touching rss_ctx you need to hold the rss_lock".

IMO the "MCPU has rebooted" case is a control path, taking rtnl seems
like the right thing to do. Does that happen often enough or takes so
long to recover that we need to be concerned about locking down rtnl?
aRFS can't sleep IIUC so the mutex is does not seem like a great match.

IOW I had the same reaction as Andrew first time I saw this mutex :(

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts
  2023-04-13  2:06           ` Jakub Kicinski
@ 2023-04-13 21:52             ` Edward Cree
  2023-04-13 21:58               ` Andrew Lunn
  0 siblings, 1 reply; 24+ messages in thread
From: Edward Cree @ 2023-04-13 21:52 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Andrew Lunn, edward.cree, linux-net-drivers, davem, pabeni,
	edumazet, netdev, habetsm.xilinx, sudheer.mogilappagari

On 13/04/2023 03:06, Jakub Kicinski wrote:
> IMO the "MCPU has rebooted" case is a control path, taking rtnl seems
> like the right thing to do. Does that happen often enough or takes so
> long to recover that we need to be concerned about locking down rtnl?

Normally we *do* hold RTNL across the reset handling path, and all the
 callers I can find take it.  But the existence of a more complicated
 condition guarding the ASSERT_RTNL() in EFX_ASSERT_RESET_SERIALISED()
 implies that there is, or at least was, a call site that doesn't;
 that makes me nervous about assuming it.

> aRFS can't sleep IIUC so the mutex is does not seem like a great match.

sfc punts aRFS filter insertion into a workitem, because you can't do
 MCDI without sleeping.  And aRFS was just one example; there's lots
 of other sources of filters in the driver (PTP, sync_rx_mode (device
 UC/MC addresses), VLAN filtering (NETIF_F_HW_VLAN_CTAG_FILTER)...).
 Some of those filters can also have EFX_FILTER_FLAG_RX_RSS (though
 not custom contexts).

So while I *think* the filter insert code could carefully go
   if (spec->flags & EFX_FILTER_FLAG_RX_RSS && spec->rss_context)
       ASSERT_RTNL();
 it's kinda hairy.  And if this is a *normal* level of hair for
 drivers to have in this area, then the "but driver writers don't
 understand locking!" argument doesn't really favour one solution over
 the other.  After all, the driver will still have to hold *some* lock
 to access this stuff, whether it's rss_lock or RTNL.
(Idk, maybe sfc is just uniquely complex and messy.  It wouldn't be
 the first time.)

> IOW I had the same reaction as Andrew first time I saw this mutex :(

Well I seem to be outvoted, so I'll have another crack at getting it
 to work with just RTNL (that's what I went for originally, actually,
 but one of the ASSERT_RTNL()s failed in test and I couldn't figure
 out why at the time.  Possibly trying to argue the case has helped me
 to understand more of the details!).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts
  2023-04-13 21:52             ` Edward Cree
@ 2023-04-13 21:58               ` Andrew Lunn
  2023-04-14 20:20                 ` Edward Cree
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Lunn @ 2023-04-13 21:58 UTC (permalink / raw)
  To: Edward Cree
  Cc: Jakub Kicinski, edward.cree, linux-net-drivers, davem, pabeni,
	edumazet, netdev, habetsm.xilinx, sudheer.mogilappagari

> (Idk, maybe sfc is just uniquely complex and messy.  It wouldn't be
>  the first time.)

Hi Ed

Have you looked at other drivers? It would be bad to design an API
around a messy driver. Maybe this is an opportunity to learn from
other drivers and come up with something cleaner for SFC?

      Andrew

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts
  2023-04-13 21:58               ` Andrew Lunn
@ 2023-04-14 20:20                 ` Edward Cree
  0 siblings, 0 replies; 24+ messages in thread
From: Edward Cree @ 2023-04-14 20:20 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Jakub Kicinski, edward.cree, linux-net-drivers, davem, pabeni,
	edumazet, netdev, habetsm.xilinx, sudheer.mogilappagari

On 13/04/2023 22:58, Andrew Lunn wrote:
>> (Idk, maybe sfc is just uniquely complex and messy.  It wouldn't be
>>  the first time.)
> 
> Hi Ed
> 
> Have you looked at other drivers? It would be bad to design an API
> around a messy driver.

I have; there's really not many that implement custom RSS contexts
 (just Marvell's mvpp2 and octeontx2, and Mellanox's mlx5).  The
 `rss_ctx_max_id` field is designed for those as they all have fixed-
 size arrays currently and idk whether that's a purely software limit
 or whether it reflects the hardware.
I couldn't find anything in any of them that looked like "restore
 stuff after a device reboot"; maybe it's just not something those
 devices expect to experience normally.

I don't know enough about mlx5 hw to really understand their filter
 code, but the rough equivalent of our efx_mcdi_filter_insert_locked()
 in that driver appears to be _mlx5_add_flow_rules(), which seems to
 be doing some kind of hand-over-hand locking.  And no sign (whether
 in comments or in asserts) of whether the function expects callers to
 hold RTNL.  Same goes for their functions operating on TIRs (whatever
 those are) which are called from all over (aRFS, tc, even kTLS!) in
 addition to the ethtool RSS/ntuple paths.

Anyway I'll cc maintainers of those drivers on v3 so they can chime
 in on the API design.  (Should've done that on v1 really, but I
 forgot.  Mea culpa.)

-ed

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v2 net-next 3/7] net: ethtool: record custom RSS contexts in the IDR
  2023-04-13  1:49   ` Jakub Kicinski
@ 2023-06-09 20:01     ` Edward Cree
  0 siblings, 0 replies; 24+ messages in thread
From: Edward Cree @ 2023-06-09 20:01 UTC (permalink / raw)
  To: Jakub Kicinski, edward.cree
  Cc: linux-net-drivers, davem, pabeni, edumazet, netdev,
	habetsm.xilinx, sudheer.mogilappagari

On 13/04/2023 02:49, Jakub Kicinski wrote:
> On Tue, 11 Apr 2023 19:26:11 +0100 edward.cree@amd.com wrote:
>>  	if (rxfh.rss_context)
>>  		ret = ops->set_rxfh_context(dev, indir, hkey, rxfh.hfunc,
>>  					    &rxfh.rss_context, delete);
>> @@ -1350,6 +1377,40 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
>>  		else if (rxfh.indir_size != ETH_RXFH_INDIR_NO_CHANGE)
>>  			dev->priv_flags |= IFF_RXFH_CONFIGURED;
>>  	}
> 
> This is probably transient but I think we're potentially leaking @ctx
> in a goto out hiding inside the context here, and...
> 
>> +	/* Update rss_ctx tracking */
>> +	if (create) {
>> +		/* Ideally this should happen before calling the driver,
>> +		 * so that we can fail more cleanly; but we don't have the
>> +		 * context ID until the driver picks it, so we have to
>> +		 * wait until after.
>> +		 */
>> +		if (WARN_ON(idr_find(&dev->ethtool->rss_ctx, rxfh.rss_context)))
>> +			/* context ID reused, our tracking is screwed */
>> +			goto out;
> 
> here.

Wasn't entirely transient.  Fixed for v3.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2023-06-09 20:01 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-11 18:26 [RFC PATCH v2 net-next 0/7] ethtool: track custom RSS contexts in the core edward.cree
2023-04-11 18:26 ` [RFC PATCH v2 net-next 1/7] net: move ethtool-related netdev state into its own struct edward.cree
2023-04-11 20:37   ` Andrew Lunn
2023-04-13  1:36   ` Jakub Kicinski
2023-04-11 18:26 ` [RFC PATCH v2 net-next 2/7] net: ethtool: attach an IDR of custom RSS contexts to a netdevice edward.cree
2023-04-11 20:36   ` Andrew Lunn
2023-04-12 15:52     ` Edward Cree
2023-04-13  1:39   ` Jakub Kicinski
2023-04-11 18:26 ` [RFC PATCH v2 net-next 3/7] net: ethtool: record custom RSS contexts in the IDR edward.cree
2023-04-13  1:49   ` Jakub Kicinski
2023-06-09 20:01     ` Edward Cree
2023-04-11 18:26 ` [RFC PATCH v2 net-next 4/7] net: ethtool: let the core choose RSS context IDs edward.cree
2023-04-13  1:53   ` Jakub Kicinski
2023-04-11 18:26 ` [RFC PATCH v2 net-next 5/7] net: ethtool: add an extack parameter to new rxfh_context APIs edward.cree
2023-04-11 18:26 ` [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting RSS contexts edward.cree
2023-04-11 20:40   ` Andrew Lunn
2023-04-12 16:16     ` Edward Cree
2023-04-12 17:15       ` Andrew Lunn
2023-04-12 17:42         ` Edward Cree
2023-04-13  2:06           ` Jakub Kicinski
2023-04-13 21:52             ` Edward Cree
2023-04-13 21:58               ` Andrew Lunn
2023-04-14 20:20                 ` Edward Cree
2023-04-11 18:26 ` [RFC PATCH v2 net-next 7/7] sfc: use new rxfh_context API edward.cree

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).