All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2 0/3] net-sysfs: fix race conditions in the xps code
@ 2020-12-21 19:36 Antoine Tenart
  2020-12-21 19:36 ` [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num Antoine Tenart
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Antoine Tenart @ 2020-12-21 19:36 UTC (permalink / raw)
  To: davem, kuba, alexander.duyck; +Cc: Antoine Tenart, netdev, pabeni

Hello all,

This series fixes race conditions in the xps code, where out of bound
accesses can occur when dev->num_tc is updated, triggering oops. The
root cause is linked to lock issues. An explanation is given in each of
the commit logs.

Reviews in v1 suggested to use the xps_map_mutex to protect the maps and
their related parameters instead of the rtnl lock. We followed this path
in v2 as it seems a better compromise than taking the rtnl lock.

As a result, patch 1 turned out to be less straight forward as some of
the locking logic in net/core/dev.c related to xps_map_mutex had to be
changed. Patches 2 and 3 are also larger in v2 as code had to be moved
from net/core/net-sysfs.c to net/core/dev.c to take the xps_map_mutex
(however maintainability is improved).

Also, while working on the v2 I stumbled upon another race condition. I
debugged it and the fix is the same as patch 1. I updated its commit log
to describe both races.

Thanks!
Antoine

Antoine Tenart (3):
  net: fix race conditions in xps by locking the maps and dev->tc_num
  net: move the xps cpus retrieval out of net-sysfs
  net: move the xps rxqs retrieval out of net-sysfs

 include/linux/netdevice.h |   9 ++
 net/core/dev.c            | 186 +++++++++++++++++++++++++++++---------
 net/core/net-sysfs.c      |  89 ++++--------------
 3 files changed, 171 insertions(+), 113 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-21 19:36 [PATCH net v2 0/3] net-sysfs: fix race conditions in the xps code Antoine Tenart
@ 2020-12-21 19:36 ` Antoine Tenart
  2020-12-21 23:21   ` Alexander Duyck
  2020-12-21 19:36 ` [PATCH net v2 2/3] net: move the xps cpus retrieval out of net-sysfs Antoine Tenart
  2020-12-21 19:36 ` [PATCH net v2 3/3] net: move the xps rxqs " Antoine Tenart
  2 siblings, 1 reply; 16+ messages in thread
From: Antoine Tenart @ 2020-12-21 19:36 UTC (permalink / raw)
  To: davem, kuba, alexander.duyck; +Cc: Antoine Tenart, netdev, pabeni

Two race conditions can be triggered in xps, resulting in various oops
and invalid memory accesses:

1. Calling netdev_set_num_tc while netif_set_xps_queue:

   - netdev_set_num_tc sets dev->tc_num.

   - netif_set_xps_queue uses dev->tc_num as one of the parameters to
     compute the size of new_dev_maps when allocating it. dev->tc_num is
     also used to access the map, and the compiler may generate code to
     retrieve this field multiple times in the function.

   If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
   is set to a higher value through netdev_set_num_tc, later accesses to
   new_dev_maps in netif_set_xps_queue could lead to accessing memory
   outside of new_dev_maps; triggering an oops.

   One way of triggering this is to set an iface up (for which the
   driver uses netdev_set_num_tc in the open path, such as bnx2x) and
   writing to xps_cpus or xps_rxqs in a concurrent thread. With the
   right timing an oops is triggered.

2. Calling netif_set_xps_queue while netdev_set_num_tc is running:

   2.1. netdev_set_num_tc starts by resetting the xps queues,
        dev->tc_num isn't updated yet.

   2.2. netif_set_xps_queue is called, setting up the maps with the
        *old* dev->num_tc.

   2.3. dev->tc_num is updated.

   2.3. Later accesses to the map leads to out of bound accesses and
        oops.

   A similar issue can be found with netdev_reset_tc.

   The fix can't be to only link the size of the maps to them, as
   invalid configuration could still occur. The reset then set logic in
   both netdev_set_num_tc and netdev_reset_tc must be protected by a
   lock.

Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
and netdev_reset_tc should be mutually exclusive.

This patch fixes those races by:

- Reworking netif_set_xps_queue by moving the xps_map_mutex up so the
  access of dev->num_tc is done under the lock.

- Using xps_map_mutex in both netdev_set_num_tc and netdev_reset_tc for
  the reset and set logic:

  + As xps_map_mutex was taken in the reset path, netif_reset_xps_queues
    had to be reworked to offer an unlocked version (as well as
    netdev_unbind_all_sb_channels which calls it).

  + cpus_read_lock was taken in the reset path as well, and is always
    taken before xps_map_mutex. It had to be moved out of the unlocked
    version as well.

  This is why the patch is a little bit longer, and moves
  netdev_unbind_sb_channel up in the file.

Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 net/core/dev.c | 122 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 81 insertions(+), 41 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8fa739259041..effdb7fee9df 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2527,8 +2527,8 @@ static void clean_xps_maps(struct net_device *dev, const unsigned long *mask,
 	}
 }
 
-static void netif_reset_xps_queues(struct net_device *dev, u16 offset,
-				   u16 count)
+static void __netif_reset_xps_queues(struct net_device *dev, u16 offset,
+				     u16 count)
 {
 	const unsigned long *possible_mask = NULL;
 	struct xps_dev_maps *dev_maps;
@@ -2537,9 +2537,6 @@ static void netif_reset_xps_queues(struct net_device *dev, u16 offset,
 	if (!static_key_false(&xps_needed))
 		return;
 
-	cpus_read_lock();
-	mutex_lock(&xps_map_mutex);
-
 	if (static_key_false(&xps_rxqs_needed)) {
 		dev_maps = xmap_dereference(dev->xps_rxqs_map);
 		if (dev_maps) {
@@ -2551,15 +2548,23 @@ static void netif_reset_xps_queues(struct net_device *dev, u16 offset,
 
 	dev_maps = xmap_dereference(dev->xps_cpus_map);
 	if (!dev_maps)
-		goto out_no_maps;
+		return;
 
 	if (num_possible_cpus() > 1)
 		possible_mask = cpumask_bits(cpu_possible_mask);
 	nr_ids = nr_cpu_ids;
 	clean_xps_maps(dev, possible_mask, dev_maps, nr_ids, offset, count,
 		       false);
+}
+
+static void netif_reset_xps_queues(struct net_device *dev, u16 offset,
+				   u16 count)
+{
+	cpus_read_lock();
+	mutex_lock(&xps_map_mutex);
+
+	__netif_reset_xps_queues(dev, offset, count);
 
-out_no_maps:
 	mutex_unlock(&xps_map_mutex);
 	cpus_read_unlock();
 }
@@ -2615,27 +2620,32 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask,
 {
 	const unsigned long *online_mask = NULL, *possible_mask = NULL;
 	struct xps_dev_maps *dev_maps, *new_dev_maps = NULL;
-	int i, j, tci, numa_node_id = -2;
+	int i, j, tci, numa_node_id = -2, ret = 0;
 	int maps_sz, num_tc = 1, tc = 0;
 	struct xps_map *map, *new_map;
 	bool active = false;
 	unsigned int nr_ids;
 
+	mutex_lock(&xps_map_mutex);
+
 	if (dev->num_tc) {
 		/* Do not allow XPS on subordinate device directly */
 		num_tc = dev->num_tc;
-		if (num_tc < 0)
-			return -EINVAL;
+		if (num_tc < 0) {
+			ret = -EINVAL;
+			goto unlock;
+		}
 
 		/* If queue belongs to subordinate dev use its map */
 		dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
 
 		tc = netdev_txq_to_tc(dev, index);
-		if (tc < 0)
-			return -EINVAL;
+		if (tc < 0) {
+			ret = -EINVAL;
+			goto unlock;
+		}
 	}
 
-	mutex_lock(&xps_map_mutex);
 	if (is_rxqs_map) {
 		maps_sz = XPS_RXQ_DEV_MAPS_SIZE(num_tc, dev->num_rx_queues);
 		dev_maps = xmap_dereference(dev->xps_rxqs_map);
@@ -2659,8 +2669,8 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask,
 		if (!new_dev_maps)
 			new_dev_maps = kzalloc(maps_sz, GFP_KERNEL);
 		if (!new_dev_maps) {
-			mutex_unlock(&xps_map_mutex);
-			return -ENOMEM;
+			ret = -ENOMEM;
+			goto unlock;
 		}
 
 		tci = j * num_tc + tc;
@@ -2765,7 +2775,7 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask,
 	}
 
 	if (!dev_maps)
-		goto out_no_maps;
+		goto unlock;
 
 	/* removes tx-queue from unused CPUs/rx-queues */
 	for (j = -1; j = netif_attrmask_next(j, possible_mask, nr_ids),
@@ -2783,10 +2793,10 @@ int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask,
 	if (!active)
 		reset_xps_maps(dev, dev_maps, is_rxqs_map);
 
-out_no_maps:
+unlock:
 	mutex_unlock(&xps_map_mutex);
 
-	return 0;
+	return ret;
 error:
 	/* remove any maps that we added */
 	for (j = -1; j = netif_attrmask_next(j, possible_mask, nr_ids),
@@ -2822,28 +2832,68 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 EXPORT_SYMBOL(netif_set_xps_queue);
 
 #endif
-static void netdev_unbind_all_sb_channels(struct net_device *dev)
+
+static void __netdev_unbind_sb_channel(struct net_device *dev,
+				       struct net_device *sb_dev)
+{
+	struct netdev_queue *txq = &dev->_tx[dev->num_tx_queues];
+
+#ifdef CONFIG_XPS
+	__netif_reset_xps_queues(sb_dev, 0, dev->num_tx_queues);
+#endif
+
+	memset(sb_dev->tc_to_txq, 0, sizeof(sb_dev->tc_to_txq));
+	memset(sb_dev->prio_tc_map, 0, sizeof(sb_dev->prio_tc_map));
+
+	while (txq-- != &dev->_tx[0]) {
+		if (txq->sb_dev == sb_dev)
+			txq->sb_dev = NULL;
+	}
+}
+
+void netdev_unbind_sb_channel(struct net_device *dev,
+			      struct net_device *sb_dev)
+{
+	cpus_read_lock();
+	mutex_lock(&xps_map_mutex);
+
+	__netdev_unbind_sb_channel(dev, sb_dev);
+
+	mutex_unlock(&xps_map_mutex);
+	cpus_read_unlock();
+}
+EXPORT_SYMBOL(netdev_unbind_sb_channel);
+
+static void __netdev_unbind_all_sb_channels(struct net_device *dev)
 {
 	struct netdev_queue *txq = &dev->_tx[dev->num_tx_queues];
 
 	/* Unbind any subordinate channels */
 	while (txq-- != &dev->_tx[0]) {
 		if (txq->sb_dev)
-			netdev_unbind_sb_channel(dev, txq->sb_dev);
+			__netdev_unbind_sb_channel(dev, txq->sb_dev);
 	}
 }
 
 void netdev_reset_tc(struct net_device *dev)
 {
 #ifdef CONFIG_XPS
-	netif_reset_xps_queues_gt(dev, 0);
+	cpus_read_lock();
+	mutex_lock(&xps_map_mutex);
+
+	__netif_reset_xps_queues(dev, 0, dev->num_tx_queues);
 #endif
-	netdev_unbind_all_sb_channels(dev);
+	__netdev_unbind_all_sb_channels(dev);
 
 	/* Reset TC configuration of device */
 	dev->num_tc = 0;
 	memset(dev->tc_to_txq, 0, sizeof(dev->tc_to_txq));
 	memset(dev->prio_tc_map, 0, sizeof(dev->prio_tc_map));
+
+#ifdef CONFIG_XPS
+	mutex_unlock(&xps_map_mutex);
+	cpus_read_unlock();
+#endif
 }
 EXPORT_SYMBOL(netdev_reset_tc);
 
@@ -2867,32 +2917,22 @@ int netdev_set_num_tc(struct net_device *dev, u8 num_tc)
 		return -EINVAL;
 
 #ifdef CONFIG_XPS
-	netif_reset_xps_queues_gt(dev, 0);
+	cpus_read_lock();
+	mutex_lock(&xps_map_mutex);
+
+	__netif_reset_xps_queues(dev, 0, dev->num_tx_queues);
 #endif
-	netdev_unbind_all_sb_channels(dev);
+	__netdev_unbind_all_sb_channels(dev);
 
 	dev->num_tc = num_tc;
-	return 0;
-}
-EXPORT_SYMBOL(netdev_set_num_tc);
-
-void netdev_unbind_sb_channel(struct net_device *dev,
-			      struct net_device *sb_dev)
-{
-	struct netdev_queue *txq = &dev->_tx[dev->num_tx_queues];
 
 #ifdef CONFIG_XPS
-	netif_reset_xps_queues_gt(sb_dev, 0);
+	mutex_unlock(&xps_map_mutex);
+	cpus_read_unlock();
 #endif
-	memset(sb_dev->tc_to_txq, 0, sizeof(sb_dev->tc_to_txq));
-	memset(sb_dev->prio_tc_map, 0, sizeof(sb_dev->prio_tc_map));
-
-	while (txq-- != &dev->_tx[0]) {
-		if (txq->sb_dev == sb_dev)
-			txq->sb_dev = NULL;
-	}
+	return 0;
 }
-EXPORT_SYMBOL(netdev_unbind_sb_channel);
+EXPORT_SYMBOL(netdev_set_num_tc);
 
 int netdev_bind_sb_channel_queue(struct net_device *dev,
 				 struct net_device *sb_dev,
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net v2 2/3] net: move the xps cpus retrieval out of net-sysfs
  2020-12-21 19:36 [PATCH net v2 0/3] net-sysfs: fix race conditions in the xps code Antoine Tenart
  2020-12-21 19:36 ` [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num Antoine Tenart
@ 2020-12-21 19:36 ` Antoine Tenart
  2020-12-21 22:33   ` Alexander Duyck
  2020-12-21 19:36 ` [PATCH net v2 3/3] net: move the xps rxqs " Antoine Tenart
  2 siblings, 1 reply; 16+ messages in thread
From: Antoine Tenart @ 2020-12-21 19:36 UTC (permalink / raw)
  To: davem, kuba, alexander.duyck; +Cc: Antoine Tenart, netdev, pabeni

Accesses to dev->xps_cpus_map (when using dev->num_tc) should be
protected by the xps_map mutex, to avoid possible race conditions when
dev->num_tc is updated while the map is accessed. This patch moves the
logic accessing dev->xps_cpu_map and dev->num_tc to net/core/dev.c,
where the xps_map mutex is defined and used.

Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 include/linux/netdevice.h |  8 ++++++
 net/core/dev.c            | 59 +++++++++++++++++++++++++++++++++++++++
 net/core/net-sysfs.c      | 54 ++++++++---------------------------
 3 files changed, 79 insertions(+), 42 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 259be67644e3..bfd6cfa3ea90 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3671,6 +3671,8 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 			u16 index);
 int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask,
 			  u16 index, bool is_rxqs_map);
+int netif_show_xps_queue(struct net_device *dev, unsigned long **mask,
+			 u16 index);
 
 /**
  *	netif_attr_test_mask - Test a CPU or Rx queue set in a mask
@@ -3769,6 +3771,12 @@ static inline int __netif_set_xps_queue(struct net_device *dev,
 {
 	return 0;
 }
+
+static inline int netif_show_xps_queue(struct net_device *dev,
+				       unsigned long **mask, u16 index)
+{
+	return 0;
+}
 #endif
 
 /**
diff --git a/net/core/dev.c b/net/core/dev.c
index effdb7fee9df..a0257da4160a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2831,6 +2831,65 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 }
 EXPORT_SYMBOL(netif_set_xps_queue);
 
+int netif_show_xps_queue(struct net_device *dev, unsigned long **mask,
+			 u16 index)
+{
+	const unsigned long *possible_mask = NULL;
+	int j, num_tc = 1, tc = 0, ret = 0;
+	struct xps_dev_maps *dev_maps;
+	unsigned int nr_ids;
+
+	rcu_read_lock();
+	mutex_lock(&xps_map_mutex);
+
+	if (dev->num_tc) {
+		num_tc = dev->num_tc;
+		if (num_tc < 0) {
+			ret = -EINVAL;
+			goto out_no_map;
+		}
+
+		/* If queue belongs to subordinate dev use its map */
+		dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
+
+		tc = netdev_txq_to_tc(dev, index);
+		if (tc < 0) {
+			ret = -EINVAL;
+			goto out_no_map;
+		}
+	}
+
+	dev_maps = rcu_dereference(dev->xps_cpus_map);
+	if (!dev_maps)
+		goto out_no_map;
+	nr_ids = nr_cpu_ids;
+	if (num_possible_cpus() > 1)
+		possible_mask = cpumask_bits(cpu_possible_mask);
+
+	for (j = -1; j = netif_attrmask_next(j, possible_mask, nr_ids),
+	     j < nr_ids;) {
+		int i, tci = j * num_tc + tc;
+		struct xps_map *map;
+
+		map = rcu_dereference(dev_maps->attr_map[tci]);
+		if (!map)
+			continue;
+
+		for (i = map->len; i--;) {
+			if (map->queues[i] == index) {
+				set_bit(j, *mask);
+				break;
+			}
+		}
+	}
+
+out_no_map:
+	mutex_unlock(&xps_map_mutex);
+	rcu_read_unlock();
+
+	return ret;
+}
+EXPORT_SYMBOL(netif_show_xps_queue);
 #endif
 
 static void __netdev_unbind_sb_channel(struct net_device *dev,
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 999b70c59761..29ee69b67972 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1314,60 +1314,30 @@ static const struct attribute_group dql_group = {
 #endif /* CONFIG_BQL */
 
 #ifdef CONFIG_XPS
-static ssize_t xps_cpus_show(struct netdev_queue *queue,
-			     char *buf)
+static ssize_t xps_cpus_show(struct netdev_queue *queue, char *buf)
 {
 	struct net_device *dev = queue->dev;
-	int cpu, len, num_tc = 1, tc = 0;
-	struct xps_dev_maps *dev_maps;
-	cpumask_var_t mask;
-	unsigned long index;
+	unsigned long *mask, index;
+	int len, ret;
 
 	if (!netif_is_multiqueue(dev))
 		return -ENOENT;
 
 	index = get_netdev_queue_index(queue);
 
-	if (dev->num_tc) {
-		/* Do not allow XPS on subordinate device directly */
-		num_tc = dev->num_tc;
-		if (num_tc < 0)
-			return -EINVAL;
-
-		/* If queue belongs to subordinate dev use its map */
-		dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
-
-		tc = netdev_txq_to_tc(dev, index);
-		if (tc < 0)
-			return -EINVAL;
-	}
-
-	if (!zalloc_cpumask_var(&mask, GFP_KERNEL))
+	mask = bitmap_zalloc(nr_cpu_ids, GFP_KERNEL);
+	if (!mask)
 		return -ENOMEM;
 
-	rcu_read_lock();
-	dev_maps = rcu_dereference(dev->xps_cpus_map);
-	if (dev_maps) {
-		for_each_possible_cpu(cpu) {
-			int i, tci = cpu * num_tc + tc;
-			struct xps_map *map;
-
-			map = rcu_dereference(dev_maps->attr_map[tci]);
-			if (!map)
-				continue;
-
-			for (i = map->len; i--;) {
-				if (map->queues[i] == index) {
-					cpumask_set_cpu(cpu, mask);
-					break;
-				}
-			}
-		}
+	ret = netif_show_xps_queue(dev, &mask, index);
+	if (ret) {
+		bitmap_free(mask);
+		return ret;
 	}
-	rcu_read_unlock();
 
-	len = snprintf(buf, PAGE_SIZE, "%*pb\n", cpumask_pr_args(mask));
-	free_cpumask_var(mask);
+	len = bitmap_print_to_pagebuf(false, buf, mask, nr_cpu_ids);
+	bitmap_free(mask);
+
 	return len < PAGE_SIZE ? len : -EINVAL;
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net v2 3/3] net: move the xps rxqs retrieval out of net-sysfs
  2020-12-21 19:36 [PATCH net v2 0/3] net-sysfs: fix race conditions in the xps code Antoine Tenart
  2020-12-21 19:36 ` [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num Antoine Tenart
  2020-12-21 19:36 ` [PATCH net v2 2/3] net: move the xps cpus retrieval out of net-sysfs Antoine Tenart
@ 2020-12-21 19:36 ` Antoine Tenart
  2 siblings, 0 replies; 16+ messages in thread
From: Antoine Tenart @ 2020-12-21 19:36 UTC (permalink / raw)
  To: davem, kuba, alexander.duyck; +Cc: Antoine Tenart, netdev, pabeni

Accesses to dev->xps_rxqs_map (when using dev->num_tc) should be
protected by the xps_map mutex, to avoid possible race conditions when
dev->num_tc is updated while the map is accessed. Make use of the now
available netif_show_xps_queue helper which does just that.

This also helps to keep xps_cpus_show and xps_rxqs_show synced as their
logic is the same (as in __netif_set_xps_queue, the function allocating
and setting them up).

Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 include/linux/netdevice.h |  5 +++--
 net/core/dev.c            | 15 ++++++++++-----
 net/core/net-sysfs.c      | 37 ++++++-------------------------------
 3 files changed, 19 insertions(+), 38 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index bfd6cfa3ea90..5c3e16464c3f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3672,7 +3672,7 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask,
 			  u16 index, bool is_rxqs_map);
 int netif_show_xps_queue(struct net_device *dev, unsigned long **mask,
-			 u16 index);
+			 u16 index, bool is_rxqs_map);
 
 /**
  *	netif_attr_test_mask - Test a CPU or Rx queue set in a mask
@@ -3773,7 +3773,8 @@ static inline int __netif_set_xps_queue(struct net_device *dev,
 }
 
 static inline int netif_show_xps_queue(struct net_device *dev,
-				       unsigned long **mask, u16 index)
+				       unsigned long **mask, u16 index,
+				       bool is_rxqs_map)
 {
 	return 0;
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index a0257da4160a..e5cc2939e4d9 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2832,7 +2832,7 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 EXPORT_SYMBOL(netif_set_xps_queue);
 
 int netif_show_xps_queue(struct net_device *dev, unsigned long **mask,
-			 u16 index)
+			 u16 index, bool is_rxqs_map)
 {
 	const unsigned long *possible_mask = NULL;
 	int j, num_tc = 1, tc = 0, ret = 0;
@@ -2859,12 +2859,17 @@ int netif_show_xps_queue(struct net_device *dev, unsigned long **mask,
 		}
 	}
 
-	dev_maps = rcu_dereference(dev->xps_cpus_map);
+	if (is_rxqs_map) {
+		dev_maps = rcu_dereference(dev->xps_rxqs_map);
+		nr_ids = dev->num_rx_queues;
+	} else {
+		dev_maps = rcu_dereference(dev->xps_cpus_map);
+		nr_ids = nr_cpu_ids;
+		if (num_possible_cpus() > 1)
+			possible_mask = cpumask_bits(cpu_possible_mask);
+	}
 	if (!dev_maps)
 		goto out_no_map;
-	nr_ids = nr_cpu_ids;
-	if (num_possible_cpus() > 1)
-		possible_mask = cpumask_bits(cpu_possible_mask);
 
 	for (j = -1; j = netif_attrmask_next(j, possible_mask, nr_ids),
 	     j < nr_ids;) {
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 29ee69b67972..4f58b38dfc7d 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1329,7 +1329,7 @@ static ssize_t xps_cpus_show(struct netdev_queue *queue, char *buf)
 	if (!mask)
 		return -ENOMEM;
 
-	ret = netif_show_xps_queue(dev, &mask, index);
+	ret = netif_show_xps_queue(dev, &mask, index, false);
 	if (ret) {
 		bitmap_free(mask);
 		return ret;
@@ -1379,45 +1379,20 @@ static struct netdev_queue_attribute xps_cpus_attribute __ro_after_init
 static ssize_t xps_rxqs_show(struct netdev_queue *queue, char *buf)
 {
 	struct net_device *dev = queue->dev;
-	struct xps_dev_maps *dev_maps;
 	unsigned long *mask, index;
-	int j, len, num_tc = 1, tc = 0;
+	int len, ret;
 
 	index = get_netdev_queue_index(queue);
 
-	if (dev->num_tc) {
-		num_tc = dev->num_tc;
-		tc = netdev_txq_to_tc(dev, index);
-		if (tc < 0)
-			return -EINVAL;
-	}
 	mask = bitmap_zalloc(dev->num_rx_queues, GFP_KERNEL);
 	if (!mask)
 		return -ENOMEM;
 
-	rcu_read_lock();
-	dev_maps = rcu_dereference(dev->xps_rxqs_map);
-	if (!dev_maps)
-		goto out_no_maps;
-
-	for (j = -1; j = netif_attrmask_next(j, NULL, dev->num_rx_queues),
-	     j < dev->num_rx_queues;) {
-		int i, tci = j * num_tc + tc;
-		struct xps_map *map;
-
-		map = rcu_dereference(dev_maps->attr_map[tci]);
-		if (!map)
-			continue;
-
-		for (i = map->len; i--;) {
-			if (map->queues[i] == index) {
-				set_bit(j, mask);
-				break;
-			}
-		}
+	ret = netif_show_xps_queue(dev, &mask, index, true);
+	if (ret) {
+		bitmap_free(mask);
+		return ret;
 	}
-out_no_maps:
-	rcu_read_unlock();
 
 	len = bitmap_print_to_pagebuf(false, buf, mask, dev->num_rx_queues);
 	bitmap_free(mask);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 2/3] net: move the xps cpus retrieval out of net-sysfs
  2020-12-21 19:36 ` [PATCH net v2 2/3] net: move the xps cpus retrieval out of net-sysfs Antoine Tenart
@ 2020-12-21 22:33   ` Alexander Duyck
  2020-12-22  9:10     ` Antoine Tenart
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2020-12-21 22:33 UTC (permalink / raw)
  To: Antoine Tenart; +Cc: David Miller, Jakub Kicinski, Netdev, Paolo Abeni

On Mon, Dec 21, 2020 at 11:36 AM Antoine Tenart <atenart@kernel.org> wrote:
>
> Accesses to dev->xps_cpus_map (when using dev->num_tc) should be
> protected by the xps_map mutex, to avoid possible race conditions when
> dev->num_tc is updated while the map is accessed. This patch moves the
> logic accessing dev->xps_cpu_map and dev->num_tc to net/core/dev.c,
> where the xps_map mutex is defined and used.
>
> Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
> Signed-off-by: Antoine Tenart <atenart@kernel.org>

It is a bit of a shame that we have to use the mutex_lock while just
displaying the table. But we end up needing it if we are going to use
the xps_map_mutex to protect us from the num_tc being updated while we
are reading it.

One thing I might change is to actually bump this patch up in the
patch set as I think it would probably make things a bit cleaner to
read as you are going to be moving the other functions to this pattern
as well.

> ---
>  include/linux/netdevice.h |  8 ++++++
>  net/core/dev.c            | 59 +++++++++++++++++++++++++++++++++++++++
>  net/core/net-sysfs.c      | 54 ++++++++---------------------------
>  3 files changed, 79 insertions(+), 42 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 259be67644e3..bfd6cfa3ea90 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3671,6 +3671,8 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
>                         u16 index);
>  int __netif_set_xps_queue(struct net_device *dev, const unsigned long *mask,
>                           u16 index, bool is_rxqs_map);
> +int netif_show_xps_queue(struct net_device *dev, unsigned long **mask,
> +                        u16 index);
>
>  /**
>   *     netif_attr_test_mask - Test a CPU or Rx queue set in a mask
> @@ -3769,6 +3771,12 @@ static inline int __netif_set_xps_queue(struct net_device *dev,
>  {
>         return 0;
>  }
> +
> +static inline int netif_show_xps_queue(struct net_device *dev,
> +                                      unsigned long **mask, u16 index)
> +{
> +       return 0;
> +}
>  #endif
>
>  /**
> diff --git a/net/core/dev.c b/net/core/dev.c
> index effdb7fee9df..a0257da4160a 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2831,6 +2831,65 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
>  }
>  EXPORT_SYMBOL(netif_set_xps_queue);
>
> +int netif_show_xps_queue(struct net_device *dev, unsigned long **mask,
> +                        u16 index)
> +{
> +       const unsigned long *possible_mask = NULL;
> +       int j, num_tc = 1, tc = 0, ret = 0;
> +       struct xps_dev_maps *dev_maps;
> +       unsigned int nr_ids;
> +
> +       rcu_read_lock();
> +       mutex_lock(&xps_map_mutex);
> +

So you only need to call mutex_lock here. The rcu_read_lock becomes redundant.

> +       if (dev->num_tc) {
> +               num_tc = dev->num_tc;
> +               if (num_tc < 0) {
> +                       ret = -EINVAL;
> +                       goto out_no_map;
> +               }
> +
> +               /* If queue belongs to subordinate dev use its map */
> +               dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
> +
> +               tc = netdev_txq_to_tc(dev, index);
> +               if (tc < 0) {
> +                       ret = -EINVAL;
> +                       goto out_no_map;
> +               }
> +       }
> +
> +       dev_maps = rcu_dereference(dev->xps_cpus_map);
> +       if (!dev_maps)
> +               goto out_no_map;
> +       nr_ids = nr_cpu_ids;
> +       if (num_possible_cpus() > 1)
> +               possible_mask = cpumask_bits(cpu_possible_mask);
> +
> +       for (j = -1; j = netif_attrmask_next(j, possible_mask, nr_ids),
> +            j < nr_ids;) {
> +               int i, tci = j * num_tc + tc;
> +               struct xps_map *map;
> +
> +               map = rcu_dereference(dev_maps->attr_map[tci]);

For this you can use either rcu_dereference_protected, or you can use
xmap_dereference.

> +               if (!map)
> +                       continue;
> +
> +               for (i = map->len; i--;) {
> +                       if (map->queues[i] == index) {
> +                               set_bit(j, *mask);
> +                               break;
> +                       }
> +               }
> +       }
> +
> +out_no_map:
> +       mutex_unlock(&xps_map_mutex);
> +       rcu_read_unlock();
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL(netif_show_xps_queue);
>  #endif
>
>  static void __netdev_unbind_sb_channel(struct net_device *dev,
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 999b70c59761..29ee69b67972 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -1314,60 +1314,30 @@ static const struct attribute_group dql_group = {
>  #endif /* CONFIG_BQL */
>
>  #ifdef CONFIG_XPS
> -static ssize_t xps_cpus_show(struct netdev_queue *queue,
> -                            char *buf)
> +static ssize_t xps_cpus_show(struct netdev_queue *queue, char *buf)
>  {
>         struct net_device *dev = queue->dev;
> -       int cpu, len, num_tc = 1, tc = 0;
> -       struct xps_dev_maps *dev_maps;
> -       cpumask_var_t mask;
> -       unsigned long index;
> +       unsigned long *mask, index;
> +       int len, ret;
>
>         if (!netif_is_multiqueue(dev))
>                 return -ENOENT;
>
>         index = get_netdev_queue_index(queue);
>
> -       if (dev->num_tc) {
> -               /* Do not allow XPS on subordinate device directly */
> -               num_tc = dev->num_tc;
> -               if (num_tc < 0)
> -                       return -EINVAL;
> -
> -               /* If queue belongs to subordinate dev use its map */
> -               dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
> -
> -               tc = netdev_txq_to_tc(dev, index);
> -               if (tc < 0)
> -                       return -EINVAL;
> -       }
> -
> -       if (!zalloc_cpumask_var(&mask, GFP_KERNEL))
> +       mask = bitmap_zalloc(nr_cpu_ids, GFP_KERNEL);
> +       if (!mask)
>                 return -ENOMEM;
>
> -       rcu_read_lock();
> -       dev_maps = rcu_dereference(dev->xps_cpus_map);
> -       if (dev_maps) {
> -               for_each_possible_cpu(cpu) {
> -                       int i, tci = cpu * num_tc + tc;
> -                       struct xps_map *map;
> -
> -                       map = rcu_dereference(dev_maps->attr_map[tci]);
> -                       if (!map)
> -                               continue;
> -
> -                       for (i = map->len; i--;) {
> -                               if (map->queues[i] == index) {
> -                                       cpumask_set_cpu(cpu, mask);
> -                                       break;
> -                               }
> -                       }
> -               }
> +       ret = netif_show_xps_queue(dev, &mask, index);
> +       if (ret) {
> +               bitmap_free(mask);
> +               return ret;
>         }
> -       rcu_read_unlock();
>
> -       len = snprintf(buf, PAGE_SIZE, "%*pb\n", cpumask_pr_args(mask));
> -       free_cpumask_var(mask);
> +       len = bitmap_print_to_pagebuf(false, buf, mask, nr_cpu_ids);
> +       bitmap_free(mask);
> +
>         return len < PAGE_SIZE ? len : -EINVAL;
>  }
>
> --
> 2.29.2
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-21 19:36 ` [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num Antoine Tenart
@ 2020-12-21 23:21   ` Alexander Duyck
  2020-12-22  9:21     ` Antoine Tenart
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2020-12-21 23:21 UTC (permalink / raw)
  To: Antoine Tenart; +Cc: David Miller, Jakub Kicinski, Netdev, Paolo Abeni

On Mon, Dec 21, 2020 at 11:36 AM Antoine Tenart <atenart@kernel.org> wrote:
>
> Two race conditions can be triggered in xps, resulting in various oops
> and invalid memory accesses:
>
> 1. Calling netdev_set_num_tc while netif_set_xps_queue:
>
>    - netdev_set_num_tc sets dev->tc_num.
>
>    - netif_set_xps_queue uses dev->tc_num as one of the parameters to
>      compute the size of new_dev_maps when allocating it. dev->tc_num is
>      also used to access the map, and the compiler may generate code to
>      retrieve this field multiple times in the function.
>
>    If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
>    is set to a higher value through netdev_set_num_tc, later accesses to
>    new_dev_maps in netif_set_xps_queue could lead to accessing memory
>    outside of new_dev_maps; triggering an oops.
>
>    One way of triggering this is to set an iface up (for which the
>    driver uses netdev_set_num_tc in the open path, such as bnx2x) and
>    writing to xps_cpus or xps_rxqs in a concurrent thread. With the
>    right timing an oops is triggered.
>
> 2. Calling netif_set_xps_queue while netdev_set_num_tc is running:
>
>    2.1. netdev_set_num_tc starts by resetting the xps queues,
>         dev->tc_num isn't updated yet.
>
>    2.2. netif_set_xps_queue is called, setting up the maps with the
>         *old* dev->num_tc.
>
>    2.3. dev->tc_num is updated.
>
>    2.3. Later accesses to the map leads to out of bound accesses and
>         oops.
>
>    A similar issue can be found with netdev_reset_tc.
>
>    The fix can't be to only link the size of the maps to them, as
>    invalid configuration could still occur. The reset then set logic in
>    both netdev_set_num_tc and netdev_reset_tc must be protected by a
>    lock.
>
> Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
> and netdev_reset_tc should be mutually exclusive.
>
> This patch fixes those races by:
>
> - Reworking netif_set_xps_queue by moving the xps_map_mutex up so the
>   access of dev->num_tc is done under the lock.
>
> - Using xps_map_mutex in both netdev_set_num_tc and netdev_reset_tc for
>   the reset and set logic:
>
>   + As xps_map_mutex was taken in the reset path, netif_reset_xps_queues
>     had to be reworked to offer an unlocked version (as well as
>     netdev_unbind_all_sb_channels which calls it).
>
>   + cpus_read_lock was taken in the reset path as well, and is always
>     taken before xps_map_mutex. It had to be moved out of the unlocked
>     version as well.
>
>   This is why the patch is a little bit longer, and moves
>   netdev_unbind_sb_channel up in the file.
>
> Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
> Signed-off-by: Antoine Tenart <atenart@kernel.org>

Looking over this patch it seems kind of obvious that extending the
xps_map_mutex is making things far more complex then they need to be.

Applying the rtnl_mutex would probably be much simpler. Although as I
think you have already discovered we need to apply it to the store,
and show for this interface. In addition we probably need to perform
similar locking around traffic_class_show in order to prevent it from
generating a similar error.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 2/3] net: move the xps cpus retrieval out of net-sysfs
  2020-12-21 22:33   ` Alexander Duyck
@ 2020-12-22  9:10     ` Antoine Tenart
  0 siblings, 0 replies; 16+ messages in thread
From: Antoine Tenart @ 2020-12-22  9:10 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: David Miller, Jakub Kicinski, Netdev, Paolo Abeni

Hi Alexander,

Quoting Alexander Duyck (2020-12-21 23:33:15)
> 
> One thing I might change is to actually bump this patch up in the
> patch set as I think it would probably make things a bit cleaner to
> read as you are going to be moving the other functions to this pattern
> as well.

Right. If it were not for net (vs net-next), I would have split the
patches a bit differently to make things easier to review. But those
patches are fixes and can be backported to older kernel versions.
They're fixing 2 commits that were introduced in different versions, so
this patch has to be made before the next one, as it is fixing older
kernels.

(I also did not give an hint in the commit message to what is done in
patch 3 for the same reason. But I agree that's arguable.)

Thanks,
Antoine

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-21 23:21   ` Alexander Duyck
@ 2020-12-22  9:21     ` Antoine Tenart
  2020-12-22 16:12       ` Alexander Duyck
  0 siblings, 1 reply; 16+ messages in thread
From: Antoine Tenart @ 2020-12-22  9:21 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: David Miller, Jakub Kicinski, Netdev, Paolo Abeni

Hello Alexander, Jakub,

Quoting Alexander Duyck (2020-12-22 00:21:57)
> 
> Looking over this patch it seems kind of obvious that extending the
> xps_map_mutex is making things far more complex then they need to be.
> 
> Applying the rtnl_mutex would probably be much simpler. Although as I
> think you have already discovered we need to apply it to the store,
> and show for this interface. In addition we probably need to perform
> similar locking around traffic_class_show in order to prevent it from
> generating a similar error.

I don't think we have the same kind of issues with traffic_class_show:
dev->num_tc is used, but not for navigating through the map. Protecting
only a single read wouldn't change much. We can still think about what
could go wrong here without the lock, but that is not related to this
series of fixes.

If I understood correctly, as things are a bit too complex now, you
would prefer that we go for the solution proposed in v1?

I can still do the code factoring for the 2 sysfs show operations, but
that would then target net-next and would be in a different series. So I
believe we'll use the patches of v1, unmodified.

Jakub, should I send a v3 then?

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-22  9:21     ` Antoine Tenart
@ 2020-12-22 16:12       ` Alexander Duyck
  2020-12-23 18:27         ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2020-12-22 16:12 UTC (permalink / raw)
  To: Antoine Tenart; +Cc: David Miller, Jakub Kicinski, Netdev, Paolo Abeni

On Tue, Dec 22, 2020 at 1:21 AM Antoine Tenart <atenart@kernel.org> wrote:
>
> Hello Alexander, Jakub,
>
> Quoting Alexander Duyck (2020-12-22 00:21:57)
> >
> > Looking over this patch it seems kind of obvious that extending the
> > xps_map_mutex is making things far more complex then they need to be.
> >
> > Applying the rtnl_mutex would probably be much simpler. Although as I
> > think you have already discovered we need to apply it to the store,
> > and show for this interface. In addition we probably need to perform
> > similar locking around traffic_class_show in order to prevent it from
> > generating a similar error.
>
> I don't think we have the same kind of issues with traffic_class_show:
> dev->num_tc is used, but not for navigating through the map. Protecting
> only a single read wouldn't change much. We can still think about what
> could go wrong here without the lock, but that is not related to this
> series of fixes.

The problem is we are actually reading the netdev, tx queue, and
tc_to_txq mapping. Basically we have several different items that we
are accessing at the same time. If any one is updated while we are
doing it then it will throw things off.

> If I understood correctly, as things are a bit too complex now, you
> would prefer that we go for the solution proposed in v1?

Yeah, that is what I am thinking. Basically we just need to make sure
the num_tc cannot be updated while we are reading the other values.

> I can still do the code factoring for the 2 sysfs show operations, but
> that would then target net-next and would be in a different series. So I
> believe we'll use the patches of v1, unmodified.

I agree the code factoring would be better targeted to net-next.

The rtnl_lock approach from v1 would work for net and for backports.

> Jakub, should I send a v3 then?
>
> Thanks!
> Antoine

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-22 16:12       ` Alexander Duyck
@ 2020-12-23 18:27         ` Jakub Kicinski
  2020-12-23 19:36           ` Antoine Tenart
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2020-12-23 18:27 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Antoine Tenart, David Miller, Netdev, Paolo Abeni

On Tue, 22 Dec 2020 08:12:28 -0800 Alexander Duyck wrote:
> On Tue, Dec 22, 2020 at 1:21 AM Antoine Tenart <atenart@kernel.org> wrote:
> > Quoting Alexander Duyck (2020-12-22 00:21:57)  
> > >
> > > Looking over this patch it seems kind of obvious that extending the
> > > xps_map_mutex is making things far more complex then they need to be.
> > >
> > > Applying the rtnl_mutex would probably be much simpler. Although as I
> > > think you have already discovered we need to apply it to the store,
> > > and show for this interface. In addition we probably need to perform
> > > similar locking around traffic_class_show in order to prevent it from
> > > generating a similar error.  
> >
> > I don't think we have the same kind of issues with traffic_class_show:
> > dev->num_tc is used, but not for navigating through the map. Protecting
> > only a single read wouldn't change much. We can still think about what
> > could go wrong here without the lock, but that is not related to this
> > series of fixes.  
> 
> The problem is we are actually reading the netdev, tx queue, and
> tc_to_txq mapping. Basically we have several different items that we
> are accessing at the same time. If any one is updated while we are
> doing it then it will throw things off.
> 
> > If I understood correctly, as things are a bit too complex now, you
> > would prefer that we go for the solution proposed in v1?  
> 
> Yeah, that is what I am thinking. Basically we just need to make sure
> the num_tc cannot be updated while we are reading the other values.

Yeah, okay, as much as I dislike this approach 300 lines may be a little
too large for stable.

> > I can still do the code factoring for the 2 sysfs show operations, but
> > that would then target net-next and would be in a different series. So I
> > believe we'll use the patches of v1, unmodified.  

Are you saying just patch 3 for net-next? We need to do something about
the fact that with sysfs taking rtnl_lock xps_map_mutex is now entirely
pointless. I guess its value eroded over the years since Tom's initial
design so we can just get rid of it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-23 18:27         ` Jakub Kicinski
@ 2020-12-23 19:36           ` Antoine Tenart
  2020-12-23 20:11             ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Antoine Tenart @ 2020-12-23 19:36 UTC (permalink / raw)
  To: Alexander Duyck, Jakub Kicinski; +Cc: David Miller, Netdev, Paolo Abeni

Hi Jakub,

Quoting Jakub Kicinski (2020-12-23 19:27:29)
> On Tue, 22 Dec 2020 08:12:28 -0800 Alexander Duyck wrote:
> > On Tue, Dec 22, 2020 at 1:21 AM Antoine Tenart <atenart@kernel.org> wrote:
> > 
> > > If I understood correctly, as things are a bit too complex now, you
> > > would prefer that we go for the solution proposed in v1?  
> > 
> > Yeah, that is what I am thinking. Basically we just need to make sure
> > the num_tc cannot be updated while we are reading the other values.
> 
> Yeah, okay, as much as I dislike this approach 300 lines may be a little
> too large for stable.
> 
> > > I can still do the code factoring for the 2 sysfs show operations, but
> > > that would then target net-next and would be in a different series. So I
> > > believe we'll use the patches of v1, unmodified.  
> 
> Are you saying just patch 3 for net-next?

The idea would be to:

- For net, to take all 4 patches from v1. If so, do I need to resend
  them?

- For net-next, to resend patches 2 and 3 from v2 (they'll have to be
  slightly reworked, to take into account the review from Alexander and
  the rtnl lock). The patches can be sent once the ones for net land in
  net-next.

> We need to do something about the fact that with sysfs taking
> rtnl_lock xps_map_mutex is now entirely pointless. I guess its value
> eroded over the years since Tom's initial design so we can just get
> rid of it.

We should be able to remove the mutex (I'll double check as more
functions are involved). If so, I can send a patch to net-next.

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-23 19:36           ` Antoine Tenart
@ 2020-12-23 20:11             ` Jakub Kicinski
  2020-12-23 20:35               ` Antoine Tenart
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2020-12-23 20:11 UTC (permalink / raw)
  To: Antoine Tenart; +Cc: Alexander Duyck, David Miller, Netdev, Paolo Abeni

On Wed, 23 Dec 2020 20:36:33 +0100 Antoine Tenart wrote:
> Quoting Jakub Kicinski (2020-12-23 19:27:29)
> > On Tue, 22 Dec 2020 08:12:28 -0800 Alexander Duyck wrote:  
> > > On Tue, Dec 22, 2020 at 1:21 AM Antoine Tenart <atenart@kernel.org> wrote:
> > >   
> > > > If I understood correctly, as things are a bit too complex now, you
> > > > would prefer that we go for the solution proposed in v1?    
> > > 
> > > Yeah, that is what I am thinking. Basically we just need to make sure
> > > the num_tc cannot be updated while we are reading the other values.  
> > 
> > Yeah, okay, as much as I dislike this approach 300 lines may be a little
> > too large for stable.
> >   
> > > > I can still do the code factoring for the 2 sysfs show operations, but
> > > > that would then target net-next and would be in a different series. So I
> > > > believe we'll use the patches of v1, unmodified.    
> > 
> > Are you saying just patch 3 for net-next?  
> 
> The idea would be to:
> 
> - For net, to take all 4 patches from v1. If so, do I need to resend
>   them?

Yes, please.

> - For net-next, to resend patches 2 and 3 from v2 (they'll have to be
>   slightly reworked, to take into account the review from Alexander and
>   the rtnl lock). The patches can be sent once the ones for net land in
>   net-next.

If the direction is to remove xps_map_mutex, why would we need patch 2?
🤔

> > We need to do something about the fact that with sysfs taking
> > rtnl_lock xps_map_mutex is now entirely pointless. I guess its value
> > eroded over the years since Tom's initial design so we can just get
> > rid of it.  
> 
> We should be able to remove the mutex (I'll double check as more
> functions are involved). If so, I can send a patch to net-next.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-23 20:11             ` Jakub Kicinski
@ 2020-12-23 20:35               ` Antoine Tenart
  2020-12-23 20:43                 ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Antoine Tenart @ 2020-12-23 20:35 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Alexander Duyck, David Miller, Netdev, Paolo Abeni

Quoting Jakub Kicinski (2020-12-23 21:11:10)
> On Wed, 23 Dec 2020 20:36:33 +0100 Antoine Tenart wrote:
> > Quoting Jakub Kicinski (2020-12-23 19:27:29)
> > > On Tue, 22 Dec 2020 08:12:28 -0800 Alexander Duyck wrote:  
> > > > On Tue, Dec 22, 2020 at 1:21 AM Antoine Tenart <atenart@kernel.org> wrote:
> > > >   
> > > > > If I understood correctly, as things are a bit too complex now, you
> > > > > would prefer that we go for the solution proposed in v1?    
> > > > 
> > > > Yeah, that is what I am thinking. Basically we just need to make sure
> > > > the num_tc cannot be updated while we are reading the other values.  
> > > 
> > > Yeah, okay, as much as I dislike this approach 300 lines may be a little
> > > too large for stable.
> > >   
> > > > > I can still do the code factoring for the 2 sysfs show operations, but
> > > > > that would then target net-next and would be in a different series. So I
> > > > > believe we'll use the patches of v1, unmodified.    
> > > 
> > > Are you saying just patch 3 for net-next?  
> > 
> > The idea would be to:
> > 
> > - For net, to take all 4 patches from v1. If so, do I need to resend
> >   them?
> 
> Yes, please.

Will do.

> > - For net-next, to resend patches 2 and 3 from v2 (they'll have to be
> >   slightly reworked, to take into account the review from Alexander and
> >   the rtnl lock). The patches can be sent once the ones for net land in
> >   net-next.
> 
> If the direction is to remove xps_map_mutex, why would we need patch 2?
> 🤔

Only the patches for net are needed to fix the race conditions.

In addition to use the xps_map mutex, patches 2 and 3 from v2 factorize
the code into a single function, as xps_cpus_show and xps_rxqs_show
share the same logic. That would improve maintainability, but isn't
mandatory.

Sorry, it was not very clear...

Antoine

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-23 20:35               ` Antoine Tenart
@ 2020-12-23 20:43                 ` Jakub Kicinski
  2020-12-23 20:56                   ` Antoine Tenart
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2020-12-23 20:43 UTC (permalink / raw)
  To: Antoine Tenart; +Cc: Alexander Duyck, David Miller, Netdev, Paolo Abeni

On Wed, 23 Dec 2020 21:35:15 +0100 Antoine Tenart wrote:
> > > - For net-next, to resend patches 2 and 3 from v2 (they'll have to be
> > >   slightly reworked, to take into account the review from Alexander and
> > >   the rtnl lock). The patches can be sent once the ones for net land in
> > >   net-next.  
> > 
> > If the direction is to remove xps_map_mutex, why would we need patch 2?
> > 🤔  
> 
> Only the patches for net are needed to fix the race conditions.
> 
> In addition to use the xps_map mutex, patches 2 and 3 from v2 factorize
> the code into a single function, as xps_cpus_show and xps_rxqs_show
> share the same logic. That would improve maintainability, but isn't
> mandatory.
> 
> Sorry, it was not very clear...

I like the cleanup, sorry I'm net very clear either.

My understanding was that patch 2 was only needed to have access to the
XPS lock, if we don't plan to use that lock netif_show_xps_queue() can
stay in the sysfs file, right? I'm all for the cleanup and code reuse
for rxqs, I'm just making sure I'm not missing anything. I wasn't
seeing a reason to move netif_show_xps_queue(), that's all.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-23 20:43                 ` Jakub Kicinski
@ 2020-12-23 20:56                   ` Antoine Tenart
  2020-12-23 20:59                     ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Antoine Tenart @ 2020-12-23 20:56 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Alexander Duyck, David Miller, Netdev, Paolo Abeni

Quoting Jakub Kicinski (2020-12-23 21:43:15)
> On Wed, 23 Dec 2020 21:35:15 +0100 Antoine Tenart wrote:
> > > > - For net-next, to resend patches 2 and 3 from v2 (they'll have to be
> > > >   slightly reworked, to take into account the review from Alexander and
> > > >   the rtnl lock). The patches can be sent once the ones for net land in
> > > >   net-next.  
> > > 
> > > If the direction is to remove xps_map_mutex, why would we need patch 2?
> > > 🤔  
> > 
> > Only the patches for net are needed to fix the race conditions.
> > 
> > In addition to use the xps_map mutex, patches 2 and 3 from v2 factorize
> > the code into a single function, as xps_cpus_show and xps_rxqs_show
> > share the same logic. That would improve maintainability, but isn't
> > mandatory.
> > 
> > Sorry, it was not very clear...
> 
> I like the cleanup, sorry I'm net very clear either.
> 
> My understanding was that patch 2 was only needed to have access to the
> XPS lock, if we don't plan to use that lock netif_show_xps_queue() can
> stay in the sysfs file, right? I'm all for the cleanup and code reuse
> for rxqs, I'm just making sure I'm not missing anything. I wasn't
> seeing a reason to move netif_show_xps_queue(), that's all.

You understood correctly, the only reason to move this code out of sysfs
was to access the xps_map lock. Without the need, the code can stay in
sysfs.

Patch 2 is not only moving the code out of sysfs, but also reworking
xps_cpus_show. I think I now see where the confusion comes from: the
reason patches 2 and 3 were in two different patches was because they
were targeting net and different kernel versions. They could be squashed
now.

Antoine

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num
  2020-12-23 20:56                   ` Antoine Tenart
@ 2020-12-23 20:59                     ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2020-12-23 20:59 UTC (permalink / raw)
  To: Antoine Tenart; +Cc: Alexander Duyck, David Miller, Netdev, Paolo Abeni

On Wed, 23 Dec 2020 21:56:56 +0100 Antoine Tenart wrote:
> You understood correctly, the only reason to move this code out of sysfs
> was to access the xps_map lock. Without the need, the code can stay in
> sysfs.
> 
> Patch 2 is not only moving the code out of sysfs, but also reworking
> xps_cpus_show. I think I now see where the confusion comes from: the
> reason patches 2 and 3 were in two different patches was because they
> were targeting net and different kernel versions. They could be squashed
> now.

👍

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-12-23 21:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-21 19:36 [PATCH net v2 0/3] net-sysfs: fix race conditions in the xps code Antoine Tenart
2020-12-21 19:36 ` [PATCH net v2 1/3] net: fix race conditions in xps by locking the maps and dev->tc_num Antoine Tenart
2020-12-21 23:21   ` Alexander Duyck
2020-12-22  9:21     ` Antoine Tenart
2020-12-22 16:12       ` Alexander Duyck
2020-12-23 18:27         ` Jakub Kicinski
2020-12-23 19:36           ` Antoine Tenart
2020-12-23 20:11             ` Jakub Kicinski
2020-12-23 20:35               ` Antoine Tenart
2020-12-23 20:43                 ` Jakub Kicinski
2020-12-23 20:56                   ` Antoine Tenart
2020-12-23 20:59                     ` Jakub Kicinski
2020-12-21 19:36 ` [PATCH net v2 2/3] net: move the xps cpus retrieval out of net-sysfs Antoine Tenart
2020-12-21 22:33   ` Alexander Duyck
2020-12-22  9:10     ` Antoine Tenart
2020-12-21 19:36 ` [PATCH net v2 3/3] net: move the xps rxqs " Antoine Tenart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.