All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] Make XPS usable within ixgbe
@ 2013-01-10 18:56 Alexander Duyck
  2013-01-10 18:56 ` [PATCH v2 01/10] net: Split core bits of netdev_pick_tx into __netdev_pick_tx Alexander Duyck
                   ` (10 more replies)
  0 siblings, 11 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:56 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

The following patch series makes it so that the ixgbe driver can support
ATR even when the number of queues is less than the number of CPUs.  To do
this I have updated the kernel to support letting drivers set their own XPS
configuration.  To do this it was necessary to move the code out of the
sysfs specific code and into the dev specific regions.

It has been almost 6 months since the last time I released these patches.  I
have been meaning to get back to them but I just haven't had a chance to get
this work done until recently.

Only the core net patches are meant for acceptance at this time.  I will
submit the ixgbe patches through Jeff Kirsher's tree.  For this reason I have
tagged the ixgbe related patches as RFC.

---

rc2: Updated first patch to address changes from dev_pick_tx to netdev_pick_tx.
     Made it so that we only update XPS map on queue number change.
     Minor tweaks and cleanups to get caught up with recent changes in the tree.

Alexander Duyck (10):
      net: Split core bits of netdev_pick_tx into __netdev_pick_tx
      net: Add functions netif_reset_xps_queue and netif_set_xps_queue
      net: Rewrite netif_reset_xps_queue to allow for better code reuse
      net: Rewrite netif_set_xps_queues to address several issues
      net: Add support for XPS without sysfs being defined
      ixgbe: Define FCoE and Flow director limits much sooner to allow for changes
      ixgbe: Add function for setting XPS queue mapping
      ixgbe: Update ixgbe driver to use __netdev_pick_tx in ixgbe_select_queue
      ixgbe: Add support for displaying the number of Tx/Rx channels
      ixgbe: Add support for set_channels ethtool operation


 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |    3 
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |  112 +++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c     |   21 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |   77 ++++--
 include/linux/netdevice.h                        |   13 +
 net/Kconfig                                      |    2 
 net/core/dev.c                                   |  289 ++++++++++++++++++++--
 net/core/net-sysfs.c                             |  160 ------------
 8 files changed, 454 insertions(+), 223 deletions(-)

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 01/10] net: Split core bits of netdev_pick_tx into __netdev_pick_tx
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
@ 2013-01-10 18:56 ` Alexander Duyck
  2013-01-10 18:57 ` [PATCH v2 02/10] net: Add functions netif_reset_xps_queue and netif_set_xps_queue Alexander Duyck
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:56 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

This change splits the core bits of netdev_pick_tx into a separate function.
The main idea behind this is to make this code accessible to select queue
functions when they decide to process the standard path instead of their
own custom path in their select queue routine.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 include/linux/netdevice.h |    1 +
 net/core/dev.c            |   57 +++++++++++++++++++++++++--------------------
 2 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0209ac3..608c3ac 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1403,6 +1403,7 @@ static inline void netdev_for_each_tx_queue(struct net_device *dev,
 
 extern struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 					   struct sk_buff *skb);
+extern u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb);
 
 /*
  * Net namespace inlines
diff --git a/net/core/dev.c b/net/core/dev.c
index 594830e..04f04e0 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2495,37 +2495,44 @@ static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
 #endif
 }
 
-struct netdev_queue *netdev_pick_tx(struct net_device *dev,
-				    struct sk_buff *skb)
+u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb)
 {
-	int queue_index;
-	const struct net_device_ops *ops = dev->netdev_ops;
-
-	if (dev->real_num_tx_queues == 1)
-		queue_index = 0;
-	else if (ops->ndo_select_queue) {
-		queue_index = ops->ndo_select_queue(dev, skb);
-		queue_index = dev_cap_txqueue(dev, queue_index);
-	} else {
-		struct sock *sk = skb->sk;
-		queue_index = sk_tx_queue_get(sk);
+	struct sock *sk = skb->sk;
+	int queue_index = sk_tx_queue_get(sk);
 
-		if (queue_index < 0 || skb->ooo_okay ||
-		    queue_index >= dev->real_num_tx_queues) {
-			int old_index = queue_index;
+	if (queue_index < 0 || skb->ooo_okay ||
+	    queue_index >= dev->real_num_tx_queues) {
+		int new_index = get_xps_queue(dev, skb);
+		if (new_index < 0)
+			new_index = skb_tx_hash(dev, skb);
 
-			queue_index = get_xps_queue(dev, skb);
-			if (queue_index < 0)
-				queue_index = skb_tx_hash(dev, skb);
-
-			if (queue_index != old_index && sk) {
-				struct dst_entry *dst =
+		if (queue_index != new_index && sk) {
+			struct dst_entry *dst =
 				    rcu_dereference_check(sk->sk_dst_cache, 1);
 
-				if (dst && skb_dst(skb) == dst)
-					sk_tx_queue_set(sk, queue_index);
-			}
+			if (dst && skb_dst(skb) == dst)
+				sk_tx_queue_set(sk, queue_index);
+
 		}
+
+		queue_index = new_index;
+	}
+
+	return queue_index;
+}
+
+struct netdev_queue *netdev_pick_tx(struct net_device *dev,
+				    struct sk_buff *skb)
+{
+	int queue_index = 0;
+
+	if (dev->real_num_tx_queues != 1) {
+		const struct net_device_ops *ops = dev->netdev_ops;
+		if (ops->ndo_select_queue)
+			queue_index = ops->ndo_select_queue(dev, skb);
+		else
+			queue_index = __netdev_pick_tx(dev, skb);
+		queue_index = dev_cap_txqueue(dev, queue_index);
 	}
 
 	skb_set_queue_mapping(skb, queue_index);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 02/10] net: Add functions netif_reset_xps_queue and netif_set_xps_queue
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
  2013-01-10 18:56 ` [PATCH v2 01/10] net: Split core bits of netdev_pick_tx into __netdev_pick_tx Alexander Duyck
@ 2013-01-10 18:57 ` Alexander Duyck
  2013-01-10 18:57 ` [PATCH v2 03/10] net: Rewrite netif_reset_xps_queue to allow for better code reuse Alexander Duyck
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:57 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

This patch adds two functions, netif_reset_xps_queue and
netif_set_xps_queue.  The main idea behind these to functions is to provide
a mechanism through which drivers can update their defaults in regards to
XPS.

Currently no such mechanism exists and as a result we cannot use XPS for
things such as ATR which would require a basic configuration to start in
which the Tx queues are mapped to CPUs via a 1:1 mapping.  With this change
I am making it possible for drivers such as ixgbe to be able to use the XPS
feature by controlling the default configuration.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 include/linux/netdevice.h |   13 ++++
 net/core/dev.c            |  155 +++++++++++++++++++++++++++++++++++++++++++++
 net/core/net-sysfs.c      |  148 +------------------------------------------
 3 files changed, 173 insertions(+), 143 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 608c3ac..59fe9da 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2103,6 +2103,19 @@ static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
 		__netif_schedule(txq->qdisc);
 }
 
+#ifdef CONFIG_XPS
+extern void netif_reset_xps_queue(struct net_device *dev, u16 index);
+extern int netif_set_xps_queue(struct net_device *dev, struct cpumask *mask,
+			       u16 index);
+#else
+static inline int netif_set_xps_queue(struct net_device *dev,
+				      struct cpumask *mask,
+				      u16 index)
+{
+	return 0;
+}
+#endif
+
 /*
  * Returns a Tx hash for the given packet when dev->real_num_tx_queues is used
  * as a distribution range limit for the returned value.
diff --git a/net/core/dev.c b/net/core/dev.c
index 04f04e0..76126fb 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1857,6 +1857,161 @@ static void netif_setup_tc(struct net_device *dev, unsigned int txq)
 	}
 }
 
+#ifdef CONFIG_XPS
+static DEFINE_MUTEX(xps_map_mutex);
+#define xmap_dereference(P)		\
+	rcu_dereference_protected((P), lockdep_is_held(&xps_map_mutex))
+
+void netif_reset_xps_queue(struct net_device *dev, u16 index)
+{
+	struct xps_dev_maps *dev_maps;
+	struct xps_map *map;
+	int i, pos, nonempty = 0;
+
+	mutex_lock(&xps_map_mutex);
+	dev_maps = xmap_dereference(dev->xps_maps);
+
+	if (!dev_maps)
+		goto out_no_maps;
+
+	for_each_possible_cpu(i) {
+		map = xmap_dereference(dev_maps->cpu_map[i]);
+		if (!map)
+			continue;
+
+		for (pos = 0; pos < map->len; pos++)
+			if (map->queues[pos] == index)
+				break;
+
+		if (pos < map->len) {
+			if (map->len > 1) {
+				map->queues[pos] = map->queues[--map->len];
+			} else {
+				RCU_INIT_POINTER(dev_maps->cpu_map[i], NULL);
+				kfree_rcu(map, rcu);
+				map = NULL;
+			}
+		}
+		if (map)
+			nonempty = 1;
+	}
+
+	if (!nonempty) {
+		RCU_INIT_POINTER(dev->xps_maps, NULL);
+		kfree_rcu(dev_maps, rcu);
+	}
+
+out_no_maps:
+	mutex_unlock(&xps_map_mutex);
+}
+
+int netif_set_xps_queue(struct net_device *dev, struct cpumask *mask, u16 index)
+{
+	int i, cpu, pos, map_len, alloc_len, need_set;
+	struct xps_map *map, *new_map;
+	struct xps_dev_maps *dev_maps, *new_dev_maps;
+	int nonempty = 0;
+	int numa_node_id = -2;
+	int maps_sz = max_t(unsigned int, XPS_DEV_MAPS_SIZE, L1_CACHE_BYTES);
+
+	new_dev_maps = kzalloc(maps_sz, GFP_KERNEL);
+	if (!new_dev_maps)
+		return -ENOMEM;
+
+	mutex_lock(&xps_map_mutex);
+
+	dev_maps = xmap_dereference(dev->xps_maps);
+
+	for_each_possible_cpu(cpu) {
+		map = dev_maps ?
+			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
+		new_map = map;
+		if (map) {
+			for (pos = 0; pos < map->len; pos++)
+				if (map->queues[pos] == index)
+					break;
+			map_len = map->len;
+			alloc_len = map->alloc_len;
+		} else
+			pos = map_len = alloc_len = 0;
+
+		need_set = cpumask_test_cpu(cpu, mask) && cpu_online(cpu);
+#ifdef CONFIG_NUMA
+		if (need_set) {
+			if (numa_node_id == -2)
+				numa_node_id = cpu_to_node(cpu);
+			else if (numa_node_id != cpu_to_node(cpu))
+				numa_node_id = -1;
+		}
+#endif
+		if (need_set && pos >= map_len) {
+			/* Need to add queue to this CPU's map */
+			if (map_len >= alloc_len) {
+				alloc_len = alloc_len ?
+				    2 * alloc_len : XPS_MIN_MAP_ALLOC;
+				new_map = kzalloc_node(XPS_MAP_SIZE(alloc_len),
+						       GFP_KERNEL,
+						       cpu_to_node(cpu));
+				if (!new_map)
+					goto error;
+				new_map->alloc_len = alloc_len;
+				for (i = 0; i < map_len; i++)
+					new_map->queues[i] = map->queues[i];
+				new_map->len = map_len;
+			}
+			new_map->queues[new_map->len++] = index;
+		} else if (!need_set && pos < map_len) {
+			/* Need to remove queue from this CPU's map */
+			if (map_len > 1)
+				new_map->queues[pos] =
+				    new_map->queues[--new_map->len];
+			else
+				new_map = NULL;
+		}
+		RCU_INIT_POINTER(new_dev_maps->cpu_map[cpu], new_map);
+	}
+
+	/* Cleanup old maps */
+	for_each_possible_cpu(cpu) {
+		map = dev_maps ?
+			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
+		if (map && xmap_dereference(new_dev_maps->cpu_map[cpu]) != map)
+			kfree_rcu(map, rcu);
+		if (new_dev_maps->cpu_map[cpu])
+			nonempty = 1;
+	}
+
+	if (nonempty) {
+		rcu_assign_pointer(dev->xps_maps, new_dev_maps);
+	} else {
+		kfree(new_dev_maps);
+		RCU_INIT_POINTER(dev->xps_maps, NULL);
+	}
+
+	if (dev_maps)
+		kfree_rcu(dev_maps, rcu);
+
+	netdev_queue_numa_node_write(netdev_get_tx_queue(dev, index),
+				     (numa_node_id >= 0) ? numa_node_id :
+				     NUMA_NO_NODE);
+
+	mutex_unlock(&xps_map_mutex);
+
+	return 0;
+error:
+	mutex_unlock(&xps_map_mutex);
+
+	if (new_dev_maps)
+		for_each_possible_cpu(i)
+			kfree(rcu_dereference_protected(
+				new_dev_maps->cpu_map[i],
+				1));
+	kfree(new_dev_maps);
+	return -ENOMEM;
+}
+EXPORT_SYMBOL(netif_set_xps_queue);
+
+#endif
 /*
  * Routine to help set real_num_tx_queues. To avoid skbs mapped to queues
  * greater then real_num_tx_queues stale skbs on the qdisc must be flushed.
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 29c884a..5ad489d 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1002,54 +1002,14 @@ static ssize_t show_xps_map(struct netdev_queue *queue,
 	return len;
 }
 
-static DEFINE_MUTEX(xps_map_mutex);
-#define xmap_dereference(P)		\
-	rcu_dereference_protected((P), lockdep_is_held(&xps_map_mutex))
-
 static void xps_queue_release(struct netdev_queue *queue)
 {
 	struct net_device *dev = queue->dev;
-	struct xps_dev_maps *dev_maps;
-	struct xps_map *map;
 	unsigned long index;
-	int i, pos, nonempty = 0;
 
 	index = get_netdev_queue_index(queue);
 
-	mutex_lock(&xps_map_mutex);
-	dev_maps = xmap_dereference(dev->xps_maps);
-
-	if (dev_maps) {
-		for_each_possible_cpu(i) {
-			map = xmap_dereference(dev_maps->cpu_map[i]);
-			if (!map)
-				continue;
-
-			for (pos = 0; pos < map->len; pos++)
-				if (map->queues[pos] == index)
-					break;
-
-			if (pos < map->len) {
-				if (map->len > 1)
-					map->queues[pos] =
-					    map->queues[--map->len];
-				else {
-					RCU_INIT_POINTER(dev_maps->cpu_map[i],
-					    NULL);
-					kfree_rcu(map, rcu);
-					map = NULL;
-				}
-			}
-			if (map)
-				nonempty = 1;
-		}
-
-		if (!nonempty) {
-			RCU_INIT_POINTER(dev->xps_maps, NULL);
-			kfree_rcu(dev_maps, rcu);
-		}
-	}
-	mutex_unlock(&xps_map_mutex);
+	netif_reset_xps_queue(dev, index);
 }
 
 static ssize_t store_xps_map(struct netdev_queue *queue,
@@ -1057,13 +1017,9 @@ static ssize_t store_xps_map(struct netdev_queue *queue,
 		      const char *buf, size_t len)
 {
 	struct net_device *dev = queue->dev;
-	cpumask_var_t mask;
-	int err, i, cpu, pos, map_len, alloc_len, need_set;
 	unsigned long index;
-	struct xps_map *map, *new_map;
-	struct xps_dev_maps *dev_maps, *new_dev_maps;
-	int nonempty = 0;
-	int numa_node_id = -2;
+	cpumask_var_t mask;
+	int err;
 
 	if (!capable(CAP_NET_ADMIN))
 		return -EPERM;
@@ -1079,105 +1035,11 @@ static ssize_t store_xps_map(struct netdev_queue *queue,
 		return err;
 	}
 
-	new_dev_maps = kzalloc(max_t(unsigned int,
-	    XPS_DEV_MAPS_SIZE, L1_CACHE_BYTES), GFP_KERNEL);
-	if (!new_dev_maps) {
-		free_cpumask_var(mask);
-		return -ENOMEM;
-	}
-
-	mutex_lock(&xps_map_mutex);
-
-	dev_maps = xmap_dereference(dev->xps_maps);
-
-	for_each_possible_cpu(cpu) {
-		map = dev_maps ?
-			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
-		new_map = map;
-		if (map) {
-			for (pos = 0; pos < map->len; pos++)
-				if (map->queues[pos] == index)
-					break;
-			map_len = map->len;
-			alloc_len = map->alloc_len;
-		} else
-			pos = map_len = alloc_len = 0;
-
-		need_set = cpumask_test_cpu(cpu, mask) && cpu_online(cpu);
-#ifdef CONFIG_NUMA
-		if (need_set) {
-			if (numa_node_id == -2)
-				numa_node_id = cpu_to_node(cpu);
-			else if (numa_node_id != cpu_to_node(cpu))
-				numa_node_id = -1;
-		}
-#endif
-		if (need_set && pos >= map_len) {
-			/* Need to add queue to this CPU's map */
-			if (map_len >= alloc_len) {
-				alloc_len = alloc_len ?
-				    2 * alloc_len : XPS_MIN_MAP_ALLOC;
-				new_map = kzalloc_node(XPS_MAP_SIZE(alloc_len),
-						       GFP_KERNEL,
-						       cpu_to_node(cpu));
-				if (!new_map)
-					goto error;
-				new_map->alloc_len = alloc_len;
-				for (i = 0; i < map_len; i++)
-					new_map->queues[i] = map->queues[i];
-				new_map->len = map_len;
-			}
-			new_map->queues[new_map->len++] = index;
-		} else if (!need_set && pos < map_len) {
-			/* Need to remove queue from this CPU's map */
-			if (map_len > 1)
-				new_map->queues[pos] =
-				    new_map->queues[--new_map->len];
-			else
-				new_map = NULL;
-		}
-		RCU_INIT_POINTER(new_dev_maps->cpu_map[cpu], new_map);
-	}
-
-	/* Cleanup old maps */
-	for_each_possible_cpu(cpu) {
-		map = dev_maps ?
-			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
-		if (map && xmap_dereference(new_dev_maps->cpu_map[cpu]) != map)
-			kfree_rcu(map, rcu);
-		if (new_dev_maps->cpu_map[cpu])
-			nonempty = 1;
-	}
-
-	if (nonempty) {
-		rcu_assign_pointer(dev->xps_maps, new_dev_maps);
-	} else {
-		kfree(new_dev_maps);
-		RCU_INIT_POINTER(dev->xps_maps, NULL);
-	}
-
-	if (dev_maps)
-		kfree_rcu(dev_maps, rcu);
-
-	netdev_queue_numa_node_write(queue, (numa_node_id >= 0) ? numa_node_id :
-					    NUMA_NO_NODE);
-
-	mutex_unlock(&xps_map_mutex);
+	err = netif_set_xps_queue(dev, mask, index);
 
 	free_cpumask_var(mask);
-	return len;
 
-error:
-	mutex_unlock(&xps_map_mutex);
-
-	if (new_dev_maps)
-		for_each_possible_cpu(i)
-			kfree(rcu_dereference_protected(
-				new_dev_maps->cpu_map[i],
-				1));
-	kfree(new_dev_maps);
-	free_cpumask_var(mask);
-	return -ENOMEM;
+	return err ? : len;
 }
 
 static struct netdev_queue_attribute xps_cpus_attribute =

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 03/10] net: Rewrite netif_reset_xps_queue to allow for better code reuse
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
  2013-01-10 18:56 ` [PATCH v2 01/10] net: Split core bits of netdev_pick_tx into __netdev_pick_tx Alexander Duyck
  2013-01-10 18:57 ` [PATCH v2 02/10] net: Add functions netif_reset_xps_queue and netif_set_xps_queue Alexander Duyck
@ 2013-01-10 18:57 ` Alexander Duyck
  2013-01-10 18:57 ` [PATCH v2 04/10] net: Rewrite netif_set_xps_queues to address several issues Alexander Duyck
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:57 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

This patch does a minor refactor on netif_reset_xps_queue to address a few
items I noticed.

First is the fact that we are doing removal of queues in both
netif_reset_xps_queue and netif_set_xps_queue.  Since there is no need to
have the code in two places I am pushing it out into a separate function
and will come back in another patch and reuse the code in
netif_set_xps_queue.

The second item this change addresses is the fact that the Tx queues were
not getting their numa_node value cleared as a part of the XPS queue reset.
This patch resolves that by resetting the numa_node value if the dev_maps
value is set.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 net/core/dev.c |   56 +++++++++++++++++++++++++++++++++-----------------------
 1 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 76126fb..fccee52 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1862,45 +1862,55 @@ static DEFINE_MUTEX(xps_map_mutex);
 #define xmap_dereference(P)		\
 	rcu_dereference_protected((P), lockdep_is_held(&xps_map_mutex))
 
-void netif_reset_xps_queue(struct net_device *dev, u16 index)
+static struct xps_map *remove_xps_queue(struct xps_dev_maps *dev_maps,
+					int cpu, u16 index)
 {
-	struct xps_dev_maps *dev_maps;
-	struct xps_map *map;
-	int i, pos, nonempty = 0;
-
-	mutex_lock(&xps_map_mutex);
-	dev_maps = xmap_dereference(dev->xps_maps);
-
-	if (!dev_maps)
-		goto out_no_maps;
+	struct xps_map *map = NULL;
+	int pos;
 
-	for_each_possible_cpu(i) {
-		map = xmap_dereference(dev_maps->cpu_map[i]);
-		if (!map)
-			continue;
-
-		for (pos = 0; pos < map->len; pos++)
-			if (map->queues[pos] == index)
-				break;
+	if (dev_maps)
+		map = xmap_dereference(dev_maps->cpu_map[cpu]);
 
-		if (pos < map->len) {
+	for (pos = 0; map && pos < map->len; pos++) {
+		if (map->queues[pos] == index) {
 			if (map->len > 1) {
 				map->queues[pos] = map->queues[--map->len];
 			} else {
-				RCU_INIT_POINTER(dev_maps->cpu_map[i], NULL);
+				RCU_INIT_POINTER(dev_maps->cpu_map[cpu], NULL);
 				kfree_rcu(map, rcu);
 				map = NULL;
 			}
+			break;
 		}
-		if (map)
-			nonempty = 1;
 	}
 
-	if (!nonempty) {
+	return map;
+}
+
+void netif_reset_xps_queue(struct net_device *dev, u16 index)
+{
+	struct xps_dev_maps *dev_maps;
+	int cpu;
+	bool active = false;
+
+	mutex_lock(&xps_map_mutex);
+	dev_maps = xmap_dereference(dev->xps_maps);
+
+	if (!dev_maps)
+		goto out_no_maps;
+
+	for_each_possible_cpu(cpu) {
+		if (remove_xps_queue(dev_maps, cpu, index))
+			active = true;
+	}
+
+	if (!active) {
 		RCU_INIT_POINTER(dev->xps_maps, NULL);
 		kfree_rcu(dev_maps, rcu);
 	}
 
+	netdev_queue_numa_node_write(netdev_get_tx_queue(dev, index),
+				     NUMA_NO_NODE);
 out_no_maps:
 	mutex_unlock(&xps_map_mutex);
 }

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 04/10] net: Rewrite netif_set_xps_queues to address several issues
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
                   ` (2 preceding siblings ...)
  2013-01-10 18:57 ` [PATCH v2 03/10] net: Rewrite netif_reset_xps_queue to allow for better code reuse Alexander Duyck
@ 2013-01-10 18:57 ` Alexander Duyck
  2013-01-10 18:57 ` [PATCH v2 05/10] net: Add support for XPS without sysfs being defined Alexander Duyck
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:57 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

This change is meant to address several issues I found within the
netif_set_xps_queues function.

If the allocation of one of the maps to be assigned to new_dev_maps failed
we could end up with the device map in an inconsistent state since we had
already worked through a number of CPUs and removed or added the queue.  To
address that I split the process into several steps.  The first of which is
just the allocation of updated maps for CPUs that will need larger maps to
store the queue.  By doing this we can fail gracefully without actually
altering the contents of the current device map.

The second issue I found was the fact that we were always allocating a new
device map even if we were not adding any queues.  I have updated the code
so that we only allocate a new device map if we are adding queues,
otherwise if we are not adding any queues to CPUs we just skip to the
removal process.

The last change I made was to reuse the code from remove_xps_queue to remove
the queue from the CPU.  By making this change we can be consistent in how
we go about adding and removing the queues from the CPUs.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 net/core/dev.c |  183 ++++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 117 insertions(+), 66 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index fccee52..715d8d4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1915,107 +1915,158 @@ out_no_maps:
 	mutex_unlock(&xps_map_mutex);
 }
 
+static struct xps_map *expand_xps_map(struct xps_map *map,
+				      int cpu, u16 index)
+{
+	struct xps_map *new_map;
+	int alloc_len = XPS_MIN_MAP_ALLOC;
+	int i, pos;
+
+	for (pos = 0; map && pos < map->len; pos++) {
+		if (map->queues[pos] != index)
+			continue;
+		return map;
+	}
+
+	/* Need to add queue to this CPU's existing map */
+	if (map) {
+		if (pos < map->alloc_len)
+			return map;
+
+		alloc_len = map->alloc_len * 2;
+	}
+
+	/* Need to allocate new map to store queue on this CPU's map */
+	new_map = kzalloc_node(XPS_MAP_SIZE(alloc_len), GFP_KERNEL,
+			       cpu_to_node(cpu));
+	if (!new_map)
+		return NULL;
+
+	for (i = 0; i < pos; i++)
+		new_map->queues[i] = map->queues[i];
+	new_map->alloc_len = alloc_len;
+	new_map->len = pos;
+
+	return new_map;
+}
+
 int netif_set_xps_queue(struct net_device *dev, struct cpumask *mask, u16 index)
 {
-	int i, cpu, pos, map_len, alloc_len, need_set;
+	struct xps_dev_maps *dev_maps, *new_dev_maps = NULL;
 	struct xps_map *map, *new_map;
-	struct xps_dev_maps *dev_maps, *new_dev_maps;
-	int nonempty = 0;
-	int numa_node_id = -2;
 	int maps_sz = max_t(unsigned int, XPS_DEV_MAPS_SIZE, L1_CACHE_BYTES);
-
-	new_dev_maps = kzalloc(maps_sz, GFP_KERNEL);
-	if (!new_dev_maps)
-		return -ENOMEM;
+	int cpu, numa_node_id = -2;
+	bool active = false;
 
 	mutex_lock(&xps_map_mutex);
 
 	dev_maps = xmap_dereference(dev->xps_maps);
 
+	/* allocate memory for queue storage */
+	for_each_online_cpu(cpu) {
+		if (!cpumask_test_cpu(cpu, mask))
+			continue;
+
+		if (!new_dev_maps)
+			new_dev_maps = kzalloc(maps_sz, GFP_KERNEL);
+		if (!new_dev_maps)
+			return -ENOMEM;
+
+		map = dev_maps ? xmap_dereference(dev_maps->cpu_map[cpu]) :
+				 NULL;
+
+		map = expand_xps_map(map, cpu, index);
+		if (!map)
+			goto error;
+
+		RCU_INIT_POINTER(new_dev_maps->cpu_map[cpu], map);
+	}
+
+	if (!new_dev_maps)
+		goto out_no_new_maps;
+
 	for_each_possible_cpu(cpu) {
-		map = dev_maps ?
-			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
-		new_map = map;
-		if (map) {
-			for (pos = 0; pos < map->len; pos++)
-				if (map->queues[pos] == index)
-					break;
-			map_len = map->len;
-			alloc_len = map->alloc_len;
-		} else
-			pos = map_len = alloc_len = 0;
+		if (cpumask_test_cpu(cpu, mask) && cpu_online(cpu)) {
+			/* add queue to CPU maps */
+			int pos = 0;
 
-		need_set = cpumask_test_cpu(cpu, mask) && cpu_online(cpu);
+			map = xmap_dereference(new_dev_maps->cpu_map[cpu]);
+			while ((pos < map->len) && (map->queues[pos] != index))
+				pos++;
+
+			if (pos == map->len)
+				map->queues[map->len++] = index;
 #ifdef CONFIG_NUMA
-		if (need_set) {
 			if (numa_node_id == -2)
 				numa_node_id = cpu_to_node(cpu);
 			else if (numa_node_id != cpu_to_node(cpu))
 				numa_node_id = -1;
-		}
 #endif
-		if (need_set && pos >= map_len) {
-			/* Need to add queue to this CPU's map */
-			if (map_len >= alloc_len) {
-				alloc_len = alloc_len ?
-				    2 * alloc_len : XPS_MIN_MAP_ALLOC;
-				new_map = kzalloc_node(XPS_MAP_SIZE(alloc_len),
-						       GFP_KERNEL,
-						       cpu_to_node(cpu));
-				if (!new_map)
-					goto error;
-				new_map->alloc_len = alloc_len;
-				for (i = 0; i < map_len; i++)
-					new_map->queues[i] = map->queues[i];
-				new_map->len = map_len;
-			}
-			new_map->queues[new_map->len++] = index;
-		} else if (!need_set && pos < map_len) {
-			/* Need to remove queue from this CPU's map */
-			if (map_len > 1)
-				new_map->queues[pos] =
-				    new_map->queues[--new_map->len];
-			else
-				new_map = NULL;
+		} else if (dev_maps) {
+			/* fill in the new device map from the old device map */
+			map = xmap_dereference(dev_maps->cpu_map[cpu]);
+			RCU_INIT_POINTER(new_dev_maps->cpu_map[cpu], map);
 		}
-		RCU_INIT_POINTER(new_dev_maps->cpu_map[cpu], new_map);
+
 	}
 
+	rcu_assign_pointer(dev->xps_maps, new_dev_maps);
+
 	/* Cleanup old maps */
-	for_each_possible_cpu(cpu) {
-		map = dev_maps ?
-			xmap_dereference(dev_maps->cpu_map[cpu]) : NULL;
-		if (map && xmap_dereference(new_dev_maps->cpu_map[cpu]) != map)
-			kfree_rcu(map, rcu);
-		if (new_dev_maps->cpu_map[cpu])
-			nonempty = 1;
-	}
+	if (dev_maps) {
+		for_each_possible_cpu(cpu) {
+			new_map = xmap_dereference(new_dev_maps->cpu_map[cpu]);
+			map = xmap_dereference(dev_maps->cpu_map[cpu]);
+			if (map && map != new_map)
+				kfree_rcu(map, rcu);
+		}
 
-	if (nonempty) {
-		rcu_assign_pointer(dev->xps_maps, new_dev_maps);
-	} else {
-		kfree(new_dev_maps);
-		RCU_INIT_POINTER(dev->xps_maps, NULL);
+		kfree_rcu(dev_maps, rcu);
 	}
 
-	if (dev_maps)
-		kfree_rcu(dev_maps, rcu);
+	dev_maps = new_dev_maps;
+	active = true;
 
+out_no_new_maps:
+	/* update Tx queue numa node */
 	netdev_queue_numa_node_write(netdev_get_tx_queue(dev, index),
 				     (numa_node_id >= 0) ? numa_node_id :
 				     NUMA_NO_NODE);
 
+	if (!dev_maps)
+		goto out_no_maps;
+
+	/* removes queue from unused CPUs */
+	for_each_possible_cpu(cpu) {
+		if (cpumask_test_cpu(cpu, mask) && cpu_online(cpu))
+			continue;
+
+		if (remove_xps_queue(dev_maps, cpu, index))
+			active = true;
+	}
+
+	/* free map if not active */
+	if (!active) {
+		RCU_INIT_POINTER(dev->xps_maps, NULL);
+		kfree_rcu(dev_maps, rcu);
+	}
+
+out_no_maps:
 	mutex_unlock(&xps_map_mutex);
 
 	return 0;
 error:
+	/* remove any maps that we added */
+	for_each_possible_cpu(cpu) {
+		new_map = xmap_dereference(new_dev_maps->cpu_map[cpu]);
+		map = dev_maps ? xmap_dereference(dev_maps->cpu_map[cpu]) :
+				 NULL;
+		if (new_map && new_map != map)
+			kfree(new_map);
+	}
+
 	mutex_unlock(&xps_map_mutex);
 
-	if (new_dev_maps)
-		for_each_possible_cpu(i)
-			kfree(rcu_dereference_protected(
-				new_dev_maps->cpu_map[i],
-				1));
 	kfree(new_dev_maps);
 	return -ENOMEM;
 }

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 05/10] net: Add support for XPS without sysfs being defined
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
                   ` (3 preceding siblings ...)
  2013-01-10 18:57 ` [PATCH v2 04/10] net: Rewrite netif_set_xps_queues to address several issues Alexander Duyck
@ 2013-01-10 18:57 ` Alexander Duyck
  2013-01-10 18:57 ` [RFC PATCH v2 06/10] ixgbe: Define FCoE and Flow director limits much sooner to allow for changes Alexander Duyck
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:57 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

This patch makes it so that we can support transmit packet steering without
sysfs needing to be enabled.  The reason for making this change is to make
it so that a driver can make use of the XPS even while the sysfs portion of
the interface is not present.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 include/linux/netdevice.h |    1 -
 net/Kconfig               |    2 +-
 net/core/dev.c            |   26 ++++++++++++++++++++------
 net/core/net-sysfs.c      |   14 --------------
 4 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 59fe9da..aa7ad8a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2104,7 +2104,6 @@ static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
 }
 
 #ifdef CONFIG_XPS
-extern void netif_reset_xps_queue(struct net_device *dev, u16 index);
 extern int netif_set_xps_queue(struct net_device *dev, struct cpumask *mask,
 			       u16 index);
 #else
diff --git a/net/Kconfig b/net/Kconfig
index 30b48f5..3cc5be0 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -232,7 +232,7 @@ config RFS_ACCEL
 
 config XPS
 	boolean
-	depends on SMP && SYSFS && USE_GENERIC_SMP_HELPERS
+	depends on SMP && USE_GENERIC_SMP_HELPERS
 	default y
 
 config NETPRIO_CGROUP
diff --git a/net/core/dev.c b/net/core/dev.c
index 715d8d4..095d22f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1887,10 +1887,10 @@ static struct xps_map *remove_xps_queue(struct xps_dev_maps *dev_maps,
 	return map;
 }
 
-void netif_reset_xps_queue(struct net_device *dev, u16 index)
+static void netif_reset_xps_queues_gt(struct net_device *dev, u16 index)
 {
 	struct xps_dev_maps *dev_maps;
-	int cpu;
+	int cpu, i;
 	bool active = false;
 
 	mutex_lock(&xps_map_mutex);
@@ -1900,7 +1900,11 @@ void netif_reset_xps_queue(struct net_device *dev, u16 index)
 		goto out_no_maps;
 
 	for_each_possible_cpu(cpu) {
-		if (remove_xps_queue(dev_maps, cpu, index))
+		for (i = index; i < dev->num_tx_queues; i++) {
+			if (!remove_xps_queue(dev_maps, cpu, i))
+				break;
+		}
+		if (i == dev->num_tx_queues)
 			active = true;
 	}
 
@@ -1909,8 +1913,10 @@ void netif_reset_xps_queue(struct net_device *dev, u16 index)
 		kfree_rcu(dev_maps, rcu);
 	}
 
-	netdev_queue_numa_node_write(netdev_get_tx_queue(dev, index),
-				     NUMA_NO_NODE);
+	for (i = index; i < dev->num_tx_queues; i++)
+		netdev_queue_numa_node_write(netdev_get_tx_queue(dev, i),
+					     NUMA_NO_NODE);
+
 out_no_maps:
 	mutex_unlock(&xps_map_mutex);
 }
@@ -2096,8 +2102,12 @@ int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
 		if (dev->num_tc)
 			netif_setup_tc(dev, txq);
 
-		if (txq < dev->real_num_tx_queues)
+		if (txq < dev->real_num_tx_queues) {
 			qdisc_reset_all_tx_gt(dev, txq);
+#ifdef CONFIG_XPS
+			netif_reset_xps_queues_gt(dev, txq);
+#endif
+		}
 	}
 
 	dev->real_num_tx_queues = txq;
@@ -5899,6 +5909,10 @@ static void rollback_registered_many(struct list_head *head)
 
 		/* Remove entries from kobject tree */
 		netdev_unregister_kobject(dev);
+#ifdef CONFIG_XPS
+		/* Remove XPS queueing entries */
+		netif_reset_xps_queues_gt(dev, 0);
+#endif
 	}
 
 	synchronize_net();
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 5ad489d..a5b89a6 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1002,16 +1002,6 @@ static ssize_t show_xps_map(struct netdev_queue *queue,
 	return len;
 }
 
-static void xps_queue_release(struct netdev_queue *queue)
-{
-	struct net_device *dev = queue->dev;
-	unsigned long index;
-
-	index = get_netdev_queue_index(queue);
-
-	netif_reset_xps_queue(dev, index);
-}
-
 static ssize_t store_xps_map(struct netdev_queue *queue,
 		      struct netdev_queue_attribute *attribute,
 		      const char *buf, size_t len)
@@ -1058,10 +1048,6 @@ static void netdev_queue_release(struct kobject *kobj)
 {
 	struct netdev_queue *queue = to_netdev_queue(kobj);
 
-#ifdef CONFIG_XPS
-	xps_queue_release(queue);
-#endif
-
 	memset(kobj, 0, sizeof(*kobj));
 	dev_put(queue->dev);
 }

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 06/10] ixgbe: Define FCoE and Flow director limits much sooner to allow for changes
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
                   ` (4 preceding siblings ...)
  2013-01-10 18:57 ` [PATCH v2 05/10] net: Add support for XPS without sysfs being defined Alexander Duyck
@ 2013-01-10 18:57 ` Alexander Duyck
  2013-01-10 18:58 ` [RFC PATCH v2 07/10] ixgbe: Add function for setting XPS queue mapping Alexander Duyck
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:57 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

Instead of adjusting the FCoE and Flow director limits based on the number
of CPUs we can define them much sooner.  This allows the user to come
through later and adjust them once we have updated the code to support the
set_channels ethtool operation.

I am still allowing for FCoE and RSS queues to be separated if the number
queues is less than the number of CPUs.  This essentially treats the two
groupings like they are two separate traffic classes.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  |    7 +------
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   12 ++++++++----
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index 8c74f73..4b1eeb6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -386,7 +386,6 @@ static bool ixgbe_set_dcb_sriov_queues(struct ixgbe_adapter *adapter)
 		fcoe = &adapter->ring_feature[RING_F_FCOE];
 
 		/* limit ourselves based on feature limits */
-		fcoe_i = min_t(u16, fcoe_i, num_online_cpus());
 		fcoe_i = min_t(u16, fcoe_i, fcoe->limit);
 
 		if (fcoe_i) {
@@ -562,9 +561,6 @@ static bool ixgbe_set_sriov_queues(struct ixgbe_adapter *adapter)
 		fcoe_i = min_t(u16, fcoe_i, fcoe->limit);
 
 		if (vmdq_i > 1 && fcoe_i) {
-			/* reserve no more than number of CPUs */
-			fcoe_i = min_t(u16, fcoe_i, num_online_cpus());
-
 			/* alloc queues for FCoE separately */
 			fcoe->indices = fcoe_i;
 			fcoe->offset = vmdq_i * rss_i;
@@ -623,8 +619,7 @@ static bool ixgbe_set_rss_queues(struct ixgbe_adapter *adapter)
 	if (rss_i > 1 && adapter->atr_sample_rate) {
 		f = &adapter->ring_feature[RING_F_FDIR];
 
-		f->indices = min_t(u16, num_online_cpus(), f->limit);
-		rss_i = max_t(u16, rss_i, f->indices);
+		rss_i = f->indices = f->limit;
 
 		if (!(adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE))
 			adapter->flags |= IXGBE_FLAG_FDIR_HASH_CAPABLE;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 20d6764..810f2b8 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -4466,7 +4466,7 @@ static int ixgbe_sw_init(struct ixgbe_adapter *adapter)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
-	unsigned int rss;
+	unsigned int rss, fdir;
 	u32 fwsm;
 #ifdef CONFIG_IXGBE_DCB
 	int j;
@@ -4502,8 +4502,8 @@ static int ixgbe_sw_init(struct ixgbe_adapter *adapter)
 			adapter->flags2 |= IXGBE_FLAG2_TEMP_SENSOR_CAPABLE;
 		/* Flow Director hash filters enabled */
 		adapter->atr_sample_rate = 20;
-		adapter->ring_feature[RING_F_FDIR].limit =
-							 IXGBE_MAX_FDIR_INDICES;
+		fdir = min_t(int, IXGBE_MAX_FDIR_INDICES, num_online_cpus());
+		adapter->ring_feature[RING_F_FDIR].limit = fdir;
 		adapter->fdir_pballoc = IXGBE_FDIR_PBALLOC_64K;
 #ifdef IXGBE_FCOE
 		adapter->flags |= IXGBE_FLAG_FCOE_CAPABLE;
@@ -7410,13 +7410,17 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 #ifdef IXGBE_FCOE
 	if (adapter->flags & IXGBE_FLAG_FCOE_CAPABLE) {
+		unsigned int fcoe_l;
+
 		if (hw->mac.ops.get_device_caps) {
 			hw->mac.ops.get_device_caps(hw, &device_caps);
 			if (device_caps & IXGBE_DEVICE_CAPS_FCOE_OFFLOADS)
 				adapter->flags &= ~IXGBE_FLAG_FCOE_CAPABLE;
 		}
 
-		adapter->ring_feature[RING_F_FCOE].limit = IXGBE_FCRETA_SIZE;
+
+		fcoe_l = min_t(int, IXGBE_FCRETA_SIZE, num_online_cpus());
+		adapter->ring_feature[RING_F_FCOE].limit = fcoe_l;
 
 		netdev->features |= NETIF_F_FSO |
 				    NETIF_F_FCOE_CRC;

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 07/10] ixgbe: Add function for setting XPS queue mapping
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
                   ` (5 preceding siblings ...)
  2013-01-10 18:57 ` [RFC PATCH v2 06/10] ixgbe: Define FCoE and Flow director limits much sooner to allow for changes Alexander Duyck
@ 2013-01-10 18:58 ` Alexander Duyck
  2013-01-10 18:58 ` [RFC PATCH v2 08/10] ixgbe: Update ixgbe driver to use __netdev_pick_tx in ixgbe_select_queue Alexander Duyck
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:58 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

This change adds support for ixgbe to configure the XPS queue mapping on
load.  The result of this change is that on open we will now be resetting
the number of Tx queues, and then setting the default configuration for XPS
based on if ATR is enabled or disabled.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h      |    1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c  |   14 +++++++++-----
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   10 ++++++++++
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 8e78676..590f042 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -195,6 +195,7 @@ struct ixgbe_rx_queue_stats {
 
 enum ixgbe_ring_state_t {
 	__IXGBE_TX_FDIR_INIT_DONE,
+	__IXGBE_TX_XPS_INIT_DONE,
 	__IXGBE_TX_DETECT_HANG,
 	__IXGBE_HANG_CHECK_ARMED,
 	__IXGBE_RX_RSC_ENABLED,
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index 4b1eeb6..ab01662 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -771,19 +771,23 @@ static int ixgbe_alloc_q_vector(struct ixgbe_adapter *adapter,
 {
 	struct ixgbe_q_vector *q_vector;
 	struct ixgbe_ring *ring;
-	int node = -1;
+	int node = NUMA_NO_NODE;
 	int cpu = -1;
 	int ring_count, size;
+	u8 tcs = netdev_get_num_tc(adapter->netdev);
 
 	ring_count = txr_count + rxr_count;
 	size = sizeof(struct ixgbe_q_vector) +
 	       (sizeof(struct ixgbe_ring) * ring_count);
 
 	/* customize cpu for Flow Director mapping */
-	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) {
-		if (cpu_online(v_idx)) {
-			cpu = v_idx;
-			node = cpu_to_node(cpu);
+	if ((tcs <= 1) && !(adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)) {
+		u16 rss_i = adapter->ring_feature[RING_F_RSS].indices;
+		if (rss_i > 1 && adapter->atr_sample_rate) {
+			if (cpu_online(v_idx)) {
+				cpu = v_idx;
+				node = cpu_to_node(cpu);
+			}
 		}
 	}
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 810f2b8..8ca9794 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2814,6 +2814,16 @@ void ixgbe_configure_tx_ring(struct ixgbe_adapter *adapter,
 		ring->atr_sample_rate = 0;
 	}
 
+	/* initialize XPS */
+	if (!test_and_set_bit(__IXGBE_TX_XPS_INIT_DONE, &ring->state)) {
+		struct ixgbe_q_vector *q_vector = ring->q_vector;
+
+		if (q_vector)
+			netif_set_xps_queue(adapter->netdev,
+					    &q_vector->affinity_mask,
+					    ring->queue_index);
+	}
+
 	clear_bit(__IXGBE_HANG_CHECK_ARMED, &ring->state);
 
 	/* enable queue */

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 08/10] ixgbe: Update ixgbe driver to use __netdev_pick_tx in ixgbe_select_queue
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
                   ` (6 preceding siblings ...)
  2013-01-10 18:58 ` [RFC PATCH v2 07/10] ixgbe: Add function for setting XPS queue mapping Alexander Duyck
@ 2013-01-10 18:58 ` Alexander Duyck
  2013-01-10 18:58 ` [RFC PATCH v2 09/10] ixgbe: Add support for displaying the number of Tx/Rx channels Alexander Duyck
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:58 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

This change updates the ixgbe driver to use __netdev_pick_tx instead of
the current logic it is using to select a queue.  The main result of this
change is that ixgbe can now fully support XPS, and in the case of non-FCoE
enabled configs it means we don't need to have our own ndo_select_queue.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   48 ++++++++++++++-----------
 1 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 8ca9794..4c0bcd8 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -6363,38 +6363,40 @@ static inline int ixgbe_maybe_stop_tx(struct ixgbe_ring *tx_ring, u16 size)
 	return __ixgbe_maybe_stop_tx(tx_ring, size);
 }
 
+#ifdef IXGBE_FCOE
 static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb)
 {
-	struct ixgbe_adapter *adapter = netdev_priv(dev);
-	int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
-					       smp_processor_id();
-#ifdef IXGBE_FCOE
-	__be16 protocol = vlan_get_protocol(skb);
+	struct ixgbe_adapter *adapter;
+	struct ixgbe_ring_feature *f;
+	int txq;
 
-	if (((protocol == htons(ETH_P_FCOE)) ||
-	    (protocol == htons(ETH_P_FIP))) &&
-	    (adapter->flags & IXGBE_FLAG_FCOE_ENABLED)) {
-		struct ixgbe_ring_feature *f;
+	/*
+	 * only execute the code below if protocol is FCoE
+	 * or FIP and we have FCoE enabled on the adapter
+	 */
+	switch (vlan_get_protocol(skb)) {
+	case __constant_htons(ETH_P_FCOE):
+	case __constant_htons(ETH_P_FIP):
+		adapter = netdev_priv(dev);
 
-		f = &adapter->ring_feature[RING_F_FCOE];
+		if (adapter->flags & IXGBE_FLAG_FCOE_ENABLED)
+			break;
+	default:
+		return __netdev_pick_tx(dev, skb);
+	}
 
-		while (txq >= f->indices)
-			txq -= f->indices;
-		txq += adapter->ring_feature[RING_F_FCOE].offset;
+	f = &adapter->ring_feature[RING_F_FCOE];
 
-		return txq;
-	}
-#endif
+	txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
+					   smp_processor_id();
 
-	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) {
-		while (unlikely(txq >= dev->real_num_tx_queues))
-			txq -= dev->real_num_tx_queues;
-		return txq;
-	}
+	while (txq >= f->indices)
+		txq -= f->indices;
 
-	return skb_tx_hash(dev, skb);
+	return txq + f->offset;
 }
 
+#endif
 netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 			  struct ixgbe_adapter *adapter,
 			  struct ixgbe_ring *tx_ring)
@@ -7092,7 +7094,9 @@ static const struct net_device_ops ixgbe_netdev_ops = {
 	.ndo_open		= ixgbe_open,
 	.ndo_stop		= ixgbe_close,
 	.ndo_start_xmit		= ixgbe_xmit_frame,
+#ifdef IXGBE_FCOE
 	.ndo_select_queue	= ixgbe_select_queue,
+#endif
 	.ndo_set_rx_mode	= ixgbe_set_rx_mode,
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_set_mac_address	= ixgbe_set_mac,

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 09/10] ixgbe: Add support for displaying the number of Tx/Rx channels
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
                   ` (7 preceding siblings ...)
  2013-01-10 18:58 ` [RFC PATCH v2 08/10] ixgbe: Update ixgbe driver to use __netdev_pick_tx in ixgbe_select_queue Alexander Duyck
@ 2013-01-10 18:58 ` Alexander Duyck
  2013-01-10 18:58 ` [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation Alexander Duyck
  2013-01-11  6:47 ` [PATCH v2 00/10] Make XPS usable within ixgbe David Miller
  10 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:58 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

This patch adds support for the ethtool get_channels operation.

Since the ixgbe driver has to support DCB as well as the other modes the
assumption I made here is that the number of channels in DCB modes refers
to the number of queues per traffic class, not the number of queues total.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |   72 ++++++++++++++++++++++
 1 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 3268584..688effc 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -2704,6 +2704,77 @@ static int ixgbe_get_ts_info(struct net_device *dev,
 	return 0;
 }
 
+static unsigned int ixgbe_max_channels(struct ixgbe_adapter *adapter)
+{
+	unsigned int max_combined;
+	u8 tcs = netdev_get_num_tc(adapter->netdev);
+
+	if (!(adapter->flags & IXGBE_FLAG_MSIX_ENABLED)) {
+		/* We only support one q_vector without MSI-X */
+		max_combined = 1;
+	} else if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) {
+		/* SR-IOV currently only allows one queue on the PF */
+		max_combined = 1;
+	} else if (tcs > 1) {
+		/* For DCB report channels per traffic class */
+		if (adapter->hw.mac.type == ixgbe_mac_82598EB) {
+			/* 8 TC w/ 4 queues per TC */
+			max_combined = 4;
+		} else if (tcs > 4) {
+			/* 8 TC w/ 8 queues per TC */
+			max_combined = 8;
+		} else {
+			/* 4 TC w/ 16 queues per TC */
+			max_combined = 16;
+		}
+	} else if (adapter->atr_sample_rate) {
+		/* support up to 64 queues with ATR */
+		max_combined = IXGBE_MAX_FDIR_INDICES;
+	} else {
+		/* support up to 16 queues with RSS */
+		max_combined = IXGBE_MAX_RSS_INDICES;
+	}
+
+	return max_combined;
+}
+
+static void ixgbe_get_channels(struct net_device *dev,
+			       struct ethtool_channels *ch)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(dev);
+
+	/* report maximum channels */
+	ch->max_combined = ixgbe_max_channels(adapter);
+
+	/* report info for other vector */
+	if (adapter->flags & IXGBE_FLAG_MSIX_ENABLED) {
+		ch->max_other = NON_Q_VECTORS;
+		ch->other_count = NON_Q_VECTORS;
+	}
+
+	/* record RSS queues */
+	ch->combined_count = adapter->ring_feature[RING_F_RSS].indices;
+
+	/* nothing else to report if RSS is disabled */
+	if (ch->combined_count == 1)
+		return;
+
+	/* we do not support ATR queueing if SR-IOV is enabled */
+	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
+		return;
+
+	/* same thing goes for being DCB enabled */
+	if (netdev_get_num_tc(dev) > 1)
+		return;
+
+	/* if ATR is disabled we can exit */
+	if (!adapter->atr_sample_rate)
+		return;
+
+	/* report flow director queues as maximum channels */
+	ch->combined_count = adapter->ring_feature[RING_F_FDIR].indices;
+}
+
 static const struct ethtool_ops ixgbe_ethtool_ops = {
 	.get_settings           = ixgbe_get_settings,
 	.set_settings           = ixgbe_set_settings,
@@ -2733,6 +2804,7 @@ static const struct ethtool_ops ixgbe_ethtool_ops = {
 	.get_rxnfc		= ixgbe_get_rxnfc,
 	.set_rxnfc		= ixgbe_set_rxnfc,
 	.get_ts_info		= ixgbe_get_ts_info,
+	.get_channels		= ixgbe_get_channels,
 };
 
 void ixgbe_set_ethtool_ops(struct net_device *netdev)

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
                   ` (8 preceding siblings ...)
  2013-01-10 18:58 ` [RFC PATCH v2 09/10] ixgbe: Add support for displaying the number of Tx/Rx channels Alexander Duyck
@ 2013-01-10 18:58 ` Alexander Duyck
  2013-01-16 16:19   ` Ben Hutchings
  2013-01-11  6:47 ` [PATCH v2 00/10] Make XPS usable within ixgbe David Miller
  10 siblings, 1 reply; 18+ messages in thread
From: Alexander Duyck @ 2013-01-10 18:58 UTC (permalink / raw)
  To: netdev; +Cc: bhutchings, therbert, ycai, eric.dumazet, davem

This change adds support for the ethtool set_channels operation.

Since the ixgbe driver has to support DCB as well as the other modes the
assumption I made here is that the number of channels in DCB modes refers
to the number of queues per traffic class, not the number of queues total.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |    2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |   40 ++++++++++++++++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |    7 +++-
 3 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 590f042..a431628 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -695,8 +695,8 @@ extern bool ixgbe_verify_lesm_fw_enabled_82599(struct ixgbe_hw *hw);
 extern void ixgbe_set_rx_mode(struct net_device *netdev);
 #ifdef CONFIG_IXGBE_DCB
 extern void ixgbe_set_rx_drop_en(struct ixgbe_adapter *adapter);
-extern int ixgbe_setup_tc(struct net_device *dev, u8 tc);
 #endif
+extern int ixgbe_setup_tc(struct net_device *dev, u8 tc);
 extern void ixgbe_tx_ctxtdesc(struct ixgbe_ring *, u32, u32, u32, u32);
 extern void ixgbe_do_reset(struct net_device *netdev);
 #ifdef CONFIG_IXGBE_HWMON
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 688effc..31998a3 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -2775,6 +2775,45 @@ static void ixgbe_get_channels(struct net_device *dev,
 	ch->combined_count = adapter->ring_feature[RING_F_FDIR].indices;
 }
 
+static int ixgbe_set_channels(struct net_device *dev,
+			      struct ethtool_channels *ch)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(dev);
+	unsigned int count = ch->combined_count;
+
+	/* verify they are not requesting separate vectors */
+	if (ch->rx_count || ch->tx_count)
+		return -EINVAL;
+
+	/* ignore other_count since it is not changeable */
+
+	/* verify we have at least one channel requested */
+	if (!count)
+		return -EINVAL;
+
+	/* verify the number of channels does not exceed hardware limits */
+	if (count > ixgbe_max_channels(adapter))
+		return -EINVAL;
+
+	/* update feature limits from largest to smallest supported values */
+	adapter->ring_feature[RING_F_FDIR].limit = count;
+
+	/* cap RSS limit at 16 */
+	if (count > IXGBE_MAX_RSS_INDICES)
+		count = IXGBE_MAX_RSS_INDICES;
+	adapter->ring_feature[RING_F_RSS].limit = count;
+
+#ifdef IXGBE_FCOE
+	/* cap FCoE limit at 8 */
+	if (count > IXGBE_FCRETA_SIZE)
+		count = IXGBE_FCRETA_SIZE;
+	adapter->ring_feature[RING_F_FCOE].limit = count;
+
+#endif
+	/* use setup TC to update any traffic class queue mapping */
+	return ixgbe_setup_tc(dev, netdev_get_num_tc(dev));
+}
+
 static const struct ethtool_ops ixgbe_ethtool_ops = {
 	.get_settings           = ixgbe_get_settings,
 	.set_settings           = ixgbe_set_settings,
@@ -2805,6 +2844,7 @@ static const struct ethtool_ops ixgbe_ethtool_ops = {
 	.set_rxnfc		= ixgbe_set_rxnfc,
 	.get_ts_info		= ixgbe_get_ts_info,
 	.get_channels		= ixgbe_get_channels,
+	.set_channels		= ixgbe_set_channels,
 };
 
 void ixgbe_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 4c0bcd8..798bbf6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -6796,6 +6796,7 @@ static void ixgbe_set_prio_tc_map(struct ixgbe_adapter *adapter)
 	}
 }
 
+#endif /* CONFIG_IXGBE_DCB */
 /**
  * ixgbe_setup_tc - configure net_device for multiple traffic classes
  *
@@ -6821,6 +6822,7 @@ int ixgbe_setup_tc(struct net_device *dev, u8 tc)
 		ixgbe_close(dev);
 	ixgbe_clear_interrupt_scheme(adapter);
 
+#ifdef CONFIG_IXGBE_DCB
 	if (tc) {
 		netdev_set_num_tc(dev, tc);
 		ixgbe_set_prio_tc_map(adapter);
@@ -6843,15 +6845,16 @@ int ixgbe_setup_tc(struct net_device *dev, u8 tc)
 		adapter->dcb_cfg.pfc_mode_enable = false;
 	}
 
-	ixgbe_init_interrupt_scheme(adapter);
 	ixgbe_validate_rtr(adapter, tc);
+
+#endif /* CONFIG_IXGBE_DCB */
+	ixgbe_init_interrupt_scheme(adapter);
 	if (netif_running(dev))
 		ixgbe_open(dev);
 
 	return 0;
 }
 
-#endif /* CONFIG_IXGBE_DCB */
 void ixgbe_do_reset(struct net_device *netdev)
 {
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/10] Make XPS usable within ixgbe
  2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
                   ` (9 preceding siblings ...)
  2013-01-10 18:58 ` [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation Alexander Duyck
@ 2013-01-11  6:47 ` David Miller
  10 siblings, 0 replies; 18+ messages in thread
From: David Miller @ 2013-01-11  6:47 UTC (permalink / raw)
  To: alexander.h.duyck; +Cc: netdev, bhutchings, therbert, ycai, eric.dumazet

From: Alexander Duyck <alexander.h.duyck@intel.com>
Date: Thu, 10 Jan 2013 10:56:03 -0800

> The following patch series makes it so that the ixgbe driver can support
> ATR even when the number of queues is less than the number of CPUs.  To do
> this I have updated the kernel to support letting drivers set their own XPS
> configuration.  To do this it was necessary to move the code out of the
> sysfs specific code and into the dev specific regions.
> 
> It has been almost 6 months since the last time I released these patches.  I
> have been meaning to get back to them but I just haven't had a chance to get
> this work done until recently.
> 
> Only the core net patches are meant for acceptance at this time.  I will
> submit the ixgbe patches through Jeff Kirsher's tree.  For this reason I have
> tagged the ixgbe related patches as RFC.

Patches 1-5 look fine, all applied, thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation
  2013-01-10 18:58 ` [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation Alexander Duyck
@ 2013-01-16 16:19   ` Ben Hutchings
  2013-01-16 16:30     ` Waskiewicz Jr, Peter P
  2013-01-17  0:35     ` Alexander Duyck
  0 siblings, 2 replies; 18+ messages in thread
From: Ben Hutchings @ 2013-01-16 16:19 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, therbert, ycai, eric.dumazet, davem

On Thu, 2013-01-10 at 10:58 -0800, Alexander Duyck wrote:
> This change adds support for the ethtool set_channels operation.
> 
> Since the ixgbe driver has to support DCB as well as the other modes the
> assumption I made here is that the number of channels in DCB modes refers
> to the number of queues per traffic class, not the number of queues total.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>

In DCB mode are there separate IRQs for the different classes?

[...]
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> @@ -2775,6 +2775,45 @@ static void ixgbe_get_channels(struct net_device *dev,
>  	ch->combined_count = adapter->ring_feature[RING_F_FDIR].indices;
>  }
>  
> +static int ixgbe_set_channels(struct net_device *dev,
> +			      struct ethtool_channels *ch)
> +{
> +	struct ixgbe_adapter *adapter = netdev_priv(dev);
> +	unsigned int count = ch->combined_count;
> +
> +	/* verify they are not requesting separate vectors */
> +	if (ch->rx_count || ch->tx_count)
> +		return -EINVAL;
> +
> +	/* ignore other_count since it is not changeable */
[...]

Please do return an error if the command specifies a change to
other_count.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation
  2013-01-16 16:19   ` Ben Hutchings
@ 2013-01-16 16:30     ` Waskiewicz Jr, Peter P
  2013-01-23 21:20       ` Ben Hutchings
  2013-01-17  0:35     ` Alexander Duyck
  1 sibling, 1 reply; 18+ messages in thread
From: Waskiewicz Jr, Peter P @ 2013-01-16 16:30 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Duyck, Alexander H, netdev, therbert, ycai, eric.dumazet, davem

On Wed, 2013-01-16 at 16:19 +0000, Ben Hutchings wrote:
> On Thu, 2013-01-10 at 10:58 -0800, Alexander Duyck wrote:
> > This change adds support for the ethtool set_channels operation.
> > 
> > Since the ixgbe driver has to support DCB as well as the other modes the
> > assumption I made here is that the number of channels in DCB modes refers
> > to the number of queues per traffic class, not the number of queues total.
> > 
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> In DCB mode are there separate IRQs for the different classes?

Yes.  The Rx packet buffer is split into multiple packet buffers, one
for each online class.  After that, it's just queues assigned to the
packet buffers, and interrupts assigned however you want them to be.

Cheers,
-PJ

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation
  2013-01-16 16:19   ` Ben Hutchings
  2013-01-16 16:30     ` Waskiewicz Jr, Peter P
@ 2013-01-17  0:35     ` Alexander Duyck
  1 sibling, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2013-01-17  0:35 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, therbert, ycai, eric.dumazet, davem

On 01/16/2013 08:19 AM, Ben Hutchings wrote:
> On Thu, 2013-01-10 at 10:58 -0800, Alexander Duyck wrote:
>> This change adds support for the ethtool set_channels operation.
>>
>> Since the ixgbe driver has to support DCB as well as the other modes the
>> assumption I made here is that the number of channels in DCB modes refers
>> to the number of queues per traffic class, not the number of queues total.
>>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> In DCB mode are there separate IRQs for the different classes?
> 
> [...]
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
>> @@ -2775,6 +2775,45 @@ static void ixgbe_get_channels(struct net_device *dev,
>>  	ch->combined_count = adapter->ring_feature[RING_F_FDIR].indices;
>>  }
>>  
>> +static int ixgbe_set_channels(struct net_device *dev,
>> +			      struct ethtool_channels *ch)
>> +{
>> +	struct ixgbe_adapter *adapter = netdev_priv(dev);
>> +	unsigned int count = ch->combined_count;
>> +
>> +	/* verify they are not requesting separate vectors */
>> +	if (ch->rx_count || ch->tx_count)
>> +		return -EINVAL;
>> +
>> +	/* ignore other_count since it is not changeable */
> [...]
> 
> Please do return an error if the command specifies a change to
> other_count.
> 
> Ben.
> 

I will update the patch to return an error if other_count is not equal
to NON_Q_VECTORS.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation
  2013-01-16 16:30     ` Waskiewicz Jr, Peter P
@ 2013-01-23 21:20       ` Ben Hutchings
  2013-01-23 22:31         ` Alexander Duyck
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Hutchings @ 2013-01-23 21:20 UTC (permalink / raw)
  To: Waskiewicz Jr, Peter P
  Cc: Duyck, Alexander H, netdev, therbert, ycai, eric.dumazet, davem

On Wed, 2013-01-16 at 16:30 +0000, Waskiewicz Jr, Peter P wrote:
> On Wed, 2013-01-16 at 16:19 +0000, Ben Hutchings wrote:
> > On Thu, 2013-01-10 at 10:58 -0800, Alexander Duyck wrote:
> > > This change adds support for the ethtool set_channels operation.
> > > 
> > > Since the ixgbe driver has to support DCB as well as the other modes the
> > > assumption I made here is that the number of channels in DCB modes refers
> > > to the number of queues per traffic class, not the number of queues total.
> > > 
> > > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> > 
> > In DCB mode are there separate IRQs for the different classes?
> 
> Yes.  The Rx packet buffer is split into multiple packet buffers, one
> for each online class.  After that, it's just queues assigned to the
> packet buffers, and interrupts assigned however you want them to be.

Right, I think we've been through this before.  And I can see how it
would be more useful for users to specify number of RX queues per
priority level.  But that's not what was specified...

I'm afraid the 'channels' ethtool operations have turned into a mess...
I can't see how to get to a reasonable generic definition of what they
should do.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation
  2013-01-23 21:20       ` Ben Hutchings
@ 2013-01-23 22:31         ` Alexander Duyck
  2013-01-23 23:48           ` John Fastabend
  0 siblings, 1 reply; 18+ messages in thread
From: Alexander Duyck @ 2013-01-23 22:31 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Waskiewicz Jr, Peter P, netdev, therbert, ycai, eric.dumazet, davem

On 01/23/2013 01:20 PM, Ben Hutchings wrote:
> On Wed, 2013-01-16 at 16:30 +0000, Waskiewicz Jr, Peter P wrote:
>> On Wed, 2013-01-16 at 16:19 +0000, Ben Hutchings wrote:
>>> On Thu, 2013-01-10 at 10:58 -0800, Alexander Duyck wrote:
>>>> This change adds support for the ethtool set_channels operation.
>>>>
>>>> Since the ixgbe driver has to support DCB as well as the other modes the
>>>> assumption I made here is that the number of channels in DCB modes refers
>>>> to the number of queues per traffic class, not the number of queues total.
>>>>
>>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>>> In DCB mode are there separate IRQs for the different classes?
>> Yes.  The Rx packet buffer is split into multiple packet buffers, one
>> for each online class.  After that, it's just queues assigned to the
>> packet buffers, and interrupts assigned however you want them to be.
> Right, I think we've been through this before.  And I can see how it
> would be more useful for users to specify number of RX queues per
> priority level.  But that's not what was specified...
>
> I'm afraid the 'channels' ethtool operations have turned into a mess...
> I can't see how to get to a reasonable generic definition of what they
> should do.
>
> Ben.

Actually it looks like most of the drivers (I looked at bnx, bnx2x, tg3,
and qlcnic) are using the set_queues call in a similar way.  What they
end up doing is using the value and plugging it into their TSS/RSS
fields in their private structures.  From what I can tell in
bnx2x_setup_tc they may do exactly the same thing we are currently doing
for DCB since they use the BNX2X_NUM_ETH_QUEUES value that they set in
their set_channels call to set the number of queues they use per traffic
class.  I would say the usage is actually pretty consistent between
bnx2x and ixgbe based on this, even if it isn't exactly correct.

For now I would say all of the drivers are using the set_channels
operation to specify the number of Tx/Rx queues, or queue pairs per
traffic class.  So for non-DCB NICs this means it is setting exactly
that number of queues, and for DCB capable nics it means num_tcs times
the specified number of queues.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation
  2013-01-23 22:31         ` Alexander Duyck
@ 2013-01-23 23:48           ` John Fastabend
  0 siblings, 0 replies; 18+ messages in thread
From: John Fastabend @ 2013-01-23 23:48 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Ben Hutchings, Waskiewicz Jr, Peter P, netdev, therbert, ycai,
	eric.dumazet, davem

On 1/23/2013 2:31 PM, Alexander Duyck wrote:
> On 01/23/2013 01:20 PM, Ben Hutchings wrote:
>> On Wed, 2013-01-16 at 16:30 +0000, Waskiewicz Jr, Peter P wrote:
>>> On Wed, 2013-01-16 at 16:19 +0000, Ben Hutchings wrote:
>>>> On Thu, 2013-01-10 at 10:58 -0800, Alexander Duyck wrote:
>>>>> This change adds support for the ethtool set_channels operation.
>>>>>
>>>>> Since the ixgbe driver has to support DCB as well as the other modes the
>>>>> assumption I made here is that the number of channels in DCB modes refers
>>>>> to the number of queues per traffic class, not the number of queues total.
>>>>>
>>>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>>>> In DCB mode are there separate IRQs for the different classes?
>>> Yes.  The Rx packet buffer is split into multiple packet buffers, one
>>> for each online class.  After that, it's just queues assigned to the
>>> packet buffers, and interrupts assigned however you want them to be.
>> Right, I think we've been through this before.  And I can see how it
>> would be more useful for users to specify number of RX queues per
>> priority level.  But that's not what was specified...
>>
>> I'm afraid the 'channels' ethtool operations have turned into a mess...
>> I can't see how to get to a reasonable generic definition of what they
>> should do.
>>
>> Ben.
>
> Actually it looks like most of the drivers (I looked at bnx, bnx2x, tg3,
> and qlcnic) are using the set_queues call in a similar way.  What they
> end up doing is using the value and plugging it into their TSS/RSS
> fields in their private structures.  From what I can tell in
> bnx2x_setup_tc they may do exactly the same thing we are currently doing
> for DCB since they use the BNX2X_NUM_ETH_QUEUES value that they set in
> their set_channels call to set the number of queues they use per traffic
> class.  I would say the usage is actually pretty consistent between
> bnx2x and ixgbe based on this, even if it isn't exactly correct.
>
> For now I would say all of the drivers are using the set_channels
> operation to specify the number of Tx/Rx queues, or queue pairs per
> traffic class.  So for non-DCB NICs this means it is setting exactly
> that number of queues, and for DCB capable nics it means num_tcs times
> the specified number of queues.
>
> Thanks,
>
> Alex
>

Would it perhaps be cleaner to let set_channels set the absolute
number of tx/rx queue pairs regardless of number of tcs. Then
you could use the mqprio interface to divide those queues into
classes.

struct tc_mqprio_qopt {
         __u8    num_tc;
         __u8    prio_tc_map[TC_QOPT_BITMASK + 1];
         __u8    hw;
         __u16   count[TC_QOPT_MAX_QUEUE];
         __u16   offset[TC_QOPT_MAX_QUEUE];
};

The ndo_setup_tc op could pass the tc_mqprio_qopt structure to the
driver instead of just the number of tcs. This would allow changing
the number of queues per class depending on the traffic type expected
and set_channels then is consistent regardless of number of TCs.

Thanks,
John

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-01-23 23:48 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-10 18:56 [PATCH v2 00/10] Make XPS usable within ixgbe Alexander Duyck
2013-01-10 18:56 ` [PATCH v2 01/10] net: Split core bits of netdev_pick_tx into __netdev_pick_tx Alexander Duyck
2013-01-10 18:57 ` [PATCH v2 02/10] net: Add functions netif_reset_xps_queue and netif_set_xps_queue Alexander Duyck
2013-01-10 18:57 ` [PATCH v2 03/10] net: Rewrite netif_reset_xps_queue to allow for better code reuse Alexander Duyck
2013-01-10 18:57 ` [PATCH v2 04/10] net: Rewrite netif_set_xps_queues to address several issues Alexander Duyck
2013-01-10 18:57 ` [PATCH v2 05/10] net: Add support for XPS without sysfs being defined Alexander Duyck
2013-01-10 18:57 ` [RFC PATCH v2 06/10] ixgbe: Define FCoE and Flow director limits much sooner to allow for changes Alexander Duyck
2013-01-10 18:58 ` [RFC PATCH v2 07/10] ixgbe: Add function for setting XPS queue mapping Alexander Duyck
2013-01-10 18:58 ` [RFC PATCH v2 08/10] ixgbe: Update ixgbe driver to use __netdev_pick_tx in ixgbe_select_queue Alexander Duyck
2013-01-10 18:58 ` [RFC PATCH v2 09/10] ixgbe: Add support for displaying the number of Tx/Rx channels Alexander Duyck
2013-01-10 18:58 ` [RFC PATCH v2 10/10] ixgbe: Add support for set_channels ethtool operation Alexander Duyck
2013-01-16 16:19   ` Ben Hutchings
2013-01-16 16:30     ` Waskiewicz Jr, Peter P
2013-01-23 21:20       ` Ben Hutchings
2013-01-23 22:31         ` Alexander Duyck
2013-01-23 23:48           ` John Fastabend
2013-01-17  0:35     ` Alexander Duyck
2013-01-11  6:47 ` [PATCH v2 00/10] Make XPS usable within ixgbe David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.