All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
@ 2018-03-15 15:08 ` Paolo Abeni
  0 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 15:08 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jeff Kirsher, Eric Dumazet, intel-wired-lan,
	Alexander Duyck

Currently, most MQ netdevice setup the default XPS configuration mapping 1-1
the first real_num_tx_queues queues and CPUs and no mapping is created for
the CPUs with id greater then real_num_tx_queues, if any.

As a result, the xmit path for unconnected sockets on such cores experiences a
relevant overhead in netdev_pick_tx(), which needs to dissect each packet and
compute its hash.

Such scenario is easily triggered. e.g. from DNS server under relevant load, as
the user-space process is moved away from the CPUs serving the softirqs (note:
this is beneficial for the overall DNS server performances).

This series introduces an helper to easily setup up XPS mapping for all the
online CPUs, and use it in the ixgbe driver, demonstrating a relevant
performance improvement in the above scenario.

Paolo Abeni (2):
  net: introduce netif_set_xps()
  ixgbe: setup XPS via netif_set_xps()

 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |  1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 13 ++----
 include/linux/netdevice.h                        |  6 +++
 net/core/dev.c                                   | 58 ++++++++++++++++++------
 5 files changed, 55 insertions(+), 25 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
@ 2018-03-15 15:08 ` Paolo Abeni
  0 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 15:08 UTC (permalink / raw)
  To: intel-wired-lan

Currently, most MQ netdevice setup the default XPS configuration mapping 1-1
the first real_num_tx_queues queues and CPUs and no mapping is created for
the CPUs with id greater then real_num_tx_queues, if any.

As a result, the xmit path for unconnected sockets on such cores experiences a
relevant overhead in netdev_pick_tx(), which needs to dissect each packet and
compute its hash.

Such scenario is easily triggered. e.g. from DNS server under relevant load, as
the user-space process is moved away from the CPUs serving the softirqs (note:
this is beneficial for the overall DNS server performances).

This series introduces an helper to easily setup up XPS mapping for all the
online CPUs, and use it in the ixgbe driver, demonstrating a relevant
performance improvement in the above scenario.

Paolo Abeni (2):
  net: introduce netif_set_xps()
  ixgbe: setup XPS via netif_set_xps()

 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |  1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 13 ++----
 include/linux/netdevice.h                        |  6 +++
 net/core/dev.c                                   | 58 ++++++++++++++++++------
 5 files changed, 55 insertions(+), 25 deletions(-)

-- 
2.14.3


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH 1/2] net: introduce netif_set_xps()
  2018-03-15 15:08 ` [Intel-wired-lan] " Paolo Abeni
@ 2018-03-15 15:08   ` Paolo Abeni
  -1 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 15:08 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jeff Kirsher, Eric Dumazet, intel-wired-lan,
	Alexander Duyck

netif_set_xps() configures XPS on the given netdevice so
that XPS mapping exists for each online CPU. Also, factor out an
unlocked version of netif_set_xps_queue() to allow configuring
all the netdev queues acquiring the xps lock only once.

Netdevice can leverage such helper replacing all the per queue call
to netif_set_xps_queue() with a single netif_set_xps().

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/linux/netdevice.h |  6 +++++
 net/core/dev.c            | 58 +++++++++++++++++++++++++++++++++++------------
 2 files changed, 50 insertions(+), 14 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 5fbb9f1da7fd..95727ccf0865 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3193,6 +3193,7 @@ static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
 #ifdef CONFIG_XPS
 int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 			u16 index);
+int netif_set_xps(struct net_device *dev);
 #else
 static inline int netif_set_xps_queue(struct net_device *dev,
 				      const struct cpumask *mask,
@@ -3200,6 +3201,11 @@ static inline int netif_set_xps_queue(struct net_device *dev,
 {
 	return 0;
 }
+
+int netif_set_xps(struct net_device *dev)
+{
+	return 0;
+}
 #endif
 
 u16 __skb_tx_hash(const struct net_device *dev, struct sk_buff *skb,
diff --git a/net/core/dev.c b/net/core/dev.c
index 12a9aad0b057..5a8d3d9ef9b4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2177,8 +2177,8 @@ static struct xps_map *expand_xps_map(struct xps_map *map,
 	return new_map;
 }
 
-int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
-			u16 index)
+int __netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
+			  u16 index)
 {
 	struct xps_dev_maps *dev_maps, *new_dev_maps = NULL;
 	int i, cpu, tci, numa_node_id = -2;
@@ -2197,18 +2197,14 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 	if (maps_sz < L1_CACHE_BYTES)
 		maps_sz = L1_CACHE_BYTES;
 
-	mutex_lock(&xps_map_mutex);
-
 	dev_maps = xmap_dereference(dev->xps_maps);
 
 	/* allocate memory for queue storage */
 	for_each_cpu_and(cpu, cpu_online_mask, mask) {
 		if (!new_dev_maps)
 			new_dev_maps = kzalloc(maps_sz, GFP_KERNEL);
-		if (!new_dev_maps) {
-			mutex_unlock(&xps_map_mutex);
+		if (!new_dev_maps)
 			return -ENOMEM;
-		}
 
 		tci = cpu * num_tc + tc;
 		map = dev_maps ? xmap_dereference(dev_maps->cpu_map[tci]) :
@@ -2295,7 +2291,7 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 				     NUMA_NO_NODE);
 
 	if (!dev_maps)
-		goto out_no_maps;
+		return 0;
 
 	/* removes queue from unused CPUs */
 	for_each_possible_cpu(cpu) {
@@ -2312,11 +2308,8 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 		RCU_INIT_POINTER(dev->xps_maps, NULL);
 		kfree_rcu(dev_maps, rcu);
 	}
-
-out_no_maps:
-	mutex_unlock(&xps_map_mutex);
-
 	return 0;
+
 error:
 	/* remove any maps that we added */
 	for_each_possible_cpu(cpu) {
@@ -2330,13 +2323,50 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 		}
 	}
 
-	mutex_unlock(&xps_map_mutex);
-
 	kfree(new_dev_maps);
 	return -ENOMEM;
 }
+
+int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
+			u16 index)
+{
+	int ret;
+
+	mutex_lock(&xps_map_mutex);
+	ret = __netif_set_xps_queue(dev, mask, index);
+	mutex_unlock(&xps_map_mutex);
+	return ret;
+}
 EXPORT_SYMBOL(netif_set_xps_queue);
 
+int netif_set_xps(struct net_device *dev)
+{
+	cpumask_var_t queuemask;
+	int cpu, queue, err = 0;
+
+	if (!alloc_cpumask_var(&queuemask, GFP_KERNEL))
+		return -ENOMEM;
+
+	mutex_lock(&xps_map_mutex);
+	for (queue = 0; queue < dev->real_num_tx_queues; ++queue) {
+		cpumask_clear(queuemask);
+		for (cpu = queue; cpu < nr_cpu_ids;
+		     cpu += dev->real_num_tx_queues)
+			cpumask_set_cpu(cpu, queuemask);
+
+		err = __netif_set_xps_queue(dev, queuemask, queue);
+		if (err)
+			goto out;
+	}
+
+out:
+	mutex_unlock(&xps_map_mutex);
+
+	free_cpumask_var(queuemask);
+	return err;
+}
+EXPORT_SYMBOL(netif_set_xps);
+
 #endif
 void netdev_reset_tc(struct net_device *dev)
 {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 1/2] net: introduce netif_set_xps()
@ 2018-03-15 15:08   ` Paolo Abeni
  0 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 15:08 UTC (permalink / raw)
  To: intel-wired-lan

netif_set_xps() configures XPS on the given netdevice so
that XPS mapping exists for each online CPU. Also, factor out an
unlocked version of netif_set_xps_queue() to allow configuring
all the netdev queues acquiring the xps lock only once.

Netdevice can leverage such helper replacing all the per queue call
to netif_set_xps_queue() with a single netif_set_xps().

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/linux/netdevice.h |  6 +++++
 net/core/dev.c            | 58 +++++++++++++++++++++++++++++++++++------------
 2 files changed, 50 insertions(+), 14 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 5fbb9f1da7fd..95727ccf0865 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3193,6 +3193,7 @@ static inline void netif_wake_subqueue(struct net_device *dev, u16 queue_index)
 #ifdef CONFIG_XPS
 int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 			u16 index);
+int netif_set_xps(struct net_device *dev);
 #else
 static inline int netif_set_xps_queue(struct net_device *dev,
 				      const struct cpumask *mask,
@@ -3200,6 +3201,11 @@ static inline int netif_set_xps_queue(struct net_device *dev,
 {
 	return 0;
 }
+
+int netif_set_xps(struct net_device *dev)
+{
+	return 0;
+}
 #endif
 
 u16 __skb_tx_hash(const struct net_device *dev, struct sk_buff *skb,
diff --git a/net/core/dev.c b/net/core/dev.c
index 12a9aad0b057..5a8d3d9ef9b4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2177,8 +2177,8 @@ static struct xps_map *expand_xps_map(struct xps_map *map,
 	return new_map;
 }
 
-int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
-			u16 index)
+int __netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
+			  u16 index)
 {
 	struct xps_dev_maps *dev_maps, *new_dev_maps = NULL;
 	int i, cpu, tci, numa_node_id = -2;
@@ -2197,18 +2197,14 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 	if (maps_sz < L1_CACHE_BYTES)
 		maps_sz = L1_CACHE_BYTES;
 
-	mutex_lock(&xps_map_mutex);
-
 	dev_maps = xmap_dereference(dev->xps_maps);
 
 	/* allocate memory for queue storage */
 	for_each_cpu_and(cpu, cpu_online_mask, mask) {
 		if (!new_dev_maps)
 			new_dev_maps = kzalloc(maps_sz, GFP_KERNEL);
-		if (!new_dev_maps) {
-			mutex_unlock(&xps_map_mutex);
+		if (!new_dev_maps)
 			return -ENOMEM;
-		}
 
 		tci = cpu * num_tc + tc;
 		map = dev_maps ? xmap_dereference(dev_maps->cpu_map[tci]) :
@@ -2295,7 +2291,7 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 				     NUMA_NO_NODE);
 
 	if (!dev_maps)
-		goto out_no_maps;
+		return 0;
 
 	/* removes queue from unused CPUs */
 	for_each_possible_cpu(cpu) {
@@ -2312,11 +2308,8 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 		RCU_INIT_POINTER(dev->xps_maps, NULL);
 		kfree_rcu(dev_maps, rcu);
 	}
-
-out_no_maps:
-	mutex_unlock(&xps_map_mutex);
-
 	return 0;
+
 error:
 	/* remove any maps that we added */
 	for_each_possible_cpu(cpu) {
@@ -2330,13 +2323,50 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 		}
 	}
 
-	mutex_unlock(&xps_map_mutex);
-
 	kfree(new_dev_maps);
 	return -ENOMEM;
 }
+
+int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
+			u16 index)
+{
+	int ret;
+
+	mutex_lock(&xps_map_mutex);
+	ret = __netif_set_xps_queue(dev, mask, index);
+	mutex_unlock(&xps_map_mutex);
+	return ret;
+}
 EXPORT_SYMBOL(netif_set_xps_queue);
 
+int netif_set_xps(struct net_device *dev)
+{
+	cpumask_var_t queuemask;
+	int cpu, queue, err = 0;
+
+	if (!alloc_cpumask_var(&queuemask, GFP_KERNEL))
+		return -ENOMEM;
+
+	mutex_lock(&xps_map_mutex);
+	for (queue = 0; queue < dev->real_num_tx_queues; ++queue) {
+		cpumask_clear(queuemask);
+		for (cpu = queue; cpu < nr_cpu_ids;
+		     cpu += dev->real_num_tx_queues)
+			cpumask_set_cpu(cpu, queuemask);
+
+		err = __netif_set_xps_queue(dev, queuemask, queue);
+		if (err)
+			goto out;
+	}
+
+out:
+	mutex_unlock(&xps_map_mutex);
+
+	free_cpumask_var(queuemask);
+	return err;
+}
+EXPORT_SYMBOL(netif_set_xps);
+
 #endif
 void netdev_reset_tc(struct net_device *dev)
 {
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 2/2] ixgbe: setup XPS via netif_set_xps()
  2018-03-15 15:08 ` [Intel-wired-lan] " Paolo Abeni
@ 2018-03-15 15:08   ` Paolo Abeni
  -1 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 15:08 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jeff Kirsher, Eric Dumazet, intel-wired-lan,
	Alexander Duyck

Before this commit, ixgbe with the default setting lacks XPS mapping
for CPUs id greater than the number of tx queues.

As a consequence the xmit path for such CPUs experience a relevant cost
in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
tool:

7.55%--netdev_pick_tx
        |
        --6.92%--__netdev_pick_tx
                  |
                  --6.35%--__skb_tx_hash
                            |
                            --5.94%--__skb_get_hash
                                      |
                                      --3.22%--__skb_flow_dissect

in the following  scenario:

ethtool -L em1 combined 1
taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992       1   10.00     11497225      0       9.20

After this commit the perf tool reports:

0.85%--__netdev_pick_tx

and netperf reports:

MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992       1   10.00     12736058      0      10.19

roughly +10% in xmit tput.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |  1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 13 +++----------
 3 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index c1e3a0039ea5..04aaecce81d2 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -293,7 +293,6 @@ enum ixgbe_ring_state_t {
 	__IXGBE_RX_CSUM_UDP_ZERO_ERR,
 	__IXGBE_RX_FCOE,
 	__IXGBE_TX_FDIR_INIT_DONE,
-	__IXGBE_TX_XPS_INIT_DONE,
 	__IXGBE_TX_DETECT_HANG,
 	__IXGBE_HANG_CHECK_ARMED,
 	__IXGBE_TX_XDP_RING,
@@ -827,6 +826,7 @@ enum ixgbe_state_t {
 	__IXGBE_PTP_RUNNING,
 	__IXGBE_PTP_TX_IN_PROGRESS,
 	__IXGBE_RESET_REQUESTED,
+	__IXGBE_TX_XPS_INIT_DONE,
 };
 
 struct ixgbe_cb {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index c0e6ab42e0e1..da4e6416e8eb 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -3224,6 +3224,7 @@ static int ixgbe_set_channels(struct net_device *dev,
 
 #endif
 	/* use setup TC to update any traffic class queue mapping */
+	clear_bit(__IXGBE_TX_XPS_INIT_DONE, &adapter->state);
 	return ixgbe_setup_tc(dev, adapter->hw_tcs);
 }
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 85369423452d..5bd45fc737fa 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3566,16 +3566,6 @@ void ixgbe_configure_tx_ring(struct ixgbe_adapter *adapter,
 		ring->atr_sample_rate = 0;
 	}
 
-	/* initialize XPS */
-	if (!test_and_set_bit(__IXGBE_TX_XPS_INIT_DONE, &ring->state)) {
-		struct ixgbe_q_vector *q_vector = ring->q_vector;
-
-		if (q_vector)
-			netif_set_xps_queue(ring->netdev,
-					    &q_vector->affinity_mask,
-					    ring->queue_index);
-	}
-
 	clear_bit(__IXGBE_HANG_CHECK_ARMED, &ring->state);
 
 	/* reinitialize tx_buffer_info */
@@ -5626,6 +5616,9 @@ static void ixgbe_up_complete(struct ixgbe_adapter *adapter)
 	int err;
 	u32 ctrl_ext;
 
+	if (!test_and_set_bit(__IXGBE_TX_XPS_INIT_DONE, &adapter->state))
+		netif_set_xps(adapter->netdev);
+
 	ixgbe_get_hw_control(adapter);
 	ixgbe_setup_gpie(adapter);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 2/2] ixgbe: setup XPS via netif_set_xps()
@ 2018-03-15 15:08   ` Paolo Abeni
  0 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 15:08 UTC (permalink / raw)
  To: intel-wired-lan

Before this commit, ixgbe with the default setting lacks XPS mapping
for CPUs id greater than the number of tx queues.

As a consequence the xmit path for such CPUs experience a relevant cost
in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
tool:

7.55%--netdev_pick_tx
        |
        --6.92%--__netdev_pick_tx
                  |
                  --6.35%--__skb_tx_hash
                            |
                            --5.94%--__skb_get_hash
                                      |
                                      --3.22%--__skb_flow_dissect

in the following  scenario:

ethtool -L em1 combined 1
taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992       1   10.00     11497225      0       9.20

After this commit the perf tool reports:

0.85%--__netdev_pick_tx

and netperf reports:

MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992       1   10.00     12736058      0      10.19

roughly +10% in xmit tput.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |  1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 13 +++----------
 3 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index c1e3a0039ea5..04aaecce81d2 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -293,7 +293,6 @@ enum ixgbe_ring_state_t {
 	__IXGBE_RX_CSUM_UDP_ZERO_ERR,
 	__IXGBE_RX_FCOE,
 	__IXGBE_TX_FDIR_INIT_DONE,
-	__IXGBE_TX_XPS_INIT_DONE,
 	__IXGBE_TX_DETECT_HANG,
 	__IXGBE_HANG_CHECK_ARMED,
 	__IXGBE_TX_XDP_RING,
@@ -827,6 +826,7 @@ enum ixgbe_state_t {
 	__IXGBE_PTP_RUNNING,
 	__IXGBE_PTP_TX_IN_PROGRESS,
 	__IXGBE_RESET_REQUESTED,
+	__IXGBE_TX_XPS_INIT_DONE,
 };
 
 struct ixgbe_cb {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index c0e6ab42e0e1..da4e6416e8eb 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -3224,6 +3224,7 @@ static int ixgbe_set_channels(struct net_device *dev,
 
 #endif
 	/* use setup TC to update any traffic class queue mapping */
+	clear_bit(__IXGBE_TX_XPS_INIT_DONE, &adapter->state);
 	return ixgbe_setup_tc(dev, adapter->hw_tcs);
 }
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 85369423452d..5bd45fc737fa 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3566,16 +3566,6 @@ void ixgbe_configure_tx_ring(struct ixgbe_adapter *adapter,
 		ring->atr_sample_rate = 0;
 	}
 
-	/* initialize XPS */
-	if (!test_and_set_bit(__IXGBE_TX_XPS_INIT_DONE, &ring->state)) {
-		struct ixgbe_q_vector *q_vector = ring->q_vector;
-
-		if (q_vector)
-			netif_set_xps_queue(ring->netdev,
-					    &q_vector->affinity_mask,
-					    ring->queue_index);
-	}
-
 	clear_bit(__IXGBE_HANG_CHECK_ARMED, &ring->state);
 
 	/* reinitialize tx_buffer_info */
@@ -5626,6 +5616,9 @@ static void ixgbe_up_complete(struct ixgbe_adapter *adapter)
 	int err;
 	u32 ctrl_ext;
 
+	if (!test_and_set_bit(__IXGBE_TX_XPS_INIT_DONE, &adapter->state))
+		netif_set_xps(adapter->netdev);
+
 	ixgbe_get_hw_control(adapter);
 	ixgbe_setup_gpie(adapter);
 
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
  2018-03-15 15:08 ` [Intel-wired-lan] " Paolo Abeni
                   ` (2 preceding siblings ...)
  (?)
@ 2018-03-15 15:30 ` Eric Dumazet
  -1 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2018-03-15 15:30 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Mar 15, 2018 at 8:08 AM Paolo Abeni <pabeni@redhat.com> wrote:

> Currently, most MQ netdevice setup the default XPS configuration mapping
> 1-1
> the first real_num_tx_queues queues and CPUs and no mapping is created for
> the CPUs with id greater then real_num_tx_queues, if any.
>
> As a result, the xmit path for unconnected sockets on such cores
> experiences a
> relevant overhead in netdev_pick_tx(), which needs to dissect each packet
> and
> compute its hash.
>
> Such scenario is easily triggered. e.g. from DNS server under relevant
> load, as
> the user-space process is moved away from the CPUs serving the softirqs
> (note:
> this is beneficial for the overall DNS server performances).
>
> This series introduces an helper to easily setup up XPS mapping for all the
> online CPUs, and use it in the ixgbe driver, demonstrating a relevant
> performance improvement in the above scenario.
>
>
I truly believe XPS should not be setup by devices.

XPS is policy, and policy does belong to user space.

User space knows that CPU 2  and 3 (pure examples) are reserved for storage
interrupts, not NIC ones.

Note that if XPS is not setup, MQ queue selection is just fine by default ;)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osuosl.org/pipermail/intel-wired-lan/attachments/20180315/b0626b35/attachment.html>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
  2018-03-15 15:08 ` [Intel-wired-lan] " Paolo Abeni
@ 2018-03-15 15:31   ` Eric Dumazet
  -1 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2018-03-15 15:31 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, David Miller, Jeff Kirsher, intel-wired-lan, Alexander Duyck

On Thu, Mar 15, 2018 at 8:08 AM Paolo Abeni <pabeni@redhat.com> wrote:

> Currently, most MQ netdevice setup the default XPS configuration mapping
1-1
> the first real_num_tx_queues queues and CPUs and no mapping is created for
> the CPUs with id greater then real_num_tx_queues, if any.

> As a result, the xmit path for unconnected sockets on such cores
experiences a
> relevant overhead in netdev_pick_tx(), which needs to dissect each packet
and
> compute its hash.

> Such scenario is easily triggered. e.g. from DNS server under relevant
load, as
> the user-space process is moved away from the CPUs serving the softirqs
(note:
> this is beneficial for the overall DNS server performances).

> This series introduces an helper to easily setup up XPS mapping for all
the
> online CPUs, and use it in the ixgbe driver, demonstrating a relevant
> performance improvement in the above scenario.

> Paolo Abeni (2):
>    net: introduce netif_set_xps()
>    ixgbe: setup XPS via netif_set_xps()


Resent, not HTML this time, sorry for duplication.

I truly believe XPS should not be setup by devices.

XPS is policy, and policy does belong to user space.

User space knows that CPU 2  and 3 (pure examples) are reserved for storage
interrupts, not NIC ones.

Note that if XPS is not setup, MQ queue selection is just fine by default ;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
@ 2018-03-15 15:31   ` Eric Dumazet
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2018-03-15 15:31 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Mar 15, 2018 at 8:08 AM Paolo Abeni <pabeni@redhat.com> wrote:

> Currently, most MQ netdevice setup the default XPS configuration mapping
1-1
> the first real_num_tx_queues queues and CPUs and no mapping is created for
> the CPUs with id greater then real_num_tx_queues, if any.

> As a result, the xmit path for unconnected sockets on such cores
experiences a
> relevant overhead in netdev_pick_tx(), which needs to dissect each packet
and
> compute its hash.

> Such scenario is easily triggered. e.g. from DNS server under relevant
load, as
> the user-space process is moved away from the CPUs serving the softirqs
(note:
> this is beneficial for the overall DNS server performances).

> This series introduces an helper to easily setup up XPS mapping for all
the
> online CPUs, and use it in the ixgbe driver, demonstrating a relevant
> performance improvement in the above scenario.

> Paolo Abeni (2):
>    net: introduce netif_set_xps()
>    ixgbe: setup XPS via netif_set_xps()


Resent, not HTML this time, sorry for duplication.

I truly believe XPS should not be setup by devices.

XPS is policy, and policy does belong to user space.

User space knows that CPU 2  and 3 (pure examples) are reserved for storage
interrupts, not NIC ones.

Note that if XPS is not setup, MQ queue selection is just fine by default ;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
  2018-03-15 15:31   ` [Intel-wired-lan] " Eric Dumazet
@ 2018-03-15 15:51     ` Paolo Abeni
  -1 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 15:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, David Miller, Jeff Kirsher, intel-wired-lan, Alexander Duyck

Hi,

On Thu, 2018-03-15 at 15:31 +0000, Eric Dumazet wrote:
> On Thu, Mar 15, 2018 at 8:08 AM Paolo Abeni <pabeni@redhat.com> wrote:
> 
> > Currently, most MQ netdevice setup the default XPS configuration mapping
> 
> 1-1
> > the first real_num_tx_queues queues and CPUs and no mapping is created for
> > the CPUs with id greater then real_num_tx_queues, if any.
> > As a result, the xmit path for unconnected sockets on such cores
> 
> experiences a
> > relevant overhead in netdev_pick_tx(), which needs to dissect each packet
> 
> and
> > compute its hash.
> > Such scenario is easily triggered. e.g. from DNS server under relevant
> 
> load, as
> > the user-space process is moved away from the CPUs serving the softirqs
> 
> (note:
> > this is beneficial for the overall DNS server performances).
> > This series introduces an helper to easily setup up XPS mapping for all
> 
> the
> > online CPUs, and use it in the ixgbe driver, demonstrating a relevant
> > performance improvement in the above scenario.
> > Paolo Abeni (2):
> >    net: introduce netif_set_xps()
> >    ixgbe: setup XPS via netif_set_xps()
> 
> 
> Resent, not HTML this time, sorry for duplication.
> 
> I truly believe XPS should not be setup by devices.
> 
> XPS is policy, and policy does belong to user space.

Thank you for your comments!

As general principle, I agree policies should be in user-space, but I
also think that the kernel should provide a reasonable default. Many MQ
devices already configure XPS and their default is AFAICS sub-optimal.

> Note that if XPS is not setup, MQ queue selection is just fine by default ;

I'm sorry, I do not follow. AFAICS with unconnected sockets without XPS
we always hit the netdev_pick_tx()/skb_tx_hash()/skb_flow_dissect()
overhead in xmit path.

Cheers,

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
@ 2018-03-15 15:51     ` Paolo Abeni
  0 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 15:51 UTC (permalink / raw)
  To: intel-wired-lan

Hi,

On Thu, 2018-03-15 at 15:31 +0000, Eric Dumazet wrote:
> On Thu, Mar 15, 2018 at 8:08 AM Paolo Abeni <pabeni@redhat.com> wrote:
> 
> > Currently, most MQ netdevice setup the default XPS configuration mapping
> 
> 1-1
> > the first real_num_tx_queues queues and CPUs and no mapping is created for
> > the CPUs with id greater then real_num_tx_queues, if any.
> > As a result, the xmit path for unconnected sockets on such cores
> 
> experiences a
> > relevant overhead in netdev_pick_tx(), which needs to dissect each packet
> 
> and
> > compute its hash.
> > Such scenario is easily triggered. e.g. from DNS server under relevant
> 
> load, as
> > the user-space process is moved away from the CPUs serving the softirqs
> 
> (note:
> > this is beneficial for the overall DNS server performances).
> > This series introduces an helper to easily setup up XPS mapping for all
> 
> the
> > online CPUs, and use it in the ixgbe driver, demonstrating a relevant
> > performance improvement in the above scenario.
> > Paolo Abeni (2):
> >    net: introduce netif_set_xps()
> >    ixgbe: setup XPS via netif_set_xps()
> 
> 
> Resent, not HTML this time, sorry for duplication.
> 
> I truly believe XPS should not be setup by devices.
> 
> XPS is policy, and policy does belong to user space.

Thank you for your comments!

As general principle, I agree policies should be in user-space, but I
also think that the kernel should provide a reasonable default. Many MQ
devices already configure XPS and their default is AFAICS sub-optimal.

> Note that if XPS is not setup, MQ queue selection is just fine by default ;

I'm sorry, I do not follow. AFAICS with unconnected sockets without XPS
we always hit the netdev_pick_tx()/skb_tx_hash()/skb_flow_dissect()
overhead in xmit path.

Cheers,

Paolo




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
  2018-03-15 15:51     ` [Intel-wired-lan] " Paolo Abeni
@ 2018-03-15 15:59       ` Eric Dumazet
  -1 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2018-03-15 15:59 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, David Miller, Jeff Kirsher, intel-wired-lan, Alexander Duyck

On Thu, Mar 15, 2018 at 8:51 AM Paolo Abeni <pabeni@redhat.com> wrote:

> Hi,

> On Thu, 2018-03-15 at 15:31 +0000, Eric Dumazet wrote:
> > On Thu, Mar 15, 2018 at 8:08 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > > Currently, most MQ netdevice setup the default XPS configuration
mapping
> >
> > 1-1
> > > the first real_num_tx_queues queues and CPUs and no mapping is
created for
> > > the CPUs with id greater then real_num_tx_queues, if any.
> > > As a result, the xmit path for unconnected sockets on such cores
> >
> > experiences a
> > > relevant overhead in netdev_pick_tx(), which needs to dissect each
packet
> >
> > and
> > > compute its hash.
> > > Such scenario is easily triggered. e.g. from DNS server under relevant
> >
> > load, as
> > > the user-space process is moved away from the CPUs serving the
softirqs
> >
> > (note:
> > > this is beneficial for the overall DNS server performances).
> > > This series introduces an helper to easily setup up XPS mapping for
all
> >
> > the
> > > online CPUs, and use it in the ixgbe driver, demonstrating a relevant
> > > performance improvement in the above scenario.
> > > Paolo Abeni (2):
> > >    net: introduce netif_set_xps()
> > >    ixgbe: setup XPS via netif_set_xps()
> >
> >
> > Resent, not HTML this time, sorry for duplication.
> >
> > I truly believe XPS should not be setup by devices.
> >
> > XPS is policy, and policy does belong to user space.

> Thank you for your comments!

> As general principle, I agree policies should be in user-space, but I
> also think that the kernel should provide a reasonable default. Many MQ
> devices already configure XPS and their default is AFAICS sub-optimal.

> > Note that if XPS is not setup, MQ queue selection is just fine by
default ;

> I'm sorry, I do not follow. AFAICS with unconnected sockets without XPS
> we always hit the netdev_pick_tx()/skb_tx_hash()/skb_flow_dissect()
> overhead in xmit path.


Then fix this if you want, instead of fixing one NIC only, or by enforcing
XPS by all NIC.

For unconnected sockets, picking the TX queue based on current cpu is good,
we do not have to enforce ordering as much as possible.

(pfifo_fast no longer can enforce it anyway)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
@ 2018-03-15 15:59       ` Eric Dumazet
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2018-03-15 15:59 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Mar 15, 2018 at 8:51 AM Paolo Abeni <pabeni@redhat.com> wrote:

> Hi,

> On Thu, 2018-03-15 at 15:31 +0000, Eric Dumazet wrote:
> > On Thu, Mar 15, 2018 at 8:08 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > > Currently, most MQ netdevice setup the default XPS configuration
mapping
> >
> > 1-1
> > > the first real_num_tx_queues queues and CPUs and no mapping is
created for
> > > the CPUs with id greater then real_num_tx_queues, if any.
> > > As a result, the xmit path for unconnected sockets on such cores
> >
> > experiences a
> > > relevant overhead in netdev_pick_tx(), which needs to dissect each
packet
> >
> > and
> > > compute its hash.
> > > Such scenario is easily triggered. e.g. from DNS server under relevant
> >
> > load, as
> > > the user-space process is moved away from the CPUs serving the
softirqs
> >
> > (note:
> > > this is beneficial for the overall DNS server performances).
> > > This series introduces an helper to easily setup up XPS mapping for
all
> >
> > the
> > > online CPUs, and use it in the ixgbe driver, demonstrating a relevant
> > > performance improvement in the above scenario.
> > > Paolo Abeni (2):
> > >    net: introduce netif_set_xps()
> > >    ixgbe: setup XPS via netif_set_xps()
> >
> >
> > Resent, not HTML this time, sorry for duplication.
> >
> > I truly believe XPS should not be setup by devices.
> >
> > XPS is policy, and policy does belong to user space.

> Thank you for your comments!

> As general principle, I agree policies should be in user-space, but I
> also think that the kernel should provide a reasonable default. Many MQ
> devices already configure XPS and their default is AFAICS sub-optimal.

> > Note that if XPS is not setup, MQ queue selection is just fine by
default ;

> I'm sorry, I do not follow. AFAICS with unconnected sockets without XPS
> we always hit the netdev_pick_tx()/skb_tx_hash()/skb_flow_dissect()
> overhead in xmit path.


Then fix this if you want, instead of fixing one NIC only, or by enforcing
XPS by all NIC.

For unconnected sockets, picking the TX queue based on current cpu is good,
we do not have to enforce ordering as much as possible.

(pfifo_fast no longer can enforce it anyway)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-wired-lan] [RFC PATCH 2/2] ixgbe: setup XPS via netif_set_xps()
  2018-03-15 15:08   ` [Intel-wired-lan] " Paolo Abeni
@ 2018-03-15 16:43     ` Alexander Duyck
  -1 siblings, 0 replies; 23+ messages in thread
From: Alexander Duyck @ 2018-03-15 16:43 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Netdev, Eric Dumazet, intel-wired-lan, David S. Miller

On Thu, Mar 15, 2018 at 8:08 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> Before this commit, ixgbe with the default setting lacks XPS mapping
> for CPUs id greater than the number of tx queues.
>
> As a consequence the xmit path for such CPUs experience a relevant cost
> in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
> tool:
>
> 7.55%--netdev_pick_tx
>         |
>         --6.92%--__netdev_pick_tx
>                   |
>                   --6.35%--__skb_tx_hash
>                             |
>                             --5.94%--__skb_get_hash
>                                       |
>                                       --3.22%--__skb_flow_dissect
>
> in the following  scenario:
>
> ethtool -L em1 combined 1
> taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
> MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> Socket  Message  Elapsed      Messages
> Size    Size     Time         Okay Errors   Throughput
> bytes   bytes    secs            #      #   10^6bits/sec
>
> 212992       1   10.00     11497225      0       9.20
>
> After this commit the perf tool reports:
>
> 0.85%--__netdev_pick_tx
>
> and netperf reports:
>
> MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> Socket  Message  Elapsed      Messages
> Size    Size     Time         Okay Errors   Throughput
> bytes   bytes    secs            #      #   10^6bits/sec
>
> 212992       1   10.00     12736058      0      10.19
>
> roughly +10% in xmit tput.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

I think we shouldn't be configuring XPS if number of Tx or Rx queues
is less than the number of CPUs, or ATR is not enabled.

Really the XPS bits are only really supposed to be used with the ATR
functionality enabled. If we don't have enough queues for a 1:1
mapping we should probably not be programming XPS since ATR isn't
going to function right anyway.

- Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 2/2] ixgbe: setup XPS via netif_set_xps()
@ 2018-03-15 16:43     ` Alexander Duyck
  0 siblings, 0 replies; 23+ messages in thread
From: Alexander Duyck @ 2018-03-15 16:43 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Mar 15, 2018 at 8:08 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> Before this commit, ixgbe with the default setting lacks XPS mapping
> for CPUs id greater than the number of tx queues.
>
> As a consequence the xmit path for such CPUs experience a relevant cost
> in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
> tool:
>
> 7.55%--netdev_pick_tx
>         |
>         --6.92%--__netdev_pick_tx
>                   |
>                   --6.35%--__skb_tx_hash
>                             |
>                             --5.94%--__skb_get_hash
>                                       |
>                                       --3.22%--__skb_flow_dissect
>
> in the following  scenario:
>
> ethtool -L em1 combined 1
> taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
> MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> Socket  Message  Elapsed      Messages
> Size    Size     Time         Okay Errors   Throughput
> bytes   bytes    secs            #      #   10^6bits/sec
>
> 212992       1   10.00     11497225      0       9.20
>
> After this commit the perf tool reports:
>
> 0.85%--__netdev_pick_tx
>
> and netperf reports:
>
> MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> Socket  Message  Elapsed      Messages
> Size    Size     Time         Okay Errors   Throughput
> bytes   bytes    secs            #      #   10^6bits/sec
>
> 212992       1   10.00     12736058      0      10.19
>
> roughly +10% in xmit tput.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

I think we shouldn't be configuring XPS if number of Tx or Rx queues
is less than the number of CPUs, or ATR is not enabled.

Really the XPS bits are only really supposed to be used with the ATR
functionality enabled. If we don't have enough queues for a 1:1
mapping we should probably not be programming XPS since ATR isn't
going to function right anyway.

- Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-wired-lan] [RFC PATCH 2/2] ixgbe: setup XPS via netif_set_xps()
  2018-03-15 16:43     ` Alexander Duyck
@ 2018-03-15 17:05       ` Paolo Abeni
  -1 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 17:05 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Netdev, Eric Dumazet, intel-wired-lan, David S. Miller

Hi, 

On Thu, 2018-03-15 at 09:43 -0700, Alexander Duyck wrote:
> On Thu, Mar 15, 2018 at 8:08 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> > Before this commit, ixgbe with the default setting lacks XPS mapping
> > for CPUs id greater than the number of tx queues.
> > 
> > As a consequence the xmit path for such CPUs experience a relevant cost
> > in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
> > tool:
> > 
> > 7.55%--netdev_pick_tx
> >         |
> >         --6.92%--__netdev_pick_tx
> >                   |
> >                   --6.35%--__skb_tx_hash
> >                             |
> >                             --5.94%--__skb_get_hash
> >                                       |
> >                                       --3.22%--__skb_flow_dissect
> > 
> > in the following  scenario:
> > 
> > ethtool -L em1 combined 1
> > taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> > Socket  Message  Elapsed      Messages
> > Size    Size     Time         Okay Errors   Throughput
> > bytes   bytes    secs            #      #   10^6bits/sec
> > 
> > 212992       1   10.00     11497225      0       9.20
> > 
> > After this commit the perf tool reports:
> > 
> > 0.85%--__netdev_pick_tx
> > 
> > and netperf reports:
> > 
> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> > Socket  Message  Elapsed      Messages
> > Size    Size     Time         Okay Errors   Throughput
> > bytes   bytes    secs            #      #   10^6bits/sec
> > 
> > 212992       1   10.00     12736058      0      10.19
> > 
> > roughly +10% in xmit tput.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> 
> I think we shouldn't be configuring XPS if number of Tx or Rx queues
> is less than the number of CPUs, or ATR is not enabled.

Thank you for the feedback!

Please note the currently the ixgbe driver is enabling XPS regardless
of the above considerations.

> Really the XPS bits are only really supposed to be used with the ATR
> functionality enabled. If we don't have enough queues for a 1:1
> mapping we should probably not be programming XPS since ATR isn't
> going to function right anyway.

uhm... I don't know the details of ATR, but apparently it is for TCP
only, while the use-case I'm referring to is plain (no tunnel)
unconnected UDP traffic. Am I missing something?

thanks,

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 2/2] ixgbe: setup XPS via netif_set_xps()
@ 2018-03-15 17:05       ` Paolo Abeni
  0 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 17:05 UTC (permalink / raw)
  To: intel-wired-lan

Hi, 

On Thu, 2018-03-15 at 09:43 -0700, Alexander Duyck wrote:
> On Thu, Mar 15, 2018 at 8:08 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> > Before this commit, ixgbe with the default setting lacks XPS mapping
> > for CPUs id greater than the number of tx queues.
> > 
> > As a consequence the xmit path for such CPUs experience a relevant cost
> > in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
> > tool:
> > 
> > 7.55%--netdev_pick_tx
> >         |
> >         --6.92%--__netdev_pick_tx
> >                   |
> >                   --6.35%--__skb_tx_hash
> >                             |
> >                             --5.94%--__skb_get_hash
> >                                       |
> >                                       --3.22%--__skb_flow_dissect
> > 
> > in the following  scenario:
> > 
> > ethtool -L em1 combined 1
> > taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> > Socket  Message  Elapsed      Messages
> > Size    Size     Time         Okay Errors   Throughput
> > bytes   bytes    secs            #      #   10^6bits/sec
> > 
> > 212992       1   10.00     11497225      0       9.20
> > 
> > After this commit the perf tool reports:
> > 
> > 0.85%--__netdev_pick_tx
> > 
> > and netperf reports:
> > 
> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> > Socket  Message  Elapsed      Messages
> > Size    Size     Time         Okay Errors   Throughput
> > bytes   bytes    secs            #      #   10^6bits/sec
> > 
> > 212992       1   10.00     12736058      0      10.19
> > 
> > roughly +10% in xmit tput.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> 
> I think we shouldn't be configuring XPS if number of Tx or Rx queues
> is less than the number of CPUs, or ATR is not enabled.

Thank you for the feedback!

Please note the currently the ixgbe driver is enabling XPS regardless
of the above considerations.

> Really the XPS bits are only really supposed to be used with the ATR
> functionality enabled. If we don't have enough queues for a 1:1
> mapping we should probably not be programming XPS since ATR isn't
> going to function right anyway.

uhm... I don't know the details of ATR, but apparently it is for TCP
only, while the use-case I'm referring to is plain (no tunnel)
unconnected UDP traffic. Am I missing something?

thanks,

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
  2018-03-15 15:59       ` [Intel-wired-lan] " Eric Dumazet
@ 2018-03-15 17:20         ` Paolo Abeni
  -1 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 17:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, David Miller, Jeff Kirsher, intel-wired-lan, Alexander Duyck

Hi,

On Thu, 2018-03-15 at 15:59 +0000, Eric Dumazet wrote:
> On Thu, Mar 15, 2018 at 8:51 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > I'm sorry, I do not follow. AFAICS with unconnected sockets without XPS
> > we always hit the netdev_pick_tx()/skb_tx_hash()/skb_flow_dissect()
> > overhead in xmit path.
> 
> Then fix this if you want, instead of fixing one NIC only, or by enforcing
> XPS by all NIC.
> 
> For unconnected sockets, picking the TX queue based on current cpu is good,
> we do not have to enforce ordering as much as possible.
> 
> (pfifo_fast no longer can enforce it anyway)

Thank you for the prompt reply. 

I'm double checking to avoid misinterpretation on my side: are you
suggesting to plug a CPU-based selection logic for unconnected sockets
in netdev_pick_tx() or to cook patches like 2/2 for all the relevant
NICs? 

Thanks!

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
@ 2018-03-15 17:20         ` Paolo Abeni
  0 siblings, 0 replies; 23+ messages in thread
From: Paolo Abeni @ 2018-03-15 17:20 UTC (permalink / raw)
  To: intel-wired-lan

Hi,

On Thu, 2018-03-15 at 15:59 +0000, Eric Dumazet wrote:
> On Thu, Mar 15, 2018 at 8:51 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > I'm sorry, I do not follow. AFAICS with unconnected sockets without XPS
> > we always hit the netdev_pick_tx()/skb_tx_hash()/skb_flow_dissect()
> > overhead in xmit path.
> 
> Then fix this if you want, instead of fixing one NIC only, or by enforcing
> XPS by all NIC.
> 
> For unconnected sockets, picking the TX queue based on current cpu is good,
> we do not have to enforce ordering as much as possible.
> 
> (pfifo_fast no longer can enforce it anyway)

Thank you for the prompt reply. 

I'm double checking to avoid misinterpretation on my side: are you
suggesting to plug a CPU-based selection logic for unconnected sockets
in netdev_pick_tx() or to cook patches like 2/2 for all the relevant
NICs? 

Thanks!

Paolo



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-wired-lan] [RFC PATCH 2/2] ixgbe: setup XPS via netif_set_xps()
  2018-03-15 17:05       ` Paolo Abeni
@ 2018-03-15 17:22         ` Alexander Duyck
  -1 siblings, 0 replies; 23+ messages in thread
From: Alexander Duyck @ 2018-03-15 17:22 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Netdev, Eric Dumazet, intel-wired-lan, David S. Miller

On Thu, Mar 15, 2018 at 10:05 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> Hi,
>
> On Thu, 2018-03-15 at 09:43 -0700, Alexander Duyck wrote:
>> On Thu, Mar 15, 2018 at 8:08 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>> > Before this commit, ixgbe with the default setting lacks XPS mapping
>> > for CPUs id greater than the number of tx queues.
>> >
>> > As a consequence the xmit path for such CPUs experience a relevant cost
>> > in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
>> > tool:
>> >
>> > 7.55%--netdev_pick_tx
>> >         |
>> >         --6.92%--__netdev_pick_tx
>> >                   |
>> >                   --6.35%--__skb_tx_hash
>> >                             |
>> >                             --5.94%--__skb_get_hash
>> >                                       |
>> >                                       --3.22%--__skb_flow_dissect
>> >
>> > in the following  scenario:
>> >
>> > ethtool -L em1 combined 1
>> > taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
>> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
>> > Socket  Message  Elapsed      Messages
>> > Size    Size     Time         Okay Errors   Throughput
>> > bytes   bytes    secs            #      #   10^6bits/sec
>> >
>> > 212992       1   10.00     11497225      0       9.20
>> >
>> > After this commit the perf tool reports:
>> >
>> > 0.85%--__netdev_pick_tx
>> >
>> > and netperf reports:
>> >
>> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
>> > Socket  Message  Elapsed      Messages
>> > Size    Size     Time         Okay Errors   Throughput
>> > bytes   bytes    secs            #      #   10^6bits/sec
>> >
>> > 212992       1   10.00     12736058      0      10.19
>> >
>> > roughly +10% in xmit tput.
>> >
>> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>
>> I think we shouldn't be configuring XPS if number of Tx or Rx queues
>> is less than the number of CPUs, or ATR is not enabled.
>
> Thank you for the feedback!
>
> Please note the currently the ixgbe driver is enabling XPS regardless
> of the above considerations.
>
>> Really the XPS bits are only really supposed to be used with the ATR
>> functionality enabled. If we don't have enough queues for a 1:1
>> mapping we should probably not be programming XPS since ATR isn't
>> going to function right anyway.
>
> uhm... I don't know the details of ATR, but apparently it is for TCP
> only, while the use-case I'm referring to is plain (no tunnel)
> unconnected UDP traffic. Am I missing something?

No. Basically the ATR/XPS bits is an overreach. The original code had
the driver just using the incoming CPU to select a Tx queue via the
ndo_select_queue. I pushed it out to XPS in order to try to get the
drivers to avoid using ndo_select_queue and provide the user with a
way to at least manually disable the Tx side of it.

For now I would say we should not have the driver configuring the XPS
map if it cannot assume a 1:1 mapping. As-is there are a number of
features where having this functionality enabled doesn't make sense.
In those cases we leave cpu as -1 in ixgbe_alloc_q_vector, and leave
the affinity mask as all 0s. It might make sense to just update the
code there in the case of ixgbe so that we don't update the XPS map or
the q_vector->affinity_mask if we cannot achieve a 1:1 mapping. As is
I would say the code is probably in need of updates since
ixgbe_alloc_q_vector doesn't handle the case where we might have a
non-linear CPU layout.

ATR is a feature that has been on my list of things to fix sometime in
the near future, but I haven't had the time as I have been pulled into
too many other efforts. Ideally we should be moving away from ATR and
instead looking at doing something like supporting ndo_rx_flow_steer.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 2/2] ixgbe: setup XPS via netif_set_xps()
@ 2018-03-15 17:22         ` Alexander Duyck
  0 siblings, 0 replies; 23+ messages in thread
From: Alexander Duyck @ 2018-03-15 17:22 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Mar 15, 2018 at 10:05 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> Hi,
>
> On Thu, 2018-03-15 at 09:43 -0700, Alexander Duyck wrote:
>> On Thu, Mar 15, 2018 at 8:08 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>> > Before this commit, ixgbe with the default setting lacks XPS mapping
>> > for CPUs id greater than the number of tx queues.
>> >
>> > As a consequence the xmit path for such CPUs experience a relevant cost
>> > in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
>> > tool:
>> >
>> > 7.55%--netdev_pick_tx
>> >         |
>> >         --6.92%--__netdev_pick_tx
>> >                   |
>> >                   --6.35%--__skb_tx_hash
>> >                             |
>> >                             --5.94%--__skb_get_hash
>> >                                       |
>> >                                       --3.22%--__skb_flow_dissect
>> >
>> > in the following  scenario:
>> >
>> > ethtool -L em1 combined 1
>> > taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
>> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
>> > Socket  Message  Elapsed      Messages
>> > Size    Size     Time         Okay Errors   Throughput
>> > bytes   bytes    secs            #      #   10^6bits/sec
>> >
>> > 212992       1   10.00     11497225      0       9.20
>> >
>> > After this commit the perf tool reports:
>> >
>> > 0.85%--__netdev_pick_tx
>> >
>> > and netperf reports:
>> >
>> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
>> > Socket  Message  Elapsed      Messages
>> > Size    Size     Time         Okay Errors   Throughput
>> > bytes   bytes    secs            #      #   10^6bits/sec
>> >
>> > 212992       1   10.00     12736058      0      10.19
>> >
>> > roughly +10% in xmit tput.
>> >
>> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>
>> I think we shouldn't be configuring XPS if number of Tx or Rx queues
>> is less than the number of CPUs, or ATR is not enabled.
>
> Thank you for the feedback!
>
> Please note the currently the ixgbe driver is enabling XPS regardless
> of the above considerations.
>
>> Really the XPS bits are only really supposed to be used with the ATR
>> functionality enabled. If we don't have enough queues for a 1:1
>> mapping we should probably not be programming XPS since ATR isn't
>> going to function right anyway.
>
> uhm... I don't know the details of ATR, but apparently it is for TCP
> only, while the use-case I'm referring to is plain (no tunnel)
> unconnected UDP traffic. Am I missing something?

No. Basically the ATR/XPS bits is an overreach. The original code had
the driver just using the incoming CPU to select a Tx queue via the
ndo_select_queue. I pushed it out to XPS in order to try to get the
drivers to avoid using ndo_select_queue and provide the user with a
way to at least manually disable the Tx side of it.

For now I would say we should not have the driver configuring the XPS
map if it cannot assume a 1:1 mapping. As-is there are a number of
features where having this functionality enabled doesn't make sense.
In those cases we leave cpu as -1 in ixgbe_alloc_q_vector, and leave
the affinity mask as all 0s. It might make sense to just update the
code there in the case of ixgbe so that we don't update the XPS map or
the q_vector->affinity_mask if we cannot achieve a 1:1 mapping. As is
I would say the code is probably in need of updates since
ixgbe_alloc_q_vector doesn't handle the case where we might have a
non-linear CPU layout.

ATR is a feature that has been on my list of things to fix sometime in
the near future, but I haven't had the time as I have been pulled into
too many other efforts. Ideally we should be moving away from ATR and
instead looking at doing something like supporting ndo_rx_flow_steer.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-wired-lan] [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
  2018-03-15 17:20         ` [Intel-wired-lan] " Paolo Abeni
@ 2018-03-15 17:47           ` Alexander Duyck
  -1 siblings, 0 replies; 23+ messages in thread
From: Alexander Duyck @ 2018-03-15 17:47 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Eric Dumazet, netdev, intel-wired-lan, David Miller

On Thu, Mar 15, 2018 at 10:20 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> Hi,
>
> On Thu, 2018-03-15 at 15:59 +0000, Eric Dumazet wrote:
>> On Thu, Mar 15, 2018 at 8:51 AM Paolo Abeni <pabeni@redhat.com> wrote:
>> > I'm sorry, I do not follow. AFAICS with unconnected sockets without XPS
>> > we always hit the netdev_pick_tx()/skb_tx_hash()/skb_flow_dissect()
>> > overhead in xmit path.
>>
>> Then fix this if you want, instead of fixing one NIC only, or by enforcing
>> XPS by all NIC.
>>
>> For unconnected sockets, picking the TX queue based on current cpu is good,
>> we do not have to enforce ordering as much as possible.
>>
>> (pfifo_fast no longer can enforce it anyway)
>
> Thank you for the prompt reply.
>
> I'm double checking to avoid misinterpretation on my side: are you
> suggesting to plug a CPU-based selection logic for unconnected sockets
> in netdev_pick_tx() or to cook patches like 2/2 for all the relevant
> NICs?
>
> Thanks!
>
> Paolo

We just need to watch out for any possible side effects. For example
using XPS on a virtualization host has been problematic as you end up
with the traffic getting reordered if the VM jumps from CPU to CPU.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-wired-lan] [RFC PATCH 0/2] net:setup XPS mapping for each online CPU
@ 2018-03-15 17:47           ` Alexander Duyck
  0 siblings, 0 replies; 23+ messages in thread
From: Alexander Duyck @ 2018-03-15 17:47 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Mar 15, 2018 at 10:20 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> Hi,
>
> On Thu, 2018-03-15 at 15:59 +0000, Eric Dumazet wrote:
>> On Thu, Mar 15, 2018 at 8:51 AM Paolo Abeni <pabeni@redhat.com> wrote:
>> > I'm sorry, I do not follow. AFAICS with unconnected sockets without XPS
>> > we always hit the netdev_pick_tx()/skb_tx_hash()/skb_flow_dissect()
>> > overhead in xmit path.
>>
>> Then fix this if you want, instead of fixing one NIC only, or by enforcing
>> XPS by all NIC.
>>
>> For unconnected sockets, picking the TX queue based on current cpu is good,
>> we do not have to enforce ordering as much as possible.
>>
>> (pfifo_fast no longer can enforce it anyway)
>
> Thank you for the prompt reply.
>
> I'm double checking to avoid misinterpretation on my side: are you
> suggesting to plug a CPU-based selection logic for unconnected sockets
> in netdev_pick_tx() or to cook patches like 2/2 for all the relevant
> NICs?
>
> Thanks!
>
> Paolo

We just need to watch out for any possible side effects. For example
using XPS on a virtualization host has been problematic as you end up
with the traffic getting reordered if the VM jumps from CPU to CPU.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-03-15 17:47 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-15 15:08 [RFC PATCH 0/2] net:setup XPS mapping for each online CPU Paolo Abeni
2018-03-15 15:08 ` [Intel-wired-lan] " Paolo Abeni
2018-03-15 15:08 ` [RFC PATCH 1/2] net: introduce netif_set_xps() Paolo Abeni
2018-03-15 15:08   ` [Intel-wired-lan] " Paolo Abeni
2018-03-15 15:08 ` [RFC PATCH 2/2] ixgbe: setup XPS via netif_set_xps() Paolo Abeni
2018-03-15 15:08   ` [Intel-wired-lan] " Paolo Abeni
2018-03-15 16:43   ` Alexander Duyck
2018-03-15 16:43     ` Alexander Duyck
2018-03-15 17:05     ` Paolo Abeni
2018-03-15 17:05       ` Paolo Abeni
2018-03-15 17:22       ` Alexander Duyck
2018-03-15 17:22         ` Alexander Duyck
2018-03-15 15:30 ` [Intel-wired-lan] [RFC PATCH 0/2] net:setup XPS mapping for each online CPU Eric Dumazet
2018-03-15 15:31 ` Eric Dumazet
2018-03-15 15:31   ` [Intel-wired-lan] " Eric Dumazet
2018-03-15 15:51   ` Paolo Abeni
2018-03-15 15:51     ` [Intel-wired-lan] " Paolo Abeni
2018-03-15 15:59     ` Eric Dumazet
2018-03-15 15:59       ` [Intel-wired-lan] " Eric Dumazet
2018-03-15 17:20       ` Paolo Abeni
2018-03-15 17:20         ` [Intel-wired-lan] " Paolo Abeni
2018-03-15 17:47         ` Alexander Duyck
2018-03-15 17:47           ` Alexander Duyck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.