All of lore.kernel.org
 help / color / mirror / Atom feed
* [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28
@ 2018-06-28 21:50 Saeed Mahameed
  2018-06-28 21:50 ` [net-next 01/12] net/mlx5e: Add UDP GSO support Saeed Mahameed
                   ` (12 more replies)
  0 siblings, 13 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Saeed Mahameed

Hi Dave,

The following pull request includes updates for mlx5e netdevice driver.
For more information please see tag log below.

Please pull and let me know if there's any problem.

Thanks,
Saeed.

---

The following changes since commit 7861552cedd81a164c0d5d1c89fe2cb45a3ed41b:

  netlink: Return extack message if attribute validation fails (2018-06-28 16:18:04 +0900)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5e-updates-2018-06-28

for you to fetch changes up to ed56c5193ad89d1097cdbdc87abeb062e03a06eb:

  net/mlx5e: Update NIC HW stats on demand only (2018-06-28 14:44:38 -0700)

----------------------------------------------------------------
mlx5e-updates-2018-06-28

mlx5e netdevice driver updates:

- Boris Pismenny added the support for UDP GSO in the first two patches.
  Impressive performance numbers are included in the commit message,
  @Line rate with ~half of the cpu utilization compared to non offload
  or no GSO at all.

- From Tariq Toukan:
  - Convert large order kzalloc allocations to kvzalloc.
  - Added performance diagnostic statistics to several places in data path.

>From Saeed and Eran,
  - Update NIC HW stats on demand only, this is to eliminate the background
    thread needed to update some HW statistics in the driver cache in
    order to report error and drop counters from HW in ndo_get_stats.

----------------------------------------------------------------
Boris Pismenny (2):
      net/mlx5e: Add UDP GSO support
      net/mlx5e: Add UDP GSO remaining counter

Saeed Mahameed (1):
      net/mlx5e: Update NIC HW stats on demand only

Tariq Toukan (9):
      net/mlx5e: Convert large order kzalloc allocations to kvzalloc
      net/mlx5e: RX, Use existing WQ local variable
      net/mlx5e: Add TX completions statistics
      net/mlx5e: Add XDP_TX completions statistics
      net/mlx5e: Add NAPI statistics
      net/mlx5e: Add a counter for congested UMRs
      net/mlx5e: Add channel events counter
      net/mlx5e: Add counter for MPWQE filler strides
      net/mlx5e: Add counter for total num of NOP operations

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   4 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |   1 -
 .../mellanox/mlx5/core/en_accel/en_accel.h         |  11 ++-
 .../ethernet/mellanox/mlx5/core/en_accel/rxtx.c    | 109 +++++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/en_accel/rxtx.h    |  14 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  57 ++++++-----
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |   3 +
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  11 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c |  34 ++++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h |  25 ++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c    |  17 +++-
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c  |   9 +-
 drivers/net/ethernet/mellanox/mlx5/core/wq.h       |   5 +
 13 files changed, 252 insertions(+), 48 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [net-next 01/12] net/mlx5e: Add UDP GSO support
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
@ 2018-06-28 21:50 ` Saeed Mahameed
  2018-06-29 22:19   ` Willem de Bruijn
  2018-06-28 21:50 ` [net-next 02/12] net/mlx5e: Add UDP GSO remaining counter Saeed Mahameed
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Or Gerlitz, Boris Pismenny, Yossi Kuperman, Saeed Mahameed

From: Boris Pismenny <borisp@mellanox.com>

This patch enables UDP GSO support. We enable this by using two WQEs
the first is a UDP LSO WQE for all segments with equal length, and the
second is for the last segment in case it has different length.
Due to HW limitation, before sending, we must adjust the packet length fields.

We measure performance between two Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
machines connected back-to-back with Connectx4-Lx (40Gbps) NICs.
We compare single stream UDP, UDP GSO and UDP GSO with offload.
Performance:
		| MSS (bytes)	| Throughput (Gbps)	| CPU utilization (%)
UDP GSO offload	| 1472		| 35.6			| 8%
UDP GSO 	| 1472		| 25.5			| 17%
UDP 		| 1472		| 10.2			| 17%
UDP GSO offload	| 1024		| 35.6			| 8%
UDP GSO		| 1024		| 19.2			| 17%
UDP 		| 1024		| 5.7			| 17%
UDP GSO offload	| 512		| 33.8			| 16%
UDP GSO		| 512		| 10.4			| 17%
UDP 		| 512		| 3.5			| 17%

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   4 +-
 .../mellanox/mlx5/core/en_accel/en_accel.h    |  11 +-
 .../mellanox/mlx5/core/en_accel/rxtx.c        | 108 ++++++++++++++++++
 .../mellanox/mlx5/core/en_accel/rxtx.h        |  14 +++
 .../net/ethernet/mellanox/mlx5/core/en_main.c |   3 +
 .../net/ethernet/mellanox/mlx5/core/en_tx.c   |   8 +-
 6 files changed, 139 insertions(+), 9 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 9efbf193ad5a..d923f2f58608 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -14,8 +14,8 @@ mlx5_core-$(CONFIG_MLX5_FPGA) += fpga/cmd.o fpga/core.o fpga/conn.o fpga/sdk.o \
 		fpga/ipsec.o fpga/tls.o
 
 mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
-		en_tx.o en_rx.o en_dim.o en_txrx.o en_stats.o vxlan.o \
-		en_arfs.o en_fs_ethtool.o en_selftest.o en/port.o
+		en_tx.o en_rx.o en_dim.o en_txrx.o en_accel/rxtx.o en_stats.o  \
+		vxlan.o en_arfs.o en_fs_ethtool.o en_selftest.o en/port.o
 
 mlx5_core-$(CONFIG_MLX5_MPFS) += lib/mpfs.o
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h
index f20074dbef32..39a5d13ba459 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h
@@ -34,12 +34,11 @@
 #ifndef __MLX5E_EN_ACCEL_H__
 #define __MLX5E_EN_ACCEL_H__
 
-#ifdef CONFIG_MLX5_ACCEL
-
 #include <linux/skbuff.h>
 #include <linux/netdevice.h>
 #include "en_accel/ipsec_rxtx.h"
 #include "en_accel/tls_rxtx.h"
+#include "en_accel/rxtx.h"
 #include "en.h"
 
 static inline struct sk_buff *mlx5e_accel_handle_tx(struct sk_buff *skb,
@@ -64,9 +63,13 @@ static inline struct sk_buff *mlx5e_accel_handle_tx(struct sk_buff *skb,
 	}
 #endif
 
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) {
+		skb = mlx5e_udp_gso_handle_tx_skb(dev, sq, skb, wqe, pi);
+		if (unlikely(!skb))
+			return NULL;
+	}
+
 	return skb;
 }
 
-#endif /* CONFIG_MLX5_ACCEL */
-
 #endif /* __MLX5E_EN_ACCEL_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
new file mode 100644
index 000000000000..4bb1f3b12b96
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
@@ -0,0 +1,108 @@
+#include "en_accel/rxtx.h"
+
+static void mlx5e_udp_gso_prepare_last_skb(struct sk_buff *skb,
+					   struct sk_buff *nskb,
+					   int remaining)
+{
+	int bytes_needed = remaining, remaining_headlen, remaining_page_offset;
+	int headlen = skb_transport_offset(skb) + sizeof(struct udphdr);
+	int payload_len = remaining + sizeof(struct udphdr);
+	int k = 0, i, j;
+
+	skb_copy_bits(skb, 0, nskb->data, headlen);
+	nskb->dev = skb->dev;
+	skb_reset_mac_header(nskb);
+	skb_set_network_header(nskb, skb_network_offset(skb));
+	skb_set_transport_header(nskb, skb_transport_offset(skb));
+	skb_set_tail_pointer(nskb, headlen);
+
+	/* How many frags do we need? */
+	for (i = skb_shinfo(skb)->nr_frags - 1; i >= 0; i--) {
+		bytes_needed -= skb_frag_size(&skb_shinfo(skb)->frags[i]);
+		k++;
+		if (bytes_needed <= 0)
+			break;
+	}
+
+	/* Fill the first frag and split it if necessary */
+	j = skb_shinfo(skb)->nr_frags - k;
+	remaining_page_offset = -bytes_needed;
+	skb_fill_page_desc(nskb, 0,
+			   skb_shinfo(skb)->frags[j].page.p,
+			   skb_shinfo(skb)->frags[j].page_offset + remaining_page_offset,
+			   skb_shinfo(skb)->frags[j].size - remaining_page_offset);
+
+	skb_frag_ref(skb, j);
+
+	/* Fill the rest of the frags */
+	for (i = 1; i < k; i++) {
+		j = skb_shinfo(skb)->nr_frags - k + i;
+
+		skb_fill_page_desc(nskb, i,
+				   skb_shinfo(skb)->frags[j].page.p,
+				   skb_shinfo(skb)->frags[j].page_offset,
+				   skb_shinfo(skb)->frags[j].size);
+		skb_frag_ref(skb, j);
+	}
+	skb_shinfo(nskb)->nr_frags = k;
+
+	remaining_headlen = remaining - skb->data_len;
+
+	/* headlen contains remaining data? */
+	if (remaining_headlen > 0)
+		skb_copy_bits(skb, skb->len - remaining, nskb->data + headlen,
+			      remaining_headlen);
+	nskb->len = remaining + headlen;
+	nskb->data_len =  payload_len - sizeof(struct udphdr) +
+		max_t(int, 0, remaining_headlen);
+	nskb->protocol = skb->protocol;
+	if (nskb->protocol == htons(ETH_P_IP)) {
+		ip_hdr(nskb)->id = htons(ntohs(ip_hdr(nskb)->id) +
+					 skb_shinfo(skb)->gso_segs);
+		ip_hdr(nskb)->tot_len =
+			htons(payload_len + sizeof(struct iphdr));
+	} else {
+		ipv6_hdr(nskb)->payload_len = htons(payload_len);
+	}
+	udp_hdr(nskb)->len = htons(payload_len);
+	skb_shinfo(nskb)->gso_size = 0;
+	nskb->ip_summed = skb->ip_summed;
+	nskb->csum_start = skb->csum_start;
+	nskb->csum_offset = skb->csum_offset;
+	nskb->queue_mapping = skb->queue_mapping;
+}
+
+/* might send skbs and update wqe and pi */
+struct sk_buff *mlx5e_udp_gso_handle_tx_skb(struct net_device *netdev,
+					    struct mlx5e_txqsq *sq,
+					    struct sk_buff *skb,
+					    struct mlx5e_tx_wqe **wqe,
+					    u16 *pi)
+{
+	int payload_len = skb_shinfo(skb)->gso_size + sizeof(struct udphdr);
+	int headlen = skb_transport_offset(skb) + sizeof(struct udphdr);
+	int remaining = (skb->len - headlen) % skb_shinfo(skb)->gso_size;
+	struct sk_buff *nskb;
+
+	if (skb->protocol == htons(ETH_P_IP))
+		ip_hdr(skb)->tot_len = htons(payload_len + sizeof(struct iphdr));
+	else
+		ipv6_hdr(skb)->payload_len = htons(payload_len);
+	udp_hdr(skb)->len = htons(payload_len);
+	if (!remaining)
+		return skb;
+
+	nskb = alloc_skb(max_t(int, headlen, headlen + remaining - skb->data_len), GFP_ATOMIC);
+	if (unlikely(!nskb)) {
+		sq->stats->dropped++;
+		return NULL;
+	}
+
+	mlx5e_udp_gso_prepare_last_skb(skb, nskb, remaining);
+
+	skb_shinfo(skb)->gso_segs--;
+	pskb_trim(skb, skb->len - remaining);
+	mlx5e_sq_xmit(sq, skb, *wqe, *pi);
+	mlx5e_sq_fetch_wqe(sq, wqe, pi);
+	return nskb;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.h
new file mode 100644
index 000000000000..ed42699a78b3
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.h
@@ -0,0 +1,14 @@
+
+#ifndef __MLX5E_EN_ACCEL_RX_TX_H__
+#define __MLX5E_EN_ACCEL_RX_TX_H__
+
+#include <linux/skbuff.h>
+#include "en.h"
+
+struct sk_buff *mlx5e_udp_gso_handle_tx_skb(struct net_device *netdev,
+					    struct mlx5e_txqsq *sq,
+					    struct sk_buff *skb,
+					    struct mlx5e_tx_wqe **wqe,
+					    u16 *pi);
+
+#endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 134f20a182b5..e2ef68b1daa2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4592,6 +4592,9 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
 	netdev->features         |= NETIF_F_HIGHDMA;
 	netdev->features         |= NETIF_F_HW_VLAN_STAG_FILTER;
 
+	netdev->features         |= NETIF_F_GSO_UDP_L4;
+	netdev->hw_features      |= NETIF_F_GSO_UDP_L4;
+
 	netdev->priv_flags       |= IFF_UNICAST_FLT;
 
 	mlx5e_set_netdev_dev_addr(netdev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index f29deb44bf3b..f450d9ca31fb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -228,7 +228,10 @@ mlx5e_tx_get_gso_ihs(struct mlx5e_txqsq *sq, struct sk_buff *skb)
 		stats->tso_inner_packets++;
 		stats->tso_inner_bytes += skb->len - ihs;
 	} else {
-		ihs = skb_transport_offset(skb) + tcp_hdrlen(skb);
+		if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4)
+			ihs = skb_transport_offset(skb) + sizeof(struct udphdr);
+		else
+			ihs = skb_transport_offset(skb) + tcp_hdrlen(skb);
 		stats->tso_packets++;
 		stats->tso_bytes += skb->len - ihs;
 	}
@@ -443,12 +446,11 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
 	sq = priv->txq2sq[skb_get_queue_mapping(skb)];
 	mlx5e_sq_fetch_wqe(sq, &wqe, &pi);
 
-#ifdef CONFIG_MLX5_ACCEL
 	/* might send skbs and update wqe and pi */
 	skb = mlx5e_accel_handle_tx(skb, sq, dev, &wqe, &pi);
 	if (unlikely(!skb))
 		return NETDEV_TX_OK;
-#endif
+
 	return mlx5e_sq_xmit(sq, skb, wqe, pi);
 }
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 02/12] net/mlx5e: Add UDP GSO remaining counter
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
  2018-06-28 21:50 ` [net-next 01/12] net/mlx5e: Add UDP GSO support Saeed Mahameed
@ 2018-06-28 21:50 ` Saeed Mahameed
  2018-06-28 21:50 ` [net-next 03/12] net/mlx5e: Convert large order kzalloc allocations to kvzalloc Saeed Mahameed
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Boris Pismenny, Saeed Mahameed

From: Boris Pismenny <borisp@mellanox.com>

This patch adds a counter for tx UDP GSO packets that contain a segment
that is not aligned to MSS - remaining segment.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c | 1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c      | 2 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h      | 2 ++
 3 files changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
index 4bb1f3b12b96..7b7ec3998e84 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
@@ -92,6 +92,7 @@ struct sk_buff *mlx5e_udp_gso_handle_tx_skb(struct net_device *netdev,
 	if (!remaining)
 		return skb;
 
+	sq->stats->udp_seg_rem++;
 	nskb = alloc_skb(max_t(int, headlen, headlen + remaining - skb->data_len), GFP_ATOMIC);
 	if (unlikely(!nskb)) {
 		sq->stats->dropped++;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 1646859974ce..7e7155b4e0f0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -68,6 +68,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xmit_more) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_recover) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_queue_wake) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_udp_seg_rem) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqe_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_wqe_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mpwqe_filler) },
@@ -159,6 +160,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 			s->tx_added_vlan_packets += sq_stats->added_vlan_packets;
 			s->tx_queue_stopped	+= sq_stats->stopped;
 			s->tx_queue_wake	+= sq_stats->wake;
+			s->tx_udp_seg_rem	+= sq_stats->udp_seg_rem;
 			s->tx_queue_dropped	+= sq_stats->dropped;
 			s->tx_cqe_err		+= sq_stats->cqe_err;
 			s->tx_recover		+= sq_stats->recover;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 643153bb3607..d416bb86e747 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -79,6 +79,7 @@ struct mlx5e_sw_stats {
 	u64 tx_xmit_more;
 	u64 tx_recover;
 	u64 tx_queue_wake;
+	u64 tx_udp_seg_rem;
 	u64 tx_cqe_err;
 	u64 rx_wqe_err;
 	u64 rx_mpwqe_filler;
@@ -196,6 +197,7 @@ struct mlx5e_sq_stats {
 	u64 csum_partial_inner;
 	u64 added_vlan_packets;
 	u64 nop;
+	u64 udp_seg_rem;
 #ifdef CONFIG_MLX5_EN_TLS
 	u64 tls_ooo;
 	u64 tls_resync_bytes;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 03/12] net/mlx5e: Convert large order kzalloc allocations to kvzalloc
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
  2018-06-28 21:50 ` [net-next 01/12] net/mlx5e: Add UDP GSO support Saeed Mahameed
  2018-06-28 21:50 ` [net-next 02/12] net/mlx5e: Add UDP GSO remaining counter Saeed Mahameed
@ 2018-06-28 21:50 ` Saeed Mahameed
  2018-06-28 21:50 ` [net-next 04/12] net/mlx5e: RX, Use existing WQ local variable Saeed Mahameed
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

Replace calls to kzalloc_node with kvzalloc_node, as it fallsback
to lower-order pages if the higher-order trials fail.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 44 +++++++++----------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index e2ef68b1daa2..42ef8c818544 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -352,8 +352,8 @@ static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq,
 {
 	int wq_sz = mlx5_wq_ll_get_size(&rq->mpwqe.wq);
 
-	rq->mpwqe.info = kcalloc_node(wq_sz, sizeof(*rq->mpwqe.info),
-				      GFP_KERNEL, cpu_to_node(c->cpu));
+	rq->mpwqe.info = kvzalloc_node(wq_sz * sizeof(*rq->mpwqe.info),
+				       GFP_KERNEL, cpu_to_node(c->cpu));
 	if (!rq->mpwqe.info)
 		return -ENOMEM;
 
@@ -670,7 +670,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 err_free:
 	switch (rq->wq_type) {
 	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
-		kfree(rq->mpwqe.info);
+		kvfree(rq->mpwqe.info);
 		mlx5_core_destroy_mkey(mdev, &rq->umr_mkey);
 		break;
 	default: /* MLX5_WQ_TYPE_CYCLIC */
@@ -702,7 +702,7 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
 
 	switch (rq->wq_type) {
 	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
-		kfree(rq->mpwqe.info);
+		kvfree(rq->mpwqe.info);
 		mlx5_core_destroy_mkey(rq->mdev, &rq->umr_mkey);
 		break;
 	default: /* MLX5_WQ_TYPE_CYCLIC */
@@ -965,15 +965,15 @@ static void mlx5e_close_rq(struct mlx5e_rq *rq)
 
 static void mlx5e_free_xdpsq_db(struct mlx5e_xdpsq *sq)
 {
-	kfree(sq->db.di);
+	kvfree(sq->db.di);
 }
 
 static int mlx5e_alloc_xdpsq_db(struct mlx5e_xdpsq *sq, int numa)
 {
 	int wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
 
-	sq->db.di = kcalloc_node(wq_sz, sizeof(*sq->db.di),
-				     GFP_KERNEL, numa);
+	sq->db.di = kvzalloc_node(sizeof(*sq->db.di) * wq_sz,
+				  GFP_KERNEL, numa);
 	if (!sq->db.di) {
 		mlx5e_free_xdpsq_db(sq);
 		return -ENOMEM;
@@ -1024,15 +1024,15 @@ static void mlx5e_free_xdpsq(struct mlx5e_xdpsq *sq)
 
 static void mlx5e_free_icosq_db(struct mlx5e_icosq *sq)
 {
-	kfree(sq->db.ico_wqe);
+	kvfree(sq->db.ico_wqe);
 }
 
 static int mlx5e_alloc_icosq_db(struct mlx5e_icosq *sq, int numa)
 {
 	u8 wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
 
-	sq->db.ico_wqe = kcalloc_node(wq_sz, sizeof(*sq->db.ico_wqe),
-				      GFP_KERNEL, numa);
+	sq->db.ico_wqe = kvzalloc_node(sizeof(*sq->db.ico_wqe) * wq_sz,
+				       GFP_KERNEL, numa);
 	if (!sq->db.ico_wqe)
 		return -ENOMEM;
 
@@ -1077,8 +1077,8 @@ static void mlx5e_free_icosq(struct mlx5e_icosq *sq)
 
 static void mlx5e_free_txqsq_db(struct mlx5e_txqsq *sq)
 {
-	kfree(sq->db.wqe_info);
-	kfree(sq->db.dma_fifo);
+	kvfree(sq->db.wqe_info);
+	kvfree(sq->db.dma_fifo);
 }
 
 static int mlx5e_alloc_txqsq_db(struct mlx5e_txqsq *sq, int numa)
@@ -1086,10 +1086,10 @@ static int mlx5e_alloc_txqsq_db(struct mlx5e_txqsq *sq, int numa)
 	int wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
 	int df_sz = wq_sz * MLX5_SEND_WQEBB_NUM_DS;
 
-	sq->db.dma_fifo = kcalloc_node(df_sz, sizeof(*sq->db.dma_fifo),
-					   GFP_KERNEL, numa);
-	sq->db.wqe_info = kcalloc_node(wq_sz, sizeof(*sq->db.wqe_info),
-					   GFP_KERNEL, numa);
+	sq->db.dma_fifo = kvzalloc_node(df_sz * sizeof(*sq->db.dma_fifo),
+					GFP_KERNEL, numa);
+	sq->db.wqe_info = kvzalloc_node(wq_sz * sizeof(*sq->db.wqe_info),
+					GFP_KERNEL, numa);
 	if (!sq->db.dma_fifo || !sq->db.wqe_info) {
 		mlx5e_free_txqsq_db(sq);
 		return -ENOMEM;
@@ -1893,7 +1893,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 	int err;
 	int eqn;
 
-	c = kzalloc_node(sizeof(*c), GFP_KERNEL, cpu_to_node(cpu));
+	c = kvzalloc_node(sizeof(*c), GFP_KERNEL, cpu_to_node(cpu));
 	if (!c)
 		return -ENOMEM;
 
@@ -1979,7 +1979,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 
 err_napi_del:
 	netif_napi_del(&c->napi);
-	kfree(c);
+	kvfree(c);
 
 	return err;
 }
@@ -2018,7 +2018,7 @@ static void mlx5e_close_channel(struct mlx5e_channel *c)
 	mlx5e_close_cq(&c->icosq.cq);
 	netif_napi_del(&c->napi);
 
-	kfree(c);
+	kvfree(c);
 }
 
 #define DEFAULT_FRAG_SIZE (2048)
@@ -2276,7 +2276,7 @@ int mlx5e_open_channels(struct mlx5e_priv *priv,
 	chs->num = chs->params.num_channels;
 
 	chs->c = kcalloc(chs->num, sizeof(struct mlx5e_channel *), GFP_KERNEL);
-	cparam = kzalloc(sizeof(struct mlx5e_channel_param), GFP_KERNEL);
+	cparam = kvzalloc(sizeof(struct mlx5e_channel_param), GFP_KERNEL);
 	if (!chs->c || !cparam)
 		goto err_free;
 
@@ -2287,7 +2287,7 @@ int mlx5e_open_channels(struct mlx5e_priv *priv,
 			goto err_close_channels;
 	}
 
-	kfree(cparam);
+	kvfree(cparam);
 	return 0;
 
 err_close_channels:
@@ -2296,7 +2296,7 @@ int mlx5e_open_channels(struct mlx5e_priv *priv,
 
 err_free:
 	kfree(chs->c);
-	kfree(cparam);
+	kvfree(cparam);
 	chs->num = 0;
 	return err;
 }
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 04/12] net/mlx5e: RX, Use existing WQ local variable
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
                   ` (2 preceding siblings ...)
  2018-06-28 21:50 ` [net-next 03/12] net/mlx5e: Convert large order kzalloc allocations to kvzalloc Saeed Mahameed
@ 2018-06-28 21:50 ` Saeed Mahameed
  2018-06-28 21:50 ` [net-next 05/12] net/mlx5e: Add TX completions statistics Saeed Mahameed
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

Local variable 'wq' already points to &sq->wq, use it.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index d3a1dd20e41d..a2d91eaa99c4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -487,7 +487,7 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
 
 	sq->db.ico_wqe[pi].opcode = MLX5_OPCODE_UMR;
 	sq->pc += MLX5E_UMR_WQEBBS;
-	mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, &umr_wqe->ctrl);
+	mlx5e_notify_hw(wq, sq->pc, sq->uar_map, &umr_wqe->ctrl);
 
 	return 0;
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 05/12] net/mlx5e: Add TX completions statistics
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
                   ` (3 preceding siblings ...)
  2018-06-28 21:50 ` [net-next 04/12] net/mlx5e: RX, Use existing WQ local variable Saeed Mahameed
@ 2018-06-28 21:50 ` Saeed Mahameed
  2018-06-28 21:50 ` [net-next 06/12] net/mlx5e: Add XDP_TX " Saeed Mahameed
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

Add per-ring and global ethtool counters for TX completions.
This helps us monitor and analyze TX flow performance.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 4 +++-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c    | 9 +++++++--
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 7e7155b4e0f0..d35361b1b3fe 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -67,6 +67,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_queue_dropped) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xmit_more) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_recover) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqes) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_queue_wake) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_udp_seg_rem) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqe_err) },
@@ -172,6 +173,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 			s->tx_tls_ooo		+= sq_stats->tls_ooo;
 			s->tx_tls_resync_bytes	+= sq_stats->tls_resync_bytes;
 #endif
+			s->tx_cqes		+= sq_stats->cqes;
 		}
 	}
 
@@ -1142,6 +1144,7 @@ static const struct counter_desc sq_stats_desc[] = {
 	{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, dropped) },
 	{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, xmit_more) },
 	{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, recover) },
+	{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, cqes) },
 	{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, wake) },
 	{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, cqe_err) },
 };
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index d416bb86e747..8f2dfe56fdef 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -78,6 +78,7 @@ struct mlx5e_sw_stats {
 	u64 tx_queue_dropped;
 	u64 tx_xmit_more;
 	u64 tx_recover;
+	u64 tx_cqes;
 	u64 tx_queue_wake;
 	u64 tx_udp_seg_rem;
 	u64 tx_cqe_err;
@@ -208,7 +209,8 @@ struct mlx5e_sq_stats {
 	u64 dropped;
 	u64 recover;
 	/* dirtied @completion */
-	u64 wake ____cacheline_aligned_in_smp;
+	u64 cqes ____cacheline_aligned_in_smp;
+	u64 wake;
 	u64 cqe_err;
 };
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index f450d9ca31fb..f0739dae7b56 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -468,6 +468,7 @@ static void mlx5e_dump_error_cqe(struct mlx5e_txqsq *sq,
 
 bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 {
+	struct mlx5e_sq_stats *stats;
 	struct mlx5e_txqsq *sq;
 	struct mlx5_cqe64 *cqe;
 	u32 dma_fifo_cc;
@@ -485,6 +486,8 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 	if (!cqe)
 		return false;
 
+	stats = sq->stats;
+
 	npkts = 0;
 	nbytes = 0;
 
@@ -513,7 +516,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 				queue_work(cq->channel->priv->wq,
 					   &sq->recover.recover_work);
 			}
-			sq->stats->cqe_err++;
+			stats->cqe_err++;
 		}
 
 		do {
@@ -558,6 +561,8 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 
 	} while ((++i < MLX5E_TX_CQ_POLL_BUDGET) && (cqe = mlx5_cqwq_get_cqe(&cq->wq)));
 
+	stats->cqes += i;
+
 	mlx5_cqwq_update_db_record(&cq->wq);
 
 	/* ensure cq space is freed before enabling more cqes */
@@ -573,7 +578,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
 				   MLX5E_SQ_STOP_ROOM) &&
 	    !test_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state)) {
 		netif_tx_wake_queue(sq->txq);
-		sq->stats->wake++;
+		stats->wake++;
 	}
 
 	return (i == MLX5E_TX_CQ_POLL_BUDGET);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 06/12] net/mlx5e: Add XDP_TX completions statistics
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
                   ` (4 preceding siblings ...)
  2018-06-28 21:50 ` [net-next 05/12] net/mlx5e: Add TX completions statistics Saeed Mahameed
@ 2018-06-28 21:50 ` Saeed Mahameed
  2018-06-28 21:50 ` [net-next 07/12] net/mlx5e: Add NAPI statistics Saeed Mahameed
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

Add per-ring and global ethtool counters for XDP_TX completions.
This helps us monitor and analyze XDP_TX flow performance.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 2 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 2 ++
 3 files changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index a2d91eaa99c4..733c5d2c99f2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1383,6 +1383,8 @@ bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq)
 		} while (!last_wqe);
 	} while ((++i < MLX5E_TX_CQ_POLL_BUDGET) && (cqe = mlx5_cqwq_get_cqe(&cq->wq)));
 
+	rq->stats->xdp_tx_cqe += i;
+
 	mlx5_cqwq_update_db_record(&cq->wq);
 
 	/* ensure cq space is freed before enabling more cqes */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index d35361b1b3fe..98aefa6eb266 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -59,6 +59,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_csum_unnecessary_inner) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_drop) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_cqe) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_full) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_csum_none) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_csum_partial) },
@@ -135,6 +136,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_csum_unnecessary_inner += rq_stats->csum_unnecessary_inner;
 		s->rx_xdp_drop += rq_stats->xdp_drop;
 		s->rx_xdp_tx += rq_stats->xdp_tx;
+		s->rx_xdp_tx_cqe  += rq_stats->xdp_tx_cqe;
 		s->rx_xdp_tx_full += rq_stats->xdp_tx_full;
 		s->rx_wqe_err   += rq_stats->wqe_err;
 		s->rx_mpwqe_filler += rq_stats->mpwqe_filler;
@@ -1111,6 +1113,7 @@ static const struct counter_desc rq_stats_desc[] = {
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, csum_none) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_drop) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_tx) },
+	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_tx_cqe) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_tx_full) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_packets) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_bytes) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 8f2dfe56fdef..b598a21bb4d6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -70,6 +70,7 @@ struct mlx5e_sw_stats {
 	u64 rx_csum_unnecessary_inner;
 	u64 rx_xdp_drop;
 	u64 rx_xdp_tx;
+	u64 rx_xdp_tx_cqe;
 	u64 rx_xdp_tx_full;
 	u64 tx_csum_none;
 	u64 tx_csum_partial;
@@ -171,6 +172,7 @@ struct mlx5e_rq_stats {
 	u64 removed_vlan_packets;
 	u64 xdp_drop;
 	u64 xdp_tx;
+	u64 xdp_tx_cqe;
 	u64 xdp_tx_full;
 	u64 wqe_err;
 	u64 mpwqe_filler;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 07/12] net/mlx5e: Add NAPI statistics
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
                   ` (5 preceding siblings ...)
  2018-06-28 21:50 ` [net-next 06/12] net/mlx5e: Add XDP_TX " Saeed Mahameed
@ 2018-06-28 21:50 ` Saeed Mahameed
  2018-06-28 21:50 ` [net-next 08/12] net/mlx5e: Add a counter for congested UMRs Saeed Mahameed
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

Add per-channel and global ethtool counters for NAPI.
This helps us monitor and analyze performance in general.

- ch[i]_poll:
  the number of times the channel's NAPI poll was invoked.

- ch[i]_arm:
  the number of times the channel's NAPI poll completed
  and armed the completion queues.

- ch[i]_aff_change:
  the number of times the channel's NAPI poll explicitly
  stopped execution on a cpu due to a change in affinity.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 9 +++++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 6 ++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c  | 6 ++++++
 3 files changed, 21 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 98aefa6eb266..ec7784189dc2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -83,6 +83,9 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_empty) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_busy) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_waive) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_poll) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_arm) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_aff_change) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_eq_rearm) },
 };
 
@@ -149,6 +152,9 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_cache_empty += rq_stats->cache_empty;
 		s->rx_cache_busy  += rq_stats->cache_busy;
 		s->rx_cache_waive += rq_stats->cache_waive;
+		s->ch_poll        += ch_stats->poll;
+		s->ch_arm         += ch_stats->arm;
+		s->ch_aff_change  += ch_stats->aff_change;
 		s->ch_eq_rearm += ch_stats->eq_rearm;
 
 		for (j = 0; j < priv->max_opened_tc; j++) {
@@ -1153,6 +1159,9 @@ static const struct counter_desc sq_stats_desc[] = {
 };
 
 static const struct counter_desc ch_stats_desc[] = {
+	{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, poll) },
+	{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, arm) },
+	{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, aff_change) },
 	{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, eq_rearm) },
 };
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index b598a21bb4d6..0cd08b9f46ff 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -94,6 +94,9 @@ struct mlx5e_sw_stats {
 	u64 rx_cache_empty;
 	u64 rx_cache_busy;
 	u64 rx_cache_waive;
+	u64 ch_poll;
+	u64 ch_arm;
+	u64 ch_aff_change;
 	u64 ch_eq_rearm;
 
 #ifdef CONFIG_MLX5_EN_TLS
@@ -217,6 +220,9 @@ struct mlx5e_sq_stats {
 };
 
 struct mlx5e_ch_stats {
+	u64 poll;
+	u64 arm;
+	u64 aff_change;
 	u64 eq_rearm;
 };
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 1b17f682693b..9f6e97883cbc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -74,10 +74,13 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 {
 	struct mlx5e_channel *c = container_of(napi, struct mlx5e_channel,
 					       napi);
+	struct mlx5e_ch_stats *ch_stats = c->stats;
 	bool busy = false;
 	int work_done = 0;
 	int i;
 
+	ch_stats->poll++;
+
 	for (i = 0; i < c->num_tc; i++)
 		busy |= mlx5e_poll_tx_cq(&c->sq[i].cq, budget);
 
@@ -94,6 +97,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 	if (busy) {
 		if (likely(mlx5e_channel_no_affinity_change(c)))
 			return budget;
+		ch_stats->aff_change++;
 		if (budget && work_done == budget)
 			work_done--;
 	}
@@ -101,6 +105,8 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
 	if (unlikely(!napi_complete_done(napi, work_done)))
 		return work_done;
 
+	ch_stats->arm++;
+
 	for (i = 0; i < c->num_tc; i++) {
 		mlx5e_handle_tx_dim(&c->sq[i]);
 		mlx5e_cq_arm(&c->sq[i].cq);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 08/12] net/mlx5e: Add a counter for congested UMRs
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
                   ` (6 preceding siblings ...)
  2018-06-28 21:50 ` [net-next 07/12] net/mlx5e: Add NAPI statistics Saeed Mahameed
@ 2018-06-28 21:50 ` Saeed Mahameed
  2018-06-28 21:51 ` [net-next 09/12] net/mlx5e: Add channel events counter Saeed Mahameed
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

Add per-ring and global ethtool counters for congested UMR requests.
These events indicate congestion in UMR handlers in HW.

Such event is concluded when there's an outstanding UMR post,
yet the SW consumed at least two additional MPWQEs in the meanwhile.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 2 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 2 ++
 drivers/net/ethernet/mellanox/mlx5/core/wq.h       | 5 +++++
 4 files changed, 12 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 733c5d2c99f2..6f20ce76c11c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -601,6 +601,8 @@ bool mlx5e_post_rx_mpwqes(struct mlx5e_rq *rq)
 
 	if (!rq->mpwqe.umr_in_progress)
 		mlx5e_alloc_rx_mpwqe(rq, wq->head);
+	else
+		rq->stats->congst_umr += mlx5_wq_ll_missing(wq) > 2;
 
 	return false;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index ec7784189dc2..dd3b5a028a97 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -83,6 +83,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_empty) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_busy) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_waive) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_congst_umr) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_poll) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_arm) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_aff_change) },
@@ -152,6 +153,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_cache_empty += rq_stats->cache_empty;
 		s->rx_cache_busy  += rq_stats->cache_busy;
 		s->rx_cache_waive += rq_stats->cache_waive;
+		s->rx_congst_umr  += rq_stats->congst_umr;
 		s->ch_poll        += ch_stats->poll;
 		s->ch_arm         += ch_stats->arm;
 		s->ch_aff_change  += ch_stats->aff_change;
@@ -1135,6 +1137,7 @@ static const struct counter_desc rq_stats_desc[] = {
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_empty) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_busy) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_waive) },
+	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, congst_umr) },
 };
 
 static const struct counter_desc sq_stats_desc[] = {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 0cd08b9f46ff..4e54cb86fece 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -94,6 +94,7 @@ struct mlx5e_sw_stats {
 	u64 rx_cache_empty;
 	u64 rx_cache_busy;
 	u64 rx_cache_waive;
+	u64 rx_congst_umr;
 	u64 ch_poll;
 	u64 ch_arm;
 	u64 ch_aff_change;
@@ -188,6 +189,7 @@ struct mlx5e_rq_stats {
 	u64 cache_empty;
 	u64 cache_busy;
 	u64 cache_waive;
+	u64 congst_umr;
 };
 
 struct mlx5e_sq_stats {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.h b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
index 0b47126815b6..2bd4c3184eba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
@@ -229,6 +229,11 @@ static inline int mlx5_wq_ll_is_empty(struct mlx5_wq_ll *wq)
 	return !wq->cur_sz;
 }
 
+static inline int mlx5_wq_ll_missing(struct mlx5_wq_ll *wq)
+{
+	return wq->fbc.sz_m1 - wq->cur_sz;
+}
+
 static inline void *mlx5_wq_ll_get_wqe(struct mlx5_wq_ll *wq, u16 ix)
 {
 	return mlx5_frag_buf_get_wqe(&wq->fbc, ix);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 09/12] net/mlx5e: Add channel events counter
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
                   ` (7 preceding siblings ...)
  2018-06-28 21:50 ` [net-next 08/12] net/mlx5e: Add a counter for congested UMRs Saeed Mahameed
@ 2018-06-28 21:51 ` Saeed Mahameed
  2018-06-28 21:51 ` [net-next 10/12] net/mlx5e: Add counter for MPWQE filler strides Saeed Mahameed
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

Add per-channel and global ethtool counters for channel events.
Each event indicates an interrupt on one of the channel's
completion queues.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 2 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c  | 3 ++-
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index dd3b5a028a97..0c18b20c2c18 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -84,6 +84,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_busy) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_waive) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_congst_umr) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_events) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_poll) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_arm) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_aff_change) },
@@ -154,6 +155,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_cache_busy  += rq_stats->cache_busy;
 		s->rx_cache_waive += rq_stats->cache_waive;
 		s->rx_congst_umr  += rq_stats->congst_umr;
+		s->ch_events      += ch_stats->events;
 		s->ch_poll        += ch_stats->poll;
 		s->ch_arm         += ch_stats->arm;
 		s->ch_aff_change  += ch_stats->aff_change;
@@ -1162,6 +1164,7 @@ static const struct counter_desc sq_stats_desc[] = {
 };
 
 static const struct counter_desc ch_stats_desc[] = {
+	{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, events) },
 	{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, poll) },
 	{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, arm) },
 	{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, aff_change) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 4e54cb86fece..70a05298851e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -95,6 +95,7 @@ struct mlx5e_sw_stats {
 	u64 rx_cache_busy;
 	u64 rx_cache_waive;
 	u64 rx_congst_umr;
+	u64 ch_events;
 	u64 ch_poll;
 	u64 ch_arm;
 	u64 ch_aff_change;
@@ -222,6 +223,7 @@ struct mlx5e_sq_stats {
 };
 
 struct mlx5e_ch_stats {
+	u64 events;
 	u64 poll;
 	u64 arm;
 	u64 aff_change;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 9f6e97883cbc..4e1f99a98d5d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -124,8 +124,9 @@ void mlx5e_completion_event(struct mlx5_core_cq *mcq)
 {
 	struct mlx5e_cq *cq = container_of(mcq, struct mlx5e_cq, mcq);
 
-	cq->event_ctr++;
 	napi_schedule(cq->napi);
+	cq->event_ctr++;
+	cq->channel->stats->events++;
 }
 
 void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event)
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 10/12] net/mlx5e: Add counter for MPWQE filler strides
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
                   ` (8 preceding siblings ...)
  2018-06-28 21:51 ` [net-next 09/12] net/mlx5e: Add channel events counter Saeed Mahameed
@ 2018-06-28 21:51 ` Saeed Mahameed
  2018-06-28 21:51 ` [net-next 11/12] net/mlx5e: Add counter for total num of NOP operations Saeed Mahameed
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

Add ethtool counter to indicate the number of strides consumed
by filler CQEs.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 5 ++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 9 ++++++---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 6 ++++--
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 6f20ce76c11c..f763a6aebc2d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1263,7 +1263,10 @@ void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
 	}
 
 	if (unlikely(mpwrq_is_filler_cqe(cqe))) {
-		rq->stats->mpwqe_filler++;
+		struct mlx5e_rq_stats *stats = rq->stats;
+
+		stats->mpwqe_filler_cqes++;
+		stats->mpwqe_filler_strides += cstrides;
 		goto mpwrq_cqe_out;
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 0c18b20c2c18..76107ed3a651 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -73,7 +73,8 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_udp_seg_rem) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqe_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_wqe_err) },
-	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mpwqe_filler) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mpwqe_filler_cqes) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mpwqe_filler_strides) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_buff_alloc_err) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_blks) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_pkts) },
@@ -144,7 +145,8 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_xdp_tx_cqe  += rq_stats->xdp_tx_cqe;
 		s->rx_xdp_tx_full += rq_stats->xdp_tx_full;
 		s->rx_wqe_err   += rq_stats->wqe_err;
-		s->rx_mpwqe_filler += rq_stats->mpwqe_filler;
+		s->rx_mpwqe_filler_cqes    += rq_stats->mpwqe_filler_cqes;
+		s->rx_mpwqe_filler_strides += rq_stats->mpwqe_filler_strides;
 		s->rx_buff_alloc_err += rq_stats->buff_alloc_err;
 		s->rx_cqe_compress_blks += rq_stats->cqe_compress_blks;
 		s->rx_cqe_compress_pkts += rq_stats->cqe_compress_pkts;
@@ -1129,7 +1131,8 @@ static const struct counter_desc rq_stats_desc[] = {
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_bytes) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, removed_vlan_packets) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, wqe_err) },
-	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mpwqe_filler) },
+	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mpwqe_filler_cqes) },
+	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mpwqe_filler_strides) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, buff_alloc_err) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_blks) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_pkts) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 70a05298851e..1d641b012afa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -84,7 +84,8 @@ struct mlx5e_sw_stats {
 	u64 tx_udp_seg_rem;
 	u64 tx_cqe_err;
 	u64 rx_wqe_err;
-	u64 rx_mpwqe_filler;
+	u64 rx_mpwqe_filler_cqes;
+	u64 rx_mpwqe_filler_strides;
 	u64 rx_buff_alloc_err;
 	u64 rx_cqe_compress_blks;
 	u64 rx_cqe_compress_pkts;
@@ -180,7 +181,8 @@ struct mlx5e_rq_stats {
 	u64 xdp_tx_cqe;
 	u64 xdp_tx_full;
 	u64 wqe_err;
-	u64 mpwqe_filler;
+	u64 mpwqe_filler_cqes;
+	u64 mpwqe_filler_strides;
 	u64 buff_alloc_err;
 	u64 cqe_compress_blks;
 	u64 cqe_compress_pkts;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 11/12] net/mlx5e: Add counter for total num of NOP operations
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
                   ` (9 preceding siblings ...)
  2018-06-28 21:51 ` [net-next 10/12] net/mlx5e: Add counter for MPWQE filler strides Saeed Mahameed
@ 2018-06-28 21:51 ` Saeed Mahameed
  2018-06-28 21:51 ` [net-next 12/12] net/mlx5e: Update NIC HW stats on demand only Saeed Mahameed
  2018-06-29 14:56 ` [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 David Miller
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed

From: Tariq Toukan <tariqt@mellanox.com>

A per-ring counter for NOP operations already exists.
Here I add a counter that sums them up.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 2 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 76107ed3a651..c0507fada0be 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -44,6 +44,7 @@ static const struct counter_desc sw_stats_desc[] = {
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_tso_inner_packets) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_tso_inner_bytes) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_added_vlan_packets) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_nop) },
 
 #ifdef CONFIG_MLX5_EN_TLS
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_tls_ooo) },
@@ -173,6 +174,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 			s->tx_tso_inner_packets	+= sq_stats->tso_inner_packets;
 			s->tx_tso_inner_bytes	+= sq_stats->tso_inner_bytes;
 			s->tx_added_vlan_packets += sq_stats->added_vlan_packets;
+			s->tx_nop               += sq_stats->nop;
 			s->tx_queue_stopped	+= sq_stats->stopped;
 			s->tx_queue_wake	+= sq_stats->wake;
 			s->tx_udp_seg_rem	+= sq_stats->udp_seg_rem;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 1d641b012afa..fc3f66003edd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -61,6 +61,7 @@ struct mlx5e_sw_stats {
 	u64 tx_tso_inner_packets;
 	u64 tx_tso_inner_bytes;
 	u64 tx_added_vlan_packets;
+	u64 tx_nop;
 	u64 rx_lro_packets;
 	u64 rx_lro_bytes;
 	u64 rx_removed_vlan_packets;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [net-next 12/12] net/mlx5e: Update NIC HW stats on demand only
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
                   ` (10 preceding siblings ...)
  2018-06-28 21:51 ` [net-next 11/12] net/mlx5e: Add counter for total num of NOP operations Saeed Mahameed
@ 2018-06-28 21:51 ` Saeed Mahameed
  2018-06-29 14:56 ` [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 David Miller
  12 siblings, 0 replies; 22+ messages in thread
From: Saeed Mahameed @ 2018-06-28 21:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Or Gerlitz, Saeed Mahameed

Disable periodic stats update background thread and update stats in
background on demand when ndo_get_stats is called.

Having a background thread running in the driver all the time is bad for
power consumption and normally a user space daemon will query the stats
once every specific interval, so ideally the background thread and its
interval can be done in user space..

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |  1 -
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +++++-----
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c  |  3 +++
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index eb9eb7aa953a..e2b7586ed7a0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -137,7 +137,6 @@ struct page_pool;
 #define MLX5E_MAX_NUM_CHANNELS         (MLX5E_INDIR_RQT_SIZE >> 1)
 #define MLX5E_MAX_NUM_SQS              (MLX5E_MAX_NUM_CHANNELS * MLX5E_MAX_NUM_TC)
 #define MLX5E_TX_CQ_POLL_BUDGET        128
-#define MLX5E_UPDATE_STATS_INTERVAL    200 /* msecs */
 #define MLX5E_SQ_RECOVER_MIN_INTERVAL  500 /* msecs */
 
 #define MLX5E_UMR_WQE_INLINE_SZ \
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 42ef8c818544..ba9ae05efe09 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -270,12 +270,9 @@ void mlx5e_update_stats_work(struct work_struct *work)
 	struct delayed_work *dwork = to_delayed_work(work);
 	struct mlx5e_priv *priv = container_of(dwork, struct mlx5e_priv,
 					       update_stats_work);
+
 	mutex_lock(&priv->state_lock);
-	if (test_bit(MLX5E_STATE_OPENED, &priv->state)) {
-		priv->profile->update_stats(priv);
-		queue_delayed_work(priv->wq, dwork,
-				   msecs_to_jiffies(MLX5E_UPDATE_STATS_INTERVAL));
-	}
+	priv->profile->update_stats(priv);
 	mutex_unlock(&priv->state_lock);
 }
 
@@ -3405,6 +3402,9 @@ mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	struct mlx5e_vport_stats *vstats = &priv->stats.vport;
 	struct mlx5e_pport_stats *pstats = &priv->stats.pport;
 
+	/* update HW stats in background for next time */
+	queue_delayed_work(priv->wq, &priv->update_stats_work, 0);
+
 	if (mlx5e_is_uplink_rep(priv)) {
 		stats->rx_packets = PPORT_802_3_GET(pstats, a_frames_received_ok);
 		stats->rx_bytes   = PPORT_802_3_GET(pstats, a_octets_received_ok);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 3f2fe95e01d9..7db7552b7e3f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -893,6 +893,9 @@ mlx5e_rep_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
 
+	/* update HW stats in background for next time */
+	queue_delayed_work(priv->wq, &priv->update_stats_work, 0);
+
 	memcpy(stats, &priv->stats.vf_vport, sizeof(*stats));
 }
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28
  2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
                   ` (11 preceding siblings ...)
  2018-06-28 21:51 ` [net-next 12/12] net/mlx5e: Update NIC HW stats on demand only Saeed Mahameed
@ 2018-06-29 14:56 ` David Miller
  12 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2018-06-29 14:56 UTC (permalink / raw)
  To: saeedm; +Cc: netdev, ogerlitz

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Thu, 28 Jun 2018 14:50:51 -0700

> The following pull request includes updates for mlx5e netdevice driver.
> For more information please see tag log below.
> 
> Please pull and let me know if there's any problem.

Pulled, thanks Saeed.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [net-next 01/12] net/mlx5e: Add UDP GSO support
  2018-06-28 21:50 ` [net-next 01/12] net/mlx5e: Add UDP GSO support Saeed Mahameed
@ 2018-06-29 22:19   ` Willem de Bruijn
  2018-06-30 16:06     ` Boris Pismenny
  0 siblings, 1 reply; 22+ messages in thread
From: Willem de Bruijn @ 2018-06-29 22:19 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David Miller, Network Development, ogerlitz, borisp, yossiku,
	Alexander Duyck

On Fri, Jun 29, 2018 at 2:24 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> From: Boris Pismenny <borisp@mellanox.com>
>
> This patch enables UDP GSO support. We enable this by using two WQEs
> the first is a UDP LSO WQE for all segments with equal length, and the
> second is for the last segment in case it has different length.
> Due to HW limitation, before sending, we must adjust the packet length fields.
>
> We measure performance between two Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
> machines connected back-to-back with Connectx4-Lx (40Gbps) NICs.
> We compare single stream UDP, UDP GSO and UDP GSO with offload.
> Performance:
>                 | MSS (bytes)   | Throughput (Gbps)     | CPU utilization (%)
> UDP GSO offload | 1472          | 35.6                  | 8%
> UDP GSO         | 1472          | 25.5                  | 17%
> UDP             | 1472          | 10.2                  | 17%
> UDP GSO offload | 1024          | 35.6                  | 8%
> UDP GSO         | 1024          | 19.2                  | 17%
> UDP             | 1024          | 5.7                   | 17%
> UDP GSO offload | 512           | 33.8                  | 16%
> UDP GSO         | 512           | 10.4                  | 17%
> UDP             | 512           | 3.5                   | 17%

Very nice results :)

> +static void mlx5e_udp_gso_prepare_last_skb(struct sk_buff *skb,
> +                                          struct sk_buff *nskb,
> +                                          int remaining)
> +{
> +       int bytes_needed = remaining, remaining_headlen, remaining_page_offset;
> +       int headlen = skb_transport_offset(skb) + sizeof(struct udphdr);
> +       int payload_len = remaining + sizeof(struct udphdr);
> +       int k = 0, i, j;
> +
> +       skb_copy_bits(skb, 0, nskb->data, headlen);
> +       nskb->dev = skb->dev;
> +       skb_reset_mac_header(nskb);
> +       skb_set_network_header(nskb, skb_network_offset(skb));
> +       skb_set_transport_header(nskb, skb_transport_offset(skb));
> +       skb_set_tail_pointer(nskb, headlen);
> +
> +       /* How many frags do we need? */
> +       for (i = skb_shinfo(skb)->nr_frags - 1; i >= 0; i--) {
> +               bytes_needed -= skb_frag_size(&skb_shinfo(skb)->frags[i]);
> +               k++;
> +               if (bytes_needed <= 0)
> +                       break;
> +       }
> +
> +       /* Fill the first frag and split it if necessary */
> +       j = skb_shinfo(skb)->nr_frags - k;
> +       remaining_page_offset = -bytes_needed;
> +       skb_fill_page_desc(nskb, 0,
> +                          skb_shinfo(skb)->frags[j].page.p,
> +                          skb_shinfo(skb)->frags[j].page_offset + remaining_page_offset,
> +                          skb_shinfo(skb)->frags[j].size - remaining_page_offset);
> +
> +       skb_frag_ref(skb, j);
> +
> +       /* Fill the rest of the frags */
> +       for (i = 1; i < k; i++) {
> +               j = skb_shinfo(skb)->nr_frags - k + i;
> +
> +               skb_fill_page_desc(nskb, i,
> +                                  skb_shinfo(skb)->frags[j].page.p,
> +                                  skb_shinfo(skb)->frags[j].page_offset,
> +                                  skb_shinfo(skb)->frags[j].size);
> +               skb_frag_ref(skb, j);
> +       }
> +       skb_shinfo(nskb)->nr_frags = k;
> +
> +       remaining_headlen = remaining - skb->data_len;
> +
> +       /* headlen contains remaining data? */
> +       if (remaining_headlen > 0)
> +               skb_copy_bits(skb, skb->len - remaining, nskb->data + headlen,
> +                             remaining_headlen);
> +       nskb->len = remaining + headlen;
> +       nskb->data_len =  payload_len - sizeof(struct udphdr) +
> +               max_t(int, 0, remaining_headlen);
> +       nskb->protocol = skb->protocol;
> +       if (nskb->protocol == htons(ETH_P_IP)) {
> +               ip_hdr(nskb)->id = htons(ntohs(ip_hdr(nskb)->id) +
> +                                        skb_shinfo(skb)->gso_segs);
> +               ip_hdr(nskb)->tot_len =
> +                       htons(payload_len + sizeof(struct iphdr));
> +       } else {
> +               ipv6_hdr(nskb)->payload_len = htons(payload_len);
> +       }
> +       udp_hdr(nskb)->len = htons(payload_len);
> +       skb_shinfo(nskb)->gso_size = 0;
> +       nskb->ip_summed = skb->ip_summed;
> +       nskb->csum_start = skb->csum_start;
> +       nskb->csum_offset = skb->csum_offset;
> +       nskb->queue_mapping = skb->queue_mapping;
> +}
> +
> +/* might send skbs and update wqe and pi */
> +struct sk_buff *mlx5e_udp_gso_handle_tx_skb(struct net_device *netdev,
> +                                           struct mlx5e_txqsq *sq,
> +                                           struct sk_buff *skb,
> +                                           struct mlx5e_tx_wqe **wqe,
> +                                           u16 *pi)
> +{
> +       int payload_len = skb_shinfo(skb)->gso_size + sizeof(struct udphdr);
> +       int headlen = skb_transport_offset(skb) + sizeof(struct udphdr);
> +       int remaining = (skb->len - headlen) % skb_shinfo(skb)->gso_size;
> +       struct sk_buff *nskb;
> +
> +       if (skb->protocol == htons(ETH_P_IP))
> +               ip_hdr(skb)->tot_len = htons(payload_len + sizeof(struct iphdr));
> +       else
> +               ipv6_hdr(skb)->payload_len = htons(payload_len);
> +       udp_hdr(skb)->len = htons(payload_len);
> +       if (!remaining)
> +               return skb;
> +
> +       nskb = alloc_skb(max_t(int, headlen, headlen + remaining - skb->data_len), GFP_ATOMIC);
> +       if (unlikely(!nskb)) {
> +               sq->stats->dropped++;
> +               return NULL;
> +       }
> +
> +       mlx5e_udp_gso_prepare_last_skb(skb, nskb, remaining);
> +
> +       skb_shinfo(skb)->gso_segs--;
> +       pskb_trim(skb, skb->len - remaining);
> +       mlx5e_sq_xmit(sq, skb, *wqe, *pi);
> +       mlx5e_sq_fetch_wqe(sq, wqe, pi);
> +       return nskb;
> +}

The device driver seems to be implementing the packet split here
similar to NETIF_F_GSO_PARTIAL. When advertising the right flag, the
stack should be able to do that for you and pass two packets to the
driver.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [net-next 01/12] net/mlx5e: Add UDP GSO support
  2018-06-29 22:19   ` Willem de Bruijn
@ 2018-06-30 16:06     ` Boris Pismenny
  2018-06-30 19:40       ` Boris Pismenny
  0 siblings, 1 reply; 22+ messages in thread
From: Boris Pismenny @ 2018-06-30 16:06 UTC (permalink / raw)
  To: Willem de Bruijn, Saeed Mahameed
  Cc: David Miller, Network Development, ogerlitz, yossiku, Alexander Duyck



On 06/30/18 01:19, Willem de Bruijn wrote:
> On Fri, Jun 29, 2018 at 2:24 AM Saeed Mahameed <saeedm@mellanox.com> wrote:
>> From: Boris Pismenny <borisp@mellanox.com>
>>
>> This patch enables UDP GSO support. We enable this by using two WQEs
>> the first is a UDP LSO WQE for all segments with equal length, and the
>> second is for the last segment in case it has different length.
>> Due to HW limitation, before sending, we must adjust the packet length fields.
>>
>> We measure performance between two Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
>> machines connected back-to-back with Connectx4-Lx (40Gbps) NICs.
>> We compare single stream UDP, UDP GSO and UDP GSO with offload.
>> Performance:
>>                  | MSS (bytes)   | Throughput (Gbps)     | CPU utilization (%)
>> UDP GSO offload | 1472          | 35.6                  | 8%
>> UDP GSO         | 1472          | 25.5                  | 17%
>> UDP             | 1472          | 10.2                  | 17%
>> UDP GSO offload | 1024          | 35.6                  | 8%
>> UDP GSO         | 1024          | 19.2                  | 17%
>> UDP             | 1024          | 5.7                   | 17%
>> UDP GSO offload | 512           | 33.8                  | 16%
>> UDP GSO         | 512           | 10.4                  | 17%
>> UDP             | 512           | 3.5                   | 17%
> Very nice results :)
Thanks. We expect to have 100Gbps results by netdev0x12.

>> +static void mlx5e_udp_gso_prepare_last_skb(struct sk_buff *skb,
>> +                                          struct sk_buff *nskb,
>> +                                          int remaining)
>> +{
>> +       int bytes_needed = remaining, remaining_headlen, remaining_page_offset;
>> +       int headlen = skb_transport_offset(skb) + sizeof(struct udphdr);
>> +       int payload_len = remaining + sizeof(struct udphdr);
>> +       int k = 0, i, j;
>> +
>> +       skb_copy_bits(skb, 0, nskb->data, headlen);
>> +       nskb->dev = skb->dev;
>> +       skb_reset_mac_header(nskb);
>> +       skb_set_network_header(nskb, skb_network_offset(skb));
>> +       skb_set_transport_header(nskb, skb_transport_offset(skb));
>> +       skb_set_tail_pointer(nskb, headlen);
>> +
>> +       /* How many frags do we need? */
>> +       for (i = skb_shinfo(skb)->nr_frags - 1; i >= 0; i--) {
>> +               bytes_needed -= skb_frag_size(&skb_shinfo(skb)->frags[i]);
>> +               k++;
>> +               if (bytes_needed <= 0)
>> +                       break;
>> +       }
>> +
>> +       /* Fill the first frag and split it if necessary */
>> +       j = skb_shinfo(skb)->nr_frags - k;
>> +       remaining_page_offset = -bytes_needed;
>> +       skb_fill_page_desc(nskb, 0,
>> +                          skb_shinfo(skb)->frags[j].page.p,
>> +                          skb_shinfo(skb)->frags[j].page_offset + remaining_page_offset,
>> +                          skb_shinfo(skb)->frags[j].size - remaining_page_offset);
>> +
>> +       skb_frag_ref(skb, j);
>> +
>> +       /* Fill the rest of the frags */
>> +       for (i = 1; i < k; i++) {
>> +               j = skb_shinfo(skb)->nr_frags - k + i;
>> +
>> +               skb_fill_page_desc(nskb, i,
>> +                                  skb_shinfo(skb)->frags[j].page.p,
>> +                                  skb_shinfo(skb)->frags[j].page_offset,
>> +                                  skb_shinfo(skb)->frags[j].size);
>> +               skb_frag_ref(skb, j);
>> +       }
>> +       skb_shinfo(nskb)->nr_frags = k;
>> +
>> +       remaining_headlen = remaining - skb->data_len;
>> +
>> +       /* headlen contains remaining data? */
>> +       if (remaining_headlen > 0)
>> +               skb_copy_bits(skb, skb->len - remaining, nskb->data + headlen,
>> +                             remaining_headlen);
>> +       nskb->len = remaining + headlen;
>> +       nskb->data_len =  payload_len - sizeof(struct udphdr) +
>> +               max_t(int, 0, remaining_headlen);
>> +       nskb->protocol = skb->protocol;
>> +       if (nskb->protocol == htons(ETH_P_IP)) {
>> +               ip_hdr(nskb)->id = htons(ntohs(ip_hdr(nskb)->id) +
>> +                                        skb_shinfo(skb)->gso_segs);
>> +               ip_hdr(nskb)->tot_len =
>> +                       htons(payload_len + sizeof(struct iphdr));
>> +       } else {
>> +               ipv6_hdr(nskb)->payload_len = htons(payload_len);
>> +       }
>> +       udp_hdr(nskb)->len = htons(payload_len);
>> +       skb_shinfo(nskb)->gso_size = 0;
>> +       nskb->ip_summed = skb->ip_summed;
>> +       nskb->csum_start = skb->csum_start;
>> +       nskb->csum_offset = skb->csum_offset;
>> +       nskb->queue_mapping = skb->queue_mapping;
>> +}
>> +
>> +/* might send skbs and update wqe and pi */
>> +struct sk_buff *mlx5e_udp_gso_handle_tx_skb(struct net_device *netdev,
>> +                                           struct mlx5e_txqsq *sq,
>> +                                           struct sk_buff *skb,
>> +                                           struct mlx5e_tx_wqe **wqe,
>> +                                           u16 *pi)
>> +{
>> +       int payload_len = skb_shinfo(skb)->gso_size + sizeof(struct udphdr);
>> +       int headlen = skb_transport_offset(skb) + sizeof(struct udphdr);
>> +       int remaining = (skb->len - headlen) % skb_shinfo(skb)->gso_size;
>> +       struct sk_buff *nskb;
>> +
>> +       if (skb->protocol == htons(ETH_P_IP))
>> +               ip_hdr(skb)->tot_len = htons(payload_len + sizeof(struct iphdr));
>> +       else
>> +               ipv6_hdr(skb)->payload_len = htons(payload_len);
>> +       udp_hdr(skb)->len = htons(payload_len);
>> +       if (!remaining)
>> +               return skb;
>> +
>> +       nskb = alloc_skb(max_t(int, headlen, headlen + remaining - skb->data_len), GFP_ATOMIC);
>> +       if (unlikely(!nskb)) {
>> +               sq->stats->dropped++;
>> +               return NULL;
>> +       }
>> +
>> +       mlx5e_udp_gso_prepare_last_skb(skb, nskb, remaining);
>> +
>> +       skb_shinfo(skb)->gso_segs--;
>> +       pskb_trim(skb, skb->len - remaining);
>> +       mlx5e_sq_xmit(sq, skb, *wqe, *pi);
>> +       mlx5e_sq_fetch_wqe(sq, wqe, pi);
>> +       return nskb;
>> +}
> The device driver seems to be implementing the packet split here
> similar to NETIF_F_GSO_PARTIAL. When advertising the right flag, the
> stack should be able to do that for you and pass two packets to the
> driver.

We've totally missed that!
Thank you. I'll fix and resubmit.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [net-next 01/12] net/mlx5e: Add UDP GSO support
  2018-06-30 16:06     ` Boris Pismenny
@ 2018-06-30 19:40       ` Boris Pismenny
       [not found]         ` <CAKgT0UfFMG+i9z_nxB0vkJesDE4CHWbudCh6kJ_1+=Vd1hv=wA@mail.gmail.com>
  0 siblings, 1 reply; 22+ messages in thread
From: Boris Pismenny @ 2018-06-30 19:40 UTC (permalink / raw)
  To: Willem de Bruijn, Saeed Mahameed
  Cc: David Miller, Network Development, ogerlitz, yossiku, Alexander Duyck


On 06/30/18 19:06, Boris Pismenny wrote:
>
>
> On 06/30/18 01:19, Willem de Bruijn wrote:
>> On Fri, Jun 29, 2018 at 2:24 AM Saeed Mahameed <saeedm@mellanox.com> 
>> wrote:
>>> From: Boris Pismenny <borisp@mellanox.com>
>>>
>>> This patch enables UDP GSO support. We enable this by using two WQEs
>>> the first is a UDP LSO WQE for all segments with equal length, and the
>>> second is for the last segment in case it has different length.
>>> Due to HW limitation, before sending, we must adjust the packet 
>>> length fields.
>>>
>>> We measure performance between two Intel(R) Xeon(R) CPU E5-2643 v2 
>>> @3.50GHz
>>> machines connected back-to-back with Connectx4-Lx (40Gbps) NICs.
>>> We compare single stream UDP, UDP GSO and UDP GSO with offload.
>>> Performance:
>>>                  | MSS (bytes)   | Throughput (Gbps)     | CPU 
>>> utilization (%)
>>> UDP GSO offload | 1472          | 35.6                  | 8%
>>> UDP GSO         | 1472          | 25.5                  | 17%
>>> UDP             | 1472          | 10.2                  | 17%
>>> UDP GSO offload | 1024          | 35.6                  | 8%
>>> UDP GSO         | 1024          | 19.2                  | 17%
>>> UDP             | 1024          | 5.7                   | 17%
>>> UDP GSO offload | 512           | 33.8                  | 16%
>>> UDP GSO         | 512           | 10.4                  | 17%
>>> UDP             | 512           | 3.5                   | 17%
>> Very nice results :)
> Thanks. We expect to have 100Gbps results by netdev0x12.
>
>>> +static void mlx5e_udp_gso_prepare_last_skb(struct sk_buff *skb,
>>> +                                          struct sk_buff *nskb,
>>> +                                          int remaining)
>>> +{
>>> +       int bytes_needed = remaining, remaining_headlen, 
>>> remaining_page_offset;
>>> +       int headlen = skb_transport_offset(skb) + sizeof(struct 
>>> udphdr);
>>> +       int payload_len = remaining + sizeof(struct udphdr);
>>> +       int k = 0, i, j;
>>> +
>>> +       skb_copy_bits(skb, 0, nskb->data, headlen);
>>> +       nskb->dev = skb->dev;
>>> +       skb_reset_mac_header(nskb);
>>> +       skb_set_network_header(nskb, skb_network_offset(skb));
>>> +       skb_set_transport_header(nskb, skb_transport_offset(skb));
>>> +       skb_set_tail_pointer(nskb, headlen);
>>> +
>>> +       /* How many frags do we need? */
>>> +       for (i = skb_shinfo(skb)->nr_frags - 1; i >= 0; i--) {
>>> +               bytes_needed -= 
>>> skb_frag_size(&skb_shinfo(skb)->frags[i]);
>>> +               k++;
>>> +               if (bytes_needed <= 0)
>>> +                       break;
>>> +       }
>>> +
>>> +       /* Fill the first frag and split it if necessary */
>>> +       j = skb_shinfo(skb)->nr_frags - k;
>>> +       remaining_page_offset = -bytes_needed;
>>> +       skb_fill_page_desc(nskb, 0,
>>> + skb_shinfo(skb)->frags[j].page.p,
>>> + skb_shinfo(skb)->frags[j].page_offset + remaining_page_offset,
>>> +                          skb_shinfo(skb)->frags[j].size - 
>>> remaining_page_offset);
>>> +
>>> +       skb_frag_ref(skb, j);
>>> +
>>> +       /* Fill the rest of the frags */
>>> +       for (i = 1; i < k; i++) {
>>> +               j = skb_shinfo(skb)->nr_frags - k + i;
>>> +
>>> +               skb_fill_page_desc(nskb, i,
>>> + skb_shinfo(skb)->frags[j].page.p,
>>> + skb_shinfo(skb)->frags[j].page_offset,
>>> + skb_shinfo(skb)->frags[j].size);
>>> +               skb_frag_ref(skb, j);
>>> +       }
>>> +       skb_shinfo(nskb)->nr_frags = k;
>>> +
>>> +       remaining_headlen = remaining - skb->data_len;
>>> +
>>> +       /* headlen contains remaining data? */
>>> +       if (remaining_headlen > 0)
>>> +               skb_copy_bits(skb, skb->len - remaining, nskb->data 
>>> + headlen,
>>> +                             remaining_headlen);
>>> +       nskb->len = remaining + headlen;
>>> +       nskb->data_len =  payload_len - sizeof(struct udphdr) +
>>> +               max_t(int, 0, remaining_headlen);
>>> +       nskb->protocol = skb->protocol;
>>> +       if (nskb->protocol == htons(ETH_P_IP)) {
>>> +               ip_hdr(nskb)->id = htons(ntohs(ip_hdr(nskb)->id) +
>>> + skb_shinfo(skb)->gso_segs);
>>> +               ip_hdr(nskb)->tot_len =
>>> +                       htons(payload_len + sizeof(struct iphdr));
>>> +       } else {
>>> +               ipv6_hdr(nskb)->payload_len = htons(payload_len);
>>> +       }
>>> +       udp_hdr(nskb)->len = htons(payload_len);
>>> +       skb_shinfo(nskb)->gso_size = 0;
>>> +       nskb->ip_summed = skb->ip_summed;
>>> +       nskb->csum_start = skb->csum_start;
>>> +       nskb->csum_offset = skb->csum_offset;
>>> +       nskb->queue_mapping = skb->queue_mapping;
>>> +}
>>> +
>>> +/* might send skbs and update wqe and pi */
>>> +struct sk_buff *mlx5e_udp_gso_handle_tx_skb(struct net_device *netdev,
>>> +                                           struct mlx5e_txqsq *sq,
>>> +                                           struct sk_buff *skb,
>>> +                                           struct mlx5e_tx_wqe **wqe,
>>> +                                           u16 *pi)
>>> +{
>>> +       int payload_len = skb_shinfo(skb)->gso_size + sizeof(struct 
>>> udphdr);
>>> +       int headlen = skb_transport_offset(skb) + sizeof(struct 
>>> udphdr);
>>> +       int remaining = (skb->len - headlen) % 
>>> skb_shinfo(skb)->gso_size;
>>> +       struct sk_buff *nskb;
>>> +
>>> +       if (skb->protocol == htons(ETH_P_IP))
>>> +               ip_hdr(skb)->tot_len = htons(payload_len + 
>>> sizeof(struct iphdr));
>>> +       else
>>> +               ipv6_hdr(skb)->payload_len = htons(payload_len);
>>> +       udp_hdr(skb)->len = htons(payload_len);
>>> +       if (!remaining)
>>> +               return skb;
>>> +
>>> +       nskb = alloc_skb(max_t(int, headlen, headlen + remaining - 
>>> skb->data_len), GFP_ATOMIC);
>>> +       if (unlikely(!nskb)) {
>>> +               sq->stats->dropped++;
>>> +               return NULL;
>>> +       }
>>> +
>>> +       mlx5e_udp_gso_prepare_last_skb(skb, nskb, remaining);
>>> +
>>> +       skb_shinfo(skb)->gso_segs--;
>>> +       pskb_trim(skb, skb->len - remaining);
>>> +       mlx5e_sq_xmit(sq, skb, *wqe, *pi);
>>> +       mlx5e_sq_fetch_wqe(sq, wqe, pi);
>>> +       return nskb;
>>> +}
>> The device driver seems to be implementing the packet split here
>> similar to NETIF_F_GSO_PARTIAL. When advertising the right flag, the
>> stack should be able to do that for you and pass two packets to the
>> driver.
>
> We've totally missed that!
> Thank you. I'll fix and resubmit.

I've noticed that we could get cleaner code in our driver if we remove 
these two lines from net/ipv4/udp_offload.c:
if (skb_is_gso(segs))
              mss *= skb_shinfo(segs)->gso_segs;

I think that this is correct in case of GSO_PARTIAL segmentation for the 
following reasons:
1. After this change the UDP payload field is consistent with the IP 
header payload length field. Currently, IPv4 length is 1500 and UDP 
total length is the full unsegmented length.
2. AFAIU, in GSO_PARTIAL no tunnel headers should be modified except the 
IP ID field, including the UDP length field.
What do you think?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [net-next 01/12] net/mlx5e: Add UDP GSO support
       [not found]         ` <CAKgT0UfFMG+i9z_nxB0vkJesDE4CHWbudCh6kJ_1+=Vd1hv=wA@mail.gmail.com>
@ 2018-07-02  1:45           ` Willem de Bruijn
  2018-07-02  5:29             ` Boris Pismenny
  0 siblings, 1 reply; 22+ messages in thread
From: Willem de Bruijn @ 2018-07-02  1:45 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: borisp, David Miller, Network Development, Saeed Mahameed,
	ogerlitz, yossiku

>> I've noticed that we could get cleaner code in our driver if we remove
>> these two lines from net/ipv4/udp_offload.c:
>> if (skb_is_gso(segs))
>>               mss *= skb_shinfo(segs)->gso_segs;
>>
>> I think that this is correct in case of GSO_PARTIAL segmentation for the
>> following reasons:
>> 1. After this change the UDP payload field is consistent with the IP
>> header payload length field. Currently, IPv4 length is 1500 and UDP
>> total length is the full unsegmented length.

How does this simplify the driver? Does it currently have to
change the udph->length field to the mss on the wire, because the
device only splits + replicates the headers + computes the csum?

> I don’t recall that the UDP header length field will match the IP length
> field. I had intentionally left it at the original value used to compute
> the UDP header checksum. That way you could just adjust it by
> cancelling out the length from the partial checksum.

>> 2. AFAIU, in GSO_PARTIAL no tunnel headers should be modified except the
>> IP ID field, including the UDP length field.
>> What do you think?
>
>
> For outer headers this is correct. For inner headers this is not. The inner
> UDP header will need to have the length updated and the checksum
> recomputed as a part of the segmentation.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [net-next 01/12] net/mlx5e: Add UDP GSO support
  2018-07-02  1:45           ` Willem de Bruijn
@ 2018-07-02  5:29             ` Boris Pismenny
  2018-07-02 13:34               ` Willem de Bruijn
  0 siblings, 1 reply; 22+ messages in thread
From: Boris Pismenny @ 2018-07-02  5:29 UTC (permalink / raw)
  To: Willem de Bruijn, Alexander Duyck
  Cc: David Miller, Network Development, Saeed Mahameed, ogerlitz, yossiku



On 7/2/2018 4:45 AM, Willem de Bruijn wrote:
>>> I've noticed that we could get cleaner code in our driver if we remove
>>> these two lines from net/ipv4/udp_offload.c:
>>> if (skb_is_gso(segs))
>>>                mss *= skb_shinfo(segs)->gso_segs;
>>>
>>> I think that this is correct in case of GSO_PARTIAL segmentation for the
>>> following reasons:
>>> 1. After this change the UDP payload field is consistent with the IP
>>> header payload length field. Currently, IPv4 length is 1500 and UDP
>>> total length is the full unsegmented length.
> 
> How does this simplify the driver? Does it currently have to
> change the udph->length field to the mss on the wire, because the
> device only splits + replicates the headers + computes the csum?

Yes, this is the code I have at the moment.

The device's limitation is more subtle than this. It could adjust the 
length, but then the checksum would be wrong.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [net-next 01/12] net/mlx5e: Add UDP GSO support
  2018-07-02  5:29             ` Boris Pismenny
@ 2018-07-02 13:34               ` Willem de Bruijn
  2018-07-02 14:45                 ` Willem de Bruijn
  0 siblings, 1 reply; 22+ messages in thread
From: Willem de Bruijn @ 2018-07-02 13:34 UTC (permalink / raw)
  To: borisp
  Cc: Alexander Duyck, David Miller, Network Development,
	Saeed Mahameed, ogerlitz, yossiku

On Mon, Jul 2, 2018 at 1:30 AM Boris Pismenny <borisp@mellanox.com> wrote:
>
>
>
> On 7/2/2018 4:45 AM, Willem de Bruijn wrote:
> >>> I've noticed that we could get cleaner code in our driver if we remove
> >>> these two lines from net/ipv4/udp_offload.c:
> >>> if (skb_is_gso(segs))
> >>>                mss *= skb_shinfo(segs)->gso_segs;
> >>>
> >>> I think that this is correct in case of GSO_PARTIAL segmentation for the
> >>> following reasons:
> >>> 1. After this change the UDP payload field is consistent with the IP
> >>> header payload length field. Currently, IPv4 length is 1500 and UDP
> >>> total length is the full unsegmented length.
> >
> > How does this simplify the driver? Does it currently have to
> > change the udph->length field to the mss on the wire, because the
> > device only splits + replicates the headers + computes the csum?
>
> Yes, this is the code I have at the moment.
>
> The device's limitation is more subtle than this. It could adjust the
> length, but then the checksum would be wrong.

I see. We do have to keep in mind other devices. Alexander's ixgbe
RFC patch does not have this logic, so that device must update the
field directly.

  https://patchwork.ozlabs.org/patch/908396/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [net-next 01/12] net/mlx5e: Add UDP GSO support
  2018-07-02 13:34               ` Willem de Bruijn
@ 2018-07-02 14:45                 ` Willem de Bruijn
       [not found]                   ` <CAKgT0UdC2c04JagxW8S==-ymBfDZVd6f=7DcLUGjRqiiZA3BwA@mail.gmail.com>
  0 siblings, 1 reply; 22+ messages in thread
From: Willem de Bruijn @ 2018-07-02 14:45 UTC (permalink / raw)
  To: borisp
  Cc: Alexander Duyck, David Miller, Network Development,
	Saeed Mahameed, ogerlitz, yossiku

On Mon, Jul 2, 2018 at 9:34 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Mon, Jul 2, 2018 at 1:30 AM Boris Pismenny <borisp@mellanox.com> wrote:
> >
> >
> >
> > On 7/2/2018 4:45 AM, Willem de Bruijn wrote:
> > >>> I've noticed that we could get cleaner code in our driver if we remove
> > >>> these two lines from net/ipv4/udp_offload.c:
> > >>> if (skb_is_gso(segs))
> > >>>                mss *= skb_shinfo(segs)->gso_segs;
> > >>>
> > >>> I think that this is correct in case of GSO_PARTIAL segmentation for the
> > >>> following reasons:
> > >>> 1. After this change the UDP payload field is consistent with the IP
> > >>> header payload length field. Currently, IPv4 length is 1500 and UDP
> > >>> total length is the full unsegmented length.
> > >
> > > How does this simplify the driver? Does it currently have to
> > > change the udph->length field to the mss on the wire, because the
> > > device only splits + replicates the headers + computes the csum?
> >
> > Yes, this is the code I have at the moment.
> >
> > The device's limitation is more subtle than this. It could adjust the
> > length, but then the checksum would be wrong.
>
> I see. We do have to keep in mind other devices. Alexander's ixgbe
> RFC patch does not have this logic, so that device must update the
> field directly.
>
>   https://patchwork.ozlabs.org/patch/908396/

To be clear, I think it's fine to remove these two lines if it does not cause
problems for the ixgbe. It's quite possible that that device sets the udp
length field unconditionally, ignoring the previous value. In which case both
devices will work after this change without additional driver logic.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [net-next 01/12] net/mlx5e: Add UDP GSO support
       [not found]                   ` <CAKgT0UdC2c04JagxW8S==-ymBfDZVd6f=7DcLUGjRqiiZA3BwA@mail.gmail.com>
@ 2018-07-02 19:17                     ` Boris Pismenny
  0 siblings, 0 replies; 22+ messages in thread
From: Boris Pismenny @ 2018-07-02 19:17 UTC (permalink / raw)
  To: Alexander Duyck, Willem de Bruijn
  Cc: David Miller, Network Development, Saeed Mahameed, ogerlitz, yossiku



On 7/2/2018 6:32 PM, Alexander Duyck wrote:
> 
> 
> On Mon, Jul 2, 2018 at 7:46 AM Willem de Bruijn 
> <willemdebruijn.kernel@gmail.com 
> <mailto:willemdebruijn.kernel@gmail.com>> wrote:
> 
>     On Mon, Jul 2, 2018 at 9:34 AM Willem de Bruijn
>     <willemdebruijn.kernel@gmail.com
>     <mailto:willemdebruijn.kernel@gmail.com>> wrote:
>      >
>      > On Mon, Jul 2, 2018 at 1:30 AM Boris Pismenny
>     <borisp@mellanox.com <mailto:borisp@mellanox.com>> wrote:
>      > >
>      > >
>      > >
>      > > On 7/2/2018 4:45 AM, Willem de Bruijn wrote:
>      > > >>> I've noticed that we could get cleaner code in our driver
>     if we remove
>      > > >>> these two lines from net/ipv4/udp_offload.c:
>      > > >>> if (skb_is_gso(segs))
>      > > >>>                mss *= skb_shinfo(segs)->gso_segs;
>      > > >>>
>      > > >>> I think that this is correct in case of GSO_PARTIAL
>     segmentation for the
>      > > >>> following reasons:
>      > > >>> 1. After this change the UDP payload field is consistent
>     with the IP
>      > > >>> header payload length field. Currently, IPv4 length is 1500
>     and UDP
>      > > >>> total length is the full unsegmented length.
>      > > >
>      > > > How does this simplify the driver? Does it currently have to
>      > > > change the udph->length field to the mss on the wire, because the
>      > > > device only splits + replicates the headers + computes the csum?
>      > >
>      > > Yes, this is the code I have at the moment.
>      > >
>      > > The device's limitation is more subtle than this. It could
>     adjust the
>      > > length, but then the checksum would be wrong.
>      >
>      > I see. We do have to keep in mind other devices. Alexander's ixgbe
>      > RFC patch does not have this logic, so that device must update the
>      > field directly.
>      >
>      > https://patchwork.ozlabs.org/patch/908396/
> 
>     To be clear, I think it's fine to remove these two lines if it does
>     not cause
>     problems for the ixgbe. It's quite possible that that device sets
>     the udp
>     length field unconditionally, ignoring the previous value. In which
>     case both
>     devices will work after this change without additional driver logic.
> 
> 
> I would prefer we didn’t modify this. Otherwise we cannot cancel out the 
> length from the partial checksum when the time comes. The ixgbe code was 
> making use of it if I recall.
> 

AFAIU, the ixgbe patch doesn't use this. Instead the length is obtained 
by the following code for both TCP and UDP segmentation:
paylen = skb->len - l4_offset;

Could you please check to see if this is actually required?

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-07-02 19:17 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-28 21:50 [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 Saeed Mahameed
2018-06-28 21:50 ` [net-next 01/12] net/mlx5e: Add UDP GSO support Saeed Mahameed
2018-06-29 22:19   ` Willem de Bruijn
2018-06-30 16:06     ` Boris Pismenny
2018-06-30 19:40       ` Boris Pismenny
     [not found]         ` <CAKgT0UfFMG+i9z_nxB0vkJesDE4CHWbudCh6kJ_1+=Vd1hv=wA@mail.gmail.com>
2018-07-02  1:45           ` Willem de Bruijn
2018-07-02  5:29             ` Boris Pismenny
2018-07-02 13:34               ` Willem de Bruijn
2018-07-02 14:45                 ` Willem de Bruijn
     [not found]                   ` <CAKgT0UdC2c04JagxW8S==-ymBfDZVd6f=7DcLUGjRqiiZA3BwA@mail.gmail.com>
2018-07-02 19:17                     ` Boris Pismenny
2018-06-28 21:50 ` [net-next 02/12] net/mlx5e: Add UDP GSO remaining counter Saeed Mahameed
2018-06-28 21:50 ` [net-next 03/12] net/mlx5e: Convert large order kzalloc allocations to kvzalloc Saeed Mahameed
2018-06-28 21:50 ` [net-next 04/12] net/mlx5e: RX, Use existing WQ local variable Saeed Mahameed
2018-06-28 21:50 ` [net-next 05/12] net/mlx5e: Add TX completions statistics Saeed Mahameed
2018-06-28 21:50 ` [net-next 06/12] net/mlx5e: Add XDP_TX " Saeed Mahameed
2018-06-28 21:50 ` [net-next 07/12] net/mlx5e: Add NAPI statistics Saeed Mahameed
2018-06-28 21:50 ` [net-next 08/12] net/mlx5e: Add a counter for congested UMRs Saeed Mahameed
2018-06-28 21:51 ` [net-next 09/12] net/mlx5e: Add channel events counter Saeed Mahameed
2018-06-28 21:51 ` [net-next 10/12] net/mlx5e: Add counter for MPWQE filler strides Saeed Mahameed
2018-06-28 21:51 ` [net-next 11/12] net/mlx5e: Add counter for total num of NOP operations Saeed Mahameed
2018-06-28 21:51 ` [net-next 12/12] net/mlx5e: Update NIC HW stats on demand only Saeed Mahameed
2018-06-29 14:56 ` [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28 David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.