All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 net-next 0/3] mlx4: better BIG-TCP support
@ 2022-12-07 14:12 Eric Dumazet
  2022-12-07 14:12 ` [PATCH v2 net-next 1/3] net/mlx4: rename two constants Eric Dumazet
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Eric Dumazet @ 2022-12-07 14:12 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Tariq Toukan, Leon Romanovsky, netdev, eric.dumazet, Eric Dumazet

mlx4 uses a bounce buffer in TX whenever the tx descriptors
wrap around the right edge of the ring.

Size of this bounce buffer was hard coded and can be
increased if/when needed.

v2: roundup MLX4_TX_BOUNCE_BUFFER_SIZE (Tariq)

Eric Dumazet (3):
  net/mlx4: rename two constants
  net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS
  net/mlx4: small optimization in mlx4_en_xmit()

 drivers/net/ethernet/mellanox/mlx4/en_tx.c   | 18 ++++++++++--------
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 18 +++++++++++++-----
 2 files changed, 23 insertions(+), 13 deletions(-)

-- 
2.39.0.rc0.267.gcb52ba06e7-goog


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 net-next 1/3] net/mlx4: rename two constants
  2022-12-07 14:12 [PATCH v2 net-next 0/3] mlx4: better BIG-TCP support Eric Dumazet
@ 2022-12-07 14:12 ` Eric Dumazet
  2022-12-07 18:53   ` Tariq Toukan
  2022-12-07 14:12 ` [PATCH v2 net-next 2/3] net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS Eric Dumazet
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2022-12-07 14:12 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Tariq Toukan, Leon Romanovsky, netdev, eric.dumazet, Eric Dumazet

MAX_DESC_SIZE is really the size of the bounce buffer used
when reaching the right side of TX ring buffer.

MAX_DESC_TXBBS get a MLX4_ prefix.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c   | 10 ++++++----
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  4 ++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 43a4102e9c091758b33aa7377dcb82cab7c43a94..8372aeb392a28cf36a454e1b8a4783bc2b2056eb 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -65,7 +65,7 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
 	ring->size = size;
 	ring->size_mask = size - 1;
 	ring->sp_stride = stride;
-	ring->full_size = ring->size - HEADROOM - MAX_DESC_TXBBS;
+	ring->full_size = ring->size - HEADROOM - MLX4_MAX_DESC_TXBBS;
 
 	tmp = size * sizeof(struct mlx4_en_tx_info);
 	ring->tx_info = kvmalloc_node(tmp, GFP_KERNEL, node);
@@ -77,9 +77,11 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
 	en_dbg(DRV, priv, "Allocated tx_info ring at addr:%p size:%d\n",
 		 ring->tx_info, tmp);
 
-	ring->bounce_buf = kmalloc_node(MAX_DESC_SIZE, GFP_KERNEL, node);
+	ring->bounce_buf = kmalloc_node(MLX4_TX_BOUNCE_BUFFER_SIZE,
+					GFP_KERNEL, node);
 	if (!ring->bounce_buf) {
-		ring->bounce_buf = kmalloc(MAX_DESC_SIZE, GFP_KERNEL);
+		ring->bounce_buf = kmalloc(MLX4_TX_BOUNCE_BUFFER_SIZE,
+					   GFP_KERNEL);
 		if (!ring->bounce_buf) {
 			err = -ENOMEM;
 			goto err_info;
@@ -909,7 +911,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* Align descriptor to TXBB size */
 	desc_size = ALIGN(real_size, TXBB_SIZE);
 	nr_txbb = desc_size >> LOG_TXBB_SIZE;
-	if (unlikely(nr_txbb > MAX_DESC_TXBBS)) {
+	if (unlikely(nr_txbb > MLX4_MAX_DESC_TXBBS)) {
 		if (netif_msg_tx_err(priv))
 			en_warn(priv, "Oversized header or SG list\n");
 		goto tx_drop_count;
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index e132ff4c82f2d33045f6c9aeecaaa409a41e0b0d..7cc288db2a64f75ffe64882e3c25b90715e68855 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -90,8 +90,8 @@
 #define MLX4_EN_FILTER_EXPIRY_QUOTA 60
 
 /* Typical TSO descriptor with 16 gather entries is 352 bytes... */
-#define MAX_DESC_SIZE		512
-#define MAX_DESC_TXBBS		(MAX_DESC_SIZE / TXBB_SIZE)
+#define MLX4_TX_BOUNCE_BUFFER_SIZE 512
+#define MLX4_MAX_DESC_TXBBS	   (MLX4_TX_BOUNCE_BUFFER_SIZE / TXBB_SIZE)
 
 /*
  * OS related constants and tunables
-- 
2.39.0.rc0.267.gcb52ba06e7-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 net-next 2/3] net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS
  2022-12-07 14:12 [PATCH v2 net-next 0/3] mlx4: better BIG-TCP support Eric Dumazet
  2022-12-07 14:12 ` [PATCH v2 net-next 1/3] net/mlx4: rename two constants Eric Dumazet
@ 2022-12-07 14:12 ` Eric Dumazet
  2022-12-07 18:53   ` Tariq Toukan
  2022-12-07 14:12 ` [PATCH v2 net-next 3/3] net/mlx4: small optimization in mlx4_en_xmit() Eric Dumazet
  2022-12-08 22:30 ` [PATCH v2 net-next 0/3] mlx4: better BIG-TCP support patchwork-bot+netdevbpf
  3 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2022-12-07 14:12 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Tariq Toukan, Leon Romanovsky, netdev, eric.dumazet,
	Eric Dumazet, Wei Wang

Google production kernel has increased MAX_SKB_FRAGS to 45
for BIG-TCP rollout.

Unfortunately mlx4 TX bounce buffer is not big enough whenever
an skb has up to 45 page fragments.

This can happen often with TCP TX zero copy, as one frag usually
holds 4096 bytes of payload (order-0 page).

Tested:
 Kernel built with MAX_SKB_FRAGS=45
 ip link set dev eth0 gso_max_size 185000
 netperf -t TCP_SENDFILE

I made sure that "ethtool -G eth0 tx 64" was properly working,
ring->full_size being set to 15.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Wei Wang <weiwan@google.com>
Cc: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 7cc288db2a64f75ffe64882e3c25b90715e68855..3d4226ddba5e6582e9420d853b6535b806219e55 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -89,8 +89,18 @@
 #define MLX4_EN_FILTER_HASH_SHIFT 4
 #define MLX4_EN_FILTER_EXPIRY_QUOTA 60
 
-/* Typical TSO descriptor with 16 gather entries is 352 bytes... */
-#define MLX4_TX_BOUNCE_BUFFER_SIZE 512
+#define CTRL_SIZE	sizeof(struct mlx4_wqe_ctrl_seg)
+#define DS_SIZE		sizeof(struct mlx4_wqe_data_seg)
+
+/* Maximal size of the bounce buffer:
+ * 256 bytes for LSO headers.
+ * CTRL_SIZE for control desc.
+ * DS_SIZE if skb->head contains some payload.
+ * MAX_SKB_FRAGS frags.
+ */
+#define MLX4_TX_BOUNCE_BUFFER_SIZE \
+	ALIGN(256 + CTRL_SIZE + DS_SIZE + MAX_SKB_FRAGS * DS_SIZE, TXBB_SIZE)
+
 #define MLX4_MAX_DESC_TXBBS	   (MLX4_TX_BOUNCE_BUFFER_SIZE / TXBB_SIZE)
 
 /*
@@ -217,9 +227,7 @@ struct mlx4_en_tx_info {
 
 
 #define MLX4_EN_BIT_DESC_OWN	0x80000000
-#define CTRL_SIZE	sizeof(struct mlx4_wqe_ctrl_seg)
 #define MLX4_EN_MEMTYPE_PAD	0x100
-#define DS_SIZE		sizeof(struct mlx4_wqe_data_seg)
 
 
 struct mlx4_en_tx_desc {
-- 
2.39.0.rc0.267.gcb52ba06e7-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 net-next 3/3] net/mlx4: small optimization in mlx4_en_xmit()
  2022-12-07 14:12 [PATCH v2 net-next 0/3] mlx4: better BIG-TCP support Eric Dumazet
  2022-12-07 14:12 ` [PATCH v2 net-next 1/3] net/mlx4: rename two constants Eric Dumazet
  2022-12-07 14:12 ` [PATCH v2 net-next 2/3] net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS Eric Dumazet
@ 2022-12-07 14:12 ` Eric Dumazet
  2022-12-07 18:54   ` Tariq Toukan
  2022-12-08 22:30 ` [PATCH v2 net-next 0/3] mlx4: better BIG-TCP support patchwork-bot+netdevbpf
  3 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2022-12-07 14:12 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Tariq Toukan, Leon Romanovsky, netdev, eric.dumazet,
	Eric Dumazet, Wei Wang

Test against MLX4_MAX_DESC_TXBBS only matters if the TX
bounce buffer is going to be used.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@nvidia.com>
Cc: Wei Wang <weiwan@google.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 8372aeb392a28cf36a454e1b8a4783bc2b2056eb..c5758637b7bed67021a9f3e9c5283033f68639a3 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -911,11 +911,6 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* Align descriptor to TXBB size */
 	desc_size = ALIGN(real_size, TXBB_SIZE);
 	nr_txbb = desc_size >> LOG_TXBB_SIZE;
-	if (unlikely(nr_txbb > MLX4_MAX_DESC_TXBBS)) {
-		if (netif_msg_tx_err(priv))
-			en_warn(priv, "Oversized header or SG list\n");
-		goto tx_drop_count;
-	}
 
 	bf_ok = ring->bf_enabled;
 	if (skb_vlan_tag_present(skb)) {
@@ -943,6 +938,11 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (likely(index + nr_txbb <= ring->size))
 		tx_desc = ring->buf + (index << LOG_TXBB_SIZE);
 	else {
+		if (unlikely(nr_txbb > MLX4_MAX_DESC_TXBBS)) {
+			if (netif_msg_tx_err(priv))
+				en_warn(priv, "Oversized header or SG list\n");
+			goto tx_drop_count;
+		}
 		tx_desc = (struct mlx4_en_tx_desc *) ring->bounce_buf;
 		bounce = true;
 		bf_ok = false;
-- 
2.39.0.rc0.267.gcb52ba06e7-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 net-next 1/3] net/mlx4: rename two constants
  2022-12-07 14:12 ` [PATCH v2 net-next 1/3] net/mlx4: rename two constants Eric Dumazet
@ 2022-12-07 18:53   ` Tariq Toukan
  0 siblings, 0 replies; 8+ messages in thread
From: Tariq Toukan @ 2022-12-07 18:53 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Tariq Toukan, Leon Romanovsky, netdev, eric.dumazet



On 12/7/2022 4:12 PM, Eric Dumazet wrote:
> MAX_DESC_SIZE is really the size of the bounce buffer used
> when reaching the right side of TX ring buffer.
> 
> MAX_DESC_TXBBS get a MLX4_ prefix.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Tariq Toukan <tariqt@nvidia.com>
> ---
>   drivers/net/ethernet/mellanox/mlx4/en_tx.c   | 10 ++++++----
>   drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  4 ++--
>   2 files changed, 8 insertions(+), 6 deletions(-)
> 

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>

Thanks for your series!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 net-next 2/3] net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS
  2022-12-07 14:12 ` [PATCH v2 net-next 2/3] net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS Eric Dumazet
@ 2022-12-07 18:53   ` Tariq Toukan
  0 siblings, 0 replies; 8+ messages in thread
From: Tariq Toukan @ 2022-12-07 18:53 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Tariq Toukan, Leon Romanovsky, netdev, eric.dumazet, Wei Wang



On 12/7/2022 4:12 PM, Eric Dumazet wrote:
> Google production kernel has increased MAX_SKB_FRAGS to 45
> for BIG-TCP rollout.
> 
> Unfortunately mlx4 TX bounce buffer is not big enough whenever
> an skb has up to 45 page fragments.
> 
> This can happen often with TCP TX zero copy, as one frag usually
> holds 4096 bytes of payload (order-0 page).
> 
> Tested:
>   Kernel built with MAX_SKB_FRAGS=45
>   ip link set dev eth0 gso_max_size 185000
>   netperf -t TCP_SENDFILE
> 
> I made sure that "ethtool -G eth0 tx 64" was properly working,
> ring->full_size being set to 15.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Wei Wang <weiwan@google.com>
> Cc: Tariq Toukan <tariqt@nvidia.com>

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 net-next 3/3] net/mlx4: small optimization in mlx4_en_xmit()
  2022-12-07 14:12 ` [PATCH v2 net-next 3/3] net/mlx4: small optimization in mlx4_en_xmit() Eric Dumazet
@ 2022-12-07 18:54   ` Tariq Toukan
  0 siblings, 0 replies; 8+ messages in thread
From: Tariq Toukan @ 2022-12-07 18:54 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Tariq Toukan, Leon Romanovsky, netdev, eric.dumazet, Wei Wang



On 12/7/2022 4:12 PM, Eric Dumazet wrote:
> Test against MLX4_MAX_DESC_TXBBS only matters if the TX
> bounce buffer is going to be used.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Tariq Toukan <tariqt@nvidia.com>
> Cc: Wei Wang <weiwan@google.com>
> ---
>   drivers/net/ethernet/mellanox/mlx4/en_tx.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)
> 

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 net-next 0/3] mlx4: better BIG-TCP support
  2022-12-07 14:12 [PATCH v2 net-next 0/3] mlx4: better BIG-TCP support Eric Dumazet
                   ` (2 preceding siblings ...)
  2022-12-07 14:12 ` [PATCH v2 net-next 3/3] net/mlx4: small optimization in mlx4_en_xmit() Eric Dumazet
@ 2022-12-08 22:30 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-12-08 22:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, kuba, pabeni, tariqt, leonro, netdev, eric.dumazet

Hello:

This series was applied to netdev/net-next.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Wed,  7 Dec 2022 14:12:34 +0000 you wrote:
> mlx4 uses a bounce buffer in TX whenever the tx descriptors
> wrap around the right edge of the ring.
> 
> Size of this bounce buffer was hard coded and can be
> increased if/when needed.
> 
> v2: roundup MLX4_TX_BOUNCE_BUFFER_SIZE (Tariq)
> 
> [...]

Here is the summary with links:
  - [v2,net-next,1/3] net/mlx4: rename two constants
    https://git.kernel.org/netdev/net-next/c/35f31ff0c0b6
  - [v2,net-next,2/3] net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS
    https://git.kernel.org/netdev/net-next/c/26782aad00cc
  - [v2,net-next,3/3] net/mlx4: small optimization in mlx4_en_xmit()
    https://git.kernel.org/netdev/net-next/c/0e706f7961a4

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-12-08 22:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-07 14:12 [PATCH v2 net-next 0/3] mlx4: better BIG-TCP support Eric Dumazet
2022-12-07 14:12 ` [PATCH v2 net-next 1/3] net/mlx4: rename two constants Eric Dumazet
2022-12-07 18:53   ` Tariq Toukan
2022-12-07 14:12 ` [PATCH v2 net-next 2/3] net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS Eric Dumazet
2022-12-07 18:53   ` Tariq Toukan
2022-12-07 14:12 ` [PATCH v2 net-next 3/3] net/mlx4: small optimization in mlx4_en_xmit() Eric Dumazet
2022-12-07 18:54   ` Tariq Toukan
2022-12-08 22:30 ` [PATCH v2 net-next 0/3] mlx4: better BIG-TCP support patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.