All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] ethtool: add header/data split indication
@ 2022-01-27 18:42 Jakub Kicinski
  2022-01-27 18:42 ` [PATCH net-next 1/2] " Jakub Kicinski
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Jakub Kicinski @ 2022-01-27 18:42 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-doc, chenhao288, huangguangbin2, idosch, corbet,
	Jakub Kicinski

TCP ZC Rx requires data to be placed neatly into pages, separate
from the networking headers. This is not supported by most devices
so to make deployment easy this set adds a way for the driver to
report support for this feature thru ethtool.

The larger scope of configuring splitting headers and data, or DMA
scatter seems dauntingly broad, so this set focuses specifically
on the question "is this device usable with TCP ZC Rx?".

The aim is to avoid a litany of conditions on HW platforms, features,
and firmware versions in orchestration systems when the drivers can
easily tell their SG config.

Jakub Kicinski (2):
  ethtool: add header/data split indication
  bnxt: report header-data split state

 Documentation/networking/ethtool-netlink.rst      |  8 ++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  3 +++
 include/linux/ethtool.h                           |  2 ++
 include/uapi/linux/ethtool_netlink.h              |  7 +++++++
 net/ethtool/rings.c                               | 15 ++++++++++-----
 5 files changed, 30 insertions(+), 5 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net-next 1/2] ethtool: add header/data split indication
  2022-01-27 18:42 [PATCH net-next 0/2] ethtool: add header/data split indication Jakub Kicinski
@ 2022-01-27 18:42 ` Jakub Kicinski
  2022-01-27 18:43 ` [PATCH net-next 2/2] bnxt: report header-data split state Jakub Kicinski
  2022-01-28 15:10 ` [PATCH net-next 0/2] ethtool: add header/data split indication patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2022-01-27 18:42 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-doc, chenhao288, huangguangbin2, idosch, corbet,
	Jakub Kicinski

For applications running on a mix of platforms it's useful
to have a clear indication whether host's NIC supports the
geometry requirements of TCP zero-copy. TCP zero-copy Rx
requires data to be neatly placed into memory pages.
Most NICs can't do that.

This patch is adding GET support only, since the NICs
I work with either always have the feature enabled or
enable it whenever MTU is set to jumbo. In other words
I don't need SET. But adding set should be trivial.
(The only note on SET is that we will likely want
the setting to be "sticky" and use 0 / `unknown`
to reset it back to driver default.)

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 Documentation/networking/ethtool-netlink.rst |  8 ++++++++
 include/linux/ethtool.h                      |  2 ++
 include/uapi/linux/ethtool_netlink.h         |  7 +++++++
 net/ethtool/rings.c                          | 15 ++++++++++-----
 4 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 9d98e0511249..cae28af7a476 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -860,8 +860,16 @@ Gets ring sizes like ``ETHTOOL_GRINGPARAM`` ioctl request.
   ``ETHTOOL_A_RINGS_RX_JUMBO``          u32     size of RX jumbo ring
   ``ETHTOOL_A_RINGS_TX``                u32     size of TX ring
   ``ETHTOOL_A_RINGS_RX_BUF_LEN``        u32     size of buffers on the ring
+  ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT``    u8      TCP header / data split
   ====================================  ======  ===========================
 
+``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with
+page-flipping TCP zero-copy receive (``getsockopt(TCP_ZEROCOPY_RECEIVE)``).
+If enabled the device is configured to place frame headers and data into
+separate buffers. The device configuration must make it possible to receive
+full memory pages of data, for example because MTU is high enough or through
+HW-GRO.
+
 
 RINGS_SET
 =========
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 11efc45de66a..e0853f48b75e 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -70,9 +70,11 @@ enum {
 /**
  * struct kernel_ethtool_ringparam - RX/TX ring configuration
  * @rx_buf_len: Current length of buffers on the rx ring.
+ * @tcp_data_split: Scatter packet headers and data to separate buffers
  */
 struct kernel_ethtool_ringparam {
 	u32	rx_buf_len;
+	u8	tcp_data_split;
 };
 
 /**
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index cca6e474a085..417d4280d7b5 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -318,6 +318,12 @@ enum {
 
 /* RINGS */
 
+enum {
+	ETHTOOL_TCP_DATA_SPLIT_UNKNOWN = 0,
+	ETHTOOL_TCP_DATA_SPLIT_DISABLED,
+	ETHTOOL_TCP_DATA_SPLIT_ENABLED,
+};
+
 enum {
 	ETHTOOL_A_RINGS_UNSPEC,
 	ETHTOOL_A_RINGS_HEADER,				/* nest - _A_HEADER_* */
@@ -330,6 +336,7 @@ enum {
 	ETHTOOL_A_RINGS_RX_JUMBO,			/* u32 */
 	ETHTOOL_A_RINGS_TX,				/* u32 */
 	ETHTOOL_A_RINGS_RX_BUF_LEN,                     /* u32 */
+	ETHTOOL_A_RINGS_TCP_DATA_SPLIT,			/* u8 */
 
 	/* add new constants above here */
 	__ETHTOOL_A_RINGS_CNT,
diff --git a/net/ethtool/rings.c b/net/ethtool/rings.c
index c1d5f5e0fdc9..18a5035d3bee 100644
--- a/net/ethtool/rings.c
+++ b/net/ethtool/rings.c
@@ -53,7 +53,8 @@ static int rings_reply_size(const struct ethnl_req_info *req_base,
 	       nla_total_size(sizeof(u32)) +	/* _RINGS_RX_MINI */
 	       nla_total_size(sizeof(u32)) +	/* _RINGS_RX_JUMBO */
 	       nla_total_size(sizeof(u32)) +	/* _RINGS_TX */
-	       nla_total_size(sizeof(u32));     /* _RINGS_RX_BUF_LEN */
+	       nla_total_size(sizeof(u32)) +	/* _RINGS_RX_BUF_LEN */
+	       nla_total_size(sizeof(u8));	/* _RINGS_TCP_DATA_SPLIT */
 }
 
 static int rings_fill_reply(struct sk_buff *skb,
@@ -61,9 +62,11 @@ static int rings_fill_reply(struct sk_buff *skb,
 			    const struct ethnl_reply_data *reply_base)
 {
 	const struct rings_reply_data *data = RINGS_REPDATA(reply_base);
-	const struct kernel_ethtool_ringparam *kernel_ringparam = &data->kernel_ringparam;
+	const struct kernel_ethtool_ringparam *kr = &data->kernel_ringparam;
 	const struct ethtool_ringparam *ringparam = &data->ringparam;
 
+	WARN_ON(kr->tcp_data_split > ETHTOOL_TCP_DATA_SPLIT_ENABLED);
+
 	if ((ringparam->rx_max_pending &&
 	     (nla_put_u32(skb, ETHTOOL_A_RINGS_RX_MAX,
 			  ringparam->rx_max_pending) ||
@@ -84,9 +87,11 @@ static int rings_fill_reply(struct sk_buff *skb,
 			  ringparam->tx_max_pending) ||
 	      nla_put_u32(skb, ETHTOOL_A_RINGS_TX,
 			  ringparam->tx_pending)))  ||
-	    (kernel_ringparam->rx_buf_len &&
-	     (nla_put_u32(skb, ETHTOOL_A_RINGS_RX_BUF_LEN,
-			  kernel_ringparam->rx_buf_len))))
+	    (kr->rx_buf_len &&
+	     (nla_put_u32(skb, ETHTOOL_A_RINGS_RX_BUF_LEN, kr->rx_buf_len))) ||
+	    (kr->tcp_data_split &&
+	     (nla_put_u8(skb, ETHTOOL_A_RINGS_TCP_DATA_SPLIT,
+			 kr->tcp_data_split))))
 		return -EMSGSIZE;
 
 	return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net-next 2/2] bnxt: report header-data split state
  2022-01-27 18:42 [PATCH net-next 0/2] ethtool: add header/data split indication Jakub Kicinski
  2022-01-27 18:42 ` [PATCH net-next 1/2] " Jakub Kicinski
@ 2022-01-27 18:43 ` Jakub Kicinski
  2022-01-28 15:10 ` [PATCH net-next 0/2] ethtool: add header/data split indication patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2022-01-27 18:43 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-doc, chenhao288, huangguangbin2, idosch, corbet,
	Jakub Kicinski, Michael Chan

Aggregation rings imply header-data split.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
--
CC: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 003330e8cd58..5edbee92f5c4 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -11,6 +11,7 @@
 #include <linux/ctype.h>
 #include <linux/stringify.h>
 #include <linux/ethtool.h>
+#include <linux/ethtool_netlink.h>
 #include <linux/linkmode.h>
 #include <linux/interrupt.h>
 #include <linux/pci.h>
@@ -802,9 +803,11 @@ static void bnxt_get_ringparam(struct net_device *dev,
 	if (bp->flags & BNXT_FLAG_AGG_RINGS) {
 		ering->rx_max_pending = BNXT_MAX_RX_DESC_CNT_JUM_ENA;
 		ering->rx_jumbo_max_pending = BNXT_MAX_RX_JUM_DESC_CNT;
+		kernel_ering->tcp_data_split = ETHTOOL_TCP_DATA_SPLIT_ENABLED;
 	} else {
 		ering->rx_max_pending = BNXT_MAX_RX_DESC_CNT;
 		ering->rx_jumbo_max_pending = 0;
+		kernel_ering->tcp_data_split = ETHTOOL_TCP_DATA_SPLIT_DISABLED;
 	}
 	ering->tx_max_pending = BNXT_MAX_TX_DESC_CNT;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 0/2] ethtool: add header/data split indication
  2022-01-27 18:42 [PATCH net-next 0/2] ethtool: add header/data split indication Jakub Kicinski
  2022-01-27 18:42 ` [PATCH net-next 1/2] " Jakub Kicinski
  2022-01-27 18:43 ` [PATCH net-next 2/2] bnxt: report header-data split state Jakub Kicinski
@ 2022-01-28 15:10 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-01-28 15:10 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, linux-doc, chenhao288, huangguangbin2, idosch, corbet

Hello:

This series was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Thu, 27 Jan 2022 10:42:58 -0800 you wrote:
> TCP ZC Rx requires data to be placed neatly into pages, separate
> from the networking headers. This is not supported by most devices
> so to make deployment easy this set adds a way for the driver to
> report support for this feature thru ethtool.
> 
> The larger scope of configuring splitting headers and data, or DMA
> scatter seems dauntingly broad, so this set focuses specifically
> on the question "is this device usable with TCP ZC Rx?".
> 
> [...]

Here is the summary with links:
  - [net-next,1/2] ethtool: add header/data split indication
    https://git.kernel.org/netdev/net-next/c/9690ae604290
  - [net-next,2/2] bnxt: report header-data split state
    https://git.kernel.org/netdev/net-next/c/b370517e5233

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-01-28 15:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-27 18:42 [PATCH net-next 0/2] ethtool: add header/data split indication Jakub Kicinski
2022-01-27 18:42 ` [PATCH net-next 1/2] " Jakub Kicinski
2022-01-27 18:43 ` [PATCH net-next 2/2] bnxt: report header-data split state Jakub Kicinski
2022-01-28 15:10 ` [PATCH net-next 0/2] ethtool: add header/data split indication patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.