All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable
@ 2015-10-05 23:39 Tom Herbert
  2015-10-05 23:39 ` [PATCH RFC 1/3] net: Add skb_inner_transport_offset function Tom Herbert
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Tom Herbert @ 2015-10-05 23:39 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team, dwmw2, ogerlitz

When drivers have support for offloading transport IP checksums, they
will indicate this in the features for the device. As described in
skbuff.h, a driver will usually advertise NETIF_F_HW_CSUM,
NETIF_F_IP_CSUM, and/or NETIF_F_IPV6_CSUM. The first of these
(NETIF_F_HW_CSUM) is the preferred method since this implies that the
device has implemented a generic checksum offload feature that should
work under arbitrary scenarios (e.g. for different protocols, with or
without encapsulation).

For narrow features support (NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM
for offload of TCP/UDP), the features flags may not be sufficient to
deduce whether a packet may be offloaded. Some devices will not be able
to offload encapsulated checksums, some cannot offload transport
checksums in packet with IPv6 extension headers, etc. In these cases
a driver will need to perform additional packet inspection to determine
if a packet's checksum can be offloaded to a device.

This patch defines a helper function that drivers can call to check if
it is able to offload the checksum for a particular packet. In an
argument to the function, the driver specifies what type of packets it
is able to offload to a device. The function is intended to check for
the most common restrictions of devices (like by IP version, transport
protocol, encapsulation, extension headers). Since the function includes
checks for IP version and transport protocol, the driver is able
to advertise NETIF_F_HW_CSUM instead of protocol specific support.
This should put us on a path to deprecate NETIF_F_IP_CSUM and
NETIF_F_IPV6_CSUM.

The helper function may be called from ndo_features_check or the xmit
routine of the driver. The csum_help argument does not need to be set
when the function is called from ndo_features_check.

Note that the helper function is intended to verify checksum
offloadability in the non-LSO case. Checksum offload as part of LSO
should be validated as part of GSO offload checks (through NETIF_F_GSO_*
flags and work done in ndo_features_check).

In this patch set:
- Added skb_csum_offload_chk function
- Call skb_csum_offload_chk from mlx4

Testing:

Ran 200 netperf TCP_RR to see impact of calling skb_csum_offload_chk
for every packet. Did not observe any noticeable regression. Perf shows
that the function was 0.20% utilization which puts it pretty far down
the list.

Tom Herbert (3):
  net: Add skb_inner_transport_offset function
  net: Add driver helper function to determine checksum offloadability
  mlx4: Call skb_csum_offload_check to check offloadability

 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |   6 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c     |  20 +++-
 include/linux/netdevice.h                      |  47 +++++++++
 include/linux/skbuff.h                         |   5 +
 net/core/dev.c                                 | 129 +++++++++++++++++++++++++
 5 files changed, 199 insertions(+), 8 deletions(-)

-- 
2.4.6

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC 1/3] net: Add skb_inner_transport_offset function
  2015-10-05 23:39 [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable Tom Herbert
@ 2015-10-05 23:39 ` Tom Herbert
  2015-10-05 23:39 ` [PATCH RFC 2/3] net: Add driver helper function to determine checksum offloadability Tom Herbert
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Tom Herbert @ 2015-10-05 23:39 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team, dwmw2, ogerlitz

Same thing as skb_transport_offset but returns the offset of the inner
transport header (when skb->encpasulation is set).

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/linux/skbuff.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4398411..ad1a24c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1933,6 +1933,11 @@ static inline unsigned char *skb_inner_transport_header(const struct sk_buff
 	return skb->head + skb->inner_transport_header;
 }
 
+static inline int skb_inner_transport_offset(const struct sk_buff *skb)
+{
+	return skb_inner_transport_header(skb) - skb->data;
+}
+
 static inline void skb_reset_inner_transport_header(struct sk_buff *skb)
 {
 	skb->inner_transport_header = skb->data - skb->head;
-- 
2.4.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RFC 2/3] net: Add driver helper function to determine checksum offloadability
  2015-10-05 23:39 [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable Tom Herbert
  2015-10-05 23:39 ` [PATCH RFC 1/3] net: Add skb_inner_transport_offset function Tom Herbert
@ 2015-10-05 23:39 ` Tom Herbert
  2015-10-06  3:52   ` Alexander Duyck
  2015-10-05 23:39 ` [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability Tom Herbert
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Tom Herbert @ 2015-10-05 23:39 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team, dwmw2, ogerlitz

Add skb_csum_offload_chk driver helper function to determine if a
device with limited checksum offload capabilities is able to offload the
checksum for a given packet.

This patch includes:
  - The skb_csum_offload_chk function. Returns true if checksum is
    offloadable, else false. Optionally, in the case that the checksum
    is not offloable, the function can call skb_checksum_help to resolve
    the checksum.
  - Definition of skb_csum_offl_spec structure that caller uses to
    indicate rules about what it can offload (e.g. IPv4/v6, TCP/UDP only,
    whether encapsulated checksums can be offloaded, whether checksum with
    IPv6 extension headers can be offloaded).
  - Definition of skb_csum_offl_params which is an output parameter of
    skb_csum_offload_check. This contains values discovered in the
    function's evaluation that may be useful to the caller for setting
    up checksum offload (e.g. offset of checksum start, whether the
    checksum is encpasulated).

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/linux/netdevice.h |  47 +++++++++++++++++
 net/core/dev.c            | 129 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 176 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b945078..ac65d9b 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2454,6 +2454,40 @@ static inline void skb_gro_remcsum_cleanup(struct sk_buff *skb,
 	remcsum_unadjust((__sum16 *)ptr, grc->delta);
 }
 
+struct skb_csum_offl_spec {
+	__u16		ipv4_okay:1,
+			ipv6_okay:1,
+			encap_okay:1,
+			ip_options_okay:1,
+			ext_hdrs_okay:1,
+			tcp_okay:1,
+			udp_okay:1,
+			sctp_okay:1;
+};
+
+struct skb_csum_offl_params {
+	__u16		encapped_csum:1;
+	int protocol;
+	int ip_proto;
+	int network_hdr_offset;
+};
+
+bool __skb_csum_offload_chk(struct sk_buff *skb,
+			    const struct skb_csum_offl_spec *spec,
+			    struct skb_csum_offl_params *params,
+			    bool csum_help);
+
+static inline bool skb_csum_offload_chk(struct sk_buff *skb,
+					const struct skb_csum_offl_spec *spec,
+					struct skb_csum_offl_params *params,
+					bool csum_help)
+{
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		return false;
+
+	return __skb_csum_offload_chk(skb, spec, params, csum_help);
+}
+
 static inline int dev_hard_header(struct sk_buff *skb, struct net_device *dev,
 				  unsigned short type,
 				  const void *daddr, const void *saddr,
@@ -3632,6 +3666,19 @@ static inline bool can_checksum_protocol(netdev_features_t features,
 		 protocol == htons(ETH_P_FCOE)));
 }
 
+/* Map an ethertype into IP protocol if possible */
+static inline int eproto_to_ipproto(int eproto)
+{
+	switch (eproto) {
+	case htons(ETH_P_IP):
+		return IPPROTO_IP;
+	case htons(ETH_P_IPV6):
+		return IPPROTO_IPV6;
+	default:
+		return -1;
+	}
+}
+
 #ifdef CONFIG_BUG
 void netdev_rx_csum_fault(struct net_device *dev);
 #else
diff --git a/net/core/dev.c b/net/core/dev.c
index a229bf0..e7365ca 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -136,6 +136,7 @@
 #include <linux/errqueue.h>
 #include <linux/hrtimer.h>
 #include <linux/netfilter_ingress.h>
+#include <linux/sctp.h>
 
 #include "net-sysfs.h"
 
@@ -2440,6 +2441,134 @@ out:
 }
 EXPORT_SYMBOL(skb_checksum_help);
 
+/* skb_csum_offload_check - Driver helper function to determine if a device
+ * with limited checksum offload capabilities is able to offload the checksum
+ * for a given packet.
+ *
+ * Arguments:
+ *   skb - sk_buff for the packet in question
+ *   spec - contains the description of what device can offload
+ *   params - contains result of parsing the packet that may be of
+ *	      use to the driver for setting up checksum offload
+ *   checksum_help - when set indicates that helper function should
+ *            call skb_checksum_help if offload checks fail
+ *
+ * Returns:
+ *   true: Packet has passed the checksum checks and should be offloadable to
+ *         the device (a driver may still need to check for additional
+ *         restrictions of its device)
+ *   false: Checksum is not offloadable. If checksum_help was set then
+ *         skb_checksum_help was called to resolve checksum for non-GSO
+ *         packets and when IP protocol is not SCTP
+ */
+bool __skb_csum_offload_chk(struct sk_buff *skb,
+			    const struct skb_csum_offl_spec *spec,
+			    struct skb_csum_offl_params *params,
+			    bool csum_help)
+{
+	struct iphdr *iph;
+	struct ipv6hdr *ipv6;
+	void *nhdr;
+	int protocol = -1;
+	u8 ip_proto;
+
+	memset(params, 0, sizeof(*params));
+
+	/* We are assuming that checksum start must coincide with the inner or
+	 * outer transport header. This is true for TCP, UDP, and SCTP-- but if
+	 * support is ever added for GRE checksum offload this would not hold.
+	 */
+	if (skb_checksum_start_offset(skb) == skb_transport_offset(skb)) {
+		/* Non-encapsulated checksum */
+		protocol = eproto_to_ipproto(skb->protocol);
+		params->protocol = protocol;
+		params->network_hdr_offset = skb_network_offset(skb);
+		nhdr = skb_network_header(skb);
+	} else if (skb->encapsulation && spec->encap_okay &&
+		   skb_checksum_start_offset(skb) ==
+		   skb_inner_transport_offset(skb)) {
+		/* Encapsulated checksum */
+		params->encapped_csum = 1;
+		switch (skb->inner_protocol_type) {
+		case ENCAP_TYPE_ETHER:
+			protocol = eproto_to_ipproto(skb->inner_protocol);
+			break;
+		case ENCAP_TYPE_IPPROTO:
+			protocol = skb->inner_protocol;
+			break;
+		}
+		params->protocol = protocol;
+		params->network_hdr_offset = skb_inner_network_offset(skb);
+		nhdr = skb_inner_network_header(skb);
+	} else {
+		goto need_help;
+	}
+
+	switch (protocol) {
+	case IPPROTO_IP:
+		if (!spec->ipv4_okay)
+			goto need_help;
+		iph = nhdr;
+		ip_proto = iph->protocol;
+		if (iph->ihl != 5 && !spec->ip_options_okay)
+			goto need_help;
+		break;
+	case IPPROTO_IPV6:
+		if (!spec->ipv6_okay)
+			goto need_help;
+		ipv6 = nhdr;
+		nhdr += sizeof(*ipv6);
+		ip_proto = ipv6->nexthdr;
+		break;
+	default:
+		goto need_help;
+	}
+
+ip_proto_again:
+	switch (ip_proto) {
+	case IPPROTO_TCP:
+		if (!spec->tcp_okay ||
+		    skb->csum_offset != offsetof(struct tcphdr, check))
+			goto need_help;
+		break;
+	case IPPROTO_UDP:
+		if (!spec->udp_okay ||
+		    skb->csum_offset != offsetof(struct udphdr, check))
+			goto need_help;
+		break;
+	case IPPROTO_SCTP:
+		if (!spec->sctp_okay ||
+		    skb->csum_offset != offsetof(struct sctphdr, checksum))
+			goto cant_help;
+		break;
+	case NEXTHDR_HOP:
+	case NEXTHDR_ROUTING:
+	case NEXTHDR_DEST: {
+		u8 *opthdr = nhdr;
+
+		if (protocol != IPPROTO_IPV6 || !spec->ext_hdrs_okay)
+			goto need_help;
+
+		ip_proto = opthdr[0];
+		nhdr += (opthdr[1] + 1) << 3;
+
+		goto ip_proto_again;
+	}
+	default:
+		goto need_help;
+	}
+
+	/* Passed the tests */
+	return true;
+
+need_help:
+	if (csum_help && !skb_shinfo(skb)->gso_size)
+		skb_checksum_help(skb);
+cant_help:
+	return false;
+}
+EXPORT_SYMBOL(__skb_csum_offload_chk);
+
 __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
 {
 	__be16 type = skb->protocol;
-- 
2.4.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability
  2015-10-05 23:39 [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable Tom Herbert
  2015-10-05 23:39 ` [PATCH RFC 1/3] net: Add skb_inner_transport_offset function Tom Herbert
  2015-10-05 23:39 ` [PATCH RFC 2/3] net: Add driver helper function to determine checksum offloadability Tom Herbert
@ 2015-10-05 23:39 ` Tom Herbert
  2015-10-06  4:03   ` Alexander Duyck
  2015-10-07 15:41   ` Or Gerlitz
  2015-10-06 10:51 ` [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable David Woodhouse
  2015-10-08 15:09 ` David Woodhouse
  4 siblings, 2 replies; 15+ messages in thread
From: Tom Herbert @ 2015-10-05 23:39 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team, dwmw2, ogerlitz

This provides an example of a driver calling the skb_csum_offload_check.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |  6 +++---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c     | 20 +++++++++++++++-----
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 4726122..f2ed8d0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2360,7 +2360,7 @@ out:
 	}
 
 	/* set offloads */
-	priv->dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
+	priv->dev->hw_enc_features |= NETIF_F_HW_CSUM | NETIF_F_RXCSUM |
 				      NETIF_F_TSO | NETIF_F_GSO_UDP_TUNNEL;
 	priv->dev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
 	priv->dev->features    |= NETIF_F_GSO_UDP_TUNNEL;
@@ -2372,7 +2372,7 @@ static void mlx4_en_del_vxlan_offloads(struct work_struct *work)
 	struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
 						 vxlan_del_task);
 	/* unset offloads */
-	priv->dev->hw_enc_features &= ~(NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
+	priv->dev->hw_enc_features &= ~(NETIF_F_HW_CSUM | NETIF_F_RXCSUM |
 				      NETIF_F_TSO | NETIF_F_GSO_UDP_TUNNEL);
 	priv->dev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
 	priv->dev->features    &= ~NETIF_F_GSO_UDP_TUNNEL;
@@ -2943,7 +2943,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
 	/*
 	 * Set driver features
 	 */
-	dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
+	dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM;
 	if (mdev->LSO_support)
 		dev->hw_features |= NETIF_F_TSO | NETIF_F_TSO6;
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 494e776..f364ffd 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -702,6 +702,14 @@ static void mlx4_bf_copy(void __iomem *dst, const void *src,
 	__iowrite64_copy(dst, src, bytecnt / 8);
 }
 
+static const struct skb_csum_offl_spec csum_offl_spec = {
+	.ipv4_okay = 1,
+	.ipv6_okay = 1,
+	.encap_okay = 1,
+	.tcp_okay = 1,
+	.udp_okay = 1,
+};
+
 netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
@@ -727,6 +735,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 	bool stop_queue;
 	bool inline_ok;
 	u32 ring_cons;
+	struct skb_csum_offl_params offl_params;
 
 	if (!priv->port_up)
 		goto tx_drop;
@@ -853,8 +862,9 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* Prepare ctrl segement apart opcode+ownership, which depends on
 	 * whether LSO is used */
 	tx_desc->ctrl.srcrb_flags = priv->ctrl_flags;
-	if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) {
-		if (!skb->encapsulation)
+	if (skb_csum_offload_chk(skb, &csum_offl_spec,
+				 &offl_params, true)) {
+		if (!offl_params.encapped_csum)
 			tx_desc->ctrl.srcrb_flags |= cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM |
 								 MLX4_WQE_CTRL_TCP_UDP_CSUM);
 		else
@@ -912,9 +922,9 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 		build_inline_wqe(tx_desc, skb, shinfo, real_size, &vlan_tag,
 				 tx_ind, fragptr);
 
-	if (skb->encapsulation) {
-		struct iphdr *ipv4 = (struct iphdr *)skb_inner_network_header(skb);
-		if (ipv4->protocol == IPPROTO_TCP || ipv4->protocol == IPPROTO_UDP)
+	if (!offl_params.encapped_csum) {
+		if (offl_params.ip_proto == IPPROTO_TCP ||
+		    offl_params.ip_proto == IPPROTO_UDP)
 			op_own |= cpu_to_be32(MLX4_WQE_CTRL_IIP | MLX4_WQE_CTRL_ILP);
 		else
 			op_own |= cpu_to_be32(MLX4_WQE_CTRL_IIP);
-- 
2.4.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 2/3] net: Add driver helper function to determine checksum offloadability
  2015-10-05 23:39 ` [PATCH RFC 2/3] net: Add driver helper function to determine checksum offloadability Tom Herbert
@ 2015-10-06  3:52   ` Alexander Duyck
  2015-10-06 16:31     ` Tom Herbert
  0 siblings, 1 reply; 15+ messages in thread
From: Alexander Duyck @ 2015-10-06  3:52 UTC (permalink / raw)
  To: Tom Herbert, davem, netdev; +Cc: kernel-team, dwmw2, ogerlitz

On 10/05/2015 04:39 PM, Tom Herbert wrote:
> Add skb_csum_offload_chk driver helper function to determine if a
> device with limited checksum offload capabilities is able to offload the
> checksum for a given packet.
>
> This patch includes:
>    - The skb_csum_offload_chk function. Returns true if checksum is
>      offloadable, else false. Optionally, in the case that the checksum
>      is not offloable, the function can call skb_checksum_help to resolve
>      the checksum.
>    - Definition of skb_csum_offl_spec structure that caller uses to
>      indicate rules about what it can offload (e.g. IPv4/v6, TCP/UDP only,
>      whether encapsulated checksums can be offloaded, whether checksum with
>      IPv6 extension headers can be offloaded).
>    - Definition of skb_csum_offl_params which is an output parameter of
>      skb_csum_offload_check. This contains values discovered in the
>      function's evaluation that may be useful to the caller for setting
>      up checksum offload (e.g. offset of checksum start, whether the
>      checksum is encpasulated).
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
>   include/linux/netdevice.h |  47 +++++++++++++++++
>   net/core/dev.c            | 129 ++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 176 insertions(+)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index b945078..ac65d9b 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2454,6 +2454,40 @@ static inline void skb_gro_remcsum_cleanup(struct sk_buff *skb,
>   	remcsum_unadjust((__sum16 *)ptr, grc->delta);
>   }
>   
> +struct skb_csum_offl_spec {
> +	__u16		ipv4_okay:1,
> +			ipv6_okay:1,
> +			encap_okay:1,
> +			ip_options_okay:1,
> +			ext_hdrs_okay:1,
> +			tcp_okay:1,
> +			udp_okay:1,
> +			sctp_okay:1;
> +};

I think the cases where things are 0 will be much easier to support with 
this API than the cases where they are 1.  For example I don't think we 
could use this API for something like fm10k where there are limits on 
the tunnels that can be supported, so in that case we would have to 
parse the tunnel out in the driver anyway.

> +struct skb_csum_offl_params {
> +	__u16		encapped_csum:1;
> +	int protocol;
> +	int ip_proto;
> +	int network_hdr_offset;
> +};
> +
> +bool __skb_csum_offload_chk(struct sk_buff *skb,
> +			    const struct skb_csum_offl_spec *spec,
> +			    struct skb_csum_offl_params *params,
> +			    bool csum_help);
> +
> +static inline bool skb_csum_offload_chk(struct sk_buff *skb,
> +					const struct skb_csum_offl_spec *spec,
> +					struct skb_csum_offl_params *params,
> +					bool csum_help)
> +{
> +	if (skb->ip_summed != CHECKSUM_PARTIAL)
> +		return false;
> +
> +	return __skb_csum_offload_chk(skb, spec, params, csum_help);
> +}
> +
>   static inline int dev_hard_header(struct sk_buff *skb, struct net_device *dev,
>   				  unsigned short type,
>   				  const void *daddr, const void *saddr,
> @@ -3632,6 +3666,19 @@ static inline bool can_checksum_protocol(netdev_features_t features,
>   		 protocol == htons(ETH_P_FCOE)));
>   }
>   
> +/* Map an ethertype into IP protocol if possible */
> +static inline int eproto_to_ipproto(int eproto)
> +{
> +	switch (eproto) {
> +	case htons(ETH_P_IP):
> +		return IPPROTO_IP;
> +	case htons(ETH_P_IPV6):
> +		return IPPROTO_IPV6;
> +	default:
> +		return -1;
> +	}
> +}
> +
>   #ifdef CONFIG_BUG
>   void netdev_rx_csum_fault(struct net_device *dev);
>   #else
> diff --git a/net/core/dev.c b/net/core/dev.c
> index a229bf0..e7365ca 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -136,6 +136,7 @@
>   #include <linux/errqueue.h>
>   #include <linux/hrtimer.h>
>   #include <linux/netfilter_ingress.h>
> +#include <linux/sctp.h>
>   
>   #include "net-sysfs.h"
>   
> @@ -2440,6 +2441,134 @@ out:
>   }
>   EXPORT_SYMBOL(skb_checksum_help);
>   
> +/* skb_csum_offload_check - Driver helper function to determine if a device
> + * with limited checksum offload capabilities is able to offload the checksum
> + * for a given packet.
> + *
> + * Arguments:
> + *   skb - sk_buff for the packet in question
> + *   spec - contains the description of what device can offload
> + *   params - contains result of parsing the packet that may be of
> + *	      use to the driver for setting up checksum offload
> + *   checksum_help - when set indicates that helper function should
> + *            call skb_checksum_help if offload checks fail
> + *
> + * Returns:
> + *   true: Packet has passed the checksum checks and should be offloadable to
> + *         the device (a driver may still need to check for additional
> + *         restrictions of its device)
> + *   false: Checksum is not offloadable. If checksum_help was set then
> + *         skb_checksum_help was called to resolve checksum for non-GSO
> + *         packets and when IP protocol is not SCTP
> + */
> +bool __skb_csum_offload_chk(struct sk_buff *skb,
> +			    const struct skb_csum_offl_spec *spec,
> +			    struct skb_csum_offl_params *params,
> +			    bool csum_help)
> +{
> +	struct iphdr *iph;
> +	struct ipv6hdr *ipv6;
> +	void *nhdr;
> +	int protocol = -1;
> +	u8 ip_proto;
> +
> +	memset(params, 0, sizeof(*params));
> +
> +	/* We are assuming that checksum start must coincide with the inner or
> +	 * outer transport header. This is true for TCP, UDP, and SCTP-- but if
> +	 * support is ever added for GRE checksum offload this would not hold.
> +	 */
> +	if (skb_checksum_start_offset(skb) == skb_transport_offset(skb)) {
> +		/* Non-encapsulated checksum */
> +		protocol = eproto_to_ipproto(skb->protocol);

This bit is overlooking VLANs.  You should probably be using 
vlan_get_protocol(skb) instead of skb->protocol.

> +		params->protocol = protocol;
> +		params->network_hdr_offset = skb_network_offset(skb);
> +		nhdr = skb_network_header(skb);
> +	} else if (skb->encapsulation && spec->encap_okay &&
> +		   skb_checksum_start_offset(skb) ==
> +		   skb_inner_transport_offset(skb)) {
> +		/* Encapsulated checksum */
> +		params->encapped_csum = 1;
> +		switch (skb->inner_protocol_type) {
> +		case ENCAP_TYPE_ETHER:
> +			protocol = eproto_to_ipproto(skb->inner_protocol);

Probably something similar here, but I don't know if you would be even 
asking for offloads on a VLAN packet being passed inside of a tunnel.

> +			break;
> +		case ENCAP_TYPE_IPPROTO:
> +			protocol = skb->inner_protocol;
> +			break;
> +		}
> +		params->protocol = protocol;
> +		params->network_hdr_offset = skb_inner_network_offset(skb);
> +		nhdr = skb_inner_network_header(skb);
> +	} else {
> +		goto need_help;
> +	}
> +
> +	switch (protocol) {
> +	case IPPROTO_IP:
> +		if (!spec->ipv4_okay)
> +			goto need_help;
> +		iph = nhdr;
> +		ip_proto = iph->protocol;
> +		if (iph->ihl != 5 && !spec->ip_options_okay)
> +			goto need_help;
> +		break;
> +	case IPPROTO_IPV6:
> +		if (!spec->ipv6_okay)
> +			goto need_help;
> +		ipv6 = nhdr;
> +		nhdr += sizeof(*ipv6);
> +		ip_proto = ipv6->nexthdr;
> +		break;
> +	default:
> +		goto need_help;
> +	}
> +
> +ip_proto_again:
> +	switch (ip_proto) {
> +	case IPPROTO_TCP:
> +		if (!spec->tcp_okay ||
> +		    skb->csum_offset != offsetof(struct tcphdr, check))
> +			goto need_help;
> +		break;
> +	case IPPROTO_UDP:
> +		if (!spec->udp_okay ||
> +		    skb->csum_offset != offsetof(struct udphdr, check))
> +			goto need_help;
> +		break;
> +	case IPPROTO_SCTP:
> +		if (!spec->sctp_okay ||
> +		    skb->csum_offset != offsetof(struct sctphdr, checksum))
> +			goto cant_help;
> +		break;
> +	case NEXTHDR_HOP:
> +	case NEXTHDR_ROUTING:
> +	case NEXTHDR_DEST: {
> +		u8 *opthdr = nhdr;
> +
> +		if (protocol != IPPROTO_IPV6 || !spec->ext_hdrs_okay)
> +			goto need_help;
> +
> +		ip_proto = opthdr[0];
> +		nhdr += (opthdr[1] + 1) << 3;
> +
> +		goto ip_proto_again;
> +	}
> +	default:
> +		goto need_help;
> +	}
> +
> +	/* Passed the tests */
> +	return true;
> +
> +need_help:
> +	if (csum_help && !skb_shinfo(skb)->gso_size)
> +		skb_checksum_help(skb);
> +cant_help:
> +	return false;
> +}
> +EXPORT_SYMBOL(__skb_csum_offload_chk);
> +
>   __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
>   {
>   	__be16 type = skb->protocol;

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability
  2015-10-05 23:39 ` [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability Tom Herbert
@ 2015-10-06  4:03   ` Alexander Duyck
  2015-10-06 16:22     ` Tom Herbert
  2015-10-07 15:41   ` Or Gerlitz
  1 sibling, 1 reply; 15+ messages in thread
From: Alexander Duyck @ 2015-10-06  4:03 UTC (permalink / raw)
  To: Tom Herbert, davem, netdev; +Cc: kernel-team, dwmw2, ogerlitz

On 10/05/2015 04:39 PM, Tom Herbert wrote:
> This provides an example of a driver calling the skb_csum_offload_check.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
>   drivers/net/ethernet/mellanox/mlx4/en_netdev.c |  6 +++---
>   drivers/net/ethernet/mellanox/mlx4/en_tx.c     | 20 +++++++++++++++-----
>   2 files changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> index 4726122..f2ed8d0 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> @@ -2360,7 +2360,7 @@ out:
>   	}
>   
>   	/* set offloads */
> -	priv->dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
> +	priv->dev->hw_enc_features |= NETIF_F_HW_CSUM | NETIF_F_RXCSUM |
>   				      NETIF_F_TSO | NETIF_F_GSO_UDP_TUNNEL;
>   	priv->dev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
>   	priv->dev->features    |= NETIF_F_GSO_UDP_TUNNEL;
> @@ -2372,7 +2372,7 @@ static void mlx4_en_del_vxlan_offloads(struct work_struct *work)
>   	struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
>   						 vxlan_del_task);
>   	/* unset offloads */
> -	priv->dev->hw_enc_features &= ~(NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
> +	priv->dev->hw_enc_features &= ~(NETIF_F_HW_CSUM | NETIF_F_RXCSUM |
>   				      NETIF_F_TSO | NETIF_F_GSO_UDP_TUNNEL);
>   	priv->dev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
>   	priv->dev->features    &= ~NETIF_F_GSO_UDP_TUNNEL;
> @@ -2943,7 +2943,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
>   	/*
>   	 * Set driver features
>   	 */
> -	dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
> +	dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM;
>   	if (mdev->LSO_support)
>   		dev->hw_features |= NETIF_F_TSO | NETIF_F_TSO6;
>   
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> index 494e776..f364ffd 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> @@ -702,6 +702,14 @@ static void mlx4_bf_copy(void __iomem *dst, const void *src,
>   	__iowrite64_copy(dst, src, bytecnt / 8);
>   }
>   
> +static const struct skb_csum_offl_spec csum_offl_spec = {
> +	.ipv4_okay = 1,
> +	.ipv6_okay = 1,
> +	.encap_okay = 1,
> +	.tcp_okay = 1,
> +	.udp_okay = 1,
> +};
> +

The question I would have is if inner IPv6 checksum is supported by this 
driver.  The code before didn't seem to indicate it was, and after the 
csum_offl_spec would seem to indicate it is.  One of my concerns about a 
change like this is that it is likely prone to introduce regressions as 
features are going to be toggling due to interpretations of flags and 
assumptions about what is good for the outer headers is good for the 
inner ones.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable
  2015-10-05 23:39 [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable Tom Herbert
                   ` (2 preceding siblings ...)
  2015-10-05 23:39 ` [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability Tom Herbert
@ 2015-10-06 10:51 ` David Woodhouse
  2015-10-08 15:09 ` David Woodhouse
  4 siblings, 0 replies; 15+ messages in thread
From: David Woodhouse @ 2015-10-06 10:51 UTC (permalink / raw)
  To: Tom Herbert, davem, netdev; +Cc: kernel-team, ogerlitz

[-- Attachment #1: Type: text/plain, Size: 3106 bytes --]

On Mon, 2015-10-05 at 16:39 -0700, Tom Herbert wrote:
> When drivers have support for offloading transport IP checksums, they
> will indicate this in the features for the device. As described in
> skbuff.h, a driver will usually advertise NETIF_F_HW_CSUM,
> NETIF_F_IP_CSUM, and/or NETIF_F_IPV6_CSUM. The first of these
> (NETIF_F_HW_CSUM) is the preferred method since this implies that the
> device has implemented a generic checksum offload feature that should
> work under arbitrary scenarios (e.g. for different protocols, with or
> without encapsulation).
> 
> For narrow features support (NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM
> for offload of TCP/UDP), the features flags may not be sufficient to
> deduce whether a packet may be offloaded. Some devices will not be able
> to offload encapsulated checksums, some cannot offload transport
> checksums in packet with IPv6 extension headers, etc. In these cases
> a driver will need to perform additional packet inspection to determine
> if a packet's checksum can be offloaded to a device.
> 
> This patch defines a helper function that drivers can call to check if
> it is able to offload the checksum for a particular packet. In an
> argument to the function, the driver specifies what type of packets it
> is able to offload to a device. The function is intended to check for
> the most common restrictions of devices (like by IP version, transport
> protocol, encapsulation, extension headers). Since the function includes
> checks for IP version and transport protocol, the driver is able
> to advertise NETIF_F_HW_CSUM instead of protocol specific support.
> This should put us on a path to deprecate NETIF_F_IP_CSUM and
> NETIF_F_IPV6_CSUM.
> 
> The helper function may be called from ndo_features_check or the xmit
> routine of the driver. The csum_help argument does not need to be set
> when the function is called from ndo_features_check.

Looks good in general; thanks.

I do strongly believe we want to encourage your helper to be called
from .ndo_features_check(). Because if you can't do the checksum, then
you can't do TSO. And if you can't do TSO, you *really* want your
hard_start_xmit() function to be handed one skb at a time and not have
to call skb_gso_segment() for itself when it might not have enough room
in its descriptor ring for *all* the resulting segments.

Also, is your skb_csum_offload_chk() going to do the right thing for
SCTP packets? Looks like it doesn't do crc32...

And how accurate is the check we have, on the various IP output
routines, for rt->dst.dev->features & NETIF_F_xx_CSUM? Could we
consider just ditching that check and always using CHECKSUM_PARTIAL, in
the knowledge that we check it before it hits a non-capable driver
anyway? Or do we benefit from doing the software checksum early, due to
improved cache locality when we've probably just copied the data from
userspace in many cases?

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability
  2015-10-06  4:03   ` Alexander Duyck
@ 2015-10-06 16:22     ` Tom Herbert
  2015-10-06 16:54       ` Alexander Duyck
  0 siblings, 1 reply; 15+ messages in thread
From: Tom Herbert @ 2015-10-06 16:22 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team,
	David Woodhouse, Or Gerlitz

On Mon, Oct 5, 2015 at 9:03 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On 10/05/2015 04:39 PM, Tom Herbert wrote:
>>
>> This provides an example of a driver calling the skb_csum_offload_check.
>>
>> Signed-off-by: Tom Herbert <tom@herbertland.com>
>> ---
>>   drivers/net/ethernet/mellanox/mlx4/en_netdev.c |  6 +++---
>>   drivers/net/ethernet/mellanox/mlx4/en_tx.c     | 20 +++++++++++++++-----
>>   2 files changed, 18 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
>> b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
>> index 4726122..f2ed8d0 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
>> @@ -2360,7 +2360,7 @@ out:
>>         }
>>         /* set offloads */
>> -       priv->dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
>> +       priv->dev->hw_enc_features |= NETIF_F_HW_CSUM | NETIF_F_RXCSUM |
>>                                       NETIF_F_TSO |
>> NETIF_F_GSO_UDP_TUNNEL;
>>         priv->dev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
>>         priv->dev->features    |= NETIF_F_GSO_UDP_TUNNEL;
>> @@ -2372,7 +2372,7 @@ static void mlx4_en_del_vxlan_offloads(struct
>> work_struct *work)
>>         struct mlx4_en_priv *priv = container_of(work, struct
>> mlx4_en_priv,
>>                                                  vxlan_del_task);
>>         /* unset offloads */
>> -       priv->dev->hw_enc_features &= ~(NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
>> +       priv->dev->hw_enc_features &= ~(NETIF_F_HW_CSUM | NETIF_F_RXCSUM |
>>                                       NETIF_F_TSO |
>> NETIF_F_GSO_UDP_TUNNEL);
>>         priv->dev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
>>         priv->dev->features    &= ~NETIF_F_GSO_UDP_TUNNEL;
>> @@ -2943,7 +2943,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev,
>> int port,
>>         /*
>>          * Set driver features
>>          */
>> -       dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM |
>> NETIF_F_IPV6_CSUM;
>> +       dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM;
>>         if (mdev->LSO_support)
>>                 dev->hw_features |= NETIF_F_TSO | NETIF_F_TSO6;
>>   diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> index 494e776..f364ffd 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> @@ -702,6 +702,14 @@ static void mlx4_bf_copy(void __iomem *dst, const
>> void *src,
>>         __iowrite64_copy(dst, src, bytecnt / 8);
>>   }
>>   +static const struct skb_csum_offl_spec csum_offl_spec = {
>> +       .ipv4_okay = 1,
>> +       .ipv6_okay = 1,
>> +       .encap_okay = 1,
>> +       .tcp_okay = 1,
>> +       .udp_okay = 1,
>> +};
>> +
>
>
> The question I would have is if inner IPv6 checksum is supported by this
> driver.  The code before didn't seem to indicate it was, and after the
> csum_offl_spec would seem to indicate it is.  One of my concerns about a
> change like this is that it is likely prone to introduce regressions as
> features are going to be toggling due to interpretations of flags and
> assumptions about what is good for the outer headers is good for the inner
> ones.

Do you mean to say that there could be a device that supports an inner
and outer checksum for IPv4, but only an outer checksum for IPv6 and
not inner checksum?

Tom

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 2/3] net: Add driver helper function to determine checksum offloadability
  2015-10-06  3:52   ` Alexander Duyck
@ 2015-10-06 16:31     ` Tom Herbert
  0 siblings, 0 replies; 15+ messages in thread
From: Tom Herbert @ 2015-10-06 16:31 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team,
	David Woodhouse, Or Gerlitz

On Mon, Oct 5, 2015 at 8:52 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On 10/05/2015 04:39 PM, Tom Herbert wrote:
>>
>> Add skb_csum_offload_chk driver helper function to determine if a
>> device with limited checksum offload capabilities is able to offload the
>> checksum for a given packet.
>>
>> This patch includes:
>>    - The skb_csum_offload_chk function. Returns true if checksum is
>>      offloadable, else false. Optionally, in the case that the checksum
>>      is not offloable, the function can call skb_checksum_help to resolve
>>      the checksum.
>>    - Definition of skb_csum_offl_spec structure that caller uses to
>>      indicate rules about what it can offload (e.g. IPv4/v6, TCP/UDP only,
>>      whether encapsulated checksums can be offloaded, whether checksum
>> with
>>      IPv6 extension headers can be offloaded).
>>    - Definition of skb_csum_offl_params which is an output parameter of
>>      skb_csum_offload_check. This contains values discovered in the
>>      function's evaluation that may be useful to the caller for setting
>>      up checksum offload (e.g. offset of checksum start, whether the
>>      checksum is encpasulated).
>>
>> Signed-off-by: Tom Herbert <tom@herbertland.com>
>> ---
>>   include/linux/netdevice.h |  47 +++++++++++++++++
>>   net/core/dev.c            | 129
>> ++++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 176 insertions(+)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index b945078..ac65d9b 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -2454,6 +2454,40 @@ static inline void skb_gro_remcsum_cleanup(struct
>> sk_buff *skb,
>>         remcsum_unadjust((__sum16 *)ptr, grc->delta);
>>   }
>>   +struct skb_csum_offl_spec {
>> +       __u16           ipv4_okay:1,
>> +                       ipv6_okay:1,
>> +                       encap_okay:1,
>> +                       ip_options_okay:1,
>> +                       ext_hdrs_okay:1,
>> +                       tcp_okay:1,
>> +                       udp_okay:1,
>> +                       sctp_okay:1;
>> +};
>
>
> I think the cases where things are 0 will be much easier to support with
> this API than the cases where they are 1.  For example I don't think we
> could use this API for something like fm10k where there are limits on the
> tunnels that can be supported, so in that case we would have to parse the
> tunnel out in the driver anyway.
>
Right, this helper is not intended to be comprehensive. I am hoping
that this will apply to most drivers that just support "simple"
checksum offload. Drivers with more complex constraints will need to
verify those themselves. Drivers that support devices that implement
NETIF_HW_CSUM shouldn't need to do any of this validation at all. In
any case, the ultimate responsibility for offloading a checksum
correctly (whenever xmit is called with CHECKSUM_PARTIAL) rests in the
driver.

Tom

>
>> +struct skb_csum_offl_params {
>> +       __u16           encapped_csum:1;
>> +       int protocol;
>> +       int ip_proto;
>> +       int network_hdr_offset;
>> +};
>> +
>> +bool __skb_csum_offload_chk(struct sk_buff *skb,
>> +                           const struct skb_csum_offl_spec *spec,
>> +                           struct skb_csum_offl_params *params,
>> +                           bool csum_help);
>> +
>> +static inline bool skb_csum_offload_chk(struct sk_buff *skb,
>> +                                       const struct skb_csum_offl_spec
>> *spec,
>> +                                       struct skb_csum_offl_params
>> *params,
>> +                                       bool csum_help)
>> +{
>> +       if (skb->ip_summed != CHECKSUM_PARTIAL)
>> +               return false;
>> +
>> +       return __skb_csum_offload_chk(skb, spec, params, csum_help);
>> +}
>> +
>>   static inline int dev_hard_header(struct sk_buff *skb, struct net_device
>> *dev,
>>                                   unsigned short type,
>>                                   const void *daddr, const void *saddr,
>> @@ -3632,6 +3666,19 @@ static inline bool
>> can_checksum_protocol(netdev_features_t features,
>>                  protocol == htons(ETH_P_FCOE)));
>>   }
>>   +/* Map an ethertype into IP protocol if possible */
>> +static inline int eproto_to_ipproto(int eproto)
>> +{
>> +       switch (eproto) {
>> +       case htons(ETH_P_IP):
>> +               return IPPROTO_IP;
>> +       case htons(ETH_P_IPV6):
>> +               return IPPROTO_IPV6;
>> +       default:
>> +               return -1;
>> +       }
>> +}
>> +
>>   #ifdef CONFIG_BUG
>>   void netdev_rx_csum_fault(struct net_device *dev);
>>   #else
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index a229bf0..e7365ca 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -136,6 +136,7 @@
>>   #include <linux/errqueue.h>
>>   #include <linux/hrtimer.h>
>>   #include <linux/netfilter_ingress.h>
>> +#include <linux/sctp.h>
>>     #include "net-sysfs.h"
>>   @@ -2440,6 +2441,134 @@ out:
>>   }
>>   EXPORT_SYMBOL(skb_checksum_help);
>>   +/* skb_csum_offload_check - Driver helper function to determine if a
>> device
>> + * with limited checksum offload capabilities is able to offload the
>> checksum
>> + * for a given packet.
>> + *
>> + * Arguments:
>> + *   skb - sk_buff for the packet in question
>> + *   spec - contains the description of what device can offload
>> + *   params - contains result of parsing the packet that may be of
>> + *           use to the driver for setting up checksum offload
>> + *   checksum_help - when set indicates that helper function should
>> + *            call skb_checksum_help if offload checks fail
>> + *
>> + * Returns:
>> + *   true: Packet has passed the checksum checks and should be
>> offloadable to
>> + *         the device (a driver may still need to check for additional
>> + *         restrictions of its device)
>> + *   false: Checksum is not offloadable. If checksum_help was set then
>> + *         skb_checksum_help was called to resolve checksum for non-GSO
>> + *         packets and when IP protocol is not SCTP
>> + */
>> +bool __skb_csum_offload_chk(struct sk_buff *skb,
>> +                           const struct skb_csum_offl_spec *spec,
>> +                           struct skb_csum_offl_params *params,
>> +                           bool csum_help)
>> +{
>> +       struct iphdr *iph;
>> +       struct ipv6hdr *ipv6;
>> +       void *nhdr;
>> +       int protocol = -1;
>> +       u8 ip_proto;
>> +
>> +       memset(params, 0, sizeof(*params));
>> +
>> +       /* We are assuming that checksum start must coincide with the
>> inner or
>> +        * outer transport header. This is true for TCP, UDP, and SCTP--
>> but if
>> +        * support is ever added for GRE checksum offload this would not
>> hold.
>> +        */
>> +       if (skb_checksum_start_offset(skb) == skb_transport_offset(skb)) {
>> +               /* Non-encapsulated checksum */
>> +               protocol = eproto_to_ipproto(skb->protocol);
>
>
> This bit is overlooking VLANs.  You should probably be using
> vlan_get_protocol(skb) instead of skb->protocol.
>
>> +               params->protocol = protocol;
>> +               params->network_hdr_offset = skb_network_offset(skb);
>> +               nhdr = skb_network_header(skb);
>> +       } else if (skb->encapsulation && spec->encap_okay &&
>> +                  skb_checksum_start_offset(skb) ==
>> +                  skb_inner_transport_offset(skb)) {
>> +               /* Encapsulated checksum */
>> +               params->encapped_csum = 1;
>> +               switch (skb->inner_protocol_type) {
>> +               case ENCAP_TYPE_ETHER:
>> +                       protocol = eproto_to_ipproto(skb->inner_protocol);
>
>
> Probably something similar here, but I don't know if you would be even
> asking for offloads on a VLAN packet being passed inside of a tunnel.
>
>
>> +                       break;
>> +               case ENCAP_TYPE_IPPROTO:
>> +                       protocol = skb->inner_protocol;
>> +                       break;
>> +               }
>> +               params->protocol = protocol;
>> +               params->network_hdr_offset =
>> skb_inner_network_offset(skb);
>> +               nhdr = skb_inner_network_header(skb);
>> +       } else {
>> +               goto need_help;
>> +       }
>> +
>> +       switch (protocol) {
>> +       case IPPROTO_IP:
>> +               if (!spec->ipv4_okay)
>> +                       goto need_help;
>> +               iph = nhdr;
>> +               ip_proto = iph->protocol;
>> +               if (iph->ihl != 5 && !spec->ip_options_okay)
>> +                       goto need_help;
>> +               break;
>> +       case IPPROTO_IPV6:
>> +               if (!spec->ipv6_okay)
>> +                       goto need_help;
>> +               ipv6 = nhdr;
>> +               nhdr += sizeof(*ipv6);
>> +               ip_proto = ipv6->nexthdr;
>> +               break;
>> +       default:
>> +               goto need_help;
>> +       }
>> +
>> +ip_proto_again:
>> +       switch (ip_proto) {
>> +       case IPPROTO_TCP:
>> +               if (!spec->tcp_okay ||
>> +                   skb->csum_offset != offsetof(struct tcphdr, check))
>> +                       goto need_help;
>> +               break;
>> +       case IPPROTO_UDP:
>> +               if (!spec->udp_okay ||
>> +                   skb->csum_offset != offsetof(struct udphdr, check))
>> +                       goto need_help;
>> +               break;
>> +       case IPPROTO_SCTP:
>> +               if (!spec->sctp_okay ||
>> +                   skb->csum_offset != offsetof(struct sctphdr,
>> checksum))
>> +                       goto cant_help;
>> +               break;
>> +       case NEXTHDR_HOP:
>> +       case NEXTHDR_ROUTING:
>> +       case NEXTHDR_DEST: {
>> +               u8 *opthdr = nhdr;
>> +
>> +               if (protocol != IPPROTO_IPV6 || !spec->ext_hdrs_okay)
>> +                       goto need_help;
>> +
>> +               ip_proto = opthdr[0];
>> +               nhdr += (opthdr[1] + 1) << 3;
>> +
>> +               goto ip_proto_again;
>> +       }
>> +       default:
>> +               goto need_help;
>> +       }
>> +
>> +       /* Passed the tests */
>> +       return true;
>> +
>> +need_help:
>> +       if (csum_help && !skb_shinfo(skb)->gso_size)
>> +               skb_checksum_help(skb);
>> +cant_help:
>> +       return false;
>> +}
>> +EXPORT_SYMBOL(__skb_csum_offload_chk);
>> +
>>   __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
>>   {
>>         __be16 type = skb->protocol;
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability
  2015-10-06 16:22     ` Tom Herbert
@ 2015-10-06 16:54       ` Alexander Duyck
  0 siblings, 0 replies; 15+ messages in thread
From: Alexander Duyck @ 2015-10-06 16:54 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team,
	David Woodhouse, Or Gerlitz

On 10/06/2015 09:22 AM, Tom Herbert wrote:
> On Mon, Oct 5, 2015 at 9:03 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On 10/05/2015 04:39 PM, Tom Herbert wrote:
>>> This provides an example of a driver calling the skb_csum_offload_check.
>>>
>>> Signed-off-by: Tom Herbert <tom@herbertland.com>
>>> ---
>>>    drivers/net/ethernet/mellanox/mlx4/en_netdev.c |  6 +++---
>>>    drivers/net/ethernet/mellanox/mlx4/en_tx.c     | 20 +++++++++++++++-----
>>>    2 files changed, 18 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
>>> b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
>>> index 4726122..f2ed8d0 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
>>> @@ -2360,7 +2360,7 @@ out:
>>>          }
>>>          /* set offloads */
>>> -       priv->dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
>>> +       priv->dev->hw_enc_features |= NETIF_F_HW_CSUM | NETIF_F_RXCSUM |
>>>                                        NETIF_F_TSO |
>>> NETIF_F_GSO_UDP_TUNNEL;
>>>          priv->dev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
>>>          priv->dev->features    |= NETIF_F_GSO_UDP_TUNNEL;
>>> @@ -2372,7 +2372,7 @@ static void mlx4_en_del_vxlan_offloads(struct
>>> work_struct *work)
>>>          struct mlx4_en_priv *priv = container_of(work, struct
>>> mlx4_en_priv,
>>>                                                   vxlan_del_task);
>>>          /* unset offloads */
>>> -       priv->dev->hw_enc_features &= ~(NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
>>> +       priv->dev->hw_enc_features &= ~(NETIF_F_HW_CSUM | NETIF_F_RXCSUM |
>>>                                        NETIF_F_TSO |
>>> NETIF_F_GSO_UDP_TUNNEL);
>>>          priv->dev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
>>>          priv->dev->features    &= ~NETIF_F_GSO_UDP_TUNNEL;
>>> @@ -2943,7 +2943,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev,
>>> int port,
>>>          /*
>>>           * Set driver features
>>>           */
>>> -       dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM |
>>> NETIF_F_IPV6_CSUM;
>>> +       dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM;
>>>          if (mdev->LSO_support)
>>>                  dev->hw_features |= NETIF_F_TSO | NETIF_F_TSO6;
>>>    diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>>> b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>>> index 494e776..f364ffd 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>>> @@ -702,6 +702,14 @@ static void mlx4_bf_copy(void __iomem *dst, const
>>> void *src,
>>>          __iowrite64_copy(dst, src, bytecnt / 8);
>>>    }
>>>    +static const struct skb_csum_offl_spec csum_offl_spec = {
>>> +       .ipv4_okay = 1,
>>> +       .ipv6_okay = 1,
>>> +       .encap_okay = 1,
>>> +       .tcp_okay = 1,
>>> +       .udp_okay = 1,
>>> +};
>>> +
>>
>> The question I would have is if inner IPv6 checksum is supported by this
>> driver.  The code before didn't seem to indicate it was, and after the
>> csum_offl_spec would seem to indicate it is.  One of my concerns about a
>> change like this is that it is likely prone to introduce regressions as
>> features are going to be toggling due to interpretations of flags and
>> assumptions about what is good for the outer headers is good for the inner
>> ones.
> Do you mean to say that there could be a device that supports an inner
> and outer checksum for IPv4, but only an outer checksum for IPv6 and
> not inner checksum?
>
> Tom

Yes, that is what I mean.  The fact is hardware designs are often short 
sighted like that.  Somebody may have decided to save a few gates by 
only supporting IPv4 because somebody somewhere didn't make it a hard 
requirement to support IPv6, or perhaps the implementation wasn't quite 
right and instead of spinning a new silicon they decided to de-feature 
IPv6 inner checksum offload.

I don't know if that is the case for the mlx4, maybe it is just a driver 
oversight, but the fact that it didn't list IPv6 as being a supported 
encapsulation before kind of implies that it doesn't support TCP/UDP 
checksums on top of encapsulated IPv6.

- Alex

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability
  2015-10-05 23:39 ` [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability Tom Herbert
  2015-10-06  4:03   ` Alexander Duyck
@ 2015-10-07 15:41   ` Or Gerlitz
  2015-10-07 18:07     ` Tom Herbert
  1 sibling, 1 reply; 15+ messages in thread
From: Or Gerlitz @ 2015-10-07 15:41 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev, kernel-team, dwmw2

On 10/6/2015 2:39 AM, Tom Herbert wrote:
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> index 494e776..f364ffd 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> @@ -702,6 +702,14 @@ static void mlx4_bf_copy(void __iomem *dst, const void *src,
>   	__iowrite64_copy(dst, src, bytecnt / 8);
>   }
>   
> +static const struct skb_csum_offl_spec csum_offl_spec = {
> +	.ipv4_okay = 1,
> +	.ipv6_okay = 1,
> +	.encap_okay = 1,
> +	.tcp_okay = 1,
> +	.udp_okay = 1,
> +};
> +
>   netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
>   {
>   	struct skb_shared_info *shinfo = skb_shinfo(skb);
> @@ -727,6 +735,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
>   	bool stop_queue;
>   	bool inline_ok;
>   	u32 ring_cons;
> +	struct skb_csum_offl_params offl_params;
>   
>   	if (!priv->port_up)
>   		goto tx_drop;
> @@ -853,8 +862,9 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
>   	/* Prepare ctrl segement apart opcode+ownership, which depends on
>   	 * whether LSO is used */
>   	tx_desc->ctrl.srcrb_flags = priv->ctrl_flags;
> -	if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) {
> -		if (!skb->encapsulation)
> +	if (skb_csum_offload_chk(skb, &csum_offl_spec,
> +				 &offl_params, true)) {
> +		if (!offl_params.encapped_csum)
>   			tx_desc->ctrl.srcrb_flags |= cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM |
>   								 MLX4_WQE_CTRL_TCP_UDP_CSUM);
>   		else
> @@ -912,9 +922,9 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
>   		build_inline_wqe(tx_desc, skb, shinfo, real_size, &vlan_tag,
>   				 tx_ind, fragptr);
>   
> -	if (skb->encapsulation) {
> -		struct iphdr *ipv4 = (struct iphdr *)skb_inner_network_header(skb);
> -		if (ipv4->protocol == IPPROTO_TCP || ipv4->protocol == IPPROTO_UDP)
> +	if (!offl_params.encapped_csum) {
> +		if (offl_params.ip_proto == IPPROTO_TCP ||
> +		    offl_params.ip_proto == IPPROTO_UDP)
>   			op_own |= cpu_to_be32(MLX4_WQE_CTRL_IIP | MLX4_WQE_CTRL_ILP);
>   		else
>   			op_own |= cpu_to_be32(MLX4_WQE_CTRL_IIP);
> -- 2.4.6
few quick comments

not all mlx4 devices support offloading checksum for tunneled traffic, 
only ConnectX3-pro -- this translates for the mlx4 EN driver
logic that determines that on startup, e.g only when 
mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN
we set hw_enc_features and friends -- in that respect, I wasn't sure if 
it's okay to blindly set  .encap_oka = 1 here.

Another constraint, is that when the device does support (say) TCP TX 
checksum offload for the inner packet they don't support UDP
checksum offload for the outer packet.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability
  2015-10-07 15:41   ` Or Gerlitz
@ 2015-10-07 18:07     ` Tom Herbert
  2015-10-08 21:39       ` Or Gerlitz
  0 siblings, 1 reply; 15+ messages in thread
From: Tom Herbert @ 2015-10-07 18:07 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team,
	David Woodhouse

On Wed, Oct 7, 2015 at 8:41 AM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> On 10/6/2015 2:39 AM, Tom Herbert wrote:
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> index 494e776..f364ffd 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> @@ -702,6 +702,14 @@ static void mlx4_bf_copy(void __iomem *dst, const
>> void *src,
>>         __iowrite64_copy(dst, src, bytecnt / 8);
>>   }
>>   +static const struct skb_csum_offl_spec csum_offl_spec = {
>> +       .ipv4_okay = 1,
>> +       .ipv6_okay = 1,
>> +       .encap_okay = 1,
>> +       .tcp_okay = 1,
>> +       .udp_okay = 1,
>> +};
>> +
>>   netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
>>   {
>>         struct skb_shared_info *shinfo = skb_shinfo(skb);
>> @@ -727,6 +735,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct
>> net_device *dev)
>>         bool stop_queue;
>>         bool inline_ok;
>>         u32 ring_cons;
>> +       struct skb_csum_offl_params offl_params;
>>         if (!priv->port_up)
>>                 goto tx_drop;
>> @@ -853,8 +862,9 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct
>> net_device *dev)
>>         /* Prepare ctrl segement apart opcode+ownership, which depends on
>>          * whether LSO is used */
>>         tx_desc->ctrl.srcrb_flags = priv->ctrl_flags;
>> -       if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) {
>> -               if (!skb->encapsulation)
>> +       if (skb_csum_offload_chk(skb, &csum_offl_spec,
>> +                                &offl_params, true)) {
>> +               if (!offl_params.encapped_csum)
>>                         tx_desc->ctrl.srcrb_flags |=
>> cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM |
>>
>> MLX4_WQE_CTRL_TCP_UDP_CSUM);
>>                 else
>> @@ -912,9 +922,9 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct
>> net_device *dev)
>>                 build_inline_wqe(tx_desc, skb, shinfo, real_size,
>> &vlan_tag,
>>                                  tx_ind, fragptr);
>>   -     if (skb->encapsulation) {
>> -               struct iphdr *ipv4 = (struct iphdr
>> *)skb_inner_network_header(skb);
>> -               if (ipv4->protocol == IPPROTO_TCP || ipv4->protocol ==
>> IPPROTO_UDP)
>> +       if (!offl_params.encapped_csum) {
>> +               if (offl_params.ip_proto == IPPROTO_TCP ||
>> +                   offl_params.ip_proto == IPPROTO_UDP)
>>                         op_own |= cpu_to_be32(MLX4_WQE_CTRL_IIP |
>> MLX4_WQE_CTRL_ILP);
>>                 else
>>                         op_own |= cpu_to_be32(MLX4_WQE_CTRL_IIP);
>> -- 2.4.6
>
> few quick comments
>
> not all mlx4 devices support offloading checksum for tunneled traffic, only
> ConnectX3-pro -- this translates for the mlx4 EN driver
> logic that determines that on startup, e.g only when
> mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN
> we set hw_enc_features and friends -- in that respect, I wasn't sure if it's
> okay to blindly set  .encap_oka = 1 here.
>
Or, I would only give the mlnx support as an example. I think driver
owners would need to implement the specification for the their
devices.

> Another constraint, is that when the device does support (say) TCP TX
> checksum offload for the inner packet they don't support UDP
> checksum offload for the outer packet.

We can add such things to the specification. One value I see in having
a common structure to describe the checksum capabilities is that
becomes a way to clearly document what is (and is not) supported by
devices.

btw, I don't quite understand your example. If a device does not
support UDP checksum there is a flag for that in the specification. If
the stack sends a TCP packet encapsulated in a UDP packet with UDP
checksum enabled, it will try to offload the UDP checksum and not the
TCP one. There is currently no interface to offload two checksums in
the same packet (in non-GRO at least) and with things like RCO we
probably will never need that anyway.

Tom

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable
  2015-10-05 23:39 [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable Tom Herbert
                   ` (3 preceding siblings ...)
  2015-10-06 10:51 ` [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable David Woodhouse
@ 2015-10-08 15:09 ` David Woodhouse
  2015-10-08 15:48   ` Tom Herbert
  4 siblings, 1 reply; 15+ messages in thread
From: David Woodhouse @ 2015-10-08 15:09 UTC (permalink / raw)
  To: Tom Herbert, davem, netdev; +Cc: kernel-team, ogerlitz

[-- Attachment #1: Type: text/plain, Size: 429 bytes --]

On Mon, 2015-10-05 at 16:39 -0700, Tom Herbert wrote:
> This patch defines a helper function that drivers can call to check 
> if it is able to offload the checksum for a particular packet.

It occurs to me that by itself, this doesn't actually fix the problem.
We'd then need to go over all drivers which currently use
NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM, and convert them. Would that be
the intention?

-- 
dwmw2


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable
  2015-10-08 15:09 ` David Woodhouse
@ 2015-10-08 15:48   ` Tom Herbert
  0 siblings, 0 replies; 15+ messages in thread
From: Tom Herbert @ 2015-10-08 15:48 UTC (permalink / raw)
  To: David Woodhouse
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team,
	Or Gerlitz

On Thu, Oct 8, 2015 at 8:09 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> On Mon, 2015-10-05 at 16:39 -0700, Tom Herbert wrote:
>> This patch defines a helper function that drivers can call to check
>> if it is able to offload the checksum for a particular packet.
>
> It occurs to me that by itself, this doesn't actually fix the problem.
> We'd then need to go over all drivers which currently use
> NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM, and convert them. Would that be
> the intention?
>
In time, yes. But in the near future, I would like to be able to state
that NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are being deprecated and
that new drivers can only implement NETIF_F_HW_CSUM interface.
Hopefully, this will encourage HW vendors to implement the generic,
non-protocol specific checksum algorithms in new devices.

Tom

> --
> dwmw2
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability
  2015-10-07 18:07     ` Tom Herbert
@ 2015-10-08 21:39       ` Or Gerlitz
  0 siblings, 0 replies; 15+ messages in thread
From: Or Gerlitz @ 2015-10-08 21:39 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Or Gerlitz, David S. Miller, Linux Kernel Network Developers,
	Kernel Team, David Woodhouse

On Wed, Oct 7, 2015 at 9:07 PM, Tom Herbert <tom@herbertland.com> wrote:
> On Wed, Oct 7, 2015 at 8:41 AM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
>> On 10/6/2015 2:39 AM, Tom Herbert wrote:

>>>   +static const struct skb_csum_offl_spec csum_offl_spec = {
>>> +       .ipv4_okay = 1,
>>> +       .ipv6_okay = 1,
>>> +       .encap_okay = 1,
>>> +       .tcp_okay = 1,
>>> +       .udp_okay = 1,
>>> +};
>>> +
[...]


> Or, I would only give the mlnx support as an example. I think driver
> owners would need to implement the specification for the their devices.

sure, sorry to bother on that.

>> Another constraint, is that when the device does support (say) TCP TX
>> checksum offload for the inner packet they don't support UDP
>> checksum offload for the outer packet.

> We can add such things to the specification. One value I see in having
> a common structure to describe the checksum capabilities is that
> becomes a way to clearly document what is (and is not) supported by
> devices.

> btw, I don't quite understand your example. If a device does not
> support UDP checksum there is a flag for that in the specification.

But we do support UDP checksum generation for not-tunneled packets, so
the specification should somehow capture this combination.

> If the stack sends a TCP packet encapsulated in a UDP packet with UDP
> checksum enabled, it will try to offload the UDP checksum and not the
> TCP one.

Not following... who is "it" in this sentence, the stack or the device?

We don't advertise NETIF_F_GSO_UDP_TUNNEL_CSUM so for UDP tunning
GSO-ed packets  we should be fine. For non GSO... you say this works
only b/c the default for the vxlan driver is to use zero udp checksum?

> There is currently no interface to offload two checksums in
> the same packet (in non-GRO at least)

GRO? aren't we talking on xmit?

> and with things like RCO we probably will never need that anyway.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-10-08 21:39 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-05 23:39 [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable Tom Herbert
2015-10-05 23:39 ` [PATCH RFC 1/3] net: Add skb_inner_transport_offset function Tom Herbert
2015-10-05 23:39 ` [PATCH RFC 2/3] net: Add driver helper function to determine checksum offloadability Tom Herbert
2015-10-06  3:52   ` Alexander Duyck
2015-10-06 16:31     ` Tom Herbert
2015-10-05 23:39 ` [PATCH RFC 3/3] mlx4: Call skb_csum_offload_check to check offloadability Tom Herbert
2015-10-06  4:03   ` Alexander Duyck
2015-10-06 16:22     ` Tom Herbert
2015-10-06 16:54       ` Alexander Duyck
2015-10-07 15:41   ` Or Gerlitz
2015-10-07 18:07     ` Tom Herbert
2015-10-08 21:39       ` Or Gerlitz
2015-10-06 10:51 ` [PATCH RFC 0/3] net: Add driver helper function to determine if checksum is offloadable David Woodhouse
2015-10-08 15:09 ` David Woodhouse
2015-10-08 15:48   ` Tom Herbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.