linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation
@ 2012-12-08  0:14 Joseph Gasparakis
  2012-12-08  0:14 ` [PATCH v4 1/5] net: " Joseph Gasparakis
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Joseph Gasparakis @ 2012-12-08  0:14 UTC (permalink / raw)
  To: davem, shemminger, chrisw, gospo
  Cc: Joseph Gasparakis, netdev, linux-kernel, dmitry, saeed.bishara,
	bhutchings

The series contains updates to add in the NIC Rx and Tx checksumming support
for encapsulated packets.

The sk_buff needs to somehow have information of the inner packet, and adding
three fields for the inner mac, network and transport headers was the prefered
approach. 

Not adding these fields would mean that the drivers would need to parse the
sk_buff data in hot-path, having a negative impact in the performance.

Adding in sk_buff a pointer to the skbuff of the inner packet made sense, but
would be a complicated change as assumptions needed to be made with regards to
helper functions such as skb_clone() skb_copy(). Also code for the existing
encapsulation protocols (such as VXLAN and IP GRE) had to be reworked, so the
decision was to have the simple approach of adding these three fields.

v2 Makes sure that checksumming for IP GRE does not take place if the offload
   flag is set in the skb's netdev features

v3 Fixes issues picked up by the community in v2 and is intended to provide
   ability to demo vxlan Tx offloading with Intel's ixgbe. As part of this, 
   it provides an RFC patch for ixgbe to take advantage of the offloading
   mechanism

   Now it is possible to create a vxlan interface like this:
    #ip link add vxlan0 type vxlan id 40 ttl 10 group 239.1.1.1 dev eth0

   Then turn on/off the encapsulation offload mechanism by doing:
    #ethtool -K eth0 tx-checksum-ip-generic on

   In v3 ipgre work got paused (and therefore patches not included) and I will
   come back to it when vxlan is accepted by the community.

v4 Added more detailed commit logs and code comments as per request in v3
   Also now the Rx offload encapsulation patch is included in the series.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/5] net: Add support for hardware-offloaded encapsulation
  2012-12-08  0:14 [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation Joseph Gasparakis
@ 2012-12-08  0:14 ` Joseph Gasparakis
  2012-12-10 10:04   ` saeed bishara
  2012-12-08  0:14 ` [PATCH v4 2/5] net: Handle encapsulated offloads before fragmentation or handing to lower dev Joseph Gasparakis
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Joseph Gasparakis @ 2012-12-08  0:14 UTC (permalink / raw)
  To: davem, shemminger, chrisw, gospo
  Cc: Joseph Gasparakis, netdev, linux-kernel, dmitry, saeed.bishara,
	bhutchings, Peter P Waskiewicz Jr, Alexander Duyck

This patch adds support in the kernel for offloading in the NIC Tx and Rx
checksumming for encapsulated packets (such as VXLAN and IP GRE).

For Tx encapsulation offload, the driver will need to set the right bits
in netdev->hw_enc_features. The protocol driver will have to set the
skb->encapsulation bit and populate the inner headers, so the NIC driver will
use those inner headers to calculate the csum in hardware.

For Rx encapsulation offload, the driver will need to set again the
skb->encapsulation flag and the skb->ip_csum to CHECKSUM_UNNECESSARY.
In that case the protocol driver should push the decapsulated packet up
to the stack, again with CHECKSUM_UNNECESSARY. In ether case, the protocol
driver should set the skb->encapsulation flag back to zero. Fianlly the
protocol driver should have NETIF_F_RXCSUM flag set in its features.

Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 include/linux/ip.h        |  5 +++
 include/linux/ipv6.h      |  5 +++
 include/linux/netdevice.h |  6 +++
 include/linux/skbuff.h    | 95 ++++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/tcp.h       | 10 +++++
 include/linux/udp.h       |  5 +++
 net/core/skbuff.c         |  9 +++++
 7 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/include/linux/ip.h b/include/linux/ip.h
index 58b82a2..492bc65 100644
--- a/include/linux/ip.h
+++ b/include/linux/ip.h
@@ -25,6 +25,11 @@ static inline struct iphdr *ip_hdr(const struct sk_buff *skb)
 	return (struct iphdr *)skb_network_header(skb);
 }
 
+static inline struct iphdr *inner_ip_hdr(const struct sk_buff *skb)
+{
+	return (struct iphdr *)skb_inner_network_header(skb);
+}
+
 static inline struct iphdr *ipip_hdr(const struct sk_buff *skb)
 {
 	return (struct iphdr *)skb_transport_header(skb);
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 12729e9..faed1e3 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -67,6 +67,11 @@ static inline struct ipv6hdr *ipv6_hdr(const struct sk_buff *skb)
 	return (struct ipv6hdr *)skb_network_header(skb);
 }
 
+static inline struct ipv6hdr *inner_ipv6_hdr(const struct sk_buff *skb)
+{
+	return (struct ipv6hdr *)skb_inner_network_header(skb);
+}
+
 static inline struct ipv6hdr *ipipv6_hdr(const struct sk_buff *skb)
 {
 	return (struct ipv6hdr *)skb_transport_header(skb);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 18c5dc9..c6a14d4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1063,6 +1063,12 @@ struct net_device {
 	netdev_features_t	wanted_features;
 	/* mask of features inheritable by VLAN devices */
 	netdev_features_t	vlan_features;
+	/* mask of features inherited by encapsulating devices
+	 * This field indicates what encapsulation offloads
+	 * the hardware is capable of doing, and drivers will
+	 * need to set them appropriately.
+	 */
+	netdev_features_t	hw_enc_features;
 
 	/* Interface index. Unique device identifier	*/
 	int			ifindex;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f2af494..320e976 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -376,6 +376,8 @@ typedef unsigned char *sk_buff_data_t;
  *	@mark: Generic packet mark
  *	@dropcount: total number of sk_receive_queue overflows
  *	@vlan_tci: vlan tag control information
+ *	@inner_transport_header: Inner transport layer header (encapsulation)
+ *	@inner_network_header: Network layer header (encapsulation)
  *	@transport_header: Transport layer header
  *	@network_header: Network layer header
  *	@mac_header: Link layer header
@@ -471,7 +473,13 @@ struct sk_buff {
 	__u8			wifi_acked:1;
 	__u8			no_fcs:1;
 	__u8			head_frag:1;
-	/* 8/10 bit hole (depending on ndisc_nodetype presence) */
+	/* Encapsulation protocol and NIC drivers should use
+	 * this flag to indicate to each other if the skb contains
+	 * encapsulated packet or not and maybe use the inner packet
+	 * headers if needed
+	 */
+	__u8			encapsulation:1;
+	/* 7/9 bit hole (depending on ndisc_nodetype presence) */
 	kmemcheck_bitfield_end(flags2);
 
 #ifdef CONFIG_NET_DMA
@@ -486,6 +494,8 @@ struct sk_buff {
 		__u32		avail_size;
 	};
 
+	sk_buff_data_t		inner_transport_header;
+	sk_buff_data_t		inner_network_header;
 	sk_buff_data_t		transport_header;
 	sk_buff_data_t		network_header;
 	sk_buff_data_t		mac_header;
@@ -1435,12 +1445,53 @@ static inline void skb_reserve(struct sk_buff *skb, int len)
 	skb->tail += len;
 }
 
+static inline void skb_reset_inner_headers(struct sk_buff *skb)
+{
+	skb->inner_network_header = skb->network_header;
+	skb->inner_transport_header = skb->transport_header;
+}
+
 static inline void skb_reset_mac_len(struct sk_buff *skb)
 {
 	skb->mac_len = skb->network_header - skb->mac_header;
 }
 
 #ifdef NET_SKBUFF_DATA_USES_OFFSET
+static inline unsigned char *skb_inner_transport_header(const struct sk_buff
+							*skb)
+{
+	return skb->head + skb->inner_transport_header;
+}
+
+static inline void skb_reset_inner_transport_header(struct sk_buff *skb)
+{
+	skb->inner_transport_header = skb->data - skb->head;
+}
+
+static inline void skb_set_inner_transport_header(struct sk_buff *skb,
+						   const int offset)
+{
+	skb_reset_inner_transport_header(skb);
+	skb->inner_transport_header += offset;
+}
+
+static inline unsigned char *skb_inner_network_header(const struct sk_buff *skb)
+{
+	return skb->head + skb->inner_network_header;
+}
+
+static inline void skb_reset_inner_network_header(struct sk_buff *skb)
+{
+	skb->inner_network_header = skb->data - skb->head;
+}
+
+static inline void skb_set_inner_network_header(struct sk_buff *skb,
+						const int offset)
+{
+	skb_reset_inner_network_header(skb);
+	skb->inner_network_header += offset;
+}
+
 static inline unsigned char *skb_transport_header(const struct sk_buff *skb)
 {
 	return skb->head + skb->transport_header;
@@ -1496,6 +1547,38 @@ static inline void skb_set_mac_header(struct sk_buff *skb, const int offset)
 }
 
 #else /* NET_SKBUFF_DATA_USES_OFFSET */
+static inline unsigned char *skb_inner_transport_header(const struct sk_buff
+							*skb)
+{
+	return skb->inner_transport_header;
+}
+
+static inline void skb_reset_inner_transport_header(struct sk_buff *skb)
+{
+	skb->inner_transport_header = skb->data;
+}
+
+static inline void skb_set_inner_transport_header(struct sk_buff *skb,
+						   const int offset)
+{
+	skb->inner_transport_header = skb->data + offset;
+}
+
+static inline unsigned char *skb_inner_network_header(const struct sk_buff *skb)
+{
+	return skb->inner_network_header;
+}
+
+static inline void skb_reset_inner_network_header(struct sk_buff *skb)
+{
+	skb->inner_network_header = skb->data;
+}
+
+static inline void skb_set_inner_network_header(struct sk_buff *skb,
+						const int offset)
+{
+	skb->inner_network_header = skb->data + offset;
+}
 
 static inline unsigned char *skb_transport_header(const struct sk_buff *skb)
 {
@@ -1574,11 +1657,21 @@ static inline u32 skb_network_header_len(const struct sk_buff *skb)
 	return skb->transport_header - skb->network_header;
 }
 
+static inline u32 skb_inner_network_header_len(const struct sk_buff *skb)
+{
+	return skb->inner_transport_header - skb->inner_network_header;
+}
+
 static inline int skb_network_offset(const struct sk_buff *skb)
 {
 	return skb_network_header(skb) - skb->data;
 }
 
+static inline int skb_inner_network_offset(const struct sk_buff *skb)
+{
+	return skb_inner_network_header(skb) - skb->data;
+}
+
 static inline int pskb_network_may_pull(struct sk_buff *skb, unsigned int len)
 {
 	return pskb_may_pull(skb, skb_network_offset(skb) + len);
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 60b7aac..4e1d228 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -35,6 +35,16 @@ static inline unsigned int tcp_hdrlen(const struct sk_buff *skb)
 	return tcp_hdr(skb)->doff * 4;
 }
 
+static inline struct tcphdr *inner_tcp_hdr(const struct sk_buff *skb)
+{
+	return (struct tcphdr *)skb_inner_transport_header(skb);
+}
+
+static inline unsigned int inner_tcp_hdrlen(const struct sk_buff *skb)
+{
+	return inner_tcp_hdr(skb)->doff * 4;
+}
+
 static inline unsigned int tcp_optlen(const struct sk_buff *skb)
 {
 	return (tcp_hdr(skb)->doff - 5) * 4;
diff --git a/include/linux/udp.h b/include/linux/udp.h
index 0b67d77..9d81de1 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -27,6 +27,11 @@ static inline struct udphdr *udp_hdr(const struct sk_buff *skb)
 	return (struct udphdr *)skb_transport_header(skb);
 }
 
+static inline struct udphdr *inner_udp_hdr(const struct sk_buff *skb)
+{
+	return (struct udphdr *)skb_inner_transport_header(skb);
+}
+
 #define UDP_HTABLE_SIZE_MIN		(CONFIG_BASE_SMALL ? 128 : 256)
 
 static inline int udp_hashfn(struct net *net, unsigned num, unsigned mask)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 880722e2..ccbabf5 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -682,11 +682,14 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 	new->transport_header	= old->transport_header;
 	new->network_header	= old->network_header;
 	new->mac_header		= old->mac_header;
+	new->inner_transport_header = old->inner_transport_header;
+	new->inner_network_header = old->inner_transport_header;
 	skb_dst_copy(new, old);
 	new->rxhash		= old->rxhash;
 	new->ooo_okay		= old->ooo_okay;
 	new->l4_rxhash		= old->l4_rxhash;
 	new->no_fcs		= old->no_fcs;
+	new->encapsulation	= old->encapsulation;
 #ifdef CONFIG_XFRM
 	new->sp			= secpath_get(old->sp);
 #endif
@@ -892,6 +895,8 @@ static void copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 	new->network_header   += offset;
 	if (skb_mac_header_was_set(new))
 		new->mac_header	      += offset;
+	new->inner_transport_header += offset;
+	new->inner_network_header   += offset;
 #endif
 	skb_shinfo(new)->gso_size = skb_shinfo(old)->gso_size;
 	skb_shinfo(new)->gso_segs = skb_shinfo(old)->gso_segs;
@@ -1089,6 +1094,8 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 	skb->network_header   += off;
 	if (skb_mac_header_was_set(skb))
 		skb->mac_header += off;
+	skb->inner_transport_header += off;
+	skb->inner_network_header += off;
 	/* Only adjust this if it actually is csum_start rather than csum */
 	if (skb->ip_summed == CHECKSUM_PARTIAL)
 		skb->csum_start += nhead;
@@ -1188,6 +1195,8 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
 	n->network_header   += off;
 	if (skb_mac_header_was_set(skb))
 		n->mac_header += off;
+	n->inner_transport_header += off;
+	n->inner_network_header	   += off;
 #endif
 
 	return n;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/5] net: Handle encapsulated offloads before fragmentation or handing to lower dev
  2012-12-08  0:14 [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation Joseph Gasparakis
  2012-12-08  0:14 ` [PATCH v4 1/5] net: " Joseph Gasparakis
@ 2012-12-08  0:14 ` Joseph Gasparakis
  2012-12-08  0:14 ` [PATCH v4 3/5] vxlan: capture inner headers during encapsulation Joseph Gasparakis
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Joseph Gasparakis @ 2012-12-08  0:14 UTC (permalink / raw)
  To: davem, shemminger, chrisw, gospo
  Cc: Alexander Duyck, netdev, linux-kernel, dmitry, saeed.bishara, bhutchings

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change allows the VXLAN to enable Tx checksum offloading even on
devices that do not support encapsulated checksum offloads. The
advantage to this is that it allows for the lower device to change due
to routing table changes without impacting features on the VXLAN itself.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 net/core/dev.c       | 15 +++++++++++++--
 net/ipv4/ip_output.c |  4 ++++
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 307142a..a4c4a1b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2324,6 +2324,13 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 			skb->vlan_tci = 0;
 		}
 
+		/* If encapsulation offload request, verify we are testing
+		 * hardware encapsulation features instead of standard
+		 * features for the netdev
+		 */
+		if (skb->encapsulation)
+			features &= dev->hw_enc_features;
+
 		if (netif_needs_gso(skb, features)) {
 			if (unlikely(dev_gso_segment(skb, features)))
 				goto out_kfree_skb;
@@ -2339,8 +2346,12 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 			 * checksumming here.
 			 */
 			if (skb->ip_summed == CHECKSUM_PARTIAL) {
-				skb_set_transport_header(skb,
-					skb_checksum_start_offset(skb));
+				if (skb->encapsulation)
+					skb_set_inner_transport_header(skb,
+						skb_checksum_start_offset(skb));
+				else
+					skb_set_transport_header(skb,
+						skb_checksum_start_offset(skb));
 				if (!(features & NETIF_F_ALL_CSUM) &&
 				     skb_checksum_help(skb))
 					goto out_kfree_skb;
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 6537a40..3e98ed2 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -595,6 +595,10 @@ slow_path_clean:
 	}
 
 slow_path:
+	/* for offloaded checksums cleanup checksum before fragmentation */
+	if ((skb->ip_summed == CHECKSUM_PARTIAL) && skb_checksum_help(skb))
+		goto fail;
+
 	left = skb->len - hlen;		/* Space per frame */
 	ptr = hlen;		/* Where to start from */
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/5] vxlan: capture inner headers during encapsulation
  2012-12-08  0:14 [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation Joseph Gasparakis
  2012-12-08  0:14 ` [PATCH v4 1/5] net: " Joseph Gasparakis
  2012-12-08  0:14 ` [PATCH v4 2/5] net: Handle encapsulated offloads before fragmentation or handing to lower dev Joseph Gasparakis
@ 2012-12-08  0:14 ` Joseph Gasparakis
  2012-12-08  0:14 ` [PATCH v4 4/5] ixgbe: Adding tx encapsulation capability Joseph Gasparakis
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Joseph Gasparakis @ 2012-12-08  0:14 UTC (permalink / raw)
  To: davem, shemminger, chrisw, gospo
  Cc: Joseph Gasparakis, netdev, linux-kernel, dmitry, saeed.bishara,
	bhutchings, Peter P Waskiewicz Jr, Alexander Duyck

Allow VXLAN to make use of Tx checksum offloading and Tx scatter-gather.
The advantage to these two changes is that it also allows the VXLAN to
make use of GSO.

Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/vxlan.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index ce77b8b..88b31f2 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -876,6 +876,11 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct net_device *dev)
 		goto drop;
 	}
 
+	if (!skb->encapsulation) {
+		skb_reset_inner_headers(skb);
+		skb->encapsulation = 1;
+	}
+
 	/* Need space for new headers (invalidates iph ptr) */
 	if (skb_cow_head(skb, VXLAN_HEADROOM))
 		goto drop;
@@ -947,7 +952,8 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct net_device *dev)
 	vxlan_set_owner(dev, skb);
 
 	/* See iptunnel_xmit() */
-	skb->ip_summed = CHECKSUM_NONE;
+	if (skb->ip_summed != CHECKSUM_PARTIAL)
+		skb->ip_summed = CHECKSUM_NONE;
 	ip_select_ident(iph, &rt->dst, NULL);
 
 	err = ip_local_out(skb);
@@ -1168,6 +1174,8 @@ static void vxlan_setup(struct net_device *dev)
 	dev->tx_queue_len = 0;
 	dev->features	|= NETIF_F_LLTX;
 	dev->features	|= NETIF_F_NETNS_LOCAL;
+	dev->features	|= NETIF_F_SG | NETIF_F_HW_CSUM;
+	dev->hw_features |= NETIF_F_SG | NETIF_F_HW_CSUM;
 	dev->priv_flags	&= ~IFF_XMIT_DST_RELEASE;
 
 	spin_lock_init(&vxlan->hash_lock);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 4/5] ixgbe: Adding tx encapsulation capability
  2012-12-08  0:14 [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation Joseph Gasparakis
                   ` (2 preceding siblings ...)
  2012-12-08  0:14 ` [PATCH v4 3/5] vxlan: capture inner headers during encapsulation Joseph Gasparakis
@ 2012-12-08  0:14 ` Joseph Gasparakis
  2012-12-08  0:14 ` [PATCH v4 5/5] vxlan: Add capability of Rx checksum offload for inner packet Joseph Gasparakis
  2012-12-09  5:21 ` [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation David Miller
  5 siblings, 0 replies; 12+ messages in thread
From: Joseph Gasparakis @ 2012-12-08  0:14 UTC (permalink / raw)
  To: davem, shemminger, chrisw, gospo
  Cc: Joseph Gasparakis, netdev, linux-kernel, dmitry, saeed.bishara,
	bhutchings, Alexander Duyck

This patch allows ixgbe to recognize encapsulated packets and do the tx
checksum offload in hardware. This patch is only for demonstration
purposes and should not be applied.

Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 46 +++++++++++++++++++++------
 1 file changed, 37 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index fb165b6..62a7d6e 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -5972,17 +5972,42 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
 			if (!(first->tx_flags & IXGBE_TX_FLAGS_TXSW))
 				return;
 		}
+		vlan_macip_lens |= skb_network_offset(skb)
+				   << IXGBE_ADVTXD_MACLEN_SHIFT;
 	} else {
 		u8 l4_hdr = 0;
-		switch (first->protocol) {
-		case __constant_htons(ETH_P_IP):
-			vlan_macip_lens |= skb_network_header_len(skb);
+		union {
+			struct iphdr *ipv4;
+			struct ipv6hdr *ipv6;
+			u8 *raw;
+		} network_hdr;
+		union {
+			struct tcphdr *tcphdr;
+			u8 *raw;
+		} transport_hdr;
+
+		if (skb->encapsulation) {
+			network_hdr.raw = skb_inner_network_header(skb);
+			transport_hdr.raw = skb_inner_transport_header(skb);
+			vlan_macip_lens |= skb_inner_network_offset(skb) <<
+					   IXGBE_ADVTXD_MACLEN_SHIFT;
+		} else {
+			network_hdr.raw = skb_network_header(skb);
+			transport_hdr.raw = skb_transport_header(skb);
+			vlan_macip_lens |= skb_network_offset(skb) <<
+					   IXGBE_ADVTXD_MACLEN_SHIFT;
+		}
+
+		/* use first 4 bits to determine IP version */
+		switch (network_hdr.ipv4->version) {
+		case 4:
+			vlan_macip_lens |= transport_hdr.raw - network_hdr.raw;
 			type_tucmd |= IXGBE_ADVTXD_TUCMD_IPV4;
-			l4_hdr = ip_hdr(skb)->protocol;
+			l4_hdr = network_hdr.ipv4->protocol;
 			break;
-		case __constant_htons(ETH_P_IPV6):
-			vlan_macip_lens |= skb_network_header_len(skb);
-			l4_hdr = ipv6_hdr(skb)->nexthdr;
+		case 6:
+			vlan_macip_lens |= transport_hdr.raw - network_hdr.raw;
+			l4_hdr = network_hdr.ipv6->nexthdr;
 			break;
 		default:
 			if (unlikely(net_ratelimit())) {
@@ -5996,7 +6021,7 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
 		switch (l4_hdr) {
 		case IPPROTO_TCP:
 			type_tucmd |= IXGBE_ADVTXD_TUCMD_L4T_TCP;
-			mss_l4len_idx = tcp_hdrlen(skb) <<
+			mss_l4len_idx = (transport_hdr.tcphdr->doff * 4) <<
 					IXGBE_ADVTXD_L4LEN_SHIFT;
 			break;
 		case IPPROTO_SCTP:
@@ -6022,7 +6047,6 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
 	}
 
 	/* vlan_macip_lens: MACLEN, VLAN tag */
-	vlan_macip_lens |= skb_network_offset(skb) << IXGBE_ADVTXD_MACLEN_SHIFT;
 	vlan_macip_lens |= first->tx_flags & IXGBE_TX_FLAGS_VLAN_MASK;
 
 	ixgbe_tx_ctxtdesc(tx_ring, vlan_macip_lens, 0,
@@ -7383,6 +7407,10 @@ static int ixgbe_probe(struct pci_dev *pdev,
 
 	netdev->hw_features = netdev->features;
 
+	netdev->hw_enc_features = NETIF_F_IP_CSUM |
+				  NETIF_F_IPV6_CSUM |
+				  NETIF_F_SG;
+
 	switch (adapter->hw.mac.type) {
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 5/5] vxlan: Add capability of Rx checksum offload for inner packet
  2012-12-08  0:14 [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation Joseph Gasparakis
                   ` (3 preceding siblings ...)
  2012-12-08  0:14 ` [PATCH v4 4/5] ixgbe: Adding tx encapsulation capability Joseph Gasparakis
@ 2012-12-08  0:14 ` Joseph Gasparakis
  2012-12-09  5:21 ` [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation David Miller
  5 siblings, 0 replies; 12+ messages in thread
From: Joseph Gasparakis @ 2012-12-08  0:14 UTC (permalink / raw)
  To: davem, shemminger, chrisw, gospo
  Cc: Joseph Gasparakis, netdev, linux-kernel, dmitry, saeed.bishara,
	bhutchings

This patch adds capability in vxlan to identify received
checksummed inner packets and signal them to the upper layers of
the stack. The driver needs to set the skb->encapsulation bit
and also set the skb->ip_summed to CHECKSUM_UNNECESSARY.

Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
---
 drivers/net/vxlan.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 88b31f2..3b3fdf6 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -607,7 +607,17 @@ static int vxlan_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
 
 	__skb_tunnel_rx(skb, vxlan->dev);
 	skb_reset_network_header(skb);
-	skb->ip_summed = CHECKSUM_NONE;
+
+	/* If the NIC driver gave us an encapsulated packet with
+	 * CHECKSUM_UNNECESSARY and Rx checksum feature is enabled,
+	 * leave the CHECKSUM_UNNECESSARY, the device checksummed it
+	 * for us. Otherwise force the upper layers to verify it.
+	 */
+	if (skb->ip_summed != CHECKSUM_UNNECESSARY || !skb->encapsulation ||
+	    !(vxlan->dev->features & NETIF_F_RXCSUM))
+		skb->ip_summed = CHECKSUM_NONE;
+
+	skb->encapsulation = 0;
 
 	err = IP_ECN_decapsulate(oip, skb);
 	if (unlikely(err)) {
@@ -1175,7 +1185,9 @@ static void vxlan_setup(struct net_device *dev)
 	dev->features	|= NETIF_F_LLTX;
 	dev->features	|= NETIF_F_NETNS_LOCAL;
 	dev->features	|= NETIF_F_SG | NETIF_F_HW_CSUM;
-	dev->hw_features |= NETIF_F_SG | NETIF_F_HW_CSUM;
+	dev->features   |= NETIF_F_RXCSUM;
+
+	dev->hw_features |= NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_RXCSUM;
 	dev->priv_flags	&= ~IFF_XMIT_DST_RELEASE;
 
 	spin_lock_init(&vxlan->hash_lock);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation
  2012-12-08  0:14 [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation Joseph Gasparakis
                   ` (4 preceding siblings ...)
  2012-12-08  0:14 ` [PATCH v4 5/5] vxlan: Add capability of Rx checksum offload for inner packet Joseph Gasparakis
@ 2012-12-09  5:21 ` David Miller
  5 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2012-12-09  5:21 UTC (permalink / raw)
  To: joseph.gasparakis
  Cc: shemminger, chrisw, gospo, netdev, linux-kernel, dmitry,
	saeed.bishara, bhutchings

From: Joseph Gasparakis <joseph.gasparakis@intel.com>
Date: Fri,  7 Dec 2012 16:14:13 -0800

> The series contains updates to add in the NIC Rx and Tx checksumming
> support for encapsulated packets.

Ok, all applied (except the ixgbe patch, of course), thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/5] net: Add support for hardware-offloaded encapsulation
  2012-12-08  0:14 ` [PATCH v4 1/5] net: " Joseph Gasparakis
@ 2012-12-10 10:04   ` saeed bishara
  2012-12-10 16:22     ` Alexander Duyck
  2012-12-10 19:58     ` Dmitry Kravkov
  0 siblings, 2 replies; 12+ messages in thread
From: saeed bishara @ 2012-12-10 10:04 UTC (permalink / raw)
  To: Joseph Gasparakis
  Cc: davem, shemminger, chrisw, gospo, netdev, linux-kernel, dmitry,
	bhutchings, Peter P Waskiewicz Jr, Alexander Duyck

> +static inline struct iphdr *inner_ip_hdr(const struct sk_buff *skb)
> +{
> +       return (struct iphdr *)skb_inner_network_header(skb);
> +}

Hi,
I'm a little bit bothered because of those inner_ functions, what
about the following approach:
1. the skb will have a new state, that state can be outer (normal
mode) and inner.
2. when you change the state to inner, all the helper functions such
as ip_hdr will return the innter header.

that's ofcourse the API side. the implementation may still use the
fields you added to the skb.

what you think?
saeed

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/5] net: Add support for hardware-offloaded encapsulation
  2012-12-10 10:04   ` saeed bishara
@ 2012-12-10 16:22     ` Alexander Duyck
  2012-12-10 19:58     ` Dmitry Kravkov
  1 sibling, 0 replies; 12+ messages in thread
From: Alexander Duyck @ 2012-12-10 16:22 UTC (permalink / raw)
  To: saeed bishara
  Cc: Joseph Gasparakis, davem, shemminger, chrisw, gospo, netdev,
	linux-kernel, dmitry, bhutchings, Peter P Waskiewicz Jr,
	Alexander Duyck

On 12/10/2012 02:04 AM, saeed bishara wrote:
>> +static inline struct iphdr *inner_ip_hdr(const struct sk_buff *skb)
>> +{
>> +       return (struct iphdr *)skb_inner_network_header(skb);
>> +}
> Hi,
> I'm a little bit bothered because of those inner_ functions, what
> about the following approach:
> 1. the skb will have a new state, that state can be outer (normal
> mode) and inner.
> 2. when you change the state to inner, all the helper functions such
> as ip_hdr will return the innter header.
>
> that's ofcourse the API side. the implementation may still use the
> fields you added to the skb.
>
> what you think?
> saeed

What you describe isn't too far off from what we are doing.  However we
need to store both the inner and the outer headers.  All these inner_
functions are meant to do is assist drivers to access the inner headers
in the case that skb->encapsulation is set.  We wanted to avoid
abstracting it too much since it is possible in the future that both
inner and outer network headers may be needed if for instance you were
to place a tunnelled frame inside of a VLAN with hardware tag insertion.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v4 1/5] net: Add support for hardware-offloaded encapsulation
  2012-12-10 10:04   ` saeed bishara
  2012-12-10 16:22     ` Alexander Duyck
@ 2012-12-10 19:58     ` Dmitry Kravkov
  2012-12-11  8:11       ` saeed bishara
  1 sibling, 1 reply; 12+ messages in thread
From: Dmitry Kravkov @ 2012-12-10 19:58 UTC (permalink / raw)
  To: saeed bishara, Joseph Gasparakis
  Cc: davem, shemminger, chrisw, gospo, netdev, linux-kernel,
	bhutchings, Peter P Waskiewicz Jr, Alexander Duyck

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 1351 bytes --]

> -----Original Message-----
> From: saeed bishara [mailto:saeed.bishara@gmail.com]
> Sent: Monday, December 10, 2012 12:04 PM
> To: Joseph Gasparakis
> Cc: davem@davemloft.net; shemminger@vyatta.com; chrisw@sous-sol.org;
> gospo@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> Dmitry Kravkov; bhutchings@solarflare.com; Peter P Waskiewicz Jr; Alexander
> Duyck
> Subject: Re: [PATCH v4 1/5] net: Add support for hardware-offloaded
> encapsulation
> 
> > +static inline struct iphdr *inner_ip_hdr(const struct sk_buff *skb)
> > +{
> > +       return (struct iphdr *)skb_inner_network_header(skb);
> > +}
> 
> Hi,
> I'm a little bit bothered because of those inner_ functions, what
> about the following approach:
> 1. the skb will have a new state, that state can be outer (normal
> mode) and inner.
> 2. when you change the state to inner, all the helper functions such
> as ip_hdr will return the innter header.
> 
> that's ofcourse the API side. the implementation may still use the
> fields you added to the skb.
> 
> what you think?
> saeed

Some drivers will probably need both inner_ and other_ in same flow, switching between two states will consume cpu cycles. 
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/5] net: Add support for hardware-offloaded encapsulation
  2012-12-10 19:58     ` Dmitry Kravkov
@ 2012-12-11  8:11       ` saeed bishara
  2012-12-11 16:46         ` Alexander Duyck
  0 siblings, 1 reply; 12+ messages in thread
From: saeed bishara @ 2012-12-11  8:11 UTC (permalink / raw)
  To: Dmitry Kravkov
  Cc: Joseph Gasparakis, davem, shemminger, chrisw, gospo, netdev,
	linux-kernel, bhutchings, Peter P Waskiewicz Jr, Alexander Duyck

On Mon, Dec 10, 2012 at 9:58 PM, Dmitry Kravkov <dmitry@broadcom.com> wrote:
>> -----Original Message-----
>> From: saeed bishara [mailto:saeed.bishara@gmail.com]
>> Sent: Monday, December 10, 2012 12:04 PM
>> To: Joseph Gasparakis
>> Cc: davem@davemloft.net; shemminger@vyatta.com; chrisw@sous-sol.org;
>> gospo@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
>> Dmitry Kravkov; bhutchings@solarflare.com; Peter P Waskiewicz Jr; Alexander
>> Duyck
>> Subject: Re: [PATCH v4 1/5] net: Add support for hardware-offloaded
>> encapsulation
>>
>> > +static inline struct iphdr *inner_ip_hdr(const struct sk_buff *skb)
>> > +{
>> > +       return (struct iphdr *)skb_inner_network_header(skb);
>> > +}
>>
>> Hi,
>> I'm a little bit bothered because of those inner_ functions, what
>> about the following approach:
>> 1. the skb will have a new state, that state can be outer (normal
>> mode) and inner.
>> 2. when you change the state to inner, all the helper functions such
>> as ip_hdr will return the innter header.
>>
>> that's ofcourse the API side. the implementation may still use the
>> fields you added to the skb.
>>
>> what you think?
>> saeed
>
> Some drivers will probably need both inner_ and other_ in same flow, switching between two states will consume cpu cycles.
from performance perspective, I'm not sure the switching is worse, it
may be better as it reduces code size. please have a look at patch
2/5, with switching you can avoid doing the following change -> less
code, less if-else.
-                               skb_set_transport_header(skb,
-                                       skb_checksum_start_offset(skb));
+                               if (skb->encapsulation)
+                                       skb_set_inner_transport_header(skb,
+                                               skb_checksum_start_offset(skb));
+                               else
+                                       skb_set_transport_header(skb,
+                                               skb_checksum_start_offset(skb));
                                if (!(features & NETIF_F_ALL_CSUM) &&

I think also that from (stack) maintenance perspective, less code is better.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/5] net: Add support for hardware-offloaded encapsulation
  2012-12-11  8:11       ` saeed bishara
@ 2012-12-11 16:46         ` Alexander Duyck
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Duyck @ 2012-12-11 16:46 UTC (permalink / raw)
  To: saeed bishara
  Cc: Dmitry Kravkov, Joseph Gasparakis, davem, shemminger, chrisw,
	gospo, netdev, linux-kernel, bhutchings, Peter P Waskiewicz Jr

On 12/11/2012 12:11 AM, saeed bishara wrote:
> On Mon, Dec 10, 2012 at 9:58 PM, Dmitry Kravkov <dmitry@broadcom.com> wrote:
>>> -----Original Message-----
>>> From: saeed bishara [mailto:saeed.bishara@gmail.com]
>>> Sent: Monday, December 10, 2012 12:04 PM
>>> To: Joseph Gasparakis
>>> Cc: davem@davemloft.net; shemminger@vyatta.com; chrisw@sous-sol.org;
>>> gospo@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
>>> Dmitry Kravkov; bhutchings@solarflare.com; Peter P Waskiewicz Jr; Alexander
>>> Duyck
>>> Subject: Re: [PATCH v4 1/5] net: Add support for hardware-offloaded
>>> encapsulation
>>>
>>>> +static inline struct iphdr *inner_ip_hdr(const struct sk_buff *skb)
>>>> +{
>>>> +       return (struct iphdr *)skb_inner_network_header(skb);
>>>> +}
>>>
>>> Hi,
>>> I'm a little bit bothered because of those inner_ functions, what
>>> about the following approach:
>>> 1. the skb will have a new state, that state can be outer (normal
>>> mode) and inner.
>>> 2. when you change the state to inner, all the helper functions such
>>> as ip_hdr will return the innter header.
>>>
>>> that's ofcourse the API side. the implementation may still use the
>>> fields you added to the skb.
>>>
>>> what you think?
>>> saeed
>>
>> Some drivers will probably need both inner_ and other_ in same flow, switching between two states will consume cpu cycles.
> from performance perspective, I'm not sure the switching is worse, it
> may be better as it reduces code size. please have a look at patch
> 2/5, with switching you can avoid doing the following change -> less
> code, less if-else.
> -                               skb_set_transport_header(skb,
> -                                       skb_checksum_start_offset(skb));
> +                               if (skb->encapsulation)
> +                                       skb_set_inner_transport_header(skb,
> +                                               skb_checksum_start_offset(skb));
> +                               else
> +                                       skb_set_transport_header(skb,
> +                                               skb_checksum_start_offset(skb));
>                                 if (!(features & NETIF_F_ALL_CSUM) &&
> 
> I think also that from (stack) maintenance perspective, less code is better.

I don't think your argument is making much sense.  With the approach we
took the switching only needs to take place in the offloaded path.  If
we were to put the switching in place generically we would end up with
the code scattered all throughout the stack.  In addition we will need
both the inner and outer headers to be captured in the case of an
encapsulated offload because the stack will need access to the outer
headers for routing.

My advice is if you have an idea then please just code it up, test it,
and submit a patch so that we can see what you are talking about.  My
concern is that you are suggesting we come up with a generic network and
transport offset that I don't believe has been completely thought through.

Thanks,

Alex


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-12-11 16:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-08  0:14 [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation Joseph Gasparakis
2012-12-08  0:14 ` [PATCH v4 1/5] net: " Joseph Gasparakis
2012-12-10 10:04   ` saeed bishara
2012-12-10 16:22     ` Alexander Duyck
2012-12-10 19:58     ` Dmitry Kravkov
2012-12-11  8:11       ` saeed bishara
2012-12-11 16:46         ` Alexander Duyck
2012-12-08  0:14 ` [PATCH v4 2/5] net: Handle encapsulated offloads before fragmentation or handing to lower dev Joseph Gasparakis
2012-12-08  0:14 ` [PATCH v4 3/5] vxlan: capture inner headers during encapsulation Joseph Gasparakis
2012-12-08  0:14 ` [PATCH v4 4/5] ixgbe: Adding tx encapsulation capability Joseph Gasparakis
2012-12-08  0:14 ` [PATCH v4 5/5] vxlan: Add capability of Rx checksum offload for inner packet Joseph Gasparakis
2012-12-09  5:21 ` [PATCH v4 net-next 0/5] tunneling: Add support for hardware-offloaded encapsulation David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).