netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] Small Modifications to GSO to allow segmentation of MPLS (repost)
@ 2013-04-23  2:19 Simon Horman
       [not found] ` <1366683556-4956-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Simon Horman @ 2013-04-23  2:19 UTC (permalink / raw)
  To: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Alexander Duyck, Eric Dumazet, Maciej Żenczykowski,
	Peter P Waskiewicz Jr, Joseph Gasparakis

[ Reposting with net-next in subject prefix,
  sorry for forgetting that the first time around ]

This series consists of two small (at least in terms of numbers of lines
of code changes) that allow support of GSO of non-MPLS GSO skbs that
are changed into MPLS GSO skbs via Open vSwtich and its push MPLS action.

A description of each of the two changes is provided in the changelog
of the two patches that comprise this series.

GSO of MPLS skbs may be achieved by (Open vSwtich) setting the following:

1. skb's mac_header should point to the beginning of the L2 header
   (no surprise!)
2. skb's mac_len should be the length of the L2 header plus
   the length of the MPLS stack. In other words, the MPLS stack
   is considered to be part of the l2 header.
3. skb's natwork_header should be set to just after the end of
   the MPLS stack. Or in other words the begining of the L3 header.
   This is consistent with 2.
4. skb's protocol should be left as the skb's original (non-MPLS) protocol.
   This is used by GSO to determine which callback to use to segment
   the (MPLS encapsulated) skb data.
5. The protocol in the skb's mac header should be set to the new
   (MPLS) protocol. This is what will appear on the wire.
6. skb's encapsulation_features should be set.
   This is introduced and explained in more detail in the first patch of the
   series.

I have posted a patch, "[PATCH v2.26] datapath: Add basic MPLS support to
kernel" which adds MPLS support to the kernel datapath of Open vSwtich.
That patch sets the above requirements and was used to exercise the MPLS
GSO code. The datapath patch is against the Open vSwtich tree but it is
intended that it be added to the Open vSwtich code present in the mainline
Linux kernel at some point.

Simon Horman (2):
  net: More fine-grained support for encapsulated GSO features
  net: Loosen constraints for recalculating checksum in skb_segment()

 include/linux/skbuff.h | 9 ++++++++-
 net/core/dev.c         | 2 +-
 net/core/skbuff.c      | 4 +++-
 net/ipv4/gre.c         | 1 +
 net/ipv4/ipip.c        | 1 +
 5 files changed, 14 insertions(+), 3 deletions(-)

-- 
1.8.2.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
       [not found] ` <1366683556-4956-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
@ 2013-04-23  2:19   ` Simon Horman
       [not found]     ` <1366683556-4956-2-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
  2013-04-23  2:19   ` [PATCH net-next 2/2] net: Loosen constraints for recalculating checksum in skb_segment() Simon Horman
  1 sibling, 1 reply; 21+ messages in thread
From: Simon Horman @ 2013-04-23  2:19 UTC (permalink / raw)
  To: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Alexander Duyck, Eric Dumazet, Maciej Żenczykowski,
	Peter P Waskiewicz Jr, Joseph Gasparakis

"net: Add support for hardware-offloaded encapsulation" introduced
the encapsulation field of struct skb, which when set provides hints
that GSO should handle an skb that encapsulates a packet.

This patch adds an encapsulation_features field which provides
a hint to dev dev_hard_start_xmit() that harware-offload encapsulation
features should be used. Previously this hint was provided by the
encapsulation field.

The other uses of the encapsulation field are left unchanged.

The two in-tree locations that set the encapsulation have been updated to
also set encapsulation_field.

The motivation for this is to provide support segmentation of GSO MPLS skbs.
This may be necessary when a non-MPLS GSO skb is turned into an MPLS GSO
skb by Open vSwtich and its MPLS push action.

In this case it  harware-offload encapsulation features should be used,
actually to be more exact software segmentation should be selected, but
other hints provided by the encapsulation field are not applicable.

Cc: Joseph Gasparakis <joseph.gasparakis-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Alexander Duyck <alexander.h.duyck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
 include/linux/skbuff.h | 9 ++++++++-
 net/core/dev.c         | 2 +-
 net/core/skbuff.c      | 1 +
 net/ipv4/gre.c         | 1 +
 net/ipv4/ipip.c        | 1 +
 5 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2e0ced1..d9ec1de 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -494,7 +494,14 @@ struct sk_buff {
 	 * headers if needed
 	 */
 	__u8			encapsulation:1;
-	/* 7/9 bit hole (depending on ndisc_nodetype presence) */
+	/* Encapsulation protocol and NIC drivers should use
+	 * this flag to indicate to each other if the skb contains
+	 * encapsulated packet and GSO should use encapsulation features
+	 * instead of standard features for the netdev. This is typically
+	 * a subset of cases where skb->encapsulation is set.
+	 */
+	__u8			encapsulation_features:1;
+	/* 6/8 bit hole (depending on ndisc_nodetype presence) */
 	kmemcheck_bitfield_end(flags2);
 
 #ifdef CONFIG_NET_DMA
diff --git a/net/core/dev.c b/net/core/dev.c
index 9e26b8d..53236c5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2498,7 +2498,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 		 * hardware encapsulation features instead of standard
 		 * features for the netdev
 		 */
-		if (skb->encapsulation)
+		if (skb->encapsulation_features)
 			features &= dev->hw_enc_features;
 
 		if (netif_needs_gso(skb, features)) {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 898cf5c..f23d136 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -708,6 +708,7 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 	new->l4_rxhash		= old->l4_rxhash;
 	new->no_fcs		= old->no_fcs;
 	new->encapsulation	= old->encapsulation;
+	new->encapsulation_features = old->encapsulation_features;
 #ifdef CONFIG_XFRM
 	new->sp			= secpath_get(old->sp);
 #endif
diff --git a/net/ipv4/gre.c b/net/ipv4/gre.c
index 0ae998b..8420f29 100644
--- a/net/ipv4/gre.c
+++ b/net/ipv4/gre.c
@@ -157,6 +157,7 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
 	}
 
 	skb->encapsulation = 0;
+	skb->encapsulation_features = 0;
 
 	if (unlikely(!pskb_may_pull(skb, ghl)))
 		goto out;
diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 77bfcce..a6db3c0 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -220,6 +220,7 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (likely(!skb->encapsulation)) {
 		skb_reset_inner_headers(skb);
 		skb->encapsulation = 1;
+		skb->encapsulation_features = 1;
 	}
 
 	ip_tunnel_xmit(skb, dev, tiph);
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net-next 2/2] net: Loosen constraints for recalculating checksum in skb_segment()
       [not found] ` <1366683556-4956-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
  2013-04-23  2:19   ` [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features Simon Horman
@ 2013-04-23  2:19   ` Simon Horman
  2013-04-23 23:43     ` Jesse Gross
  1 sibling, 1 reply; 21+ messages in thread
From: Simon Horman @ 2013-04-23  2:19 UTC (permalink / raw)
  To: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Alexander Duyck, Eric Dumazet, Maciej Żenczykowski,
	Peter P Waskiewicz Jr, Joseph Gasparakis

In the case where a non-MPLS GSO skb becomes an MPLS GSO skb, via
Open vSwitch's push MPLS action it is desirable to provide segmentation
in software. In this case the original protocol of the skb may have allowed
its checksumming to be offloaded but this may no longer be supported now
the skb is MPLS. Actually it seems to me that this is the likely case.

In order to allow the checksum to be updated in this case loosen
the rules for recalculating the checksum on in skb_segment().

N.B.: I must confess that I am a little unsure of the details of
the implementation of skb_checksum(). But I have observed that this
is necessary as skb_checksum() hits the following:

		if (!hsize && i >= nfrags) {
			...
			fskb = fskb->next;
			...
		}
		...
		if (fskb != skb_shinfo(skb)->frag_list)
			...

Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
 net/core/skbuff.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f23d136..81c9856 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2854,7 +2854,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features)
 						 doffset + tnl_hlen);
 
 		if (fskb != skb_shinfo(skb)->frag_list)
-			continue;
+			goto csum;
 
 		if (!sg) {
 			nskb->ip_summed = CHECKSUM_NONE;
@@ -2918,6 +2918,7 @@ skip_fraglist:
 		nskb->len += nskb->data_len;
 		nskb->truesize += nskb->data_len;
 
+csum:
 		if (!csum) {
 			nskb->csum = skb_checksum(nskb, doffset,
 						  nskb->len - doffset, 0);
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
       [not found]     ` <1366683556-4956-2-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
@ 2013-04-23 21:00       ` Joseph Gasparakis
  2013-04-25  7:36         ` Simon Horman
  0 siblings, 1 reply; 21+ messages in thread
From: Joseph Gasparakis @ 2013-04-23 21:00 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, Alexander Duyck,
	Peter P Waskiewicz Jr, Eric Dumazet, Maciej Żenczykowski,
	netdev-u79uwXL29TY76Z2rM5mHXA, Joseph Gasparakis



On Mon, 22 Apr 2013, Simon Horman wrote:

> "net: Add support for hardware-offloaded encapsulation" introduced
> the encapsulation field of struct skb, which when set provides hints
> that GSO should handle an skb that encapsulates a packet.
> 
> This patch adds an encapsulation_features field which provides
> a hint to dev dev_hard_start_xmit() that harware-offload encapsulation
> features should be used. Previously this hint was provided by the
> encapsulation field.
> 
> The other uses of the encapsulation field are left unchanged.
> 
> The two in-tree locations that set the encapsulation have been updated to
> also set encapsulation_field.
> 
> The motivation for this is to provide support segmentation of GSO MPLS skbs.
> This may be necessary when a non-MPLS GSO skb is turned into an MPLS GSO
> skb by Open vSwtich and its MPLS push action.
> 
> In this case it  harware-offload encapsulation features should be used,
> actually to be more exact software segmentation should be selected, but
> other hints provided by the encapsulation field are not applicable.
> 
> Cc: Joseph Gasparakis <joseph.gasparakis-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Cc: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Cc: Alexander Duyck <alexander.h.duyck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
> ---
>  include/linux/skbuff.h | 9 ++++++++-
>  net/core/dev.c         | 2 +-
>  net/core/skbuff.c      | 1 +
>  net/ipv4/gre.c         | 1 +
>  net/ipv4/ipip.c        | 1 +
>  5 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 2e0ced1..d9ec1de 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -494,7 +494,14 @@ struct sk_buff {
>  	 * headers if needed
>  	 */
>  	__u8			encapsulation:1;
> -	/* 7/9 bit hole (depending on ndisc_nodetype presence) */
> +	/* Encapsulation protocol and NIC drivers should use
> +	 * this flag to indicate to each other if the skb contains
> +	 * encapsulated packet and GSO should use encapsulation features
> +	 * instead of standard features for the netdev. This is typically
> +	 * a subset of cases where skb->encapsulation is set.
> +	 */
> +	__u8			encapsulation_features:1;
> +	/* 6/8 bit hole (depending on ndisc_nodetype presence) */
>  	kmemcheck_bitfield_end(flags2);
>  
>  #ifdef CONFIG_NET_DMA
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9e26b8d..53236c5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2498,7 +2498,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
>  		 * hardware encapsulation features instead of standard
>  		 * features for the netdev
>  		 */
> -		if (skb->encapsulation)
> +		if (skb->encapsulation_features)
>  			features &= dev->hw_enc_features;
>  
>  		if (netif_needs_gso(skb, features)) {
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 898cf5c..f23d136 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -708,6 +708,7 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
>  	new->l4_rxhash		= old->l4_rxhash;
>  	new->no_fcs		= old->no_fcs;
>  	new->encapsulation	= old->encapsulation;
> +	new->encapsulation_features = old->encapsulation_features;
>  #ifdef CONFIG_XFRM
>  	new->sp			= secpath_get(old->sp);
>  #endif
> diff --git a/net/ipv4/gre.c b/net/ipv4/gre.c
> index 0ae998b..8420f29 100644
> --- a/net/ipv4/gre.c
> +++ b/net/ipv4/gre.c
> @@ -157,6 +157,7 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
>  	}
>  
>  	skb->encapsulation = 0;
> +	skb->encapsulation_features = 0;
>  
>  	if (unlikely(!pskb_may_pull(skb, ghl)))
>  		goto out;
> diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
> index 77bfcce..a6db3c0 100644
> --- a/net/ipv4/ipip.c
> +++ b/net/ipv4/ipip.c
> @@ -220,6 +220,7 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
>  	if (likely(!skb->encapsulation)) {
>  		skb_reset_inner_headers(skb);
>  		skb->encapsulation = 1;
> +		skb->encapsulation_features = 1;
>  	}
>  
>  	ip_tunnel_xmit(skb, dev, tiph);
> 

Any particular reason to introduce skb->encapsulation_features instead of 
using the existing skb->encapsulation? Also I don't see it used in your 
second patch either.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 2/2] net: Loosen constraints for recalculating checksum in skb_segment()
  2013-04-23  2:19   ` [PATCH net-next 2/2] net: Loosen constraints for recalculating checksum in skb_segment() Simon Horman
@ 2013-04-23 23:43     ` Jesse Gross
  2013-04-25  7:26       ` Simon Horman
  0 siblings, 1 reply; 21+ messages in thread
From: Jesse Gross @ 2013-04-23 23:43 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev, netdev, Jarno Rajahalme, Joseph Gasparakis,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Mon, Apr 22, 2013 at 7:19 PM, Simon Horman <horms@verge.net.au> wrote:
> In the case where a non-MPLS GSO skb becomes an MPLS GSO skb, via
> Open vSwitch's push MPLS action it is desirable to provide segmentation
> in software. In this case the original protocol of the skb may have allowed
> its checksumming to be offloaded but this may no longer be supported now
> the skb is MPLS. Actually it seems to me that this is the likely case.
>
> In order to allow the checksum to be updated in this case loosen
> the rules for recalculating the checksum on in skb_segment().
>
> N.B.: I must confess that I am a little unsure of the details of
> the implementation of skb_checksum(). But I have observed that this
> is necessary as skb_checksum() hits the following:
>
>                 if (!hsize && i >= nfrags) {
>                         ...
>                         fskb = fskb->next;
>                         ...
>                 }
>                 ...
>                 if (fskb != skb_shinfo(skb)->frag_list)
>                         ...
>
> Signed-off-by: Simon Horman <horms@verge.net.au>

I'm surprised at the need for changes here since neither vlans nor
tunneling protocols needed something similar. Do you know what is
special about MPLS that is triggering this?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 2/2] net: Loosen constraints for recalculating checksum in skb_segment()
  2013-04-23 23:43     ` Jesse Gross
@ 2013-04-25  7:26       ` Simon Horman
  2013-04-29 20:01         ` Jesse Gross
  0 siblings, 1 reply; 21+ messages in thread
From: Simon Horman @ 2013-04-25  7:26 UTC (permalink / raw)
  To: Jesse Gross
  Cc: dev, netdev, Jarno Rajahalme, Joseph Gasparakis,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Tue, Apr 23, 2013 at 04:43:58PM -0700, Jesse Gross wrote:
> On Mon, Apr 22, 2013 at 7:19 PM, Simon Horman <horms@verge.net.au> wrote:
> > In the case where a non-MPLS GSO skb becomes an MPLS GSO skb, via
> > Open vSwitch's push MPLS action it is desirable to provide segmentation
> > in software. In this case the original protocol of the skb may have allowed
> > its checksumming to be offloaded but this may no longer be supported now
> > the skb is MPLS. Actually it seems to me that this is the likely case.
> >
> > In order to allow the checksum to be updated in this case loosen
> > the rules for recalculating the checksum on in skb_segment().
> >
> > N.B.: I must confess that I am a little unsure of the details of
> > the implementation of skb_checksum(). But I have observed that this
> > is necessary as skb_checksum() hits the following:
> >
> >                 if (!hsize && i >= nfrags) {
> >                         ...
> >                         fskb = fskb->next;
> >                         ...
> >                 }
> >                 ...
> >                 if (fskb != skb_shinfo(skb)->frag_list)
> >                         ...
> >
> > Signed-off-by: Simon Horman <horms@verge.net.au>
> 
> I'm surprised at the need for changes here since neither vlans nor
> tunneling protocols needed something similar. Do you know what is
> special about MPLS that is triggering this?

After some testing I believe that the answer for GRE is, much to my
surprise, that it doesn't work without this patch. At least not for the
test case I have been using.

To test GRE I set up a machine to receive IP packets and forward them over
a GRE (not TEB) tunnel created using the ip command. In that case I see
incorrect TCP checksums and then retransmissions when GSO segmentation
occurs.  A tcpdump is below.

With this patch applied the checksums are correct.

This seems to be the same behaviour I observed when using Open vSwtich to
output a GSO skb after it had been changed from IPv4/TCP to MPLS with an
IPv4/TCP payload using an MPLS PUSH action. Without this patch the TCP
checksums are incorrect, with it they are calculated correctly.


tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
16:10:54.502631 00:23:7d:e8:5d:a0 > 00:23:7d:e8:dc:8c, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 19650, offset 0, flags [DF], proto GRE (47), length 84)
    10.3.3.149 > 10.3.3.153: GREv0, Flags [none], proto IPv4 (0x0800), length 64
	(tos 0x0, ttl 63, id 19650, offset 0, flags [DF], proto TCP (6), length 60)
    10.3.3.135.54778 > 10.0.98.153.10000: Flags [S], cksum 0x7cb8 (correct), seq 1709735614, win 14560, options [mss 1456,sackOK,TS val 68891260 ecr 0,nop,wscale 7], length 0
16:10:54.502882 00:23:7d:e8:dc:8c > 00:23:7d:e8:5d:a0, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto GRE (47), length 84)
    10.3.3.153 > 10.3.3.149: GREv0, Flags [none], proto IPv4 (0x0800), length 64
	(tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    10.0.98.153.10000 > 10.3.3.135.54778: Flags [S.], cksum 0x2e77 (correct), seq 3486876063, ack 1709735615, win 14240, options [mss 1436,sackOK,TS val 1310203 ecr 68891260,nop,wscale 7], length 0
16:10:54.503038 00:23:7d:e8:5d:a0 > 00:23:7d:e8:dc:8c, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 19651, offset 0, flags [DF], proto GRE (47), length 76)
    10.3.3.149 > 10.3.3.153: GREv0, Flags [none], proto IPv4 (0x0800), length 56
	(tos 0x0, ttl 63, id 19651, offset 0, flags [DF], proto TCP (6), length 52)
    10.3.3.135.54778 > 10.0.98.153.10000: Flags [.], cksum 0x9459 (correct), seq 1, ack 1, win 114, options [nop,nop,TS val 68891260 ecr 1310203], length 0
16:10:54.503181 00:23:7d:e8:5d:a0 > 00:23:7d:e8:dc:8c, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 63, id 19652, offset 0, flags [DF], proto GRE (47), length 1500)
    10.3.3.149 > 10.3.3.153: GREv0, Flags [none], proto IPv4 (0x0800), length 1480
	(tos 0x0, ttl 63, id 19652, offset 0, flags [DF], proto TCP (6), length 1476)
    10.3.3.135.54778 > 10.0.98.153.10000: Flags [.], cksum 0x7fd9 (incorrect -> 0x8ec9), seq 1:1425, ack 1, win 114, options [nop,nop,TS val 68891260 ecr 1310203], length 1424
16:10:54.503277 00:23:7d:e8:5d:a0 > 00:23:7d:e8:dc:8c, ethertype IPv4 (0x0800), length 714: (tos 0x0, ttl 63, id 19653, offset 0, flags [DF], proto GRE (47), length 700)
    10.3.3.149 > 10.3.3.153: GREv0, Flags [none], proto IPv4 (0x0800), length 680
	(tos 0x0, ttl 63, id 19653, offset 0, flags [DF], proto TCP (6), length 676)
    10.3.3.135.54778 > 10.0.98.153.10000: Flags [P.], cksum 0x7cb9 (incorrect -> 0x8c51), seq 1425:2049, ack 1, win 114, options [nop,nop,TS val 68891260 ecr 1310203], length 624
16:10:54.503286 00:23:7d:e8:5d:a0 > 00:23:7d:e8:dc:8c, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 19654, offset 0, flags [DF], proto GRE (47), length 76)
    10.3.3.149 > 10.3.3.153: GREv0, Flags [none], proto IPv4 (0x0800), length 56
	(tos 0x0, ttl 63, id 19654, offset 0, flags [DF], proto TCP (6), length 52)
    10.3.3.135.54778 > 10.0.98.153.10000: Flags [F.], cksum 0x8c58 (correct), seq 2049, ack 1, win 114, options [nop,nop,TS val 68891260 ecr 1310203], length 0
16:10:54.503414 00:23:7d:e8:dc:8c > 00:23:7d:e8:5d:a0, ethertype IPv4 (0x0800), length 102: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto GRE (47), length 88)
    10.3.3.153 > 10.3.3.149: GREv0, Flags [none], proto IPv4 (0x0800), length 68
	(tos 0x0, ttl 64, id 31835, offset 0, flags [DF], proto TCP (6), length 64)
    10.0.98.153.10000 > 10.3.3.135.54778: Flags [.], cksum 0x84f3 (correct), seq 1, ack 1, win 112, options [nop,nop,TS val 1310203 ecr 68891260,nop,nop,sack 1 {2049:2050}], length 0
16:10:54.704485 00:23:7d:e8:5d:a0 > 00:23:7d:e8:dc:8c, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 63, id 19655, offset 0, flags [DF], proto GRE (47), length 1500)
    10.3.3.149 > 10.3.3.153: GREv0, Flags [none], proto IPv4 (0x0800), length 1480
	(tos 0x0, ttl 63, id 19655, offset 0, flags [DF], proto TCP (6), length 1476)
    10.3.3.135.54778 > 10.0.98.153.10000: Flags [.], cksum 0x8e96 (correct), seq 1:1425, ack 1, win 114, options [nop,nop,TS val 68891311 ecr 1310203], length 1424
16:10:54.704666 00:23:7d:e8:dc:8c > 00:23:7d:e8:5d:a0, ethertype IPv4 (0x0800), length 102: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto GRE (47), length 88)
    10.3.3.153 > 10.3.3.149: GREv0, Flags [none], proto IPv4 (0x0800), length 68
	(tos 0x0, ttl 64, id 31836, offset 0, flags [DF], proto TCP (6), length 64)
    10.0.98.153.10000 > 10.3.3.135.54778: Flags [.], cksum 0x7ee8 (correct), seq 1, ack 1425, win 134, options [nop,nop,TS val 1310253 ecr 68891311,nop,nop,sack 1 {2049:2050}], length 0
16:10:54.704817 00:23:7d:e8:5d:a0 > 00:23:7d:e8:dc:8c, ethertype IPv4 (0x0800), length 714: (tos 0x0, ttl 63, id 19656, offset 0, flags [DF], proto GRE (47), length 700)
    10.3.3.149 > 10.3.3.153: GREv0, Flags [none], proto IPv4 (0x0800), length 680
	(tos 0x0, ttl 63, id 19656, offset 0, flags [DF], proto TCP (6), length 676)
    10.3.3.135.54778 > 10.0.98.153.10000: Flags [P.], cksum 0x8bec (correct), seq 1425:2049, ack 1, win 114, options [nop,nop,TS val 68891311 ecr 1310253], length 624
16:10:54.704961 00:23:7d:e8:dc:8c > 00:23:7d:e8:5d:a0, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto GRE (47), length 76)
    10.3.3.153 > 10.3.3.149: GREv0, Flags [none], proto IPv4 (0x0800), length 56
	(tos 0x0, ttl 64, id 31837, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.98.153.10000 > 10.3.3.135.54778: Flags [.], cksum 0x8bc9 (correct), seq 1, ack 2050, win 156, options [nop,nop,TS val 1310253 ecr 68891311], length 0
16:10:54.704990 00:23:7d:e8:dc:8c > 00:23:7d:e8:5d:a0, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto GRE (47), length 76)
    10.3.3.153 > 10.3.3.149: GREv0, Flags [none], proto IPv4 (0x0800), length 56
	(tos 0x0, ttl 64, id 31838, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.98.153.10000 > 10.3.3.135.54778: Flags [F.], cksum 0x8bc8 (correct), seq 1, ack 2050, win 156, options [nop,nop,TS val 1310253 ecr 68891311], length 0
16:10:54.705164 00:23:7d:e8:5d:a0 > 00:23:7d:e8:dc:8c, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 63, id 20581, offset 0, flags [DF], proto GRE (47), length 76)
    10.3.3.149 > 10.3.3.153: GREv0, Flags [none], proto IPv4 (0x0800), length 56
	(tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    10.3.3.135.54778 > 10.0.98.153.10000: Flags [.], cksum 0x8bf2 (correct), seq 2050, ack 2, win 114, options [nop,nop,TS val 68891311 ecr 1310253], length 0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
  2013-04-23 21:00       ` Joseph Gasparakis
@ 2013-04-25  7:36         ` Simon Horman
  2013-04-25  8:03           ` Simon Horman
       [not found]           ` <20130425073644.GC7936-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
  0 siblings, 2 replies; 21+ messages in thread
From: Simon Horman @ 2013-04-25  7:36 UTC (permalink / raw)
  To: Joseph Gasparakis
  Cc: dev, netdev, Jesse Gross, jarno.rajahalme, Peter P Waskiewicz Jr,
	Alexander Duyck, Eric Dumazet, Maciej Żenczykowski

On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
> 
> 
> On Mon, 22 Apr 2013, Simon Horman wrote:
> 
> > "net: Add support for hardware-offloaded encapsulation" introduced
> > the encapsulation field of struct skb, which when set provides hints
> > that GSO should handle an skb that encapsulates a packet.
> > 
> > This patch adds an encapsulation_features field which provides
> > a hint to dev dev_hard_start_xmit() that harware-offload encapsulation
> > features should be used. Previously this hint was provided by the
> > encapsulation field.
> > 
> > The other uses of the encapsulation field are left unchanged.
> > 
> > The two in-tree locations that set the encapsulation have been updated to
> > also set encapsulation_field.
> > 
> > The motivation for this is to provide support segmentation of GSO MPLS skbs.
> > This may be necessary when a non-MPLS GSO skb is turned into an MPLS GSO
> > skb by Open vSwtich and its MPLS push action.
> > 
> > In this case it  harware-offload encapsulation features should be used,
> > actually to be more exact software segmentation should be selected, but
> > other hints provided by the encapsulation field are not applicable.
> > 
> > Cc: Joseph Gasparakis <joseph.gasparakis@intel.com>
> > Cc: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
> > Cc: Alexander Duyck <alexander.h.duyck@intel.com>
> > Signed-off-by: Simon Horman <horms@verge.net.au>
> > ---
> >  include/linux/skbuff.h | 9 ++++++++-
> >  net/core/dev.c         | 2 +-
> >  net/core/skbuff.c      | 1 +
> >  net/ipv4/gre.c         | 1 +
> >  net/ipv4/ipip.c        | 1 +
> >  5 files changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index 2e0ced1..d9ec1de 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -494,7 +494,14 @@ struct sk_buff {
> >  	 * headers if needed
> >  	 */
> >  	__u8			encapsulation:1;
> > -	/* 7/9 bit hole (depending on ndisc_nodetype presence) */
> > +	/* Encapsulation protocol and NIC drivers should use
> > +	 * this flag to indicate to each other if the skb contains
> > +	 * encapsulated packet and GSO should use encapsulation features
> > +	 * instead of standard features for the netdev. This is typically
> > +	 * a subset of cases where skb->encapsulation is set.
> > +	 */
> > +	__u8			encapsulation_features:1;
> > +	/* 6/8 bit hole (depending on ndisc_nodetype presence) */
> >  	kmemcheck_bitfield_end(flags2);
> >  
> >  #ifdef CONFIG_NET_DMA
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 9e26b8d..53236c5 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -2498,7 +2498,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
> >  		 * hardware encapsulation features instead of standard
> >  		 * features for the netdev
> >  		 */
> > -		if (skb->encapsulation)
> > +		if (skb->encapsulation_features)
> >  			features &= dev->hw_enc_features;
> >  
> >  		if (netif_needs_gso(skb, features)) {
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 898cf5c..f23d136 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -708,6 +708,7 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
> >  	new->l4_rxhash		= old->l4_rxhash;
> >  	new->no_fcs		= old->no_fcs;
> >  	new->encapsulation	= old->encapsulation;
> > +	new->encapsulation_features = old->encapsulation_features;
> >  #ifdef CONFIG_XFRM
> >  	new->sp			= secpath_get(old->sp);
> >  #endif
> > diff --git a/net/ipv4/gre.c b/net/ipv4/gre.c
> > index 0ae998b..8420f29 100644
> > --- a/net/ipv4/gre.c
> > +++ b/net/ipv4/gre.c
> > @@ -157,6 +157,7 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
> >  	}
> >  
> >  	skb->encapsulation = 0;
> > +	skb->encapsulation_features = 0;
> >  
> >  	if (unlikely(!pskb_may_pull(skb, ghl)))
> >  		goto out;
> > diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
> > index 77bfcce..a6db3c0 100644
> > --- a/net/ipv4/ipip.c
> > +++ b/net/ipv4/ipip.c
> > @@ -220,6 +220,7 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
> >  	if (likely(!skb->encapsulation)) {
> >  		skb_reset_inner_headers(skb);
> >  		skb->encapsulation = 1;
> > +		skb->encapsulation_features = 1;
> >  	}
> >  
> >  	ip_tunnel_xmit(skb, dev, tiph);
> > 
> 
> Any particular reason to introduce skb->encapsulation_features instead of 
> using the existing skb->encapsulation? Also I don't see it used in your 
> second patch either.

My reasoning is that skb->encapsulation seems to alter the behaviour of
many different locations and I'm not sure that any of them, other than the
one in dev_hard_start_xmit() make sense for MPLS.

For example the following in inet_gso_segment()

	tunnel = !!skb->encapsulation;
	...
	do {
		...
		if (!tunnel && proto == IPPROTO_UDP) {
			iph->id = htons(id);
			iph->frag_off = htons(offset >> 3);
			if (skb->next != NULL)
				iph->frag_off |= htons(IP_MF);
			offset += (skb->len - skb->mac_len - iph->ihl * 4);
		} else  {
			iph->id = htons(id++);
		}
		...

On reflection I need to examine the relevant code-paths more closely, but I
believe the !tunnel portion of the above code is intended to help effect
GSO segmentation UDP tunnelling protocols and is not relevant to MPLS.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
  2013-04-25  7:36         ` Simon Horman
@ 2013-04-25  8:03           ` Simon Horman
       [not found]           ` <20130425073644.GC7936-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
  1 sibling, 0 replies; 21+ messages in thread
From: Simon Horman @ 2013-04-25  8:03 UTC (permalink / raw)
  To: Joseph Gasparakis
  Cc: dev, netdev, Jesse Gross, jarno.rajahalme, Peter P Waskiewicz Jr,
	Alexander Duyck, Eric Dumazet, Maciej Żenczykowski

On Thu, Apr 25, 2013 at 04:36:44PM +0900, Simon Horman wrote:
> On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
> > 
> > 
> > On Mon, 22 Apr 2013, Simon Horman wrote:
> > 
> > > "net: Add support for hardware-offloaded encapsulation" introduced
> > > the encapsulation field of struct skb, which when set provides hints
> > > that GSO should handle an skb that encapsulates a packet.
> > > 
> > > This patch adds an encapsulation_features field which provides
> > > a hint to dev dev_hard_start_xmit() that harware-offload encapsulation
> > > features should be used. Previously this hint was provided by the
> > > encapsulation field.
> > > 
> > > The other uses of the encapsulation field are left unchanged.
> > > 
> > > The two in-tree locations that set the encapsulation have been updated to
> > > also set encapsulation_field.
> > > 
> > > The motivation for this is to provide support segmentation of GSO MPLS skbs.
> > > This may be necessary when a non-MPLS GSO skb is turned into an MPLS GSO
> > > skb by Open vSwtich and its MPLS push action.
> > > 
> > > In this case it  harware-offload encapsulation features should be used,
> > > actually to be more exact software segmentation should be selected, but
> > > other hints provided by the encapsulation field are not applicable.
> > > 
> > > Cc: Joseph Gasparakis <joseph.gasparakis@intel.com>
> > > Cc: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
> > > Cc: Alexander Duyck <alexander.h.duyck@intel.com>
> > > Signed-off-by: Simon Horman <horms@verge.net.au>
> > > ---
> > >  include/linux/skbuff.h | 9 ++++++++-
> > >  net/core/dev.c         | 2 +-
> > >  net/core/skbuff.c      | 1 +
> > >  net/ipv4/gre.c         | 1 +
> > >  net/ipv4/ipip.c        | 1 +
> > >  5 files changed, 12 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > > index 2e0ced1..d9ec1de 100644
> > > --- a/include/linux/skbuff.h
> > > +++ b/include/linux/skbuff.h
> > > @@ -494,7 +494,14 @@ struct sk_buff {
> > >  	 * headers if needed
> > >  	 */
> > >  	__u8			encapsulation:1;
> > > -	/* 7/9 bit hole (depending on ndisc_nodetype presence) */
> > > +	/* Encapsulation protocol and NIC drivers should use
> > > +	 * this flag to indicate to each other if the skb contains
> > > +	 * encapsulated packet and GSO should use encapsulation features
> > > +	 * instead of standard features for the netdev. This is typically
> > > +	 * a subset of cases where skb->encapsulation is set.
> > > +	 */
> > > +	__u8			encapsulation_features:1;
> > > +	/* 6/8 bit hole (depending on ndisc_nodetype presence) */
> > >  	kmemcheck_bitfield_end(flags2);
> > >  
> > >  #ifdef CONFIG_NET_DMA
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 9e26b8d..53236c5 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -2498,7 +2498,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
> > >  		 * hardware encapsulation features instead of standard
> > >  		 * features for the netdev
> > >  		 */
> > > -		if (skb->encapsulation)
> > > +		if (skb->encapsulation_features)
> > >  			features &= dev->hw_enc_features;
> > >  
> > >  		if (netif_needs_gso(skb, features)) {
> > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > index 898cf5c..f23d136 100644
> > > --- a/net/core/skbuff.c
> > > +++ b/net/core/skbuff.c
> > > @@ -708,6 +708,7 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
> > >  	new->l4_rxhash		= old->l4_rxhash;
> > >  	new->no_fcs		= old->no_fcs;
> > >  	new->encapsulation	= old->encapsulation;
> > > +	new->encapsulation_features = old->encapsulation_features;
> > >  #ifdef CONFIG_XFRM
> > >  	new->sp			= secpath_get(old->sp);
> > >  #endif
> > > diff --git a/net/ipv4/gre.c b/net/ipv4/gre.c
> > > index 0ae998b..8420f29 100644
> > > --- a/net/ipv4/gre.c
> > > +++ b/net/ipv4/gre.c
> > > @@ -157,6 +157,7 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
> > >  	}
> > >  
> > >  	skb->encapsulation = 0;
> > > +	skb->encapsulation_features = 0;
> > >  
> > >  	if (unlikely(!pskb_may_pull(skb, ghl)))
> > >  		goto out;
> > > diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
> > > index 77bfcce..a6db3c0 100644
> > > --- a/net/ipv4/ipip.c
> > > +++ b/net/ipv4/ipip.c
> > > @@ -220,6 +220,7 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
> > >  	if (likely(!skb->encapsulation)) {
> > >  		skb_reset_inner_headers(skb);
> > >  		skb->encapsulation = 1;
> > > +		skb->encapsulation_features = 1;
> > >  	}
> > >  
> > >  	ip_tunnel_xmit(skb, dev, tiph);
> > > 
> > 
> > Any particular reason to introduce skb->encapsulation_features instead of 
> > using the existing skb->encapsulation? Also I don't see it used in your 
> > second patch either.
> 
> My reasoning is that skb->encapsulation seems to alter the behaviour of
> many different locations and I'm not sure that any of them, other than the
> one in dev_hard_start_xmit() make sense for MPLS.
> 
> For example the following in inet_gso_segment()
> 
> 	tunnel = !!skb->encapsulation;
> 	...
> 	do {
> 		...
> 		if (!tunnel && proto == IPPROTO_UDP) {
> 			iph->id = htons(id);
> 			iph->frag_off = htons(offset >> 3);
> 			if (skb->next != NULL)
> 				iph->frag_off |= htons(IP_MF);
> 			offset += (skb->len - skb->mac_len - iph->ihl * 4);
> 		} else  {
> 			iph->id = htons(id++);
> 		}
> 		...
> 
> On reflection I need to examine the relevant code-paths more closely, but I
> believe the !tunnel portion of the above code is intended to help effect
> GSO segmentation UDP tunnelling protocols and is not relevant to MPLS.

I forgot to mention in my previous email that this change is used in the
patch "[PATCH v2.26] datapath: Add basic MPLS support to kernel".

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
       [not found]           ` <20130425073644.GC7936-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
@ 2013-04-26 23:03             ` Jesse Gross
  2013-04-30  3:21               ` Simon Horman
  0 siblings, 1 reply; 21+ messages in thread
From: Jesse Gross @ 2013-04-26 23:03 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski, netdev, Peter P Waskiewicz Jr,
	Joseph Gasparakis

On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:
> On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
>> Any particular reason to introduce skb->encapsulation_features instead of
>> using the existing skb->encapsulation? Also I don't see it used in your
>> second patch either.
>
> My reasoning is that skb->encapsulation seems to alter the behaviour of
> many different locations and I'm not sure that any of them, other than the
> one in dev_hard_start_xmit() make sense for MPLS.

The problem is the meaning of skb->encapsulation isn't really defined
clearly and I'm certain that the current implementation is not going
to work in the future. Depending on your perspective, vlans, MPLS,
tunnels, etc. can all be considered forms of encapsulation but clearly
there are many NICs that have different capabilities across those. I
believe the intention here was really to describe L3 tunnels as
encapsulation, in which case MPLS really shouldn't be using this
mechanism at all.

Now there is some overlap, especially today since most currently
shipping silicon wasn't designed to support tunnels and so is using
some form of offset based offloads. In that case, all forms of
encapsulation are pretty similar. However, in the future that won't be
the case as support for specific protocols is implemented for higher
performance and richer support. When that happens, not only will MPLS
and tunnels have different capabilities but various forms tunnels
might as well.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 2/2] net: Loosen constraints for recalculating checksum in skb_segment()
  2013-04-25  7:26       ` Simon Horman
@ 2013-04-29 20:01         ` Jesse Gross
  2013-04-30  3:17           ` Simon Horman
  0 siblings, 1 reply; 21+ messages in thread
From: Jesse Gross @ 2013-04-29 20:01 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev, netdev, Jarno Rajahalme, Joseph Gasparakis,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Thu, Apr 25, 2013 at 12:26 AM, Simon Horman <horms@verge.net.au> wrote:
> On Tue, Apr 23, 2013 at 04:43:58PM -0700, Jesse Gross wrote:
>> On Mon, Apr 22, 2013 at 7:19 PM, Simon Horman <horms@verge.net.au> wrote:
>> > In the case where a non-MPLS GSO skb becomes an MPLS GSO skb, via
>> > Open vSwitch's push MPLS action it is desirable to provide segmentation
>> > in software. In this case the original protocol of the skb may have allowed
>> > its checksumming to be offloaded but this may no longer be supported now
>> > the skb is MPLS. Actually it seems to me that this is the likely case.
>> >
>> > In order to allow the checksum to be updated in this case loosen
>> > the rules for recalculating the checksum on in skb_segment().
>> >
>> > N.B.: I must confess that I am a little unsure of the details of
>> > the implementation of skb_checksum(). But I have observed that this
>> > is necessary as skb_checksum() hits the following:
>> >
>> >                 if (!hsize && i >= nfrags) {
>> >                         ...
>> >                         fskb = fskb->next;
>> >                         ...
>> >                 }
>> >                 ...
>> >                 if (fskb != skb_shinfo(skb)->frag_list)
>> >                         ...
>> >
>> > Signed-off-by: Simon Horman <horms@verge.net.au>
>>
>> I'm surprised at the need for changes here since neither vlans nor
>> tunneling protocols needed something similar. Do you know what is
>> special about MPLS that is triggering this?
>
> After some testing I believe that the answer for GRE is, much to my
> surprise, that it doesn't work without this patch. At least not for the
> test case I have been using.
>
> To test GRE I set up a machine to receive IP packets and forward them over
> a GRE (not TEB) tunnel created using the ip command. In that case I see
> incorrect TCP checksums and then retransmissions when GSO segmentation
> occurs.  A tcpdump is below.

Hmm, skb_segment() should be independent of protocol but I can see how
this is a case that wouldn't get exercised frequently and therefore
could have bugs. It seems like the case would be receiving a packet
that gets merged using GRO and then imposing some kind of header (but
not a single level of vlan since that is stored out of band). I think
probably it's necessary to do some more careful analysis to make sure
that this change is safe in all cases (or at least expand the commit
message since it's not immediately obvious at the moment).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 2/2] net: Loosen constraints for recalculating checksum in skb_segment()
  2013-04-29 20:01         ` Jesse Gross
@ 2013-04-30  3:17           ` Simon Horman
  0 siblings, 0 replies; 21+ messages in thread
From: Simon Horman @ 2013-04-30  3:17 UTC (permalink / raw)
  To: Jesse Gross
  Cc: dev, netdev, Jarno Rajahalme, Joseph Gasparakis,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Mon, Apr 29, 2013 at 01:01:44PM -0700, Jesse Gross wrote:
> On Thu, Apr 25, 2013 at 12:26 AM, Simon Horman <horms@verge.net.au> wrote:
> > On Tue, Apr 23, 2013 at 04:43:58PM -0700, Jesse Gross wrote:
> >> On Mon, Apr 22, 2013 at 7:19 PM, Simon Horman <horms@verge.net.au> wrote:
> >> > In the case where a non-MPLS GSO skb becomes an MPLS GSO skb, via
> >> > Open vSwitch's push MPLS action it is desirable to provide segmentation
> >> > in software. In this case the original protocol of the skb may have allowed
> >> > its checksumming to be offloaded but this may no longer be supported now
> >> > the skb is MPLS. Actually it seems to me that this is the likely case.
> >> >
> >> > In order to allow the checksum to be updated in this case loosen
> >> > the rules for recalculating the checksum on in skb_segment().
> >> >
> >> > N.B.: I must confess that I am a little unsure of the details of
> >> > the implementation of skb_checksum(). But I have observed that this
> >> > is necessary as skb_checksum() hits the following:

--,- s/skb_checksum/skb_segment/

> >> >
> >> >                 if (!hsize && i >= nfrags) {
> >> >                         ...
> >> >                         fskb = fskb->next;
> >> >                         ...
> >> >                 }
> >> >                 ...
> >> >                 if (fskb != skb_shinfo(skb)->frag_list)
> >> >                         ...
> >> >
> >> > Signed-off-by: Simon Horman <horms@verge.net.au>
> >>
> >> I'm surprised at the need for changes here since neither vlans nor
> >> tunneling protocols needed something similar. Do you know what is
> >> special about MPLS that is triggering this?
> >
> > After some testing I believe that the answer for GRE is, much to my
> > surprise, that it doesn't work without this patch. At least not for the
> > test case I have been using.
> >
> > To test GRE I set up a machine to receive IP packets and forward them over
> > a GRE (not TEB) tunnel created using the ip command. In that case I see
> > incorrect TCP checksums and then retransmissions when GSO segmentation
> > occurs.  A tcpdump is below.
> 
> Hmm, skb_segment() should be independent of protocol but I can see how
> this is a case that wouldn't get exercised frequently and therefore
> could have bugs. It seems like the case would be receiving a packet
> that gets merged using GRO and then imposing some kind of header (but
> not a single level of vlan since that is stored out of band).

Yes, that is what I think too.

> I think probably it's necessary to do some more careful analysis to make
> sure that this change is safe in all cases (or at least expand the commit
> message since it's not immediately obvious at the moment).

I will look over things a little more but I think that it is
correct so long as the calculation made by can_checksum_protocol(),
and the value of features passed to skb_segment() and in turn
can_checksum_protocol() is correct.

The reason is that in cases where can_checksum_protocol returns true
then this change will have no effect on the checksum calculation.
It is only in cases where it returns false that there may be an effect.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
  2013-04-26 23:03             ` Jesse Gross
@ 2013-04-30  3:21               ` Simon Horman
  2013-04-30 16:19                 ` Jesse Gross
  0 siblings, 1 reply; 21+ messages in thread
From: Simon Horman @ 2013-04-30  3:21 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Joseph Gasparakis, dev, netdev, Jarno Rajahalme,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Fri, Apr 26, 2013 at 04:03:21PM -0700, Jesse Gross wrote:
> On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms@verge.net.au> wrote:
> > On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
> >> Any particular reason to introduce skb->encapsulation_features instead of
> >> using the existing skb->encapsulation? Also I don't see it used in your
> >> second patch either.
> >
> > My reasoning is that skb->encapsulation seems to alter the behaviour of
> > many different locations and I'm not sure that any of them, other than the
> > one in dev_hard_start_xmit() make sense for MPLS.
> 
> The problem is the meaning of skb->encapsulation isn't really defined
> clearly and I'm certain that the current implementation is not going
> to work in the future. Depending on your perspective, vlans, MPLS,
> tunnels, etc. can all be considered forms of encapsulation but clearly
> there are many NICs that have different capabilities across those. I
> believe the intention here was really to describe L3 tunnels as
> encapsulation, in which case MPLS really shouldn't be using this
> mechanism at all.
> 
> Now there is some overlap, especially today since most currently
> shipping silicon wasn't designed to support tunnels and so is using
> some form of offset based offloads. In that case, all forms of
> encapsulation are pretty similar. However, in the future that won't be
> the case as support for specific protocols is implemented for higher
> performance and richer support. When that happens, not only will MPLS
> and tunnels have different capabilities but various forms tunnels
> might as well.

Wouldn't be possible to describe those differences using,
dev->hw_enc_features? I assumed that was its purpose.

The intention of my patch was allow MPLS to utilise
dev->hw_enc_features without any of the other implications of
skb->encapsulation.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
  2013-04-30  3:21               ` Simon Horman
@ 2013-04-30 16:19                 ` Jesse Gross
  2013-05-01  7:50                   ` Simon Horman
  0 siblings, 1 reply; 21+ messages in thread
From: Jesse Gross @ 2013-04-30 16:19 UTC (permalink / raw)
  To: Simon Horman
  Cc: Joseph Gasparakis, dev, netdev, Jarno Rajahalme,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Mon, Apr 29, 2013 at 8:21 PM, Simon Horman <horms@verge.net.au> wrote:
> On Fri, Apr 26, 2013 at 04:03:21PM -0700, Jesse Gross wrote:
>> On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms@verge.net.au> wrote:
>> > On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
>> >> Any particular reason to introduce skb->encapsulation_features instead of
>> >> using the existing skb->encapsulation? Also I don't see it used in your
>> >> second patch either.
>> >
>> > My reasoning is that skb->encapsulation seems to alter the behaviour of
>> > many different locations and I'm not sure that any of them, other than the
>> > one in dev_hard_start_xmit() make sense for MPLS.
>>
>> The problem is the meaning of skb->encapsulation isn't really defined
>> clearly and I'm certain that the current implementation is not going
>> to work in the future. Depending on your perspective, vlans, MPLS,
>> tunnels, etc. can all be considered forms of encapsulation but clearly
>> there are many NICs that have different capabilities across those. I
>> believe the intention here was really to describe L3 tunnels as
>> encapsulation, in which case MPLS really shouldn't be using this
>> mechanism at all.
>>
>> Now there is some overlap, especially today since most currently
>> shipping silicon wasn't designed to support tunnels and so is using
>> some form of offset based offloads. In that case, all forms of
>> encapsulation are pretty similar. However, in the future that won't be
>> the case as support for specific protocols is implemented for higher
>> performance and richer support. When that happens, not only will MPLS
>> and tunnels have different capabilities but various forms tunnels
>> might as well.
>
> Wouldn't be possible to describe those differences using,
> dev->hw_enc_features? I assumed that was its purpose.

If there truly are differences between the offload capabilities of
MPLS and L3 tunnels then no, it's not possible, because it's a single
field. It's certainly not a valid assumption that a NIC that can do
TSO over GRE can also do it over MPLS.

However, it's unlikely that there are truly significant differences
between various encapsulation formats on a per-feature basis. I think
what we need to do is separate out the ability to understand the
headers from the capabilities so you have two fields: header (none,
VLAN, QinQ, MPLS, VXLAN, GRE, etc.) and feature (checksum, SG, TSO,
etc.) rather than the product of each. Otherwise, we end up with a ton
of different combinations.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
  2013-04-30 16:19                 ` Jesse Gross
@ 2013-05-01  7:50                   ` Simon Horman
  2013-05-01 18:16                     ` Jesse Gross
  0 siblings, 1 reply; 21+ messages in thread
From: Simon Horman @ 2013-05-01  7:50 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Joseph Gasparakis, dev, netdev, Jarno Rajahalme,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Tue, Apr 30, 2013 at 09:19:51AM -0700, Jesse Gross wrote:
> On Mon, Apr 29, 2013 at 8:21 PM, Simon Horman <horms@verge.net.au> wrote:
> > On Fri, Apr 26, 2013 at 04:03:21PM -0700, Jesse Gross wrote:
> >> On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms@verge.net.au> wrote:
> >> > On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
> >> >> Any particular reason to introduce skb->encapsulation_features instead of
> >> >> using the existing skb->encapsulation? Also I don't see it used in your
> >> >> second patch either.
> >> >
> >> > My reasoning is that skb->encapsulation seems to alter the behaviour of
> >> > many different locations and I'm not sure that any of them, other than the
> >> > one in dev_hard_start_xmit() make sense for MPLS.
> >>
> >> The problem is the meaning of skb->encapsulation isn't really defined
> >> clearly and I'm certain that the current implementation is not going
> >> to work in the future. Depending on your perspective, vlans, MPLS,
> >> tunnels, etc. can all be considered forms of encapsulation but clearly
> >> there are many NICs that have different capabilities across those. I
> >> believe the intention here was really to describe L3 tunnels as
> >> encapsulation, in which case MPLS really shouldn't be using this
> >> mechanism at all.
> >>
> >> Now there is some overlap, especially today since most currently
> >> shipping silicon wasn't designed to support tunnels and so is using
> >> some form of offset based offloads. In that case, all forms of
> >> encapsulation are pretty similar. However, in the future that won't be
> >> the case as support for specific protocols is implemented for higher
> >> performance and richer support. When that happens, not only will MPLS
> >> and tunnels have different capabilities but various forms tunnels
> >> might as well.
> >
> > Wouldn't be possible to describe those differences using,
> > dev->hw_enc_features? I assumed that was its purpose.
> 
> If there truly are differences between the offload capabilities of
> MPLS and L3 tunnels then no, it's not possible, because it's a single
> field. It's certainly not a valid assumption that a NIC that can do
> TSO over GRE can also do it over MPLS.
> 
> However, it's unlikely that there are truly significant differences
> between various encapsulation formats on a per-feature basis. I think
> what we need to do is separate out the ability to understand the
> headers from the capabilities so you have two fields: header (none,
> VLAN, QinQ, MPLS, VXLAN, GRE, etc.) and feature (checksum, SG, TSO,
> etc.) rather than the product of each. Otherwise, we end up with a ton
> of different combinations.

I'm not quite sure that I follow.

Is your idea to replace skb->encapsulation (a single bit) with
a field that corresponds to the outer-most (encapsulation) header in use
and has bits for none, VLAN, QinQ, MPLS, VXLAN, GRE, etc...?

If so, I believe that would solve the problem I was trying to address
with this patch.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
  2013-05-01  7:50                   ` Simon Horman
@ 2013-05-01 18:16                     ` Jesse Gross
  2013-05-01 22:57                       ` Simon Horman
  0 siblings, 1 reply; 21+ messages in thread
From: Jesse Gross @ 2013-05-01 18:16 UTC (permalink / raw)
  To: Simon Horman
  Cc: Joseph Gasparakis, dev, netdev, Jarno Rajahalme,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Wed, May 1, 2013 at 12:50 AM, Simon Horman <horms@verge.net.au> wrote:
> On Tue, Apr 30, 2013 at 09:19:51AM -0700, Jesse Gross wrote:
>> On Mon, Apr 29, 2013 at 8:21 PM, Simon Horman <horms@verge.net.au> wrote:
>> > On Fri, Apr 26, 2013 at 04:03:21PM -0700, Jesse Gross wrote:
>> >> On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms@verge.net.au> wrote:
>> >> > On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
>> >> >> Any particular reason to introduce skb->encapsulation_features instead of
>> >> >> using the existing skb->encapsulation? Also I don't see it used in your
>> >> >> second patch either.
>> >> >
>> >> > My reasoning is that skb->encapsulation seems to alter the behaviour of
>> >> > many different locations and I'm not sure that any of them, other than the
>> >> > one in dev_hard_start_xmit() make sense for MPLS.
>> >>
>> >> The problem is the meaning of skb->encapsulation isn't really defined
>> >> clearly and I'm certain that the current implementation is not going
>> >> to work in the future. Depending on your perspective, vlans, MPLS,
>> >> tunnels, etc. can all be considered forms of encapsulation but clearly
>> >> there are many NICs that have different capabilities across those. I
>> >> believe the intention here was really to describe L3 tunnels as
>> >> encapsulation, in which case MPLS really shouldn't be using this
>> >> mechanism at all.
>> >>
>> >> Now there is some overlap, especially today since most currently
>> >> shipping silicon wasn't designed to support tunnels and so is using
>> >> some form of offset based offloads. In that case, all forms of
>> >> encapsulation are pretty similar. However, in the future that won't be
>> >> the case as support for specific protocols is implemented for higher
>> >> performance and richer support. When that happens, not only will MPLS
>> >> and tunnels have different capabilities but various forms tunnels
>> >> might as well.
>> >
>> > Wouldn't be possible to describe those differences using,
>> > dev->hw_enc_features? I assumed that was its purpose.
>>
>> If there truly are differences between the offload capabilities of
>> MPLS and L3 tunnels then no, it's not possible, because it's a single
>> field. It's certainly not a valid assumption that a NIC that can do
>> TSO over GRE can also do it over MPLS.
>>
>> However, it's unlikely that there are truly significant differences
>> between various encapsulation formats on a per-feature basis. I think
>> what we need to do is separate out the ability to understand the
>> headers from the capabilities so you have two fields: header (none,
>> VLAN, QinQ, MPLS, VXLAN, GRE, etc.) and feature (checksum, SG, TSO,
>> etc.) rather than the product of each. Otherwise, we end up with a ton
>> of different combinations.
>
> I'm not quite sure that I follow.
>
> Is your idea to replace skb->encapsulation (a single bit) with
> a field that corresponds to the outer-most (encapsulation) header in use
> and has bits for none, VLAN, QinQ, MPLS, VXLAN, GRE, etc...?

No, I'm talking about netdev features. You can already tell the
encapsulation type of a packet by looking at the EtherType.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
  2013-05-01 18:16                     ` Jesse Gross
@ 2013-05-01 22:57                       ` Simon Horman
       [not found]                         ` <20130501225706.GC6517-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Simon Horman @ 2013-05-01 22:57 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Joseph Gasparakis, dev, netdev, Jarno Rajahalme,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Wed, May 01, 2013 at 11:16:40AM -0700, Jesse Gross wrote:
> On Wed, May 1, 2013 at 12:50 AM, Simon Horman <horms@verge.net.au> wrote:
> > On Tue, Apr 30, 2013 at 09:19:51AM -0700, Jesse Gross wrote:
> >> On Mon, Apr 29, 2013 at 8:21 PM, Simon Horman <horms@verge.net.au> wrote:
> >> > On Fri, Apr 26, 2013 at 04:03:21PM -0700, Jesse Gross wrote:
> >> >> On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms@verge.net.au> wrote:
> >> >> > On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
> >> >> >> Any particular reason to introduce skb->encapsulation_features instead of
> >> >> >> using the existing skb->encapsulation? Also I don't see it used in your
> >> >> >> second patch either.
> >> >> >
> >> >> > My reasoning is that skb->encapsulation seems to alter the behaviour of
> >> >> > many different locations and I'm not sure that any of them, other than the
> >> >> > one in dev_hard_start_xmit() make sense for MPLS.
> >> >>
> >> >> The problem is the meaning of skb->encapsulation isn't really defined
> >> >> clearly and I'm certain that the current implementation is not going
> >> >> to work in the future. Depending on your perspective, vlans, MPLS,
> >> >> tunnels, etc. can all be considered forms of encapsulation but clearly
> >> >> there are many NICs that have different capabilities across those. I
> >> >> believe the intention here was really to describe L3 tunnels as
> >> >> encapsulation, in which case MPLS really shouldn't be using this
> >> >> mechanism at all.
> >> >>
> >> >> Now there is some overlap, especially today since most currently
> >> >> shipping silicon wasn't designed to support tunnels and so is using
> >> >> some form of offset based offloads. In that case, all forms of
> >> >> encapsulation are pretty similar. However, in the future that won't be
> >> >> the case as support for specific protocols is implemented for higher
> >> >> performance and richer support. When that happens, not only will MPLS
> >> >> and tunnels have different capabilities but various forms tunnels
> >> >> might as well.
> >> >
> >> > Wouldn't be possible to describe those differences using,
> >> > dev->hw_enc_features? I assumed that was its purpose.
> >>
> >> If there truly are differences between the offload capabilities of
> >> MPLS and L3 tunnels then no, it's not possible, because it's a single
> >> field. It's certainly not a valid assumption that a NIC that can do
> >> TSO over GRE can also do it over MPLS.
> >>
> >> However, it's unlikely that there are truly significant differences
> >> between various encapsulation formats on a per-feature basis. I think
> >> what we need to do is separate out the ability to understand the
> >> headers from the capabilities so you have two fields: header (none,
> >> VLAN, QinQ, MPLS, VXLAN, GRE, etc.) and feature (checksum, SG, TSO,
> >> etc.) rather than the product of each. Otherwise, we end up with a ton
> >> of different combinations.
> >
> > I'm not quite sure that I follow.
> >
> > Is your idea to replace skb->encapsulation (a single bit) with
> > a field that corresponds to the outer-most (encapsulation) header in use
> > and has bits for none, VLAN, QinQ, MPLS, VXLAN, GRE, etc...?
> 
> No, I'm talking about netdev features. You can already tell the
> encapsulation type of a packet by looking at the EtherType.

Now I am completely confused about what are the two fields that you
refer to in your previous email.


In regards to looking ath the ethernet type:

One of the tricky parts of MPLS is that the packet itself does not contain
the ethernet type or any other way of knowing the type of the inner-packet.
Information that is needed for GSO.

My proposal to get around this is to leave skb->protocol as the
original, in the case we are interested in non-MPLS, ethernet type.

The MPLS ethertype is in in the packet itself, however checking
that seems expensive.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
       [not found]                         ` <20130501225706.GC6517-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
@ 2013-05-02  4:53                           ` Jesse Gross
  2013-05-02  5:31                             ` Simon Horman
  0 siblings, 1 reply; 21+ messages in thread
From: Jesse Gross @ 2013-05-02  4:53 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski, netdev, Peter P Waskiewicz Jr,
	Joseph Gasparakis

On Wed, May 1, 2013 at 3:57 PM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:
> On Wed, May 01, 2013 at 11:16:40AM -0700, Jesse Gross wrote:
>> On Wed, May 1, 2013 at 12:50 AM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:
>> > On Tue, Apr 30, 2013 at 09:19:51AM -0700, Jesse Gross wrote:
>> >> On Mon, Apr 29, 2013 at 8:21 PM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:
>> >> > On Fri, Apr 26, 2013 at 04:03:21PM -0700, Jesse Gross wrote:
>> >> >> On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:
>> >> >> > On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
>> >> >> >> Any particular reason to introduce skb->encapsulation_features instead of
>> >> >> >> using the existing skb->encapsulation? Also I don't see it used in your
>> >> >> >> second patch either.
>> >> >> >
>> >> >> > My reasoning is that skb->encapsulation seems to alter the behaviour of
>> >> >> > many different locations and I'm not sure that any of them, other than the
>> >> >> > one in dev_hard_start_xmit() make sense for MPLS.
>> >> >>
>> >> >> The problem is the meaning of skb->encapsulation isn't really defined
>> >> >> clearly and I'm certain that the current implementation is not going
>> >> >> to work in the future. Depending on your perspective, vlans, MPLS,
>> >> >> tunnels, etc. can all be considered forms of encapsulation but clearly
>> >> >> there are many NICs that have different capabilities across those. I
>> >> >> believe the intention here was really to describe L3 tunnels as
>> >> >> encapsulation, in which case MPLS really shouldn't be using this
>> >> >> mechanism at all.
>> >> >>
>> >> >> Now there is some overlap, especially today since most currently
>> >> >> shipping silicon wasn't designed to support tunnels and so is using
>> >> >> some form of offset based offloads. In that case, all forms of
>> >> >> encapsulation are pretty similar. However, in the future that won't be
>> >> >> the case as support for specific protocols is implemented for higher
>> >> >> performance and richer support. When that happens, not only will MPLS
>> >> >> and tunnels have different capabilities but various forms tunnels
>> >> >> might as well.
>> >> >
>> >> > Wouldn't be possible to describe those differences using,
>> >> > dev->hw_enc_features? I assumed that was its purpose.
>> >>
>> >> If there truly are differences between the offload capabilities of
>> >> MPLS and L3 tunnels then no, it's not possible, because it's a single
>> >> field. It's certainly not a valid assumption that a NIC that can do
>> >> TSO over GRE can also do it over MPLS.
>> >>
>> >> However, it's unlikely that there are truly significant differences
>> >> between various encapsulation formats on a per-feature basis. I think
>> >> what we need to do is separate out the ability to understand the
>> >> headers from the capabilities so you have two fields: header (none,
>> >> VLAN, QinQ, MPLS, VXLAN, GRE, etc.) and feature (checksum, SG, TSO,
>> >> etc.) rather than the product of each. Otherwise, we end up with a ton
>> >> of different combinations.
>> >
>> > I'm not quite sure that I follow.
>> >
>> > Is your idea to replace skb->encapsulation (a single bit) with
>> > a field that corresponds to the outer-most (encapsulation) header in use
>> > and has bits for none, VLAN, QinQ, MPLS, VXLAN, GRE, etc...?
>>
>> No, I'm talking about netdev features. You can already tell the
>> encapsulation type of a packet by looking at the EtherType.
>
> Now I am completely confused about what are the two fields that you
> refer to in your previous email.

I have always been referring to the netdev features for various
protocol types. This is because considering MPLS as a form of
encapsulation for the purpose of offloads buckets too many protocols
into the same set and NICs will have varying features for those.
Trying to avoid this by having a bit for offloadable encapsulations is
just going to be very confusing and not very future proof.

> In regards to looking ath the ethernet type:
>
> One of the tricky parts of MPLS is that the packet itself does not contain
> the ethernet type or any other way of knowing the type of the inner-packet.
> Information that is needed for GSO.

I'm aware of that. However, you were referring to the type of
encapsulation. It is easy to determine that a packet is MPLS.

> My proposal to get around this is to leave skb->protocol as the
> original, in the case we are interested in non-MPLS, ethernet type.

At the very least, this is not consistent with how it is currently
handled (for example, with VLANs) and seems difficult to do properly.
However, I have not seen any further analysis since the last time that
we discussed this.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
  2013-05-02  4:53                           ` Jesse Gross
@ 2013-05-02  5:31                             ` Simon Horman
  2013-05-02 14:39                               ` Simon Horman
  2013-05-02 16:53                               ` Jesse Gross
  0 siblings, 2 replies; 21+ messages in thread
From: Simon Horman @ 2013-05-02  5:31 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Joseph Gasparakis, dev, netdev, Jarno Rajahalme,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Wed, May 01, 2013 at 09:53:42PM -0700, Jesse Gross wrote:
> On Wed, May 1, 2013 at 3:57 PM, Simon Horman <horms@verge.net.au> wrote:
> > On Wed, May 01, 2013 at 11:16:40AM -0700, Jesse Gross wrote:
> >> On Wed, May 1, 2013 at 12:50 AM, Simon Horman <horms@verge.net.au> wrote:
> >> > On Tue, Apr 30, 2013 at 09:19:51AM -0700, Jesse Gross wrote:
> >> >> On Mon, Apr 29, 2013 at 8:21 PM, Simon Horman <horms@verge.net.au> wrote:
> >> >> > On Fri, Apr 26, 2013 at 04:03:21PM -0700, Jesse Gross wrote:
> >> >> >> On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms@verge.net.au> wrote:
> >> >> >> > On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
> >> >> >> >> Any particular reason to introduce skb->encapsulation_features instead of
> >> >> >> >> using the existing skb->encapsulation? Also I don't see it used in your
> >> >> >> >> second patch either.
> >> >> >> >
> >> >> >> > My reasoning is that skb->encapsulation seems to alter the behaviour of
> >> >> >> > many different locations and I'm not sure that any of them, other than the
> >> >> >> > one in dev_hard_start_xmit() make sense for MPLS.
> >> >> >>
> >> >> >> The problem is the meaning of skb->encapsulation isn't really defined
> >> >> >> clearly and I'm certain that the current implementation is not going
> >> >> >> to work in the future. Depending on your perspective, vlans, MPLS,
> >> >> >> tunnels, etc. can all be considered forms of encapsulation but clearly
> >> >> >> there are many NICs that have different capabilities across those. I
> >> >> >> believe the intention here was really to describe L3 tunnels as
> >> >> >> encapsulation, in which case MPLS really shouldn't be using this
> >> >> >> mechanism at all.
> >> >> >>
> >> >> >> Now there is some overlap, especially today since most currently
> >> >> >> shipping silicon wasn't designed to support tunnels and so is using
> >> >> >> some form of offset based offloads. In that case, all forms of
> >> >> >> encapsulation are pretty similar. However, in the future that won't be
> >> >> >> the case as support for specific protocols is implemented for higher
> >> >> >> performance and richer support. When that happens, not only will MPLS
> >> >> >> and tunnels have different capabilities but various forms tunnels
> >> >> >> might as well.
> >> >> >
> >> >> > Wouldn't be possible to describe those differences using,
> >> >> > dev->hw_enc_features? I assumed that was its purpose.
> >> >>
> >> >> If there truly are differences between the offload capabilities of
> >> >> MPLS and L3 tunnels then no, it's not possible, because it's a single
> >> >> field. It's certainly not a valid assumption that a NIC that can do
> >> >> TSO over GRE can also do it over MPLS.
> >> >>
> >> >> However, it's unlikely that there are truly significant differences
> >> >> between various encapsulation formats on a per-feature basis. I think
> >> >> what we need to do is separate out the ability to understand the
> >> >> headers from the capabilities so you have two fields: header (none,
> >> >> VLAN, QinQ, MPLS, VXLAN, GRE, etc.) and feature (checksum, SG, TSO,
> >> >> etc.) rather than the product of each. Otherwise, we end up with a ton
> >> >> of different combinations.
> >> >
> >> > I'm not quite sure that I follow.
> >> >
> >> > Is your idea to replace skb->encapsulation (a single bit) with
> >> > a field that corresponds to the outer-most (encapsulation) header in use
> >> > and has bits for none, VLAN, QinQ, MPLS, VXLAN, GRE, etc...?
> >>
> >> No, I'm talking about netdev features. You can already tell the
> >> encapsulation type of a packet by looking at the EtherType.
> >
> > Now I am completely confused about what are the two fields that you
> > refer to in your previous email.
> 
> I have always been referring to the netdev features for various
> protocol types. This is because considering MPLS as a form of
> encapsulation for the purpose of offloads buckets too many protocols
> into the same set and NICs will have varying features for those.
> Trying to avoid this by having a bit for offloadable encapsulations is
> just going to be very confusing and not very future proof.
> 
> > In regards to looking ath the ethernet type:
> >
> > One of the tricky parts of MPLS is that the packet itself does not contain
> > the ethernet type or any other way of knowing the type of the inner-packet.
> > Information that is needed for GSO.
> 
> I'm aware of that. However, you were referring to the type of
> encapsulation. It is easy to determine that a packet is MPLS.
> 
> > My proposal to get around this is to leave skb->protocol as the
> > original, in the case we are interested in non-MPLS, ethernet type.
> 
> At the very least, this is not consistent with how it is currently
> handled (for example, with VLANs) and seems difficult to do properly.
> However, I have not seen any further analysis since the last time that
> we discussed this.

Unfortunately my efforts to solicit feedback from others regarding
that have not been successful.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
  2013-05-02  5:31                             ` Simon Horman
@ 2013-05-02 14:39                               ` Simon Horman
  2013-05-02 16:53                               ` Jesse Gross
  1 sibling, 0 replies; 21+ messages in thread
From: Simon Horman @ 2013-05-02 14:39 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Joseph Gasparakis, dev, netdev, Jarno Rajahalme,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Thu, May 02, 2013 at 02:31:44PM +0900, Simon Horman wrote:
> On Wed, May 01, 2013 at 09:53:42PM -0700, Jesse Gross wrote:
> > On Wed, May 1, 2013 at 3:57 PM, Simon Horman <horms@verge.net.au> wrote:
> > > On Wed, May 01, 2013 at 11:16:40AM -0700, Jesse Gross wrote:
> > >> On Wed, May 1, 2013 at 12:50 AM, Simon Horman <horms@verge.net.au> wrote:
> > >> > On Tue, Apr 30, 2013 at 09:19:51AM -0700, Jesse Gross wrote:
> > >> >> On Mon, Apr 29, 2013 at 8:21 PM, Simon Horman <horms@verge.net.au> wrote:
> > >> >> > On Fri, Apr 26, 2013 at 04:03:21PM -0700, Jesse Gross wrote:
> > >> >> >> On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms@verge.net.au> wrote:
> > >> >> >> > On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
> > >> >> >> >> Any particular reason to introduce skb->encapsulation_features instead of
> > >> >> >> >> using the existing skb->encapsulation? Also I don't see it used in your
> > >> >> >> >> second patch either.
> > >> >> >> >
> > >> >> >> > My reasoning is that skb->encapsulation seems to alter the behaviour of
> > >> >> >> > many different locations and I'm not sure that any of them, other than the
> > >> >> >> > one in dev_hard_start_xmit() make sense for MPLS.
> > >> >> >>
> > >> >> >> The problem is the meaning of skb->encapsulation isn't really defined
> > >> >> >> clearly and I'm certain that the current implementation is not going
> > >> >> >> to work in the future. Depending on your perspective, vlans, MPLS,
> > >> >> >> tunnels, etc. can all be considered forms of encapsulation but clearly
> > >> >> >> there are many NICs that have different capabilities across those. I
> > >> >> >> believe the intention here was really to describe L3 tunnels as
> > >> >> >> encapsulation, in which case MPLS really shouldn't be using this
> > >> >> >> mechanism at all.
> > >> >> >>
> > >> >> >> Now there is some overlap, especially today since most currently
> > >> >> >> shipping silicon wasn't designed to support tunnels and so is using
> > >> >> >> some form of offset based offloads. In that case, all forms of
> > >> >> >> encapsulation are pretty similar. However, in the future that won't be
> > >> >> >> the case as support for specific protocols is implemented for higher
> > >> >> >> performance and richer support. When that happens, not only will MPLS
> > >> >> >> and tunnels have different capabilities but various forms tunnels
> > >> >> >> might as well.
> > >> >> >
> > >> >> > Wouldn't be possible to describe those differences using,
> > >> >> > dev->hw_enc_features? I assumed that was its purpose.
> > >> >>
> > >> >> If there truly are differences between the offload capabilities of
> > >> >> MPLS and L3 tunnels then no, it's not possible, because it's a single
> > >> >> field. It's certainly not a valid assumption that a NIC that can do
> > >> >> TSO over GRE can also do it over MPLS.
> > >> >>
> > >> >> However, it's unlikely that there are truly significant differences
> > >> >> between various encapsulation formats on a per-feature basis. I think
> > >> >> what we need to do is separate out the ability to understand the
> > >> >> headers from the capabilities so you have two fields: header (none,
> > >> >> VLAN, QinQ, MPLS, VXLAN, GRE, etc.) and feature (checksum, SG, TSO,
> > >> >> etc.) rather than the product of each. Otherwise, we end up with a ton
> > >> >> of different combinations.
> > >> >
> > >> > I'm not quite sure that I follow.
> > >> >
> > >> > Is your idea to replace skb->encapsulation (a single bit) with
> > >> > a field that corresponds to the outer-most (encapsulation) header in use
> > >> > and has bits for none, VLAN, QinQ, MPLS, VXLAN, GRE, etc...?
> > >>
> > >> No, I'm talking about netdev features. You can already tell the
> > >> encapsulation type of a packet by looking at the EtherType.
> > >
> > > Now I am completely confused about what are the two fields that you
> > > refer to in your previous email.
> > 
> > I have always been referring to the netdev features for various
> > protocol types. This is because considering MPLS as a form of
> > encapsulation for the purpose of offloads buckets too many protocols
> > into the same set and NICs will have varying features for those.
> > Trying to avoid this by having a bit for offloadable encapsulations is
> > just going to be very confusing and not very future proof.

I understand your point regarding a magic bit for encapsulation not
being particularly future-proof. I agree that it is reasonable to expect
that a NIC may support an offload for one of the growing list of
supported encapsulation protocols and not another.

However, as tedious as this may be, I am rather confused about what your
proposal above is.

> > > In regards to looking ath the ethernet type:
> > >
> > > One of the tricky parts of MPLS is that the packet itself does not contain
> > > the ethernet type or any other way of knowing the type of the inner-packet.
> > > Information that is needed for GSO.
> > 
> > I'm aware of that. However, you were referring to the type of
> > encapsulation. It is easy to determine that a packet is MPLS.
> > 
> > > My proposal to get around this is to leave skb->protocol as the
> > > original, in the case we are interested in non-MPLS, ethernet type.
> > 
> > At the very least, this is not consistent with how it is currently
> > handled (for example, with VLANs) and seems difficult to do properly.
> > However, I have not seen any further analysis since the last time that
> > we discussed this.
> 
> Unfortunately my efforts to solicit feedback from others regarding
> that have not been successful.

An idea I have for the treatment of skb->protocol and friends is to add
skb->inner_protocol.  It could be set to the inner protocol, if known, for
protocols such as MPLS which don't include that information in the packet.
It would allow skb->protocol to be set to the MPLS ethernet type
corresponding to the ethernet type of the packet.

>From here I see two options:

1.  Offloads could be registered for MPLS unicast and multicast.  And the
    registered MPLS GSO segmentation call-back could set and restore
    skb->protocol before and after calling skb_mac_gso_segment().

    The MPLS GSO segmentation callback could also calculate features
    to pass to skb_mac_gso_segment() by some means.

2. Teach skb_network_protocol() about skb->inner_protocol.
   And most likely teach netif_skb_features() about
   skb->protocol == ETH_P_MPLS*.

   I'm not entirely sure how to avoid overhead for non-MPLS packets using
   this approach.

I believe the above could be achieved without using skb->encapsulation
in newly added code.

Your features proposal above not withstanding, in the current scheme of
things, it would seem that it would be appropriate to add SKB_GSO_GRE -
currently not supported by any hardware - and set skb_shinfo(skb)->gso_type
= SKB_GSO_GRE in the datapath. I think this should be sufficient to trigger
a call to skb_mac_gso_segment() in dev_hard_start_xmit().

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
  2013-05-02  5:31                             ` Simon Horman
  2013-05-02 14:39                               ` Simon Horman
@ 2013-05-02 16:53                               ` Jesse Gross
       [not found]                                 ` <CAEP_g=_0cOnE1Bhm5PGPhPTBvBEjoN_+oS-zUeNBM4ZyoJe-yg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 21+ messages in thread
From: Jesse Gross @ 2013-05-02 16:53 UTC (permalink / raw)
  To: Simon Horman
  Cc: Joseph Gasparakis, dev, netdev, Jarno Rajahalme,
	Peter P Waskiewicz Jr, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski

On Wed, May 1, 2013 at 10:31 PM, Simon Horman <horms@verge.net.au> wrote:
> On Wed, May 01, 2013 at 09:53:42PM -0700, Jesse Gross wrote:
>> On Wed, May 1, 2013 at 3:57 PM, Simon Horman <horms@verge.net.au> wrote:
>> > On Wed, May 01, 2013 at 11:16:40AM -0700, Jesse Gross wrote:
>> >> On Wed, May 1, 2013 at 12:50 AM, Simon Horman <horms@verge.net.au> wrote:
>> >> > On Tue, Apr 30, 2013 at 09:19:51AM -0700, Jesse Gross wrote:
>> >> >> On Mon, Apr 29, 2013 at 8:21 PM, Simon Horman <horms@verge.net.au> wrote:
>> >> >> > On Fri, Apr 26, 2013 at 04:03:21PM -0700, Jesse Gross wrote:
>> >> >> >> On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms@verge.net.au> wrote:
>> >> >> >> > On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
>> >> >> >> >> Any particular reason to introduce skb->encapsulation_features instead of
>> >> >> >> >> using the existing skb->encapsulation? Also I don't see it used in your
>> >> >> >> >> second patch either.
>> >> >> >> >
>> >> >> >> > My reasoning is that skb->encapsulation seems to alter the behaviour of
>> >> >> >> > many different locations and I'm not sure that any of them, other than the
>> >> >> >> > one in dev_hard_start_xmit() make sense for MPLS.
>> >> >> >>
>> >> >> >> The problem is the meaning of skb->encapsulation isn't really defined
>> >> >> >> clearly and I'm certain that the current implementation is not going
>> >> >> >> to work in the future. Depending on your perspective, vlans, MPLS,
>> >> >> >> tunnels, etc. can all be considered forms of encapsulation but clearly
>> >> >> >> there are many NICs that have different capabilities across those. I
>> >> >> >> believe the intention here was really to describe L3 tunnels as
>> >> >> >> encapsulation, in which case MPLS really shouldn't be using this
>> >> >> >> mechanism at all.
>> >> >> >>
>> >> >> >> Now there is some overlap, especially today since most currently
>> >> >> >> shipping silicon wasn't designed to support tunnels and so is using
>> >> >> >> some form of offset based offloads. In that case, all forms of
>> >> >> >> encapsulation are pretty similar. However, in the future that won't be
>> >> >> >> the case as support for specific protocols is implemented for higher
>> >> >> >> performance and richer support. When that happens, not only will MPLS
>> >> >> >> and tunnels have different capabilities but various forms tunnels
>> >> >> >> might as well.
>> >> >> >
>> >> >> > Wouldn't be possible to describe those differences using,
>> >> >> > dev->hw_enc_features? I assumed that was its purpose.
>> >> >>
>> >> >> If there truly are differences between the offload capabilities of
>> >> >> MPLS and L3 tunnels then no, it's not possible, because it's a single
>> >> >> field. It's certainly not a valid assumption that a NIC that can do
>> >> >> TSO over GRE can also do it over MPLS.
>> >> >>
>> >> >> However, it's unlikely that there are truly significant differences
>> >> >> between various encapsulation formats on a per-feature basis. I think
>> >> >> what we need to do is separate out the ability to understand the
>> >> >> headers from the capabilities so you have two fields: header (none,
>> >> >> VLAN, QinQ, MPLS, VXLAN, GRE, etc.) and feature (checksum, SG, TSO,
>> >> >> etc.) rather than the product of each. Otherwise, we end up with a ton
>> >> >> of different combinations.
>> >> >
>> >> > I'm not quite sure that I follow.
>> >> >
>> >> > Is your idea to replace skb->encapsulation (a single bit) with
>> >> > a field that corresponds to the outer-most (encapsulation) header in use
>> >> > and has bits for none, VLAN, QinQ, MPLS, VXLAN, GRE, etc...?
>> >>
>> >> No, I'm talking about netdev features. You can already tell the
>> >> encapsulation type of a packet by looking at the EtherType.
>> >
>> > Now I am completely confused about what are the two fields that you
>> > refer to in your previous email.
>>
>> I have always been referring to the netdev features for various
>> protocol types. This is because considering MPLS as a form of
>> encapsulation for the purpose of offloads buckets too many protocols
>> into the same set and NICs will have varying features for those.
>> Trying to avoid this by having a bit for offloadable encapsulations is
>> just going to be very confusing and not very future proof.
>>
>> > In regards to looking ath the ethernet type:
>> >
>> > One of the tricky parts of MPLS is that the packet itself does not contain
>> > the ethernet type or any other way of knowing the type of the inner-packet.
>> > Information that is needed for GSO.
>>
>> I'm aware of that. However, you were referring to the type of
>> encapsulation. It is easy to determine that a packet is MPLS.
>>
>> > My proposal to get around this is to leave skb->protocol as the
>> > original, in the case we are interested in non-MPLS, ethernet type.
>>
>> At the very least, this is not consistent with how it is currently
>> handled (for example, with VLANs) and seems difficult to do properly.
>> However, I have not seen any further analysis since the last time that
>> we discussed this.
>
> Unfortunately my efforts to solicit feedback from others regarding
> that have not been successful.

Give me a break, Simon. These are your patches and therefore it is
your responsibility to do the analysis. This has come up over and over
again and I'm not happy about it.

Rather than trying to respond to me as quickly as possible, you need
to slow down, take a step back, and think about the correct direction.
Given that some of these patches have revision numbers in the 20s you
have nothing to complain about as far as receiving help.

In order to assist you in this, I'm going to drop all of your pending
patches from my queue and not respond to anything further from you
until the merge window is open again.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features
       [not found]                                 ` <CAEP_g=_0cOnE1Bhm5PGPhPTBvBEjoN_+oS-zUeNBM4ZyoJe-yg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-05-02 23:55                                   ` Simon Horman
  0 siblings, 0 replies; 21+ messages in thread
From: Simon Horman @ 2013-05-02 23:55 UTC (permalink / raw)
  To: Jesse Gross
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, Alexander Duyck, Eric Dumazet,
	Maciej Żenczykowski, netdev, Peter P Waskiewicz Jr,
	Joseph Gasparakis

On Thu, May 02, 2013 at 09:53:25AM -0700, Jesse Gross wrote:
> On Wed, May 1, 2013 at 10:31 PM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:
> > On Wed, May 01, 2013 at 09:53:42PM -0700, Jesse Gross wrote:
> >> On Wed, May 1, 2013 at 3:57 PM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:
> >> > On Wed, May 01, 2013 at 11:16:40AM -0700, Jesse Gross wrote:
> >> >> On Wed, May 1, 2013 at 12:50 AM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:
> >> >> > On Tue, Apr 30, 2013 at 09:19:51AM -0700, Jesse Gross wrote:
> >> >> >> On Mon, Apr 29, 2013 at 8:21 PM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:
> >> >> >> > On Fri, Apr 26, 2013 at 04:03:21PM -0700, Jesse Gross wrote:
> >> >> >> >> On Thu, Apr 25, 2013 at 12:36 AM, Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org> wrote:
> >> >> >> >> > On Tue, Apr 23, 2013 at 02:00:19PM -0700, Joseph Gasparakis wrote:
> >> >> >> >> >> Any particular reason to introduce skb->encapsulation_features instead of
> >> >> >> >> >> using the existing skb->encapsulation? Also I don't see it used in your
> >> >> >> >> >> second patch either.
> >> >> >> >> >
> >> >> >> >> > My reasoning is that skb->encapsulation seems to alter the behaviour of
> >> >> >> >> > many different locations and I'm not sure that any of them, other than the
> >> >> >> >> > one in dev_hard_start_xmit() make sense for MPLS.
> >> >> >> >>
> >> >> >> >> The problem is the meaning of skb->encapsulation isn't really defined
> >> >> >> >> clearly and I'm certain that the current implementation is not going
> >> >> >> >> to work in the future. Depending on your perspective, vlans, MPLS,
> >> >> >> >> tunnels, etc. can all be considered forms of encapsulation but clearly
> >> >> >> >> there are many NICs that have different capabilities across those. I
> >> >> >> >> believe the intention here was really to describe L3 tunnels as
> >> >> >> >> encapsulation, in which case MPLS really shouldn't be using this
> >> >> >> >> mechanism at all.
> >> >> >> >>
> >> >> >> >> Now there is some overlap, especially today since most currently
> >> >> >> >> shipping silicon wasn't designed to support tunnels and so is using
> >> >> >> >> some form of offset based offloads. In that case, all forms of
> >> >> >> >> encapsulation are pretty similar. However, in the future that won't be
> >> >> >> >> the case as support for specific protocols is implemented for higher
> >> >> >> >> performance and richer support. When that happens, not only will MPLS
> >> >> >> >> and tunnels have different capabilities but various forms tunnels
> >> >> >> >> might as well.
> >> >> >> >
> >> >> >> > Wouldn't be possible to describe those differences using,
> >> >> >> > dev->hw_enc_features? I assumed that was its purpose.
> >> >> >>
> >> >> >> If there truly are differences between the offload capabilities of
> >> >> >> MPLS and L3 tunnels then no, it's not possible, because it's a single
> >> >> >> field. It's certainly not a valid assumption that a NIC that can do
> >> >> >> TSO over GRE can also do it over MPLS.
> >> >> >>
> >> >> >> However, it's unlikely that there are truly significant differences
> >> >> >> between various encapsulation formats on a per-feature basis. I think
> >> >> >> what we need to do is separate out the ability to understand the
> >> >> >> headers from the capabilities so you have two fields: header (none,
> >> >> >> VLAN, QinQ, MPLS, VXLAN, GRE, etc.) and feature (checksum, SG, TSO,
> >> >> >> etc.) rather than the product of each. Otherwise, we end up with a ton
> >> >> >> of different combinations.
> >> >> >
> >> >> > I'm not quite sure that I follow.
> >> >> >
> >> >> > Is your idea to replace skb->encapsulation (a single bit) with
> >> >> > a field that corresponds to the outer-most (encapsulation) header in use
> >> >> > and has bits for none, VLAN, QinQ, MPLS, VXLAN, GRE, etc...?
> >> >>
> >> >> No, I'm talking about netdev features. You can already tell the
> >> >> encapsulation type of a packet by looking at the EtherType.
> >> >
> >> > Now I am completely confused about what are the two fields that you
> >> > refer to in your previous email.
> >>
> >> I have always been referring to the netdev features for various
> >> protocol types. This is because considering MPLS as a form of
> >> encapsulation for the purpose of offloads buckets too many protocols
> >> into the same set and NICs will have varying features for those.
> >> Trying to avoid this by having a bit for offloadable encapsulations is
> >> just going to be very confusing and not very future proof.
> >>
> >> > In regards to looking ath the ethernet type:
> >> >
> >> > One of the tricky parts of MPLS is that the packet itself does not contain
> >> > the ethernet type or any other way of knowing the type of the inner-packet.
> >> > Information that is needed for GSO.
> >>
> >> I'm aware of that. However, you were referring to the type of
> >> encapsulation. It is easy to determine that a packet is MPLS.
> >>
> >> > My proposal to get around this is to leave skb->protocol as the
> >> > original, in the case we are interested in non-MPLS, ethernet type.
> >>
> >> At the very least, this is not consistent with how it is currently
> >> handled (for example, with VLANs) and seems difficult to do properly.
> >> However, I have not seen any further analysis since the last time that
> >> we discussed this.
> >
> > Unfortunately my efforts to solicit feedback from others regarding
> > that have not been successful.
> 
> Give me a break, Simon. These are your patches and therefore it is
> your responsibility to do the analysis. This has come up over and over
> again and I'm not happy about it.
> 
> Rather than trying to respond to me as quickly as possible, you need
> to slow down, take a step back, and think about the correct direction.
> Given that some of these patches have revision numbers in the 20s you
> have nothing to complain about as far as receiving help.
> 
> In order to assist you in this, I'm going to drop all of your pending
> patches from my queue and not respond to anything further from you
> until the merge window is open again.

I apologise, the comment I made above was unnecessary.

I accept your criticism in regards to responding to quickly.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2013-05-02 23:55 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-23  2:19 [PATCH net-next 0/2] Small Modifications to GSO to allow segmentation of MPLS (repost) Simon Horman
     [not found] ` <1366683556-4956-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2013-04-23  2:19   ` [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features Simon Horman
     [not found]     ` <1366683556-4956-2-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2013-04-23 21:00       ` Joseph Gasparakis
2013-04-25  7:36         ` Simon Horman
2013-04-25  8:03           ` Simon Horman
     [not found]           ` <20130425073644.GC7936-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2013-04-26 23:03             ` Jesse Gross
2013-04-30  3:21               ` Simon Horman
2013-04-30 16:19                 ` Jesse Gross
2013-05-01  7:50                   ` Simon Horman
2013-05-01 18:16                     ` Jesse Gross
2013-05-01 22:57                       ` Simon Horman
     [not found]                         ` <20130501225706.GC6517-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
2013-05-02  4:53                           ` Jesse Gross
2013-05-02  5:31                             ` Simon Horman
2013-05-02 14:39                               ` Simon Horman
2013-05-02 16:53                               ` Jesse Gross
     [not found]                                 ` <CAEP_g=_0cOnE1Bhm5PGPhPTBvBEjoN_+oS-zUeNBM4ZyoJe-yg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-05-02 23:55                                   ` Simon Horman
2013-04-23  2:19   ` [PATCH net-next 2/2] net: Loosen constraints for recalculating checksum in skb_segment() Simon Horman
2013-04-23 23:43     ` Jesse Gross
2013-04-25  7:26       ` Simon Horman
2013-04-29 20:01         ` Jesse Gross
2013-04-30  3:17           ` Simon Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).