All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] ifb: support more features
@ 2016-05-06 23:41 Eric Dumazet
  2016-05-07  0:21 ` Eric Dumazet
  2016-05-07  1:19 ` [PATCH v2 " Eric Dumazet
  0 siblings, 2 replies; 10+ messages in thread
From: Eric Dumazet @ 2016-05-06 23:41 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

When using ifb+netem on ingress on SIT/IPIP/GRE traffic,
GRO packets are not properly processed.

Segmentation should not be forced, as ifb is already adding
quite a performance hit.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 drivers/net/ifb.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index cc56fac3c3f8..0cb5d8cbe679 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -197,7 +197,7 @@ static const struct net_device_ops ifb_netdev_ops = {
 #define IFB_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG  | NETIF_F_FRAGLIST	| \
 		      NETIF_F_TSO_ECN | NETIF_F_TSO | NETIF_F_TSO6	| \
 		      NETIF_F_HIGHDMA | NETIF_F_HW_VLAN_CTAG_TX		| \
-		      NETIF_F_HW_VLAN_STAG_TX)
+		      NETIF_F_GSO_ENCAP_ALL | NETIF_F_HW_VLAN_STAG_TX)
 
 static void ifb_dev_free(struct net_device *dev)
 {
@@ -224,6 +224,7 @@ static void ifb_setup(struct net_device *dev)
 	dev->tx_queue_len = TX_Q_LIMIT;
 
 	dev->features |= IFB_FEATURES;
+	dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
 	dev->vlan_features |= IFB_FEATURES & ~(NETIF_F_HW_VLAN_CTAG_TX |
 					       NETIF_F_HW_VLAN_STAG_TX);
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next] ifb: support more features
  2016-05-06 23:41 [PATCH net-next] ifb: support more features Eric Dumazet
@ 2016-05-07  0:21 ` Eric Dumazet
  2016-05-07  1:19 ` [PATCH v2 " Eric Dumazet
  1 sibling, 0 replies; 10+ messages in thread
From: Eric Dumazet @ 2016-05-07  0:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

On Fri, 2016-05-06 at 16:41 -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> When using ifb+netem on ingress on SIT/IPIP/GRE traffic,
> GRO packets are not properly processed.
> 
> Segmentation should not be forced, as ifb is already adding
> quite a performance hit.

Please ignore, wrong version.

Will send a V2

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 net-next] ifb: support more features
  2016-05-06 23:41 [PATCH net-next] ifb: support more features Eric Dumazet
  2016-05-07  0:21 ` Eric Dumazet
@ 2016-05-07  1:19 ` Eric Dumazet
  2016-05-08  5:41   ` Eric Dumazet
  2016-05-09  4:01   ` David Miller
  1 sibling, 2 replies; 10+ messages in thread
From: Eric Dumazet @ 2016-05-07  1:19 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

When using ifb+netem on ingress on SIT/IPIP/GRE traffic,
GRO packets are not properly processed.

Segmentation should not be forced, since ifb is already adding
quite a performance hit.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 drivers/net/ifb.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index cc56fac3c3f8..66c0eeafcb5d 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -196,6 +196,7 @@ static const struct net_device_ops ifb_netdev_ops = {
 
 #define IFB_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG  | NETIF_F_FRAGLIST	| \
 		      NETIF_F_TSO_ECN | NETIF_F_TSO | NETIF_F_TSO6	| \
+		      NETIF_F_GSO_ENCAP_ALL 				| \
 		      NETIF_F_HIGHDMA | NETIF_F_HW_VLAN_CTAG_TX		| \
 		      NETIF_F_HW_VLAN_STAG_TX)
 
@@ -224,6 +225,8 @@ static void ifb_setup(struct net_device *dev)
 	dev->tx_queue_len = TX_Q_LIMIT;
 
 	dev->features |= IFB_FEATURES;
+	dev->hw_features |= dev->features;
+	dev->hw_enc_features |= dev->features;
 	dev->vlan_features |= IFB_FEATURES & ~(NETIF_F_HW_VLAN_CTAG_TX |
 					       NETIF_F_HW_VLAN_STAG_TX);
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 net-next] ifb: support more features
  2016-05-07  1:19 ` [PATCH v2 " Eric Dumazet
@ 2016-05-08  5:41   ` Eric Dumazet
  2016-05-08 16:08     ` Eric Dumazet
  2016-05-09  4:01   ` David Miller
  1 sibling, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2016-05-08  5:41 UTC (permalink / raw)
  To: David Miller, Alexander Duyck; +Cc: netdev

On Fri, 2016-05-06 at 18:19 -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> When using ifb+netem on ingress on SIT/IPIP/GRE traffic,
> GRO packets are not properly processed.
> 
> Segmentation should not be forced, since ifb is already adding
> quite a performance hit.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  drivers/net/ifb.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
> index cc56fac3c3f8..66c0eeafcb5d 100644
> --- a/drivers/net/ifb.c
> +++ b/drivers/net/ifb.c
> @@ -196,6 +196,7 @@ static const struct net_device_ops ifb_netdev_ops = {
>  
>  #define IFB_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG  | NETIF_F_FRAGLIST	| \
>  		      NETIF_F_TSO_ECN | NETIF_F_TSO | NETIF_F_TSO6	| \
> +		      NETIF_F_GSO_ENCAP_ALL 				| \
>  		      NETIF_F_HIGHDMA | NETIF_F_HW_VLAN_CTAG_TX		| \
>  		      NETIF_F_HW_VLAN_STAG_TX)
>  
> @@ -224,6 +225,8 @@ static void ifb_setup(struct net_device *dev)
>  	dev->tx_queue_len = TX_Q_LIMIT;
>  
>  	dev->features |= IFB_FEATURES;
> +	dev->hw_features |= dev->features;
> +	dev->hw_enc_features |= dev->features;
>  	dev->vlan_features |= IFB_FEATURES & ~(NETIF_F_HW_VLAN_CTAG_TX |
>  					       NETIF_F_HW_VLAN_STAG_TX);
>  
> 


BTW, encapsulated GRO traffic going through mirred+ifb is dropped
because segments get an incorrect skb->mac_len 

(If TSO/GSO is disabled on ifb, as before the above patch)

SIT traffic for example : segments get mac_len set to 34 instead of 14

What do you think of :

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index e561f9f07d6d..bec5c32b2fe9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3176,7 +3176,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 		__copy_skb_header(nskb, head_skb);
 
 		skb_headers_offset_update(nskb, skb_headroom(nskb) - headroom);
-		skb_reset_mac_len(nskb);
+		nskb->mac_len = head_skb->mac_len;
 
 		skb_copy_from_linear_data_offset(head_skb, -tnl_hlen,
 						 nskb->data - tnl_hlen,

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 net-next] ifb: support more features
  2016-05-08  5:41   ` Eric Dumazet
@ 2016-05-08 16:08     ` Eric Dumazet
  2016-05-08 17:35       ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2016-05-08 16:08 UTC (permalink / raw)
  To: David Miller, Vlad Yasevich; +Cc: Alexander Duyck, netdev

On Sat, 2016-05-07 at 22:41 -0700, Eric Dumazet wrote:
> On Fri, 2016-05-06 at 18:19 -0700, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > When using ifb+netem on ingress on SIT/IPIP/GRE traffic,
> > GRO packets are not properly processed.
> > 
> > Segmentation should not be forced, since ifb is already adding
> > quite a performance hit.
> > 
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > ---
> >  drivers/net/ifb.c |    3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
> > index cc56fac3c3f8..66c0eeafcb5d 100644
> > --- a/drivers/net/ifb.c
> > +++ b/drivers/net/ifb.c
> > @@ -196,6 +196,7 @@ static const struct net_device_ops ifb_netdev_ops = {
> >  
> >  #define IFB_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG  | NETIF_F_FRAGLIST	| \
> >  		      NETIF_F_TSO_ECN | NETIF_F_TSO | NETIF_F_TSO6	| \
> > +		      NETIF_F_GSO_ENCAP_ALL 				| \
> >  		      NETIF_F_HIGHDMA | NETIF_F_HW_VLAN_CTAG_TX		| \
> >  		      NETIF_F_HW_VLAN_STAG_TX)
> >  
> > @@ -224,6 +225,8 @@ static void ifb_setup(struct net_device *dev)
> >  	dev->tx_queue_len = TX_Q_LIMIT;
> >  
> >  	dev->features |= IFB_FEATURES;
> > +	dev->hw_features |= dev->features;
> > +	dev->hw_enc_features |= dev->features;
> >  	dev->vlan_features |= IFB_FEATURES & ~(NETIF_F_HW_VLAN_CTAG_TX |
> >  					       NETIF_F_HW_VLAN_STAG_TX);
> >  
> > 
> 
> 
> BTW, encapsulated GRO traffic going through mirred+ifb is dropped
> because segments get an incorrect skb->mac_len 
> 
> (If TSO/GSO is disabled on ifb, as before the above patch)
> 
> SIT traffic for example : segments get mac_len set to 34 instead of 14
> 
> What do you think of :
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index e561f9f07d6d..bec5c32b2fe9 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3176,7 +3176,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>  		__copy_skb_header(nskb, head_skb);
>  
>  		skb_headers_offset_update(nskb, skb_headroom(nskb) - headroom);
> -		skb_reset_mac_len(nskb);
> +		nskb->mac_len = head_skb->mac_len;
>  
>  		skb_copy_from_linear_data_offset(head_skb, -tnl_hlen,
>  						 nskb->data - tnl_hlen,
> 


Hmm... looking at history this went wrong in linux-3.17 [1]

So we probably need to make sure the network header is properly set for
the segments. Then skb_reset_mac_len(nskb); would work as intended.

Since skb_segment() is called from the deepest point in GSO path,
it always see the inner network header.

Sounds like skb_reset_network_header() calls done in inet_gso_segment()
and ipv6_gso_segment() should only be done for the outer header, (when
SKB_GSO_CB(skb)->encap_level == 0), or even better, only done in 
skb_mac_gso_segment()

Then we need to use the proper (inner) network header in
tcp4_gso_segment() and tcp6_gso_segment(), as they currently use
ip_hdr() and ipv6_hdr()

[1] : commit fcdfe3a7fa4cb74391d42b6a26dc07c20dab1d82
Author: Vlad Yasevich <vyasevic@redhat.com>
Date:   Thu Jul 31 10:33:06 2014 -0400

    net: Correctly set segment mac_len in skb_segment().
    
    When performing segmentation, the mac_len value is copied right
    out of the original skb.  However, this value is not always set correctly
    (like when the packet is VLAN-tagged) and we'll end up copying a bad
    value.
    
    One way to demonstrate this is to configure a VM which tags
    packets internally and turn off VLAN acceleration on the forwarding
    bridge port.  The packets show up corrupt like this:
    16:18:24.985548 52:54:00:ab:be:25 > 52:54:00:26:ce:a3, ethertype 802.1Q
    (0x8100), length 1518: vlan 100, p 0, ethertype 0x05e0,
            0x0000:  8cdb 1c7c 8cdb 0064 4006 b59d 0a00 6402 ...|...d@.....d.
            0x0010:  0a00 6401 9e0d b441 0a5e 64ec 0330 14fa ..d....A.^d..0..
            0x0020:  29e3 01c9 f871 0000 0101 080a 000a e833)....q.........3
            0x0030:  000f 8c75 6e65 7470 6572 6600 6e65 7470 ...unetperf.netp
            0x0040:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
            0x0050:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
            0x0060:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
            ...
    
    This also leads to awful throughput as GSO packets are dropped and
    cause retransmissions.
    
    The solution is to set the mac_len using the values already available
    in then new skb.  We've already adjusted all of the header offset, so we
    might as well correctly figure out the mac_len using skb_reset_mac_len().
    After this change, packets are segmented correctly and performance
    is restored.
    
    CC: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index c1a33033cbe2..58ff88edbefd 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2976,9 +2976,9 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
                tail = nskb;
 
                __copy_skb_header(nskb, head_skb);
-               nskb->mac_len = head_skb->mac_len;
 
                skb_headers_offset_update(nskb, skb_headroom(nskb) - headroom);
+               skb_reset_mac_len(nskb);
 
                skb_copy_from_linear_data_offset(head_skb, -tnl_hlen,
                                                 nskb->data - tnl_hlen,

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 net-next] ifb: support more features
  2016-05-08 16:08     ` Eric Dumazet
@ 2016-05-08 17:35       ` Eric Dumazet
  2016-05-09 17:22         ` Alexander Duyck
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2016-05-08 17:35 UTC (permalink / raw)
  To: David Miller; +Cc: Vlad Yasevich, Alexander Duyck, netdev, Tom Herbert

On Sun, 2016-05-08 at 09:08 -0700, Eric Dumazet wrote:


> So we probably need to make sure the network header is properly set for
> the segments. Then skb_reset_mac_len(nskb); would work as intended.
> 
> Since skb_segment() is called from the deepest point in GSO path,
> it always see the inner network header.
> 
> Sounds like skb_reset_network_header() calls done in inet_gso_segment()
> and ipv6_gso_segment() should only be done for the outer header, (when
> SKB_GSO_CB(skb)->encap_level == 0), or even better, only done in 
> skb_mac_gso_segment()
> 
> Then we need to use the proper (inner) network header in
> tcp4_gso_segment() and tcp6_gso_segment(), as they currently use
> ip_hdr() and ipv6_hdr()
> 

Prototype patch works for me (but GRE/UDP offloads might need some
work), and would even save few cycles...

Unfortunately GSO for GRE/UDP is kind of mess.

 net/core/dev.c           |    1 +
 net/ipv4/af_inet.c       |    9 +++------
 net/ipv4/tcp_offload.c   |    2 +-
 net/ipv6/ip6_offload.c   |    9 +++------
 net/ipv6/tcpv6_offload.c |    2 +-
 5 files changed, 9 insertions(+), 14 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 5c925ac50b95..3a9035ec862b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2658,6 +2658,7 @@ struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
 		return ERR_PTR(-EINVAL);
 
 	__skb_pull(skb, vlan_depth);
+	skb_reset_network_header(skb);
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ptype, &offload_base, list) {
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 9e481992dbae..fef6335a75bc 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1220,12 +1220,12 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 		       0)))
 		goto out;
 
-	skb_reset_network_header(skb);
-	nhoff = skb_network_header(skb) - skb_mac_header(skb);
+	skb_reset_inner_network_header(skb);
+	nhoff = skb->data - skb_mac_header(skb);
 	if (unlikely(!pskb_may_pull(skb, sizeof(*iph))))
 		goto out;
 
-	iph = ip_hdr(skb);
+	iph = inner_ip_hdr(skb);
 	ihl = iph->ihl * 4;
 	if (ihl < sizeof(*iph))
 		goto out;
@@ -1274,9 +1274,6 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 		}
 		iph->tot_len = htons(skb->len - nhoff);
 		ip_send_check(iph);
-		if (encap)
-			skb_reset_inner_headers(skb);
-		skb->network_header = (u8 *)iph - skb->head;
 	} while ((skb = skb->next));
 
 out:
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index 773083b7f1e9..f0650b50680e 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -36,7 +36,7 @@ static struct sk_buff *tcp4_gso_segment(struct sk_buff *skb,
 		return ERR_PTR(-EINVAL);
 
 	if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) {
-		const struct iphdr *iph = ip_hdr(skb);
+		const struct iphdr *iph = inner_ip_hdr(skb);
 		struct tcphdr *th = tcp_hdr(skb);
 
 		/* Set up checksum pseudo header, usually expect stack to
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 82e9f3076028..8d27299f86e4 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -84,8 +84,8 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
 		       0)))
 		goto out;
 
-	skb_reset_network_header(skb);
-	nhoff = skb_network_header(skb) - skb_mac_header(skb);
+	skb_reset_inner_network_header(skb);
+	nhoff = skb->data - skb_mac_header(skb);
 	if (unlikely(!pskb_may_pull(skb, sizeof(*ipv6h))))
 		goto out;
 
@@ -94,7 +94,7 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
 		features &= skb->dev->hw_enc_features;
 	SKB_GSO_CB(skb)->encap_level += sizeof(*ipv6h);
 
-	ipv6h = ipv6_hdr(skb);
+	ipv6h = inner_ipv6_hdr(skb);
 	__skb_pull(skb, sizeof(*ipv6h));
 	segs = ERR_PTR(-EPROTONOSUPPORT);
 
@@ -118,7 +118,6 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
 	for (skb = segs; skb; skb = skb->next) {
 		ipv6h = (struct ipv6hdr *)(skb_mac_header(skb) + nhoff);
 		ipv6h->payload_len = htons(skb->len - nhoff - sizeof(*ipv6h));
-		skb->network_header = (u8 *)ipv6h - skb->head;
 
 		if (udpfrag) {
 			unfrag_ip6hlen = ip6_find_1stfragopt(skb, &prevhdr);
@@ -129,8 +128,6 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
 			offset += (ntohs(ipv6h->payload_len) -
 				   sizeof(struct frag_hdr));
 		}
-		if (encap)
-			skb_reset_inner_headers(skb);
 	}
 
 out:
diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
index d883c9204c01..8e747a295bce 100644
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -50,7 +50,7 @@ static struct sk_buff *tcp6_gso_segment(struct sk_buff *skb,
 		return ERR_PTR(-EINVAL);
 
 	if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) {
-		const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+		const struct ipv6hdr *ipv6h = inner_ipv6_hdr(skb);
 		struct tcphdr *th = tcp_hdr(skb);
 
 		/* Set up pseudo header, usually expect stack to have done

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 net-next] ifb: support more features
  2016-05-07  1:19 ` [PATCH v2 " Eric Dumazet
  2016-05-08  5:41   ` Eric Dumazet
@ 2016-05-09  4:01   ` David Miller
  1 sibling, 0 replies; 10+ messages in thread
From: David Miller @ 2016-05-09  4:01 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 06 May 2016 18:19:59 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> When using ifb+netem on ingress on SIT/IPIP/GRE traffic,
> GRO packets are not properly processed.
> 
> Segmentation should not be forced, since ifb is already adding
> quite a performance hit.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 net-next] ifb: support more features
  2016-05-08 17:35       ` Eric Dumazet
@ 2016-05-09 17:22         ` Alexander Duyck
  2016-05-09 17:39           ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Alexander Duyck @ 2016-05-09 17:22 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Vlad Yasevich, Alexander Duyck, netdev, Tom Herbert

On Sun, May 8, 2016 at 10:35 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Sun, 2016-05-08 at 09:08 -0700, Eric Dumazet wrote:
>
>
>> So we probably need to make sure the network header is properly set for
>> the segments. Then skb_reset_mac_len(nskb); would work as intended.
>>
>> Since skb_segment() is called from the deepest point in GSO path,
>> it always see the inner network header.
>>
>> Sounds like skb_reset_network_header() calls done in inet_gso_segment()
>> and ipv6_gso_segment() should only be done for the outer header, (when
>> SKB_GSO_CB(skb)->encap_level == 0), or even better, only done in
>> skb_mac_gso_segment()
>>
>> Then we need to use the proper (inner) network header in
>> tcp4_gso_segment() and tcp6_gso_segment(), as they currently use
>> ip_hdr() and ipv6_hdr()
>>
>
> Prototype patch works for me (but GRE/UDP offloads might need some
> work), and would even save few cycles...
>
> Unfortunately GSO for GRE/UDP is kind of mess.

I agree.  I have been trying to work on cleaning it up but it is
taking a while to get it all sorted out.

>  net/core/dev.c           |    1 +
>  net/ipv4/af_inet.c       |    9 +++------
>  net/ipv4/tcp_offload.c   |    2 +-
>  net/ipv6/ip6_offload.c   |    9 +++------
>  net/ipv6/tcpv6_offload.c |    2 +-
>  5 files changed, 9 insertions(+), 14 deletions(-)
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 5c925ac50b95..3a9035ec862b 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2658,6 +2658,7 @@ struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
>                 return ERR_PTR(-EINVAL);
>
>         __skb_pull(skb, vlan_depth);
> +       skb_reset_network_header(skb);
>
>         rcu_read_lock();
>         list_for_each_entry_rcu(ptype, &offload_base, list) {

I'm pretty sure just dropping it in here isn't going to fix much since
this gets called by all the tunnel types that support transparent
Ethernet bridging.

> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 9e481992dbae..fef6335a75bc 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1220,12 +1220,12 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
>                        0)))
>                 goto out;
>
> -       skb_reset_network_header(skb);
> -       nhoff = skb_network_header(skb) - skb_mac_header(skb);
> +       skb_reset_inner_network_header(skb);
> +       nhoff = skb->data - skb_mac_header(skb);
>         if (unlikely(!pskb_may_pull(skb, sizeof(*iph))))
>                 goto out;
>
> -       iph = ip_hdr(skb);
> +       iph = inner_ip_hdr(skb);
>         ihl = iph->ihl * 4;
>         if (ihl < sizeof(*iph))
>                 goto out;

One thought that just occurred to me based on this would be to
configure inner headers on the way up, and to configure the outer
headers on the way down.  Then that way we could go through and be
guaranteed that the inner headers represent the inner most set of
header offsets, and the outer ones represent the outer-most set
regardless of the total number of headers present and there would be
no need to call into the reset_headers function since all the headers
would already be set.

I was also looking at possibly dropping the inner transport offset as
from what I can tell it and the csum_offset should always be the same
value since csum_offset will always point to the inner transport
header when any kind of offload is enabled which is the criteria for
skb->encapsulation being set anyway.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 net-next] ifb: support more features
  2016-05-09 17:22         ` Alexander Duyck
@ 2016-05-09 17:39           ` Eric Dumazet
  2016-05-09 17:41             ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2016-05-09 17:39 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Vlad Yasevich, Alexander Duyck, netdev, Tom Herbert

On Mon, 2016-05-09 at 10:22 -0700, Alexander Duyck wrote:

> One thought that just occurred to me based on this would be to
> configure inner headers on the way up, and to configure the outer
> headers on the way down.  Then that way we could go through and be
> guaranteed that the inner headers represent the inner most set of
> header offsets, and the outer ones represent the outer-most set
> regardless of the total number of headers present and there would be
> no need to call into the reset_headers function since all the headers
> would already be set.
> 
> I was also looking at possibly dropping the inner transport offset as
> from what I can tell it and the csum_offset should always be the same
> value since csum_offset will always point to the inner transport
> header when any kind of offload is enabled which is the criteria for
> skb->encapsulation being set anyway.

Ideally nothing should be changed in the source skb while doing
gso_segment() calls.

As we did in gro_complete() when adding nhoff argument, we probably
could pass the current offset and not touch skb->data and various header
offsets.

Presumably gso state should be stored in a separate structure.

I meant, gso_segment() is not a 'please sanity fix this skb'.

Only DODGY thing might need to ?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 net-next] ifb: support more features
  2016-05-09 17:39           ` Eric Dumazet
@ 2016-05-09 17:41             ` Eric Dumazet
  0 siblings, 0 replies; 10+ messages in thread
From: Eric Dumazet @ 2016-05-09 17:41 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Vlad Yasevich, Alexander Duyck, netdev, Tom Herbert

On Mon, 2016-05-09 at 10:39 -0700, Eric Dumazet wrote:
> On Mon, 2016-05-09 at 10:22 -0700, Alexander Duyck wrote:
> 
> > One thought that just occurred to me based on this would be to
> > configure inner headers on the way up, and to configure the outer
> > headers on the way down.  Then that way we could go through and be
> > guaranteed that the inner headers represent the inner most set of
> > header offsets, and the outer ones represent the outer-most set
> > regardless of the total number of headers present and there would be
> > no need to call into the reset_headers function since all the headers
> > would already be set.
> > 
> > I was also looking at possibly dropping the inner transport offset as
> > from what I can tell it and the csum_offset should always be the same
> > value since csum_offset will always point to the inner transport
> > header when any kind of offload is enabled which is the criteria for
> > skb->encapsulation being set anyway.
> 
> Ideally nothing should be changed in the source skb while doing
> gso_segment() calls.
> 
> As we did in gro_complete() when adding nhoff argument, we probably
> could pass the current offset and not touch skb->data and various header
> offsets.

Ugly things like skb_gso_error_unwind() would then disappear.

What a mess it is.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-05-09 17:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-06 23:41 [PATCH net-next] ifb: support more features Eric Dumazet
2016-05-07  0:21 ` Eric Dumazet
2016-05-07  1:19 ` [PATCH v2 " Eric Dumazet
2016-05-08  5:41   ` Eric Dumazet
2016-05-08 16:08     ` Eric Dumazet
2016-05-08 17:35       ` Eric Dumazet
2016-05-09 17:22         ` Alexander Duyck
2016-05-09 17:39           ` Eric Dumazet
2016-05-09 17:41             ` Eric Dumazet
2016-05-09  4:01   ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.