All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Hutchings <bhutchings@solarflare.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>, Jerry Chu <hkchu@google.com>,
	Herbert Xu <herbert@gondor.apana.org.au>
Subject: Re: [PATCH net-next] gro: should aggregate frames without DF
Date: Fri, 31 May 2013 21:09:04 +0100	[thread overview]
Message-ID: <1370030944.2703.17.camel@bwh-desktop.uk.level5networks.com> (raw)
In-Reply-To: <1370019752.5109.108.camel@edumazet-glaptop>

On Fri, 2013-05-31 at 10:02 -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> GRO on IPv4 doesn't aggregate frames if they don't have DF bit set.
> 
> Some servers use IP_MTU_DISCOVER/IP_PMTUDISC_PROBE, so linux receivers
> are unable to aggregate this kind of traffic.
> 
> The right thing to do is to allow aggregation as long as the DF bit has
> same value on all segments.
> 
> bnx2x LRO does this correctly.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jerry Chu <hkchu@google.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> ---
>  net/ipv4/af_inet.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index b05ae96..328cc62 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1384,9 +1384,9 @@ static struct sk_buff **inet_gro_receive(struct sk_buff **head,
>  	if (unlikely(ip_fast_csum((u8 *)iph, 5)))
>  		goto out_unlock;
>  
> -	id = ntohl(*(__be32 *)&iph->id);
> -	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id ^ IP_DF));
> -	id >>= 16;
> +	flush = ntohs(iph->tot_len) ^ skb_gro_len(skb);
> +
> +	id = ntohs(iph->id);
>  
>  	for (p = *head; p; p = p->next) {
>  		struct iphdr *iph2;
> @@ -1407,6 +1407,7 @@ static struct sk_buff **inet_gro_receive(struct sk_buff **head,
>  		NAPI_GRO_CB(p)->flush |=
>  			(iph->ttl ^ iph2->ttl) |
>  			(iph->tos ^ iph2->tos) |
> +			((iph->frag_off ^ iph2->frag_off) & htons(IP_DF)) |

But this results in ignoring the actual offset bits of frag_off!
We should allow merging only if all packets have frag_off == IP_DF or
all have frag_off == 0.  The first assignment of flush therefore still
needs to check the combined id/frag_off word, but using (id & ~IP_DF)
instead of (id ^ IP_DF).

Ben.

>  			((u16)(ntohs(iph2->id) + NAPI_GRO_CB(p)->count) ^ id);
>  
>  		NAPI_GRO_CB(p)->flush |= flush;

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

  parent reply	other threads:[~2013-05-31 20:09 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-31 17:02 [PATCH net-next] gro: should aggregate frames without DF Eric Dumazet
2013-05-31 18:14 ` Jerry Chu
2013-05-31 20:09 ` Ben Hutchings [this message]
2013-05-31 21:12   ` Eric Dumazet
2013-05-31 21:18   ` [PATCH v2 " Eric Dumazet
2013-05-31 21:27     ` Ben Hutchings
2013-06-01  0:15     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1370030944.2703.17.camel@bwh-desktop.uk.level5networks.com \
    --to=bhutchings@solarflare.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=hkchu@google.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.