netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] IPv6: Set SIT tunnel hard_header_len to zero
@ 2020-11-03 10:41 Oliver Herms
  2020-11-03 18:42 ` Willem de Bruijn
  2020-11-09 23:16 ` Jakub Kicinski
  0 siblings, 2 replies; 7+ messages in thread
From: Oliver Herms @ 2020-11-03 10:41 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuznet, yoshfuji, kuba

Due to the legacy usage of hard_header_len for SIT tunnels while
already using infrastructure from net/ipv4/ip_tunnel.c the
calculation of the path MTU in tnl_update_pmtu is incorrect.
This leads to unnecessary creation of MTU exceptions for any
flow going over a SIT tunnel.

As SIT tunnels do not have a header themsevles other than their
transport (L3, L2) headers we're leaving hard_header_len set to zero
as tnl_update_pmtu is already taking care of the transport headers
sizes.

This will also help avoiding unnecessary IPv6 GC runs and spinlock
contention seen when using SIT tunnels and for more than
net.ipv6.route.gc_thresh flows.

Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
Signed-off-by: Oliver Herms <oliver.peter.herms@gmail.com>
---
 net/ipv6/sit.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 5e2c34c0ac97..5e7983cb6154 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -1128,7 +1128,6 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev)
 	if (tdev && !netif_is_l3_master(tdev)) {
 		int t_hlen = tunnel->hlen + sizeof(struct iphdr);
 
-		dev->hard_header_len = tdev->hard_header_len + sizeof(struct iphdr);
 		dev->mtu = tdev->mtu - t_hlen;
 		if (dev->mtu < IPV6_MIN_MTU)
 			dev->mtu = IPV6_MIN_MTU;
@@ -1426,7 +1425,6 @@ static void ipip6_tunnel_setup(struct net_device *dev)
 	dev->priv_destructor	= ipip6_dev_free;
 
 	dev->type		= ARPHRD_SIT;
-	dev->hard_header_len	= LL_MAX_HEADER + t_hlen;
 	dev->mtu		= ETH_DATA_LEN - t_hlen;
 	dev->min_mtu		= IPV6_MIN_MTU;
 	dev->max_mtu		= IP6_MAX_MTU - t_hlen;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] IPv6: Set SIT tunnel hard_header_len to zero
  2020-11-03 10:41 [PATCH] IPv6: Set SIT tunnel hard_header_len to zero Oliver Herms
@ 2020-11-03 18:42 ` Willem de Bruijn
  2020-11-04 19:30   ` Oliver Herms
  2020-11-09 23:16 ` Jakub Kicinski
  1 sibling, 1 reply; 7+ messages in thread
From: Willem de Bruijn @ 2020-11-03 18:42 UTC (permalink / raw)
  To: Oliver Herms
  Cc: Network Development, David Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski

On Tue, Nov 3, 2020 at 5:41 AM Oliver Herms
<oliver.peter.herms@gmail.com> wrote:
>
> Due to the legacy usage of hard_header_len for SIT tunnels while
> already using infrastructure from net/ipv4/ip_tunnel.c the
> calculation of the path MTU in tnl_update_pmtu is incorrect.
> This leads to unnecessary creation of MTU exceptions for any
> flow going over a SIT tunnel.
>
> As SIT tunnels do not have a header themsevles other than their
> transport (L3, L2) headers we're leaving hard_header_len set to zero
> as tnl_update_pmtu is already taking care of the transport headers
> sizes.
>
> This will also help avoiding unnecessary IPv6 GC runs and spinlock
> contention seen when using SIT tunnels and for more than
> net.ipv6.route.gc_thresh flows.

Thanks. Yes, this is long overdue.

The hard_header_len issue was also recently discussed in the context
of GRE in commit fdafed459998 ("ip_gre: set dev->hard_header_len and
dev->needed_headroom properly").

Question is whether we should reserve room in needed_headroom instead.
AFAIK this existing update logic in ip6_tnl_xmit is sufficient

"
        /* Calculate max headroom for all the headers and adjust
         * needed_headroom if necessary.
         */
        max_headroom = LL_RESERVED_SPACE(dst->dev) + sizeof(struct ipv6hdr)
                        + dst->header_len + t->hlen;
        if (max_headroom > dev->needed_headroom)
                dev->needed_headroom = max_headroom;
"

> Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")

How did you arrive at this SHA1?

> Signed-off-by: Oliver Herms <oliver.peter.herms@gmail.com>
> ---
>  net/ipv6/sit.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
> index 5e2c34c0ac97..5e7983cb6154 100644
> --- a/net/ipv6/sit.c
> +++ b/net/ipv6/sit.c
> @@ -1128,7 +1128,6 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev)
>         if (tdev && !netif_is_l3_master(tdev)) {
>                 int t_hlen = tunnel->hlen + sizeof(struct iphdr);
>
> -               dev->hard_header_len = tdev->hard_header_len + sizeof(struct iphdr);
>                 dev->mtu = tdev->mtu - t_hlen;
>                 if (dev->mtu < IPV6_MIN_MTU)
>                         dev->mtu = IPV6_MIN_MTU;
> @@ -1426,7 +1425,6 @@ static void ipip6_tunnel_setup(struct net_device *dev)
>         dev->priv_destructor    = ipip6_dev_free;
>
>         dev->type               = ARPHRD_SIT;
> -       dev->hard_header_len    = LL_MAX_HEADER + t_hlen;
>         dev->mtu                = ETH_DATA_LEN - t_hlen;
>         dev->min_mtu            = IPV6_MIN_MTU;
>         dev->max_mtu            = IP6_MAX_MTU - t_hlen;
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] IPv6: Set SIT tunnel hard_header_len to zero
  2020-11-03 18:42 ` Willem de Bruijn
@ 2020-11-04 19:30   ` Oliver Herms
  2020-11-04 19:52     ` Willem de Bruijn
  0 siblings, 1 reply; 7+ messages in thread
From: Oliver Herms @ 2020-11-04 19:30 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski

On 03.11.20 19:42, Willem de Bruijn wrote:
> Thanks. Yes, this is long overdue.
> 
> The hard_header_len issue was also recently discussed in the context
> of GRE in commit fdafed459998 ("ip_gre: set dev->hard_header_len and
> dev->needed_headroom properly").
> 
> Question is whether we should reserve room in needed_headroom instead.
> AFAIK this existing update logic in ip6_tnl_xmit is sufficient
> 
> "
>         /* Calculate max headroom for all the headers and adjust
>          * needed_headroom if necessary.
>          */
>         max_headroom = LL_RESERVED_SPACE(dst->dev) + sizeof(struct ipv6hdr)
>                         + dst->header_len + t->hlen;
>         if (max_headroom > dev->needed_headroom)
>                 dev->needed_headroom = max_headroom;
> "I think that's enough. At least it definitely works in my test and prod environment.
Would be good to get another opinion on this though.

>> Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
> 
> How did you arrive at this SHA1?
I think the legacy usage of hard_header_len in ipv6/sit.c was overseen in c54419321455.
Please correct me if I'm wrong.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] IPv6: Set SIT tunnel hard_header_len to zero
  2020-11-04 19:30   ` Oliver Herms
@ 2020-11-04 19:52     ` Willem de Bruijn
  2020-11-09  9:05       ` Oliver Herms
  0 siblings, 1 reply; 7+ messages in thread
From: Willem de Bruijn @ 2020-11-04 19:52 UTC (permalink / raw)
  To: Oliver Herms
  Cc: Network Development, David Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski

On Wed, Nov 4, 2020 at 2:30 PM Oliver Herms
<oliver.peter.herms@gmail.com> wrote:
>
> On 03.11.20 19:42, Willem de Bruijn wrote:
> > Thanks. Yes, this is long overdue.
> >
> > The hard_header_len issue was also recently discussed in the context
> > of GRE in commit fdafed459998 ("ip_gre: set dev->hard_header_len and
> > dev->needed_headroom properly").
> >
> > Question is whether we should reserve room in needed_headroom instead.
> > AFAIK this existing update logic in ip6_tnl_xmit is sufficient
> >
> > "
> >         /* Calculate max headroom for all the headers and adjust
> >          * needed_headroom if necessary.
> >          */
> >         max_headroom = LL_RESERVED_SPACE(dst->dev) + sizeof(struct ipv6hdr)
> >                         + dst->header_len + t->hlen;
> >         if (max_headroom > dev->needed_headroom)
> >                 dev->needed_headroom = max_headroom;
> > "I think that's enough. At least it definitely works in my test and prod environment.
> Would be good to get another opinion on this though.

Sit should behave the same as other tunnels. At least in ip6_tunnel.c
and ip6_gre.c I do not see explicit initialization of needed_headroom.

> >> Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
> >
> > How did you arrive at this SHA1?
> I think the legacy usage of hard_header_len in ipv6/sit.c was overseen in c54419321455.
> Please correct me if I'm wrong.

I don't see anything in that patch assign or modify hard_header_len.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] IPv6: Set SIT tunnel hard_header_len to zero
  2020-11-04 19:52     ` Willem de Bruijn
@ 2020-11-09  9:05       ` Oliver Herms
  2020-11-09 15:10         ` Willem de Bruijn
  0 siblings, 1 reply; 7+ messages in thread
From: Oliver Herms @ 2020-11-09  9:05 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski


On 04.11.20 20:52, Willem de Bruijn wrote:
>>>> Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
>>>
>>> How did you arrive at this SHA1?
>> I think the legacy usage of hard_header_len in ipv6/sit.c was overseen in c54419321455.
>> Please correct me if I'm wrong.
> 
> I don't see anything in that patch assign or modify hard_header_len.
> 
It's not assigning or modifying it but changing expectations about how dev->hard_header_len is to be used.

The patch changed the MTU calculation from:
mtu = dst_mtu(&rt->dst) - dev->hard_header_len - tunnel->hlen;

to this:
mtu = dst_mtu(&rt->dst) - dev->hard_header_len - sizeof(struct iphdr);

Later is became this (in patch 23a3647. This is the current implementation.):
mtu = dst_mtu(&rt->dst) - dev->hard_header_len - sizeof(struct iphdr) - tunnel_hlen;

Apparently the initial usage of dev->hard_header_len was that it contains the length
of all headers before the tunnel payload. c54419321455 changed it to assuming dev->hard_header_len 
does not contain the tunnels outter IP header. Thus I think the bug was introduced by c54419321455.

Kind Regards
Oliver

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] IPv6: Set SIT tunnel hard_header_len to zero
  2020-11-09  9:05       ` Oliver Herms
@ 2020-11-09 15:10         ` Willem de Bruijn
  0 siblings, 0 replies; 7+ messages in thread
From: Willem de Bruijn @ 2020-11-09 15:10 UTC (permalink / raw)
  To: Oliver Herms
  Cc: Network Development, David Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski

On Mon, Nov 9, 2020 at 4:05 AM Oliver Herms
<oliver.peter.herms@gmail.com> wrote:
>
>
> On 04.11.20 20:52, Willem de Bruijn wrote:
> >>>> Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
> >>>
> >>> How did you arrive at this SHA1?
> >> I think the legacy usage of hard_header_len in ipv6/sit.c was overseen in c54419321455.
> >> Please correct me if I'm wrong.
> >
> > I don't see anything in that patch assign or modify hard_header_len.
> >
> It's not assigning or modifying it but changing expectations about how dev->hard_header_len is to be used.
>
> The patch changed the MTU calculation from:
> mtu = dst_mtu(&rt->dst) - dev->hard_header_len - tunnel->hlen;
>
> to this:
> mtu = dst_mtu(&rt->dst) - dev->hard_header_len - sizeof(struct iphdr);
>
> Later is became this (in patch 23a3647. This is the current implementation.):
> mtu = dst_mtu(&rt->dst) - dev->hard_header_len - sizeof(struct iphdr) - tunnel_hlen;
>
> Apparently the initial usage of dev->hard_header_len was that it contains the length
> of all headers before the tunnel payload. c54419321455 changed it to assuming dev->hard_header_len
> does not contain the tunnels outter IP header. Thus I think the bug was introduced by c54419321455.

And the only header in the case of SIT is that outer ip header. Got it, thanks.

Overly conservative MTU calculation is one issue. Packet sockets also
expect read/write link layer access with SOCK_RAW, which does not work
correctly for sit. I'm not sure that it ever did.

The chosen commit predates all stable trees, which is the most important point.

Acked-by: Willem de Bruijn <willemb@google.com>

Could ip6 tunnels have the same issue? In ip6_tnl_dev_init_gen,

        dev->hard_header_len = LL_MAX_HEADER + t_hlen;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] IPv6: Set SIT tunnel hard_header_len to zero
  2020-11-03 10:41 [PATCH] IPv6: Set SIT tunnel hard_header_len to zero Oliver Herms
  2020-11-03 18:42 ` Willem de Bruijn
@ 2020-11-09 23:16 ` Jakub Kicinski
  1 sibling, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2020-11-09 23:16 UTC (permalink / raw)
  To: Oliver Herms; +Cc: netdev, davem, kuznet, yoshfuji

On Tue, 3 Nov 2020 11:41:33 +0100 Oliver Herms wrote:
> Due to the legacy usage of hard_header_len for SIT tunnels while
> already using infrastructure from net/ipv4/ip_tunnel.c the
> calculation of the path MTU in tnl_update_pmtu is incorrect.
> This leads to unnecessary creation of MTU exceptions for any
> flow going over a SIT tunnel.
> 
> As SIT tunnels do not have a header themsevles other than their
> transport (L3, L2) headers we're leaving hard_header_len set to zero
> as tnl_update_pmtu is already taking care of the transport headers
> sizes.
> 
> This will also help avoiding unnecessary IPv6 GC runs and spinlock
> contention seen when using SIT tunnels and for more than
> net.ipv6.route.gc_thresh flows.
> 
> Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
> Signed-off-by: Oliver Herms <oliver.peter.herms@gmail.com>

Applied, thanks! Let's carry on the discussion about ipv6 tunnels,
though, it does look questionable.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-11-09 23:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-03 10:41 [PATCH] IPv6: Set SIT tunnel hard_header_len to zero Oliver Herms
2020-11-03 18:42 ` Willem de Bruijn
2020-11-04 19:30   ` Oliver Herms
2020-11-04 19:52     ` Willem de Bruijn
2020-11-09  9:05       ` Oliver Herms
2020-11-09 15:10         ` Willem de Bruijn
2020-11-09 23:16 ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).