All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 0/2] net: ipv6: seg6: headroom fixes
@ 2020-02-04 17:30 Alexander Aring
  2020-02-04 17:30 ` [PATCH net 1/2] net: ipv6: seg6_iptunnel: set tunnel headroom to zero Alexander Aring
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Alexander Aring @ 2020-02-04 17:30 UTC (permalink / raw)
  To: davem
  Cc: kuznet, yoshfuji, kuba, netdev, andrea.mayer, dav.lebrun, mcr,
	stefan, Alexander Aring

Hi netdev,

This patch series fixes issues which I discovered while implementing RPL
source routing for 6LoWPAN interfaces. 6LoWPAN interfaces are using a MTU
of 1280 which is the IPv6 minimum MTU. I suppose this is the right fix to
do that according to my explanation that tunnels which acting before L3
need to set this headroom. So far I see only segmentation route is affected
to it. Maybe BPF tunnels, but it depends on the case... Maybe a comment
need to be added there as well to not getting confused. If wanted I can
send another patch for a comment for net-next or even net? May the
variable should be renamed to l2_headroom?

I splitted this fix in two patches, I only tested the first one and I
think the second one. The second one should fix issues as well.

Open to discuss my patches and if isn't the real fix... how we fix it?

I cc'ed all necessary people and Andrea which seems to have some net-next
patches around who should consider this fix in their new patches.

- Alex

Alexander Aring (2):
  net: ipv6: seg6_iptunnel: set tunnel headroom to zero
  net: ipv6: seg6_local: don't set headroom

 net/ipv6/seg6_iptunnel.c | 2 --
 net/ipv6/seg6_local.c    | 7 -------
 2 files changed, 9 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net 1/2] net: ipv6: seg6_iptunnel: set tunnel headroom to zero
  2020-02-04 17:30 [PATCH net 0/2] net: ipv6: seg6: headroom fixes Alexander Aring
@ 2020-02-04 17:30 ` Alexander Aring
  2020-02-06 12:54   ` David Miller
  2020-02-04 17:30 ` [PATCH net 2/2] net: ipv6: seg6_local: don't set headroom Alexander Aring
  2020-02-05 12:21 ` [PATCH net 0/2] net: ipv6: seg6: headroom fixes Michael Richardson
  2 siblings, 1 reply; 7+ messages in thread
From: Alexander Aring @ 2020-02-04 17:30 UTC (permalink / raw)
  To: davem
  Cc: kuznet, yoshfuji, kuba, netdev, andrea.mayer, dav.lebrun, mcr,
	stefan, Alexander Aring

This patch sets headroom of segmentation route tunnel to zero. The
headroom setting was introduced for mpls in commit 14972cbd34ff
("net: lwtunnel: Handle fragmentation") which sits on layer 2.5. As the
Linux interface MTU value is Layer 3 and don't consider anything before
that it is misleading to set the headroom value to anything than 0.

Example setup to trigger this issue:

ip netns add foo
ip link add veth0 type veth peer name veth1
ip link set veth1 netns foo
ip link set mtu 1280 dev veth0

ip link set veth0 up
ip -n foo link set veth1 up

ip addr add beef::1/64 dev veth0
ip -6 route add beef::3 encap seg6 mode encap segs beef::2 dev veth0

then do a:

ping beef::3

You the sendmsg() will return -EINVAL because the packet doesn't fit
into the IPv6 minimum MTU anymore. It was consider the headroom value
in their destination mtu which substracts whatever headroom is from
the interface MTU 1280.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
---
 net/ipv6/seg6_iptunnel.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
index ab7f124ff5d7..5b6e88f16e2d 100644
--- a/net/ipv6/seg6_iptunnel.c
+++ b/net/ipv6/seg6_iptunnel.c
@@ -449,8 +449,6 @@ static int seg6_build_state(struct nlattr *nla,
 	if (tuninfo->mode != SEG6_IPTUN_MODE_L2ENCAP)
 		newts->flags |= LWTUNNEL_STATE_OUTPUT_REDIRECT;
 
-	newts->headroom = seg6_lwt_headroom(tuninfo);
-
 	*ts = newts;
 
 	return 0;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 2/2] net: ipv6: seg6_local: don't set headroom
  2020-02-04 17:30 [PATCH net 0/2] net: ipv6: seg6: headroom fixes Alexander Aring
  2020-02-04 17:30 ` [PATCH net 1/2] net: ipv6: seg6_iptunnel: set tunnel headroom to zero Alexander Aring
@ 2020-02-04 17:30 ` Alexander Aring
  2020-02-05 12:21 ` [PATCH net 0/2] net: ipv6: seg6: headroom fixes Michael Richardson
  2 siblings, 0 replies; 7+ messages in thread
From: Alexander Aring @ 2020-02-04 17:30 UTC (permalink / raw)
  To: davem
  Cc: kuznet, yoshfuji, kuba, netdev, andrea.mayer, dav.lebrun, mcr,
	stefan, Alexander Aring

The headroom value only need to be set on tunnels before Layer 3 like
mpls. Remove it from seg6_local because it's in Layer 3.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
---
 net/ipv6/seg6_local.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 7cbc19731997..3b77184edf2b 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -37,7 +37,6 @@ struct seg6_action_desc {
 	int action;
 	unsigned long attrs;
 	int (*input)(struct sk_buff *skb, struct seg6_local_lwt *slwt);
-	int static_headroom;
 };
 
 struct bpf_lwt_prog {
@@ -55,7 +54,6 @@ struct seg6_local_lwt {
 	int oif;
 	struct bpf_lwt_prog bpf;
 
-	int headroom;
 	struct seg6_action_desc *desc;
 };
 
@@ -603,7 +601,6 @@ static struct seg6_action_desc seg6_action_table[] = {
 		.action		= SEG6_LOCAL_ACTION_END_B6_ENCAP,
 		.attrs		= (1 << SEG6_LOCAL_SRH),
 		.input		= input_action_end_b6_encap,
-		.static_headroom	= sizeof(struct ipv6hdr),
 	},
 	{
 		.action		= SEG6_LOCAL_ACTION_END_BPF,
@@ -677,8 +674,6 @@ static int parse_nla_srh(struct nlattr **attrs, struct seg6_local_lwt *slwt)
 	if (!slwt->srh)
 		return -ENOMEM;
 
-	slwt->headroom += len;
-
 	return 0;
 }
 
@@ -952,7 +947,6 @@ static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
 		return -EOPNOTSUPP;
 
 	slwt->desc = desc;
-	slwt->headroom += desc->static_headroom;
 
 	for (i = 0; i < SEG6_LOCAL_MAX + 1; i++) {
 		if (desc->attrs & (1 << i)) {
@@ -1004,7 +998,6 @@ static int seg6_local_build_state(struct nlattr *nla, unsigned int family,
 
 	newts->type = LWTUNNEL_ENCAP_SEG6_LOCAL;
 	newts->flags = LWTUNNEL_STATE_INPUT_REDIRECT;
-	newts->headroom = slwt->headroom;
 
 	*ts = newts;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net 0/2] net: ipv6: seg6: headroom fixes
  2020-02-04 17:30 [PATCH net 0/2] net: ipv6: seg6: headroom fixes Alexander Aring
  2020-02-04 17:30 ` [PATCH net 1/2] net: ipv6: seg6_iptunnel: set tunnel headroom to zero Alexander Aring
  2020-02-04 17:30 ` [PATCH net 2/2] net: ipv6: seg6_local: don't set headroom Alexander Aring
@ 2020-02-05 12:21 ` Michael Richardson
  2020-02-06 18:22   ` Alexander Aring
  2 siblings, 1 reply; 7+ messages in thread
From: Michael Richardson @ 2020-02-05 12:21 UTC (permalink / raw)
  To: Alexander Aring; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 1771 bytes --]


Alexander Aring <alex.aring@gmail.com> wrote:
    > This patch series fixes issues which I discovered while implementing RPL
    > source routing for 6LoWPAN interfaces. 6LoWPAN interfaces are using a MTU
    > of 1280 which is the IPv6 minimum MTU. I suppose this is the right fix to
    > do that according to my explanation that tunnels which acting before L3
    > need to set this headroom. So far I see only segmentation route is affected
    > to it. Maybe BPF tunnels, but it depends on the case... Maybe a comment
    > need to be added there as well to not getting confused. If wanted I can
    > send another patch for a comment for net-next or even net? May the
    > variable should be renamed to l2_headroom?

I had discussed this with Alex over the past few days.
I had not looked closely at the code during that discussion, and maybe my
comments in chat were wrong.  So these patches don't look right to me.

I think that the issue we have here is that things are big vague when it
comes to layer-2.5's, and fatter layer-3s.  Maybe this is well established in
lore...

My understanding is that headroom is a general offset, usually set by the L2
which tells the L3/L4 how much to offset in the SKB before the ULP header is
inserted.   TCP/UDP/SCTP/ESP need to know this.

MPLS is a layer-2.5, and so it quite weird, because it creates a new L2
which lives upon other L2 and also other L3s.

Segment routing, and RPL RH3 headers involve a fatter L3 header.

Of course, one could mix all of these things together!

--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        | network architect  [
]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net 1/2] net: ipv6: seg6_iptunnel: set tunnel headroom to zero
  2020-02-04 17:30 ` [PATCH net 1/2] net: ipv6: seg6_iptunnel: set tunnel headroom to zero Alexander Aring
@ 2020-02-06 12:54   ` David Miller
  2020-02-08 17:34     ` Alexander Aring
  0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2020-02-06 12:54 UTC (permalink / raw)
  To: alex.aring
  Cc: kuznet, yoshfuji, kuba, netdev, andrea.mayer, dav.lebrun, mcr, stefan

From: Alexander Aring <alex.aring@gmail.com>
Date: Tue,  4 Feb 2020 12:30:18 -0500

> diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
> index ab7f124ff5d7..5b6e88f16e2d 100644
> --- a/net/ipv6/seg6_iptunnel.c
> +++ b/net/ipv6/seg6_iptunnel.c
> @@ -449,8 +449,6 @@ static int seg6_build_state(struct nlattr *nla,
>  	if (tuninfo->mode != SEG6_IPTUN_MODE_L2ENCAP)
>  		newts->flags |= LWTUNNEL_STATE_OUTPUT_REDIRECT;
>  
> -	newts->headroom = seg6_lwt_headroom(tuninfo);
> -
>  	*ts = newts;
>  
>  	return 0;

Even if this change is correct, you are eliminating the one and only
user of seg6_lwt_headroom() so you would have to kill that in this
patch as well.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net 0/2] net: ipv6: seg6: headroom fixes
  2020-02-05 12:21 ` [PATCH net 0/2] net: ipv6: seg6: headroom fixes Michael Richardson
@ 2020-02-06 18:22   ` Alexander Aring
  0 siblings, 0 replies; 7+ messages in thread
From: Alexander Aring @ 2020-02-06 18:22 UTC (permalink / raw)
  To: Michael Richardson; +Cc: netdev, roopa, dav.lebrun

Hi,

Cc: Roopa Prabhu <roopa@cumulusnetworks.com>

as she introduced the mentioned patch about the headroom value in
lwtunnel.

On Wed, Feb 05, 2020 at 01:21:37PM +0100, Michael Richardson wrote:
> 
> Alexander Aring <alex.aring@gmail.com> wrote:
>     > This patch series fixes issues which I discovered while implementing RPL
>     > source routing for 6LoWPAN interfaces. 6LoWPAN interfaces are using a MTU
>     > of 1280 which is the IPv6 minimum MTU. I suppose this is the right fix to
>     > do that according to my explanation that tunnels which acting before L3
>     > need to set this headroom. So far I see only segmentation route is affected
>     > to it. Maybe BPF tunnels, but it depends on the case... Maybe a comment
>     > need to be added there as well to not getting confused. If wanted I can
>     > send another patch for a comment for net-next or even net? May the
>     > variable should be renamed to l2_headroom?
> 
> I had discussed this with Alex over the past few days.
> I had not looked closely at the code during that discussion, and maybe my
> comments in chat were wrong.  So these patches don't look right to me.
> 
> I think that the issue we have here is that things are big vague when it
> comes to layer-2.5's, and fatter layer-3s.  Maybe this is well established in
> lore...
> 

Yes, we know:

MPLS: Before Layer 3
SEG6: In Layer 3

To also keep in mind that the Linux interface MTU is a Layer 3 MTU.

Now what I think is going on here is that the ip6_route.c implementation
keeps track of "per destination MTU" [0] some lines below we will hit
the -EINVAL case [1] of sendmsg() because it doesn't fit into IPv6 min MTU
anymore (May EMSGSIZE) would be more correct here? Man pages says:

If  the  message  is too long to pass atomically through the underlying
protocol, the error EMSGSIZE is returned, and the message is not transmitted.

The whole _interface_ mtu per destination will be calculated during
runtime by a callback in ip_route.c [2]. The patch I mentioned in 1/2 by
commit 14972cbd34ff ("net: lwtunnel: Handle fragmentation") changed the
line so that the tunnel headroom value get respected for the tunnel
destination whatever you specified for:

return mtu - lwtunnel_headroom(dst->lwtstate, mtu);

As I see this MTU is the same considered as the interface MTU which is
Layer 3. It is correct to respected the size for tunnels before Layer 3
but not in Layer 3 or afterwards.

> My understanding is that headroom is a general offset, usually set by the L2
> which tells the L3/L4 how much to offset in the SKB before the ULP header is
> inserted.   TCP/UDP/SCTP/ESP need to know this.
> 

I am not sure if I understand that correctly. In receive path the skb is
handeld to the next layer to layer and even set some offset like
mac_header, network_header and transport_header. Each tunnel can/must?
need to set these offsets again after parsing.

In transmit, the skb will be framented at the lower layer if necessary.

So _far_ I see the headroom value takes only place in [2] but this gets
somehow hidden in some callback. That it is in route.c smells for me
it's only used in this context.

I could be wrong here because this callback is hard to track. But the
check of [1] and getting this value by [0] which invokes the callback of
[2] it is wrong to calculate the MTU by doing:

$INTERFACE_L3_MTU - $ROUTE_HEADER_IN_L3

but it is correct by doing (which is MPLS):

$INTERFACE_L3_MTU - $ROUTE_HEADER_BEFORE_L3

I am sure this is wrong, may there could be more additional side
effects which I have not tracked yet and to fix that the original
authors should mention why they set the headroom value.

> MPLS is a layer-2.5, and so it quite weird, because it creates a new L2
> which lives upon other L2 and also other L3s.
> 
> Segment routing, and RPL RH3 headers involve a fatter L3 header.
> 
> Of course, one could mix all of these things together!
>

As I see the headroom value for tunnels in Layer 3 should always be
zero. It's getting complicated when you have a tunnel before Layer 3 and
the length is determined during runtime... the whole code can only
handle "for this route this length" and the length need to be set at
build time. I don't see issues here for RPL RH3 (I think) except we need
to set the headroom value where in our case the length of it is really
determined during runtime because destination address.

- Alex

[0] https://elixir.bootlin.com/linux/v5.5.1/source/net/ipv6/ip6_output.c#L1285
[1] https://elixir.bootlin.com/linux/v5.5.1/source/net/ipv6/ip6_output.c#L1295
[2] https://elixir.bootlin.com/linux/v5.5.1/source/net/ipv6/route.c#L3063

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net 1/2] net: ipv6: seg6_iptunnel: set tunnel headroom to zero
  2020-02-06 12:54   ` David Miller
@ 2020-02-08 17:34     ` Alexander Aring
  0 siblings, 0 replies; 7+ messages in thread
From: Alexander Aring @ 2020-02-08 17:34 UTC (permalink / raw)
  To: David Miller
  Cc: kuznet, yoshfuji, kuba, netdev, andrea.mayer, dav.lebrun, mcr, stefan

Hi,

On Thu, Feb 06, 2020 at 01:54:18PM +0100, David Miller wrote:
> From: Alexander Aring <alex.aring@gmail.com>
> Date: Tue,  4 Feb 2020 12:30:18 -0500
> 
> > diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
> > index ab7f124ff5d7..5b6e88f16e2d 100644
> > --- a/net/ipv6/seg6_iptunnel.c
> > +++ b/net/ipv6/seg6_iptunnel.c
> > @@ -449,8 +449,6 @@ static int seg6_build_state(struct nlattr *nla,
> >  	if (tuninfo->mode != SEG6_IPTUN_MODE_L2ENCAP)
> >  		newts->flags |= LWTUNNEL_STATE_OUTPUT_REDIRECT;
> >  
> > -	newts->headroom = seg6_lwt_headroom(tuninfo);
> > -
> >  	*ts = newts;
> >  
> >  	return 0;
> 
> Even if this change is correct, you are eliminating the one and only
> user of seg6_lwt_headroom() so you would have to kill that in this
> patch as well.

Okay, this is in include/uapi but surrounding by __KERNEL__ so I guess
it's still okay to remove it?

btw: why it is not static in seg6_iptunnel.c then?

Anyway I will wait until I hear something back what the use of headroom
exactly is and why the original authors of segmentation routing sets it.
In my case it will simple not work with a IPv6 min mtu so I will set it
to zero for possible net-next patches.

- Alex

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-02-08 17:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-04 17:30 [PATCH net 0/2] net: ipv6: seg6: headroom fixes Alexander Aring
2020-02-04 17:30 ` [PATCH net 1/2] net: ipv6: seg6_iptunnel: set tunnel headroom to zero Alexander Aring
2020-02-06 12:54   ` David Miller
2020-02-08 17:34     ` Alexander Aring
2020-02-04 17:30 ` [PATCH net 2/2] net: ipv6: seg6_local: don't set headroom Alexander Aring
2020-02-05 12:21 ` [PATCH net 0/2] net: ipv6: seg6: headroom fixes Michael Richardson
2020-02-06 18:22   ` Alexander Aring

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.