All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices
@ 2021-06-25 13:32 Guillaume Nault
  2021-06-25 13:33 ` [PATCH net-next 1/6] bareudp: allow redirecting bareudp packets to eth devices Guillaume Nault
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Guillaume Nault @ 2021-06-25 13:32 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski
  Cc: netdev, Hideaki YOSHIFUJI, David Ahern, Simon Horman,
	Martin Varghese, Eli Cohen, Jiri Benc, Tom Herbert,
	Pablo Neira Ayuso, Harald Welte, Andreas Schultz, Jonas Bonn

Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode),
reset the MAC header pointer after they parsed the outer headers. This
accurately reflects the fact that the decapsulated packet is pure L3
packet, as that makes the MAC header 0 bytes long (the MAC and network
header pointers are equal).

However, many L3 devices only adjust the network header after
decapsulation and leave the MAC header pointer to its original value.
This can confuse other parts of the networking stack, like TC, which
then considers the outer headers as one big MAC header.

This patch series makes the following L3 tunnels behave like VXLAN-GPE:
bareudp, ipip, sit, gre, ip6gre, ip6tnl, gtp.

The case of gre is a bit special. It already resets the MAC header
pointer in collect_md mode, so only the classical mode needs to be
adjusted. However, gre also has a special case that expects the MAC
header pointer to keep pointing to the outer header even after
decapsulation. Therefore, patch 4 keeps an exception for this case.

Ideally, we'd centralise the call to skb_reset_mac_header() in
ip_tunnel_rcv(), to avoid manual calls in ipip (patch 2),
sit (patch 3) and gre (patch 4). That's unfortunately not feasible
currently, because of the gre special case discussed above that
precludes us from resetting the MAC header unconditionally.

The original motivation is to redirect bareudp packets to Ethernet
devices (as described in patch 1). The rest of this series aims at
bringing consistency across all L3 devices (apart from gre's special
case unfortunately).

Note: the gtp patch results from pure code inspection and has been
compiled tested only.


Guillaume Nault (6):
  bareudp: allow redirecting bareudp packets to eth devices
  ipip: allow redirecting ipip and mplsip packets to eth devices
  sit: allow redirecting ip6ip, ipip and mplsip packets to eth devices
  gre: let mac_header point to outer header only when necessary
  ip6_tunnel: allow redirecting ip6gre and ipxip6 packets to eth devices
  gtp: reset mac_header after decap

 drivers/net/bareudp.c | 1 +
 drivers/net/gtp.c     | 1 +
 net/ipv4/ip_gre.c     | 7 ++++++-
 net/ipv4/ipip.c       | 2 ++
 net/ipv6/ip6_tunnel.c | 1 +
 net/ipv6/sit.c        | 4 ++++
 6 files changed, 15 insertions(+), 1 deletion(-)

-- 
2.21.3


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net-next 1/6] bareudp: allow redirecting bareudp packets to eth devices
  2021-06-25 13:32 [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices Guillaume Nault
@ 2021-06-25 13:33 ` Guillaume Nault
  2021-06-25 13:33 ` [PATCH net-next 2/6] ipip: allow redirecting ipip and mplsip " Guillaume Nault
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Guillaume Nault @ 2021-06-25 13:33 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski; +Cc: netdev, Martin Varghese, Eli Cohen

Even though bareudp transports L3 data (typically IP or MPLS), it needs
to reset the mac_header pointer, so that other parts of the stack don't
mistakenly access the outer header after the packet has been
decapsulated.

This allows to push an Ethernet header to bareudp packets and redirect
them to an Ethernet device:

  $ tc filter add dev bareudp0 ingress matchall      \
      action vlan push_eth dst_mac 00:00:5e:00:53:01 \
                           src_mac 00:00:5e:00:53:00 \
      action mirred egress redirect dev eth0

Without this patch, push_eth refuses to add an ethernet header because
the skb appears to already have a MAC header.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 drivers/net/bareudp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c
index edfad93e7b68..a7ee0af1af90 100644
--- a/drivers/net/bareudp.c
+++ b/drivers/net/bareudp.c
@@ -133,6 +133,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
 	skb->dev = bareudp->dev;
 	oiph = skb_network_header(skb);
 	skb_reset_network_header(skb);
+	skb_reset_mac_header(skb);
 
 	if (!IS_ENABLED(CONFIG_IPV6) || family == AF_INET)
 		err = IP_ECN_decapsulate(oiph, skb);
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 2/6] ipip: allow redirecting ipip and mplsip packets to eth devices
  2021-06-25 13:32 [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices Guillaume Nault
  2021-06-25 13:33 ` [PATCH net-next 1/6] bareudp: allow redirecting bareudp packets to eth devices Guillaume Nault
@ 2021-06-25 13:33 ` Guillaume Nault
  2021-06-25 13:33 ` [PATCH net-next 3/6] sit: allow redirecting ip6ip, " Guillaume Nault
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Guillaume Nault @ 2021-06-25 13:33 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski
  Cc: netdev, Hideaki YOSHIFUJI, David Ahern, Simon Horman

Even though ipip transports IPv4 or MPLS packets, it needs to reset the
mac_header pointer, so that other parts of the stack don't mistakenly
access the outer header after the packet has been decapsulated.

This allows to push an Ethernet header to ipip or mplsip packets and
redirect them to an Ethernet device:

  $ tc filter add dev ipip0 ingress matchall         \
      action vlan push_eth dst_mac 00:00:5e:00:53:01 \
                           src_mac 00:00:5e:00:53:00 \
      action mirred egress redirect dev eth0

Without this patch, push_eth refuses to add an ethernet header because
the skb appears to already have a MAC header.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 net/ipv4/ipip.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index d5bfa087c23a..266c65577ba6 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -242,6 +242,8 @@ static int ipip_tunnel_rcv(struct sk_buff *skb, u8 ipproto)
 			if (!tun_dst)
 				return 0;
 		}
+		skb_reset_mac_header(skb);
+
 		return ip_tunnel_rcv(tunnel, skb, tpi, tun_dst, log_ecn_error);
 	}
 
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 3/6] sit: allow redirecting ip6ip, ipip and mplsip packets to eth devices
  2021-06-25 13:32 [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices Guillaume Nault
  2021-06-25 13:33 ` [PATCH net-next 1/6] bareudp: allow redirecting bareudp packets to eth devices Guillaume Nault
  2021-06-25 13:33 ` [PATCH net-next 2/6] ipip: allow redirecting ipip and mplsip " Guillaume Nault
@ 2021-06-25 13:33 ` Guillaume Nault
  2021-06-25 13:33 ` [PATCH net-next 4/6] gre: let mac_header point to outer header only when necessary Guillaume Nault
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Guillaume Nault @ 2021-06-25 13:33 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski
  Cc: netdev, Hideaki YOSHIFUJI, David Ahern, Simon Horman

Even though sit transports L3 data (IPv6, IPv4 or MPLS) packets, it
needs to reset the mac_header pointer, so that other parts of the stack
don't mistakenly access the outer header after the packet has been
decapsulated. There are two rx handlers to modify: ipip6_rcv() for the
ip6ip mode and sit_tunnel_rcv() which is used to re-implement the ipip
and mplsip modes of ipip.ko.

This allows to push an Ethernet header to sit packets and redirect
them to an Ethernet device:

  $ tc filter add dev sit0 ingress matchall          \
      action vlan push_eth dst_mac 00:00:5e:00:53:01 \
                           src_mac 00:00:5e:00:53:00 \
      action mirred egress redirect dev eth0

Without this patch, push_eth refuses to add an ethernet header because
the skb appears to already have a MAC header.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 net/ipv6/sit.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index e0a39b0bb4c1..df5bea818410 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -710,6 +710,8 @@ static int ipip6_rcv(struct sk_buff *skb)
 		 * old iph is no longer valid
 		 */
 		iph = (const struct iphdr *)skb_mac_header(skb);
+		skb_reset_mac_header(skb);
+
 		err = IP_ECN_decapsulate(iph, skb);
 		if (unlikely(err)) {
 			if (log_ecn_error)
@@ -780,6 +782,8 @@ static int sit_tunnel_rcv(struct sk_buff *skb, u8 ipproto)
 			tpi = &ipip_tpi;
 		if (iptunnel_pull_header(skb, 0, tpi->proto, false))
 			goto drop;
+		skb_reset_mac_header(skb);
+
 		return ip_tunnel_rcv(tunnel, skb, tpi, NULL, log_ecn_error);
 	}
 
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 4/6] gre: let mac_header point to outer header only when necessary
  2021-06-25 13:32 [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices Guillaume Nault
                   ` (2 preceding siblings ...)
  2021-06-25 13:33 ` [PATCH net-next 3/6] sit: allow redirecting ip6ip, " Guillaume Nault
@ 2021-06-25 13:33 ` Guillaume Nault
  2021-06-25 13:33 ` [PATCH net-next 5/6] ip6_tunnel: allow redirecting ip6gre and ipxip6 packets to eth devices Guillaume Nault
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Guillaume Nault @ 2021-06-25 13:33 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski
  Cc: netdev, Hideaki YOSHIFUJI, David Ahern, Jiri Benc

Commit e271c7b4420d ("gre: do not keep the GRE header around in collect
medata mode") did reset the mac_header for the collect_md case. Let's
extend this behaviour to classical gre devices as well.

ipgre_header_parse() seems to be the only case that requires mac_header
to point to the outer header. We can detect this case accurately by
checking ->header_ops. For all other cases, we can reset mac_header.

This allows to push an Ethernet header to ipgre packets and redirect
them to an Ethernet device:

  $ tc filter add dev gre0 ingress matchall          \
      action vlan push_eth dst_mac 00:00:5e:00:53:01 \
                           src_mac 00:00:5e:00:53:00 \
      action mirred egress redirect dev eth0

Before this patch, this worked only for collect_md gre devices.
Now this works for regular gre devices as well. Only the special case
of gre devices that use ipgre_header_ops isn't supported.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 net/ipv4/ip_gre.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index a68bf4c6fe9b..12dca0c85f3c 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -107,6 +107,8 @@ module_param(log_ecn_error, bool, 0644);
 MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN");
 
 static struct rtnl_link_ops ipgre_link_ops __read_mostly;
+static const struct header_ops ipgre_header_ops;
+
 static int ipgre_tunnel_init(struct net_device *dev);
 static void erspan_build_header(struct sk_buff *skb,
 				u32 id, u32 index,
@@ -364,7 +366,10 @@ static int __ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi,
 					   raw_proto, false) < 0)
 			goto drop;
 
-		if (tunnel->dev->type != ARPHRD_NONE)
+		/* Special case for ipgre_header_parse(), which expects the
+		 * mac_header to point to the outer IP header.
+		 */
+		if (tunnel->dev->header_ops == &ipgre_header_ops)
 			skb_pop_mac_header(skb);
 		else
 			skb_reset_mac_header(skb);
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 5/6] ip6_tunnel: allow redirecting ip6gre and ipxip6 packets to eth devices
  2021-06-25 13:32 [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices Guillaume Nault
                   ` (3 preceding siblings ...)
  2021-06-25 13:33 ` [PATCH net-next 4/6] gre: let mac_header point to outer header only when necessary Guillaume Nault
@ 2021-06-25 13:33 ` Guillaume Nault
  2021-06-25 13:33 ` [PATCH net-next 6/6] gtp: reset mac_header after decap Guillaume Nault
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Guillaume Nault @ 2021-06-25 13:33 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski
  Cc: netdev, Hideaki YOSHIFUJI, David Ahern, Tom Herbert

Reset the mac_header pointer even when the tunnel transports only L3
data (in the ARPHRD_ETHER case, this is already done by eth_type_trans).
This prevents other parts of the stack from mistakenly accessing the
outer header after the packet has been decapsulated.

In practice, this allows to push an Ethernet header to ipip6, ip6ip6,
mplsip6 or ip6gre packets and redirect them to an Ethernet device:

  $ tc filter add dev ip6tnl0 ingress matchall       \
      action vlan push_eth dst_mac 00:00:5e:00:53:01 \
                           src_mac 00:00:5e:00:53:00 \
      action mirred egress redirect dev eth0

Without this patch, push_eth refuses to add an ethernet header because
the skb appears to already have a MAC header.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 net/ipv6/ip6_tunnel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 288bafded998..0b8a38687ce4 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -837,6 +837,7 @@ static int __ip6_tnl_rcv(struct ip6_tnl *tunnel, struct sk_buff *skb,
 		skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
 	} else {
 		skb->dev = tunnel->dev;
+		skb_reset_mac_header(skb);
 	}
 
 	skb_reset_network_header(skb);
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 6/6] gtp: reset mac_header after decap
  2021-06-25 13:32 [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices Guillaume Nault
                   ` (4 preceding siblings ...)
  2021-06-25 13:33 ` [PATCH net-next 5/6] ip6_tunnel: allow redirecting ip6gre and ipxip6 packets to eth devices Guillaume Nault
@ 2021-06-25 13:33 ` Guillaume Nault
  2021-06-26 17:50 ` [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices David Ahern
  2021-06-28 20:10 ` patchwork-bot+netdevbpf
  7 siblings, 0 replies; 14+ messages in thread
From: Guillaume Nault @ 2021-06-25 13:33 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski
  Cc: netdev, Pablo Neira Ayuso, Harald Welte, Andreas Schultz, Jonas Bonn

For consistency with other L3 tunnel devices, reset the mac_header
pointer after decapsulation. This makes the mac_header 0 bytes long,
thus making it clear that this skb has no mac_header.

Compile tested only.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 drivers/net/gtp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index 1c9023d47e00..30e0a10595a1 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -201,6 +201,7 @@ static int gtp_rx(struct pdp_ctx *pctx, struct sk_buff *skb,
 	 * calculate the transport header.
 	 */
 	skb_reset_network_header(skb);
+	skb_reset_mac_header(skb);
 
 	skb->dev = pctx->dev;
 
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices
  2021-06-25 13:32 [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices Guillaume Nault
                   ` (5 preceding siblings ...)
  2021-06-25 13:33 ` [PATCH net-next 6/6] gtp: reset mac_header after decap Guillaume Nault
@ 2021-06-26 17:50 ` David Ahern
  2021-06-26 20:53   ` Guillaume Nault
  2021-06-28 20:10 ` patchwork-bot+netdevbpf
  7 siblings, 1 reply; 14+ messages in thread
From: David Ahern @ 2021-06-26 17:50 UTC (permalink / raw)
  To: Guillaume Nault, David Miller, Jakub Kicinski
  Cc: netdev, Hideaki YOSHIFUJI, David Ahern, Simon Horman,
	Martin Varghese, Eli Cohen, Jiri Benc, Tom Herbert,
	Pablo Neira Ayuso, Harald Welte, Andreas Schultz, Jonas Bonn

On 6/25/21 7:32 AM, Guillaume Nault wrote:
> Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode),
> reset the MAC header pointer after they parsed the outer headers. This
> accurately reflects the fact that the decapsulated packet is pure L3
> packet, as that makes the MAC header 0 bytes long (the MAC and network
> header pointers are equal).
> 
> However, many L3 devices only adjust the network header after
> decapsulation and leave the MAC header pointer to its original value.
> This can confuse other parts of the networking stack, like TC, which
> then considers the outer headers as one big MAC header.
> 
> This patch series makes the following L3 tunnels behave like VXLAN-GPE:
> bareudp, ipip, sit, gre, ip6gre, ip6tnl, gtp.
> 
> The case of gre is a bit special. It already resets the MAC header
> pointer in collect_md mode, so only the classical mode needs to be
> adjusted. However, gre also has a special case that expects the MAC
> header pointer to keep pointing to the outer header even after
> decapsulation. Therefore, patch 4 keeps an exception for this case.
> 
> Ideally, we'd centralise the call to skb_reset_mac_header() in
> ip_tunnel_rcv(), to avoid manual calls in ipip (patch 2),
> sit (patch 3) and gre (patch 4). That's unfortunately not feasible
> currently, because of the gre special case discussed above that
> precludes us from resetting the MAC header unconditionally.

What about adding a flag to ip_tunnel indicating if it can be done (or
should not be done since doing it is the most common)?

> 
> The original motivation is to redirect bareudp packets to Ethernet
> devices (as described in patch 1). The rest of this series aims at
> bringing consistency across all L3 devices (apart from gre's special
> case unfortunately).

Can you add a selftests that covers the use cases you mention in the
commit logs?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices
  2021-06-26 17:50 ` [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices David Ahern
@ 2021-06-26 20:53   ` Guillaume Nault
  2021-06-27 15:56     ` David Ahern
  0 siblings, 1 reply; 14+ messages in thread
From: Guillaume Nault @ 2021-06-26 20:53 UTC (permalink / raw)
  To: David Ahern
  Cc: David Miller, Jakub Kicinski, netdev, Hideaki YOSHIFUJI,
	David Ahern, Simon Horman, Martin Varghese, Eli Cohen, Jiri Benc,
	Tom Herbert, Pablo Neira Ayuso, Harald Welte, Andreas Schultz,
	Jonas Bonn

On Sat, Jun 26, 2021 at 11:50:19AM -0600, David Ahern wrote:
> On 6/25/21 7:32 AM, Guillaume Nault wrote:
> > Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode),
> > reset the MAC header pointer after they parsed the outer headers. This
> > accurately reflects the fact that the decapsulated packet is pure L3
> > packet, as that makes the MAC header 0 bytes long (the MAC and network
> > header pointers are equal).
> > 
> > However, many L3 devices only adjust the network header after
> > decapsulation and leave the MAC header pointer to its original value.
> > This can confuse other parts of the networking stack, like TC, which
> > then considers the outer headers as one big MAC header.
> > 
> > This patch series makes the following L3 tunnels behave like VXLAN-GPE:
> > bareudp, ipip, sit, gre, ip6gre, ip6tnl, gtp.
> > 
> > The case of gre is a bit special. It already resets the MAC header
> > pointer in collect_md mode, so only the classical mode needs to be
> > adjusted. However, gre also has a special case that expects the MAC
> > header pointer to keep pointing to the outer header even after
> > decapsulation. Therefore, patch 4 keeps an exception for this case.
> > 
> > Ideally, we'd centralise the call to skb_reset_mac_header() in
> > ip_tunnel_rcv(), to avoid manual calls in ipip (patch 2),
> > sit (patch 3) and gre (patch 4). That's unfortunately not feasible
> > currently, because of the gre special case discussed above that
> > precludes us from resetting the MAC header unconditionally.
> 
> What about adding a flag to ip_tunnel indicating if it can be done (or
> should not be done since doing it is the most common)?

That's feasible. I didn't do it here because I wanted to keep the
patch series focused on L3 tunnels. Modifying ip_tunnel_rcv()'s
prototype would also require updating erspan_rcv(), which isn't L3
(erspan carries Ethernet frames). I was feeling such consolidation
would be best done in a follow up patch series.

I can repost if you feel strongly about it. Otherwise, I'll follow up
with the ip_tunnel_rcv() consolidation in a later patch. Just let me
know if you have any preference.

> > The original motivation is to redirect bareudp packets to Ethernet
> > devices (as described in patch 1). The rest of this series aims at
> > bringing consistency across all L3 devices (apart from gre's special
> > case unfortunately).
> 
> Can you add a selftests that covers the use cases you mention in the
> commit logs?

I'm already having a selftests for most of the tunnels (and their
different operating modes), gtp being the main exception. But it's not
yet ready for upstream, as I'm trying to move the topology in its own
.sh file, to keep the main selftests as simple as possible.
I'll post it as soon as I get it in good shape.

Thanks for the rewiew.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices
  2021-06-26 20:53   ` Guillaume Nault
@ 2021-06-27 15:56     ` David Ahern
  2021-06-28 15:04       ` Guillaume Nault
  0 siblings, 1 reply; 14+ messages in thread
From: David Ahern @ 2021-06-27 15:56 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: David Miller, Jakub Kicinski, netdev, Hideaki YOSHIFUJI,
	David Ahern, Simon Horman, Martin Varghese, Eli Cohen, Jiri Benc,
	Tom Herbert, Pablo Neira Ayuso, Harald Welte, Andreas Schultz,
	Jonas Bonn

On 6/26/21 2:53 PM, Guillaume Nault wrote:
> On Sat, Jun 26, 2021 at 11:50:19AM -0600, David Ahern wrote:
>> On 6/25/21 7:32 AM, Guillaume Nault wrote:
>>> Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode),
>>> reset the MAC header pointer after they parsed the outer headers. This
>>> accurately reflects the fact that the decapsulated packet is pure L3
>>> packet, as that makes the MAC header 0 bytes long (the MAC and network
>>> header pointers are equal).
>>>
>>> However, many L3 devices only adjust the network header after
>>> decapsulation and leave the MAC header pointer to its original value.
>>> This can confuse other parts of the networking stack, like TC, which
>>> then considers the outer headers as one big MAC header.
>>>
>>> This patch series makes the following L3 tunnels behave like VXLAN-GPE:
>>> bareudp, ipip, sit, gre, ip6gre, ip6tnl, gtp.
>>>
>>> The case of gre is a bit special. It already resets the MAC header
>>> pointer in collect_md mode, so only the classical mode needs to be
>>> adjusted. However, gre also has a special case that expects the MAC
>>> header pointer to keep pointing to the outer header even after
>>> decapsulation. Therefore, patch 4 keeps an exception for this case.
>>>
>>> Ideally, we'd centralise the call to skb_reset_mac_header() in
>>> ip_tunnel_rcv(), to avoid manual calls in ipip (patch 2),
>>> sit (patch 3) and gre (patch 4). That's unfortunately not feasible
>>> currently, because of the gre special case discussed above that
>>> precludes us from resetting the MAC header unconditionally.
>>
>> What about adding a flag to ip_tunnel indicating if it can be done (or
>> should not be done since doing it is the most common)?
> 
> That's feasible. I didn't do it here because I wanted to keep the
> patch series focused on L3 tunnels. Modifying ip_tunnel_rcv()'s
> prototype would also require updating erspan_rcv(), which isn't L3
> (erspan carries Ethernet frames). I was feeling such consolidation
> would be best done in a follow up patch series.

I was thinking a flag in 'struct ip_tunnel'. It's the private data for
those netdevices, so a per-instance setting. I haven't walked through
the details to know if it would work.

> 
> I can repost if you feel strongly about it. Otherwise, I'll follow up
> with the ip_tunnel_rcv() consolidation in a later patch. Just let me
> know if you have any preference.
> 
>>> The original motivation is to redirect bareudp packets to Ethernet
>>> devices (as described in patch 1). The rest of this series aims at
>>> bringing consistency across all L3 devices (apart from gre's special
>>> case unfortunately).
>>
>> Can you add a selftests that covers the use cases you mention in the
>> commit logs?
> 
> I'm already having a selftests for most of the tunnels (and their
> different operating modes), gtp being the main exception. But it's not
> yet ready for upstream, as I'm trying to move the topology in its own
> .sh file, to keep the main selftests as simple as possible.
> I'll post it as soon as I get it in good shape.
> 

That works. Thanks,

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices
  2021-06-27 15:56     ` David Ahern
@ 2021-06-28 15:04       ` Guillaume Nault
  2021-06-28 18:46         ` David Ahern
  0 siblings, 1 reply; 14+ messages in thread
From: Guillaume Nault @ 2021-06-28 15:04 UTC (permalink / raw)
  To: David Ahern
  Cc: David Miller, Jakub Kicinski, netdev, Hideaki YOSHIFUJI,
	David Ahern, Simon Horman, Martin Varghese, Eli Cohen, Jiri Benc,
	Tom Herbert, Pablo Neira Ayuso, Harald Welte, Andreas Schultz,
	Jonas Bonn

On Sun, Jun 27, 2021 at 09:56:53AM -0600, David Ahern wrote:
> On 6/26/21 2:53 PM, Guillaume Nault wrote:
> > On Sat, Jun 26, 2021 at 11:50:19AM -0600, David Ahern wrote:
> >> On 6/25/21 7:32 AM, Guillaume Nault wrote:
> >>> Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode),
> >>> reset the MAC header pointer after they parsed the outer headers. This
> >>> accurately reflects the fact that the decapsulated packet is pure L3
> >>> packet, as that makes the MAC header 0 bytes long (the MAC and network
> >>> header pointers are equal).
> >>>
> >>> However, many L3 devices only adjust the network header after
> >>> decapsulation and leave the MAC header pointer to its original value.
> >>> This can confuse other parts of the networking stack, like TC, which
> >>> then considers the outer headers as one big MAC header.
> >>>
> >>> This patch series makes the following L3 tunnels behave like VXLAN-GPE:
> >>> bareudp, ipip, sit, gre, ip6gre, ip6tnl, gtp.
> >>>
> >>> The case of gre is a bit special. It already resets the MAC header
> >>> pointer in collect_md mode, so only the classical mode needs to be
> >>> adjusted. However, gre also has a special case that expects the MAC
> >>> header pointer to keep pointing to the outer header even after
> >>> decapsulation. Therefore, patch 4 keeps an exception for this case.
> >>>
> >>> Ideally, we'd centralise the call to skb_reset_mac_header() in
> >>> ip_tunnel_rcv(), to avoid manual calls in ipip (patch 2),
> >>> sit (patch 3) and gre (patch 4). That's unfortunately not feasible
> >>> currently, because of the gre special case discussed above that
> >>> precludes us from resetting the MAC header unconditionally.
> >>
> >> What about adding a flag to ip_tunnel indicating if it can be done (or
> >> should not be done since doing it is the most common)?
> > 
> > That's feasible. I didn't do it here because I wanted to keep the
> > patch series focused on L3 tunnels. Modifying ip_tunnel_rcv()'s
> > prototype would also require updating erspan_rcv(), which isn't L3
> > (erspan carries Ethernet frames). I was feeling such consolidation
> > would be best done in a follow up patch series.
> 
> I was thinking a flag in 'struct ip_tunnel'. It's the private data for
> those netdevices, so a per-instance setting. I haven't walked through
> the details to know if it would work.

I didn't think about that. Good idea, that looks perfectly doable. But
I'd still prefer to centralise the skb_reset_mac_header() call in a
dedicated patch set. I we did it here, we'd have to not reset the mac
header by default, to guarantee that unrelated tunnels wouldn't be
affected.
However, I think that the default behaviour should be to reset the mac
header and that only special cases, like the one in ip_gre, should
explicitely turn that off. Therefore, we'd need a follow up patch
anyway, to change the way this "reset_mac" flag would be set.

IMHO, the current approach has the advantage of clearly separating the
new feature from the refactoring. But if you feel strongly about using
a flag in struct ip_tunnel, I can rework this series.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices
  2021-06-28 15:04       ` Guillaume Nault
@ 2021-06-28 18:46         ` David Ahern
  2021-06-28 20:40           ` Guillaume Nault
  0 siblings, 1 reply; 14+ messages in thread
From: David Ahern @ 2021-06-28 18:46 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: David Miller, Jakub Kicinski, netdev, Hideaki YOSHIFUJI,
	David Ahern, Simon Horman, Martin Varghese, Eli Cohen, Jiri Benc,
	Tom Herbert, Pablo Neira Ayuso, Harald Welte, Andreas Schultz,
	Jonas Bonn

On 6/28/21 9:04 AM, Guillaume Nault wrote:
> On Sun, Jun 27, 2021 at 09:56:53AM -0600, David Ahern wrote:
>> On 6/26/21 2:53 PM, Guillaume Nault wrote:
>>> On Sat, Jun 26, 2021 at 11:50:19AM -0600, David Ahern wrote:
>>>> On 6/25/21 7:32 AM, Guillaume Nault wrote:
>>>>> Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode),
>>>>> reset the MAC header pointer after they parsed the outer headers. This
>>>>> accurately reflects the fact that the decapsulated packet is pure L3
>>>>> packet, as that makes the MAC header 0 bytes long (the MAC and network
>>>>> header pointers are equal).
>>>>>
>>>>> However, many L3 devices only adjust the network header after
>>>>> decapsulation and leave the MAC header pointer to its original value.
>>>>> This can confuse other parts of the networking stack, like TC, which
>>>>> then considers the outer headers as one big MAC header.
>>>>>
>>>>> This patch series makes the following L3 tunnels behave like VXLAN-GPE:
>>>>> bareudp, ipip, sit, gre, ip6gre, ip6tnl, gtp.
>>>>>
>>>>> The case of gre is a bit special. It already resets the MAC header
>>>>> pointer in collect_md mode, so only the classical mode needs to be
>>>>> adjusted. However, gre also has a special case that expects the MAC
>>>>> header pointer to keep pointing to the outer header even after
>>>>> decapsulation. Therefore, patch 4 keeps an exception for this case.
>>>>>
>>>>> Ideally, we'd centralise the call to skb_reset_mac_header() in
>>>>> ip_tunnel_rcv(), to avoid manual calls in ipip (patch 2),
>>>>> sit (patch 3) and gre (patch 4). That's unfortunately not feasible
>>>>> currently, because of the gre special case discussed above that
>>>>> precludes us from resetting the MAC header unconditionally.
>>>>
>>>> What about adding a flag to ip_tunnel indicating if it can be done (or
>>>> should not be done since doing it is the most common)?
>>>
>>> That's feasible. I didn't do it here because I wanted to keep the
>>> patch series focused on L3 tunnels. Modifying ip_tunnel_rcv()'s
>>> prototype would also require updating erspan_rcv(), which isn't L3
>>> (erspan carries Ethernet frames). I was feeling such consolidation
>>> would be best done in a follow up patch series.
>>
>> I was thinking a flag in 'struct ip_tunnel'. It's the private data for
>> those netdevices, so a per-instance setting. I haven't walked through
>> the details to know if it would work.
> 
> I didn't think about that. Good idea, that looks perfectly doable. But
> I'd still prefer to centralise the skb_reset_mac_header() call in a
> dedicated patch set. I we did it here, we'd have to not reset the mac
> header by default, to guarantee that unrelated tunnels wouldn't be
> affected.
> However, I think that the default behaviour should be to reset the mac
> header and that only special cases, like the one in ip_gre, should
> explicitely turn that off. Therefore, we'd need a follow up patch
> anyway, to change the way this "reset_mac" flag would be set.
> 
> IMHO, the current approach has the advantage of clearly separating the
> new feature from the refactoring. But if you feel strongly about using
> a flag in struct ip_tunnel, I can rework this series.
> 

I am accustomed to doing refactoring first to add some new feature in
the simplest and clearest way.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices
  2021-06-25 13:32 [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices Guillaume Nault
                   ` (6 preceding siblings ...)
  2021-06-26 17:50 ` [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices David Ahern
@ 2021-06-28 20:10 ` patchwork-bot+netdevbpf
  7 siblings, 0 replies; 14+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-06-28 20:10 UTC (permalink / raw)
  To: Guillaume Nault
  Cc: davem, kuba, netdev, yoshfuji, dsahern, simon.horman,
	martin.varghese, elic, jbenc, tom, pablo, laforge, aschultz,
	jonas

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Fri, 25 Jun 2021 15:32:59 +0200 you wrote:
> Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode),
> reset the MAC header pointer after they parsed the outer headers. This
> accurately reflects the fact that the decapsulated packet is pure L3
> packet, as that makes the MAC header 0 bytes long (the MAC and network
> header pointers are equal).
> 
> However, many L3 devices only adjust the network header after
> decapsulation and leave the MAC header pointer to its original value.
> This can confuse other parts of the networking stack, like TC, which
> then considers the outer headers as one big MAC header.
> 
> [...]

Here is the summary with links:
  - [net-next,1/6] bareudp: allow redirecting bareudp packets to eth devices
    https://git.kernel.org/netdev/net-next/c/99c8719b7981
  - [net-next,2/6] ipip: allow redirecting ipip and mplsip packets to eth devices
    https://git.kernel.org/netdev/net-next/c/7ad136fd288c
  - [net-next,3/6] sit: allow redirecting ip6ip, ipip and mplsip packets to eth devices
    https://git.kernel.org/netdev/net-next/c/730eed2772e7
  - [net-next,4/6] gre: let mac_header point to outer header only when necessary
    https://git.kernel.org/netdev/net-next/c/aab1e898c26c
  - [net-next,5/6] ip6_tunnel: allow redirecting ip6gre and ipxip6 packets to eth devices
    https://git.kernel.org/netdev/net-next/c/da5a2e49f064
  - [net-next,6/6] gtp: reset mac_header after decap
    https://git.kernel.org/netdev/net-next/c/b2d898c8a523

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices
  2021-06-28 18:46         ` David Ahern
@ 2021-06-28 20:40           ` Guillaume Nault
  0 siblings, 0 replies; 14+ messages in thread
From: Guillaume Nault @ 2021-06-28 20:40 UTC (permalink / raw)
  To: David Ahern
  Cc: David Miller, Jakub Kicinski, netdev, Hideaki YOSHIFUJI,
	David Ahern, Simon Horman, Martin Varghese, Eli Cohen, Jiri Benc,
	Tom Herbert, Pablo Neira Ayuso, Harald Welte, Andreas Schultz,
	Jonas Bonn

On Mon, Jun 28, 2021 at 12:46:24PM -0600, David Ahern wrote:
> On 6/28/21 9:04 AM, Guillaume Nault wrote:
> > On Sun, Jun 27, 2021 at 09:56:53AM -0600, David Ahern wrote:
> >> On 6/26/21 2:53 PM, Guillaume Nault wrote:
> >>> On Sat, Jun 26, 2021 at 11:50:19AM -0600, David Ahern wrote:
> >>>> On 6/25/21 7:32 AM, Guillaume Nault wrote:
> >>>>> Some virtual L3 devices, like vxlan-gpe and gre (in collect_md mode),
> >>>>> reset the MAC header pointer after they parsed the outer headers. This
> >>>>> accurately reflects the fact that the decapsulated packet is pure L3
> >>>>> packet, as that makes the MAC header 0 bytes long (the MAC and network
> >>>>> header pointers are equal).
> >>>>>
> >>>>> However, many L3 devices only adjust the network header after
> >>>>> decapsulation and leave the MAC header pointer to its original value.
> >>>>> This can confuse other parts of the networking stack, like TC, which
> >>>>> then considers the outer headers as one big MAC header.
> >>>>>
> >>>>> This patch series makes the following L3 tunnels behave like VXLAN-GPE:
> >>>>> bareudp, ipip, sit, gre, ip6gre, ip6tnl, gtp.
> >>>>>
> >>>>> The case of gre is a bit special. It already resets the MAC header
> >>>>> pointer in collect_md mode, so only the classical mode needs to be
> >>>>> adjusted. However, gre also has a special case that expects the MAC
> >>>>> header pointer to keep pointing to the outer header even after
> >>>>> decapsulation. Therefore, patch 4 keeps an exception for this case.
> >>>>>
> >>>>> Ideally, we'd centralise the call to skb_reset_mac_header() in
> >>>>> ip_tunnel_rcv(), to avoid manual calls in ipip (patch 2),
> >>>>> sit (patch 3) and gre (patch 4). That's unfortunately not feasible
> >>>>> currently, because of the gre special case discussed above that
> >>>>> precludes us from resetting the MAC header unconditionally.
> >>>>
> >>>> What about adding a flag to ip_tunnel indicating if it can be done (or
> >>>> should not be done since doing it is the most common)?
> >>>
> >>> That's feasible. I didn't do it here because I wanted to keep the
> >>> patch series focused on L3 tunnels. Modifying ip_tunnel_rcv()'s
> >>> prototype would also require updating erspan_rcv(), which isn't L3
> >>> (erspan carries Ethernet frames). I was feeling such consolidation
> >>> would be best done in a follow up patch series.
> >>
> >> I was thinking a flag in 'struct ip_tunnel'. It's the private data for
> >> those netdevices, so a per-instance setting. I haven't walked through
> >> the details to know if it would work.
> > 
> > I didn't think about that. Good idea, that looks perfectly doable. But
> > I'd still prefer to centralise the skb_reset_mac_header() call in a
> > dedicated patch set. I we did it here, we'd have to not reset the mac
> > header by default, to guarantee that unrelated tunnels wouldn't be
> > affected.
> > However, I think that the default behaviour should be to reset the mac
> > header and that only special cases, like the one in ip_gre, should
> > explicitely turn that off. Therefore, we'd need a follow up patch
> > anyway, to change the way this "reset_mac" flag would be set.
> > 
> > IMHO, the current approach has the advantage of clearly separating the
> > new feature from the refactoring. But if you feel strongly about using
> > a flag in struct ip_tunnel, I can rework this series.
> > 
> 
> I am accustomed to doing refactoring first to add some new feature in
> the simplest and clearest way.

I agree on the general statement. Anyway, I've just received the
patchwork-bot message saying the series has been applied. I'll follow
up with the struct ip_tunnel flag once net-next reopens (as I guess it's
about to close).


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-06-28 20:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-25 13:32 [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices Guillaume Nault
2021-06-25 13:33 ` [PATCH net-next 1/6] bareudp: allow redirecting bareudp packets to eth devices Guillaume Nault
2021-06-25 13:33 ` [PATCH net-next 2/6] ipip: allow redirecting ipip and mplsip " Guillaume Nault
2021-06-25 13:33 ` [PATCH net-next 3/6] sit: allow redirecting ip6ip, " Guillaume Nault
2021-06-25 13:33 ` [PATCH net-next 4/6] gre: let mac_header point to outer header only when necessary Guillaume Nault
2021-06-25 13:33 ` [PATCH net-next 5/6] ip6_tunnel: allow redirecting ip6gre and ipxip6 packets to eth devices Guillaume Nault
2021-06-25 13:33 ` [PATCH net-next 6/6] gtp: reset mac_header after decap Guillaume Nault
2021-06-26 17:50 ` [PATCH net-next 0/6] net: reset MAC header consistently across L3 virtual devices David Ahern
2021-06-26 20:53   ` Guillaume Nault
2021-06-27 15:56     ` David Ahern
2021-06-28 15:04       ` Guillaume Nault
2021-06-28 18:46         ` David Ahern
2021-06-28 20:40           ` Guillaume Nault
2021-06-28 20:10 ` patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.