netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] ipv4: Fix tos mask in inet_rtm_getroute()
@ 2020-11-26 18:09 Guillaume Nault
  2020-11-28 17:03 ` David Ahern
  0 siblings, 1 reply; 5+ messages in thread
From: Guillaume Nault @ 2020-11-26 18:09 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski; +Cc: netdev, David Ahern, Russell Strong

When inet_rtm_getroute() was converted to use the RCU variants of
ip_route_input() and ip_route_output_key(), the TOS parameters
stopped being masked with IPTOS_RT_MASK before doing the route lookup.

As a result, "ip route get" can return a different route than what
would be used when sending real packets.

For example:

    $ ip route add 192.0.2.11/32 dev eth0
    $ ip route add unreachable 192.0.2.11/32 tos 2
    $ ip route get 192.0.2.11 tos 2
    RTNETLINK answers: No route to host

But, packets with TOS 2 (ECT(0) if interpreted as an ECN bit) would
actually be routed using the first route:

    $ ping -c 1 -Q 2 192.0.2.11
    PING 192.0.2.11 (192.0.2.11) 56(84) bytes of data.
    64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.173 ms

    --- 192.0.2.11 ping statistics ---
    1 packets transmitted, 1 received, 0% packet loss, time 0ms
    rtt min/avg/max/mdev = 0.173/0.173/0.173/0.000 ms

This patch re-applies IPTOS_RT_MASK in inet_rtm_getroute(), to
return results consistent with real route lookups.

Fixes: 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 net/ipv4/route.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index dc2a399cd9f4..9f43abeac3a8 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -3222,7 +3222,7 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh,
 
 	fl4.daddr = dst;
 	fl4.saddr = src;
-	fl4.flowi4_tos = rtm->rtm_tos;
+	fl4.flowi4_tos = rtm->rtm_tos & IPTOS_RT_MASK;
 	fl4.flowi4_oif = tb[RTA_OIF] ? nla_get_u32(tb[RTA_OIF]) : 0;
 	fl4.flowi4_mark = mark;
 	fl4.flowi4_uid = uid;
@@ -3246,8 +3246,9 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh,
 		fl4.flowi4_iif = iif; /* for rt_fill_info */
 		skb->dev	= dev;
 		skb->mark	= mark;
-		err = ip_route_input_rcu(skb, dst, src, rtm->rtm_tos,
-					 dev, &res);
+		err = ip_route_input_rcu(skb, dst, src,
+					 rtm->rtm_tos & IPTOS_RT_MASK, dev,
+					 &res);
 
 		rt = skb_rtable(skb);
 		if (err == 0 && rt->dst.error)
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: Fix tos mask in inet_rtm_getroute()
  2020-11-26 18:09 [PATCH net] ipv4: Fix tos mask in inet_rtm_getroute() Guillaume Nault
@ 2020-11-28 17:03 ` David Ahern
  2020-11-28 21:17   ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: David Ahern @ 2020-11-28 17:03 UTC (permalink / raw)
  To: Guillaume Nault, David Miller, Jakub Kicinski; +Cc: netdev, Russell Strong

On 11/26/20 11:09 AM, Guillaume Nault wrote:
> When inet_rtm_getroute() was converted to use the RCU variants of
> ip_route_input() and ip_route_output_key(), the TOS parameters
> stopped being masked with IPTOS_RT_MASK before doing the route lookup.
> 
> As a result, "ip route get" can return a different route than what
> would be used when sending real packets.
> 
> For example:
> 
>     $ ip route add 192.0.2.11/32 dev eth0
>     $ ip route add unreachable 192.0.2.11/32 tos 2
>     $ ip route get 192.0.2.11 tos 2
>     RTNETLINK answers: No route to host
> 
> But, packets with TOS 2 (ECT(0) if interpreted as an ECN bit) would
> actually be routed using the first route:
> 
>     $ ping -c 1 -Q 2 192.0.2.11
>     PING 192.0.2.11 (192.0.2.11) 56(84) bytes of data.
>     64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.173 ms
> 
>     --- 192.0.2.11 ping statistics ---
>     1 packets transmitted, 1 received, 0% packet loss, time 0ms
>     rtt min/avg/max/mdev = 0.173/0.173/0.173/0.000 ms
> 
> This patch re-applies IPTOS_RT_MASK in inet_rtm_getroute(), to
> return results consistent with real route lookups.
> 
> Fixes: 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
> Signed-off-by: Guillaume Nault <gnault@redhat.com>
> ---
>  net/ipv4/route.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: Fix tos mask in inet_rtm_getroute()
  2020-11-28 17:03 ` David Ahern
@ 2020-11-28 21:17   ` Jakub Kicinski
  2020-11-29 12:54     ` Guillaume Nault
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2020-11-28 21:17 UTC (permalink / raw)
  To: David Ahern; +Cc: Guillaume Nault, David Miller, netdev, Russell Strong

On Sat, 28 Nov 2020 10:03:42 -0700 David Ahern wrote:
> On 11/26/20 11:09 AM, Guillaume Nault wrote:
> > When inet_rtm_getroute() was converted to use the RCU variants of
> > ip_route_input() and ip_route_output_key(), the TOS parameters
> > stopped being masked with IPTOS_RT_MASK before doing the route lookup.
> > 
> > As a result, "ip route get" can return a different route than what
> > would be used when sending real packets.
> > 
> > For example:
> > 
> >     $ ip route add 192.0.2.11/32 dev eth0
> >     $ ip route add unreachable 192.0.2.11/32 tos 2
> >     $ ip route get 192.0.2.11 tos 2
> >     RTNETLINK answers: No route to host
> > 
> > But, packets with TOS 2 (ECT(0) if interpreted as an ECN bit) would
> > actually be routed using the first route:
> > 
> >     $ ping -c 1 -Q 2 192.0.2.11
> >     PING 192.0.2.11 (192.0.2.11) 56(84) bytes of data.
> >     64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.173 ms
> > 
> >     --- 192.0.2.11 ping statistics ---
> >     1 packets transmitted, 1 received, 0% packet loss, time 0ms
> >     rtt min/avg/max/mdev = 0.173/0.173/0.173/0.000 ms
> > 
> > This patch re-applies IPTOS_RT_MASK in inet_rtm_getroute(), to
> > return results consistent with real route lookups.
> > 
> > Fixes: 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
> > Signed-off-by: Guillaume Nault <gnault@redhat.com>
> 
> Reviewed-by: David Ahern <dsahern@kernel.org>

Applied, thanks!

Should the discrepancy between the behavior of ip_route_input_rcu() and
ip_route_input() be addressed, possibly? 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: Fix tos mask in inet_rtm_getroute()
  2020-11-28 21:17   ` Jakub Kicinski
@ 2020-11-29 12:54     ` Guillaume Nault
  2020-11-30 16:51       ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Guillaume Nault @ 2020-11-29 12:54 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: David Ahern, David Miller, netdev, Russell Strong

On Sat, Nov 28, 2020 at 01:17:16PM -0800, Jakub Kicinski wrote:
> On Sat, 28 Nov 2020 10:03:42 -0700 David Ahern wrote:
> > On 11/26/20 11:09 AM, Guillaume Nault wrote:
> > > When inet_rtm_getroute() was converted to use the RCU variants of
> > > ip_route_input() and ip_route_output_key(), the TOS parameters
> > > stopped being masked with IPTOS_RT_MASK before doing the route lookup.
> > > 
> > > As a result, "ip route get" can return a different route than what
> > > would be used when sending real packets.
> > > 
> > > For example:
> > > 
> > >     $ ip route add 192.0.2.11/32 dev eth0
> > >     $ ip route add unreachable 192.0.2.11/32 tos 2
> > >     $ ip route get 192.0.2.11 tos 2
> > >     RTNETLINK answers: No route to host
> > > 
> > > But, packets with TOS 2 (ECT(0) if interpreted as an ECN bit) would
> > > actually be routed using the first route:
> > > 
> > >     $ ping -c 1 -Q 2 192.0.2.11
> > >     PING 192.0.2.11 (192.0.2.11) 56(84) bytes of data.
> > >     64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.173 ms
> > > 
> > >     --- 192.0.2.11 ping statistics ---
> > >     1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > >     rtt min/avg/max/mdev = 0.173/0.173/0.173/0.000 ms
> > > 
> > > This patch re-applies IPTOS_RT_MASK in inet_rtm_getroute(), to
> > > return results consistent with real route lookups.
> > > 
> > > Fixes: 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
> > > Signed-off-by: Guillaume Nault <gnault@redhat.com>
> > 
> > Reviewed-by: David Ahern <dsahern@kernel.org>
> 
> Applied, thanks!
> 
> Should the discrepancy between the behavior of ip_route_input_rcu() and
> ip_route_input() be addressed, possibly?

Do you mean masking TOS with IPTOS_RT_MASK directly in
ip_route_input_rcu(), instead of in the callers?

After this patch, all callers apply IPTOS_RT_MASK before calling
ip_route_input_rcu(). So, yes, that could be easily consolidated there,
and I'll do that after net merges into net-next.

More generally, my long term plan is indeed to do mask the TOS in
central places, to get consistent behaviour across the networking
stack. However, generally speaking, I need to be careful not to break
any established behaviour.

I'm mostly worried about the ECN bits. I guess that any caller that
doesn't mask these bits has a bug (as that may break ECN, which is
there since a long time). However, there are many code paths to audit
before we can be sure.

The end goal is to fully support DSCP. Once we'll be sure that no
code path can possibly intreprete an ECN bit as TOS, we'll can safely
drop all those obsolete TOS* masks and macros from the kernel code and
simply mask out the ECN bits (thus preserving the whole DSCP space).

Please note that this is background work for me. Expect slow (but
hopefully regular) progress from me.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: Fix tos mask in inet_rtm_getroute()
  2020-11-29 12:54     ` Guillaume Nault
@ 2020-11-30 16:51       ` Jakub Kicinski
  0 siblings, 0 replies; 5+ messages in thread
From: Jakub Kicinski @ 2020-11-30 16:51 UTC (permalink / raw)
  To: Guillaume Nault; +Cc: David Ahern, David Miller, netdev, Russell Strong

On Sun, 29 Nov 2020 13:54:16 +0100 Guillaume Nault wrote:
> On Sat, Nov 28, 2020 at 01:17:16PM -0800, Jakub Kicinski wrote:
> > On Sat, 28 Nov 2020 10:03:42 -0700 David Ahern wrote:  
> > > On 11/26/20 11:09 AM, Guillaume Nault wrote:  
> > > > When inet_rtm_getroute() was converted to use the RCU variants of
> > > > ip_route_input() and ip_route_output_key(), the TOS parameters
> > > > stopped being masked with IPTOS_RT_MASK before doing the route lookup.
> > > > 
> > > > As a result, "ip route get" can return a different route than what
> > > > would be used when sending real packets.
> > > > 
> > > > For example:
> > > > 
> > > >     $ ip route add 192.0.2.11/32 dev eth0
> > > >     $ ip route add unreachable 192.0.2.11/32 tos 2
> > > >     $ ip route get 192.0.2.11 tos 2
> > > >     RTNETLINK answers: No route to host
> > > > 
> > > > But, packets with TOS 2 (ECT(0) if interpreted as an ECN bit) would
> > > > actually be routed using the first route:
> > > > 
> > > >     $ ping -c 1 -Q 2 192.0.2.11
> > > >     PING 192.0.2.11 (192.0.2.11) 56(84) bytes of data.
> > > >     64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.173 ms
> > > > 
> > > >     --- 192.0.2.11 ping statistics ---
> > > >     1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > > >     rtt min/avg/max/mdev = 0.173/0.173/0.173/0.000 ms
> > > > 
> > > > This patch re-applies IPTOS_RT_MASK in inet_rtm_getroute(), to
> > > > return results consistent with real route lookups.
> > > > 
> > > > Fixes: 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
> > > > Signed-off-by: Guillaume Nault <gnault@redhat.com>  
> > > 
> > > Reviewed-by: David Ahern <dsahern@kernel.org>  
> > 
> > Applied, thanks!
> > 
> > Should the discrepancy between the behavior of ip_route_input_rcu() and
> > ip_route_input() be addressed, possibly?  
> 
> Do you mean masking TOS with IPTOS_RT_MASK directly in
> ip_route_input_rcu(), instead of in the callers?
> 
> After this patch, all callers apply IPTOS_RT_MASK before calling
> ip_route_input_rcu(). So, yes, that could be easily consolidated there,
> and I'll do that after net merges into net-next.
> 
> More generally, my long term plan is indeed to do mask the TOS in
> central places, to get consistent behaviour across the networking
> stack. However, generally speaking, I need to be careful not to break
> any established behaviour.
> 
> I'm mostly worried about the ECN bits. I guess that any caller that
> doesn't mask these bits has a bug (as that may break ECN, which is
> there since a long time). However, there are many code paths to audit
> before we can be sure.
> 
> The end goal is to fully support DSCP. Once we'll be sure that no
> code path can possibly intreprete an ECN bit as TOS, we'll can safely
> drop all those obsolete TOS* masks and macros from the kernel code and
> simply mask out the ECN bits (thus preserving the whole DSCP space).

Sounds great!

> Please note that this is background work for me. Expect slow (but
> hopefully regular) progress from me.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-30 16:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-26 18:09 [PATCH net] ipv4: Fix tos mask in inet_rtm_getroute() Guillaume Nault
2020-11-28 17:03 ` David Ahern
2020-11-28 21:17   ` Jakub Kicinski
2020-11-29 12:54     ` Guillaume Nault
2020-11-30 16:51       ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).