All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
@ 2020-05-08 23:42 Maciej Żenczykowski
  2020-05-09 19:15 ` Ido Schimmel
  0 siblings, 1 reply; 10+ messages in thread
From: Maciej Żenczykowski @ 2020-05-08 23:42 UTC (permalink / raw)
  To: Maciej Żenczykowski, David S . Miller
  Cc: Linux Network Development Mailing List

From: Maciej Żenczykowski <maze@google.com>

This makes 'ping' 'ping6' and icmp based traceroute no longer
require any suid or file capabilities.

These sockets have baked long enough that the restriction
to make them unusable by default is no longer necessary.

The concerns were around exploits.  However there are now
major distros that default to enabling this.

This is already the default on Fedora 31:
  [root@f31vm ~]# cat /proc/sys/net/ipv4/ping_group_range
  0       2147483647
  [root@f31vm ~]# cat /usr/lib/sysctl.d/50-default.conf | egrep -B6 ping_group_range
  # ping(8) without CAP_NET_ADMIN and CAP_NET_RAW
  # The upper limit is set to 2^31-1. Values greater than that get rejected by
  # the kernel because of this definition in linux/include/net/ping.h:
  #   #define GID_T_MAX (((gid_t)~0U) >> 1)
  # That's not so bad because values between 2^31 and 2^32-1 are reserved on
  # systemd-based systems anyway: https://systemd.io/UIDS-GIDS.html#summary
  -net.ipv4.ping_group_range = 0 2147483647

And in general is super useful for any network namespace container
based setup.  See for example: https://docs.docker.com/engine/security/rootless/

This is one less thing you need to configure when you creare a new network
namespace.

Before:
  vm:~# unshare -n
  vm:~# cat /proc/sys/net/ipv4/ping_group_range
  1       0

After:
  vm:~# unshare -n
  vm:~# cat /proc/sys/net/ipv4/ping_group_range
  0       2147483647

Signed-off-by: Maciej Żenczykowski <maze@google.com>
---
 net/ipv4/af_inet.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index cf58e29cf746..1a8cb6f3ee38 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1819,12 +1819,8 @@ static __net_init int inet_init_net(struct net *net)
 	net->ipv4.ip_local_ports.range[1] =  60999;
 
 	seqlock_init(&net->ipv4.ping_group_range.lock);
-	/*
-	 * Sane defaults - nobody may create ping sockets.
-	 * Boot scripts should set this to distro-specific group.
-	 */
-	net->ipv4.ping_group_range.range[0] = make_kgid(&init_user_ns, 1);
-	net->ipv4.ping_group_range.range[1] = make_kgid(&init_user_ns, 0);
+	net->ipv4.ping_group_range.range[0] = GLOBAL_ROOT_GID;
+	net->ipv4.ping_group_range.range[1] = KGIDT_INIT(0x7FFFFFFF);
 
 	/* Default values for sysctl-controlled parameters.
 	 * We set them here, in case sysctl is not compiled.
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
  2020-05-08 23:42 [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default Maciej Żenczykowski
@ 2020-05-09 19:15 ` Ido Schimmel
  2020-05-09 19:17   ` Maciej Żenczykowski
  2020-05-09 21:20   ` David Ahern
  0 siblings, 2 replies; 10+ messages in thread
From: Ido Schimmel @ 2020-05-09 19:15 UTC (permalink / raw)
  To: Maciej Żenczykowski, dsahern
  Cc: Maciej Żenczykowski, David S . Miller,
	Linux Network Development Mailing List

On Fri, May 08, 2020 at 04:42:23PM -0700, Maciej Żenczykowski wrote:
> From: Maciej Żenczykowski <maze@google.com>
> 
> This makes 'ping' 'ping6' and icmp based traceroute no longer
> require any suid or file capabilities.
> 
> These sockets have baked long enough that the restriction
> to make them unusable by default is no longer necessary.
> 
> The concerns were around exploits.  However there are now
> major distros that default to enabling this.
> 
> This is already the default on Fedora 31:
>   [root@f31vm ~]# cat /proc/sys/net/ipv4/ping_group_range
>   0       2147483647
>   [root@f31vm ~]# cat /usr/lib/sysctl.d/50-default.conf | egrep -B6 ping_group_range
>   # ping(8) without CAP_NET_ADMIN and CAP_NET_RAW
>   # The upper limit is set to 2^31-1. Values greater than that get rejected by
>   # the kernel because of this definition in linux/include/net/ping.h:
>   #   #define GID_T_MAX (((gid_t)~0U) >> 1)
>   # That's not so bad because values between 2^31 and 2^32-1 are reserved on
>   # systemd-based systems anyway: https://systemd.io/UIDS-GIDS.html#summary
>   -net.ipv4.ping_group_range = 0 2147483647
> 
> And in general is super useful for any network namespace container
> based setup.  See for example: https://docs.docker.com/engine/security/rootless/
> 
> This is one less thing you need to configure when you creare a new network
> namespace.
> 
> Before:
>   vm:~# unshare -n
>   vm:~# cat /proc/sys/net/ipv4/ping_group_range
>   1       0
> 
> After:
>   vm:~# unshare -n
>   vm:~# cat /proc/sys/net/ipv4/ping_group_range
>   0       2147483647
> 
> Signed-off-by: Maciej Żenczykowski <maze@google.com>

This will unfortunately cause regressions with VRFs because they don't
work correctly with ping sockets. Simple example:

ip link add name vrf-red up type vrf table 10
ip link add name vrf-blue up type vrf table 20
ip rule add pref 32765 table local
ip rule del pref 0

ip link add name veth-red type veth peer name veth-blue
ip link set dev veth-red up master vrf-red
ip link set dev veth-blue up master vrf-red
ip address add 192.0.2.1/24 dev veth-red
ip address add 192.0.2.2/24 dev veth-blue

ip vrf exec vrf-red ping -I 192.0.2.1 192.0.2.2 -c 1 -w 1

Before the patch:

PING 192.0.2.2 (192.0.2.2) from 192.0.2.1 : 56(84) bytes of data.
64 bytes from 192.0.2.2: icmp_seq=1 ttl=64 time=0.053 ms

After the patch:

bind: Cannot assign requested address

This specific test case is fixed by following patch, but at least a
similar change is needed for IPv6. Other changes might also be required.
Sorry for not providing a complete solution, but I'm busy with other
tasks at the moment. :/

diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 535427292194..8463b0e9e811 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -297,6 +297,7 @@ static int ping_check_bind_addr(struct sock *sk, struct inet_sock *isk,
        struct net *net = sock_net(sk);
        if (sk->sk_family == AF_INET) {
                struct sockaddr_in *addr = (struct sockaddr_in *) uaddr;
+               u32 tb_id = RT_TABLE_LOCAL;
                int chk_addr_ret;
 
                if (addr_len < sizeof(*addr))
@@ -310,7 +311,15 @@ static int ping_check_bind_addr(struct sock *sk, struct inet_sock *isk,
                pr_debug("ping_check_bind_addr(sk=%p,addr=%pI4,port=%d)\n",
                         sk, &addr->sin_addr.s_addr, ntohs(addr->sin_port));
 
-               chk_addr_ret = inet_addr_type(net, addr->sin_addr.s_addr);
+               if (sk->sk_bound_dev_if) {
+                       tb_id = l3mdev_fib_table_by_index(net,
+                                                         sk->sk_bound_dev_if);
+                       if (!tb_id)
+                               tb_id = RT_TABLE_LOCAL;
+               }
+
+               chk_addr_ret = inet_addr_type_table(net, addr->sin_addr.s_addr,
+                                                   tb_id);
 
                if (addr->sin_addr.s_addr == htonl(INADDR_ANY))
                        chk_addr_ret = RTN_LOCAL;

> ---
>  net/ipv4/af_inet.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index cf58e29cf746..1a8cb6f3ee38 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1819,12 +1819,8 @@ static __net_init int inet_init_net(struct net *net)
>  	net->ipv4.ip_local_ports.range[1] =  60999;
>  
>  	seqlock_init(&net->ipv4.ping_group_range.lock);
> -	/*
> -	 * Sane defaults - nobody may create ping sockets.
> -	 * Boot scripts should set this to distro-specific group.
> -	 */
> -	net->ipv4.ping_group_range.range[0] = make_kgid(&init_user_ns, 1);
> -	net->ipv4.ping_group_range.range[1] = make_kgid(&init_user_ns, 0);
> +	net->ipv4.ping_group_range.range[0] = GLOBAL_ROOT_GID;
> +	net->ipv4.ping_group_range.range[1] = KGIDT_INIT(0x7FFFFFFF);
>  
>  	/* Default values for sysctl-controlled parameters.
>  	 * We set them here, in case sysctl is not compiled.
> -- 
> 2.26.2.645.ge9eca65c58-goog
> 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
  2020-05-09 19:15 ` Ido Schimmel
@ 2020-05-09 19:17   ` Maciej Żenczykowski
  2020-05-09 19:32     ` Ido Schimmel
  2020-05-09 21:20   ` David Ahern
  1 sibling, 1 reply; 10+ messages in thread
From: Maciej Żenczykowski @ 2020-05-09 19:17 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: David Ahern, David S . Miller, Linux Network Development Mailing List

Argh.  I've never understood the faintest thing about VRF's.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
  2020-05-09 19:17   ` Maciej Żenczykowski
@ 2020-05-09 19:32     ` Ido Schimmel
  2020-05-09 21:09       ` David Ahern
  0 siblings, 1 reply; 10+ messages in thread
From: Ido Schimmel @ 2020-05-09 19:32 UTC (permalink / raw)
  To: Maciej Żenczykowski
  Cc: David Ahern, David S . Miller, Linux Network Development Mailing List

On Sat, May 09, 2020 at 12:17:47PM -0700, Maciej Żenczykowski wrote:
> Argh.  I've never understood the faintest thing about VRF's.

:)

There are many resources that David created over the years. For example:

https://www.kernel.org/doc/Documentation/networking/vrf.txt
https://netdevconf.info/1.1/proceedings/slides/ahern-vrf-tutorial.pdf
https://netdevconf.info/1.2/session.html?david-ahern-talk

BTW, in the example it should be:

ip link set dev veth-blue up master vrf-blue

Instead of:

ip link set dev veth-blue up master vrf-red

(Obviously)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
  2020-05-09 19:32     ` Ido Schimmel
@ 2020-05-09 21:09       ` David Ahern
  0 siblings, 0 replies; 10+ messages in thread
From: David Ahern @ 2020-05-09 21:09 UTC (permalink / raw)
  To: Ido Schimmel, Maciej Żenczykowski
  Cc: David S . Miller, Linux Network Development Mailing List

On 5/9/20 1:32 PM, Ido Schimmel wrote:
> On Sat, May 09, 2020 at 12:17:47PM -0700, Maciej Żenczykowski wrote:
>> Argh.  I've never understood the faintest thing about VRF's.
> 
> :)
> 
> There are many resources that David created over the years. For example:
> 
> https://www.kernel.org/doc/Documentation/networking/vrf.txt
> https://netdevconf.info/1.1/proceedings/slides/ahern-vrf-tutorial.pdf
> https://netdevconf.info/1.2/session.html?david-ahern-talk

best one is from OSS N.A. 2017:
    http://schd.ws/hosted_files/ossna2017/fe/vrf-tutorial-oss.pdf

Maciej: Basically, VRF narrows the scope to an L3 domain - devices
enslaved to the VRF device, their addresses and FIB entries in the VRF
table.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
  2020-05-09 19:15 ` Ido Schimmel
  2020-05-09 19:17   ` Maciej Żenczykowski
@ 2020-05-09 21:20   ` David Ahern
  2020-05-09 21:35     ` Maciej Żenczykowski
  1 sibling, 1 reply; 10+ messages in thread
From: David Ahern @ 2020-05-09 21:20 UTC (permalink / raw)
  To: Ido Schimmel, Maciej Żenczykowski
  Cc: Maciej Żenczykowski, David S . Miller,
	Linux Network Development Mailing List

On 5/9/20 1:15 PM, Ido Schimmel wrote:
> 
> This will unfortunately cause regressions with VRFs because they don't
> work correctly with ping sockets. Simple example:

Thanks for catching this, Ido.

> 
> diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
> index 535427292194..8463b0e9e811 100644
> --- a/net/ipv4/ping.c
> +++ b/net/ipv4/ping.c
> @@ -297,6 +297,7 @@ static int ping_check_bind_addr(struct sock *sk, struct inet_sock *isk,
>         struct net *net = sock_net(sk);
>         if (sk->sk_family == AF_INET) {
>                 struct sockaddr_in *addr = (struct sockaddr_in *) uaddr;
> +               u32 tb_id = RT_TABLE_LOCAL;
>                 int chk_addr_ret;
>  
>                 if (addr_len < sizeof(*addr))
> @@ -310,7 +311,15 @@ static int ping_check_bind_addr(struct sock *sk, struct inet_sock *isk,
>                 pr_debug("ping_check_bind_addr(sk=%p,addr=%pI4,port=%d)\n",
>                          sk, &addr->sin_addr.s_addr, ntohs(addr->sin_port));
>  
> -               chk_addr_ret = inet_addr_type(net, addr->sin_addr.s_addr);
> +               if (sk->sk_bound_dev_if) {

that's the key point - checking sk->sk_bound_dev_if.

> +                       tb_id = l3mdev_fib_table_by_index(net,
> +                                                         sk->sk_bound_dev_if);
> +                       if (!tb_id)
> +                               tb_id = RT_TABLE_LOCAL;
> +               }
> +
> +               chk_addr_ret = inet_addr_type_table(net, addr->sin_addr.s_addr,
> +                                                   tb_id);
>  
>                 if (addr->sin_addr.s_addr == htonl(INADDR_ANY))
>                         chk_addr_ret = RTN_LOCAL;
> 

From a scan of the ipv4/ping.c code, ping_v4_sendmsg also does not
acknowledge sk->sk_bound_dev_if.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
  2020-05-09 21:20   ` David Ahern
@ 2020-05-09 21:35     ` Maciej Żenczykowski
  2020-05-10  1:24       ` David Ahern
  0 siblings, 1 reply; 10+ messages in thread
From: Maciej Żenczykowski @ 2020-05-09 21:35 UTC (permalink / raw)
  To: David Ahern
  Cc: Ido Schimmel, David S . Miller, Linux Network Development Mailing List

Do we have some sort of beginner's introduction to Linux VRF somewhere?
What they are? How to use them?

Currently the concept simply doesn't fit into my mental model of networking...

We've actually talked about maybe possibly using VRF's in Android (for
our multi network support)...
but no-one on our team has the faintest idea about how they work...
(and there's rumours that they don't work with ipv6 link local)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
  2020-05-09 21:35     ` Maciej Żenczykowski
@ 2020-05-10  1:24       ` David Ahern
  2020-05-10  5:15         ` Maciej Żenczykowski
  0 siblings, 1 reply; 10+ messages in thread
From: David Ahern @ 2020-05-10  1:24 UTC (permalink / raw)
  To: Maciej Żenczykowski
  Cc: Ido Schimmel, David S . Miller, Linux Network Development Mailing List

On 5/9/20 3:35 PM, Maciej Żenczykowski wrote:
> Do we have some sort of beginner's introduction to Linux VRF somewhere?
> What they are? How to use them?

Ido's response gave introductory commands which can also be found here:
    https://www.kernel.org/doc/Documentation/networking/vrf.txt

This should answer most questions about more advanced topics:
    http://schd.ws/hosted_files/ossna2017/fe/vrf-tutorial-oss.pdf

Lately, I am putting blogs on https://people.kernel.org/dsahern for
recurring questions.

> 
> Currently the concept simply doesn't fit into my mental model of networking...

network namespaces = device level separation and up
VRF = Layer 3 and up separation

> 
> We've actually talked about maybe possibly using VRF's in Android (for
> our multi network support)...
> but no-one on our team has the faintest idea about how they work...
> (and there's rumours that they don't work with ipv6 link local)
> 

Rumors are ugly. If in doubt, ask. LLA with VRF is a primary requirement
from the beginning.

With 5.3 and up, you can have IPv4 routes with IPv6 LLA gateways with
and without VRFs.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
  2020-05-10  1:24       ` David Ahern
@ 2020-05-10  5:15         ` Maciej Żenczykowski
  2020-05-10 15:29           ` David Ahern
  0 siblings, 1 reply; 10+ messages in thread
From: Maciej Żenczykowski @ 2020-05-10  5:15 UTC (permalink / raw)
  To: David Ahern
  Cc: Ido Schimmel, David S . Miller, Linux Network Development Mailing List

> Ido's response gave introductory commands which can also be found here:
>     https://www.kernel.org/doc/Documentation/networking/vrf.txt
>
> This should answer most questions about more advanced topics:
>     http://schd.ws/hosted_files/ossna2017/fe/vrf-tutorial-oss.pdf
>
> Lately, I am putting blogs on https://people.kernel.org/dsahern for
> recurring questions.

Thanks for that - I'll give it a look.

> Rumors are ugly. If in doubt, ask. LLA with VRF is a primary requirement
> from the beginning.

LLA? Link Level Aggregation?
Not sure what that has to do with VRF though... that's just bonding/teaming??
and seems to work fine without VRF at a (at least to me conceptually)
even lower layer.

> With 5.3 and up, you can have IPv4 routes with IPv6 LLA gateways with
> and without VRFs.

Ah, see... the latest phone hardware that I still don't have access to
(mostly because of covid),
is only 4.19 based (as such doing dev work on 4.14 or VMs, but getting
wifi/cell emulation
and ipv6 working right in VMs is very very hard, though we're making
slow progress).

5.4 hardware is probably 10 months out (assuming things return to normal).
We're always ~2 years behind, and playing catchup... :-(

And I recently switched some of our servers (those were stuck on 4.3
until very recently)
to use IPv4 egress routing via IPv6 ND, that's a great improvement
(and came at just the right time),
but again I guess I don't see how VRF fits in to the picture - this
seems to be just a use NDv6 for foo
instead of ARP for bar to figure out dst mac type of thing...

Obviously I haven't read your links, so perhaps all my questions will
be answered.
(I'm rambling, mostly writing this email to thank you for the links)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
  2020-05-10  5:15         ` Maciej Żenczykowski
@ 2020-05-10 15:29           ` David Ahern
  0 siblings, 0 replies; 10+ messages in thread
From: David Ahern @ 2020-05-10 15:29 UTC (permalink / raw)
  To: Maciej Żenczykowski
  Cc: Ido Schimmel, David S . Miller, Linux Network Development Mailing List

On 5/9/20 11:15 PM, Maciej Żenczykowski wrote:
>> Rumors are ugly. If in doubt, ask. LLA with VRF is a primary requirement
>> from the beginning.
> 
> LLA? Link Level Aggregation?

Link Local Address

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-05-10 15:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-08 23:42 [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default Maciej Żenczykowski
2020-05-09 19:15 ` Ido Schimmel
2020-05-09 19:17   ` Maciej Żenczykowski
2020-05-09 19:32     ` Ido Schimmel
2020-05-09 21:09       ` David Ahern
2020-05-09 21:20   ` David Ahern
2020-05-09 21:35     ` Maciej Żenczykowski
2020-05-10  1:24       ` David Ahern
2020-05-10  5:15         ` Maciej Żenczykowski
2020-05-10 15:29           ` David Ahern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.