All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: "Maciej Żenczykowski" <zenczykowski@gmail.com>, dsahern@gmail.com
Cc: "Maciej Żenczykowski" <maze@google.com>,
	"David S . Miller" <davem@davemloft.net>,
	"Linux Network Development Mailing List" <netdev@vger.kernel.org>
Subject: Re: [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default
Date: Sat, 9 May 2020 22:15:36 +0300	[thread overview]
Message-ID: <20200509191536.GA370521@splinter> (raw)
In-Reply-To: <20200508234223.118254-1-zenczykowski@gmail.com>

On Fri, May 08, 2020 at 04:42:23PM -0700, Maciej Żenczykowski wrote:
> From: Maciej Żenczykowski <maze@google.com>
> 
> This makes 'ping' 'ping6' and icmp based traceroute no longer
> require any suid or file capabilities.
> 
> These sockets have baked long enough that the restriction
> to make them unusable by default is no longer necessary.
> 
> The concerns were around exploits.  However there are now
> major distros that default to enabling this.
> 
> This is already the default on Fedora 31:
>   [root@f31vm ~]# cat /proc/sys/net/ipv4/ping_group_range
>   0       2147483647
>   [root@f31vm ~]# cat /usr/lib/sysctl.d/50-default.conf | egrep -B6 ping_group_range
>   # ping(8) without CAP_NET_ADMIN and CAP_NET_RAW
>   # The upper limit is set to 2^31-1. Values greater than that get rejected by
>   # the kernel because of this definition in linux/include/net/ping.h:
>   #   #define GID_T_MAX (((gid_t)~0U) >> 1)
>   # That's not so bad because values between 2^31 and 2^32-1 are reserved on
>   # systemd-based systems anyway: https://systemd.io/UIDS-GIDS.html#summary
>   -net.ipv4.ping_group_range = 0 2147483647
> 
> And in general is super useful for any network namespace container
> based setup.  See for example: https://docs.docker.com/engine/security/rootless/
> 
> This is one less thing you need to configure when you creare a new network
> namespace.
> 
> Before:
>   vm:~# unshare -n
>   vm:~# cat /proc/sys/net/ipv4/ping_group_range
>   1       0
> 
> After:
>   vm:~# unshare -n
>   vm:~# cat /proc/sys/net/ipv4/ping_group_range
>   0       2147483647
> 
> Signed-off-by: Maciej Żenczykowski <maze@google.com>

This will unfortunately cause regressions with VRFs because they don't
work correctly with ping sockets. Simple example:

ip link add name vrf-red up type vrf table 10
ip link add name vrf-blue up type vrf table 20
ip rule add pref 32765 table local
ip rule del pref 0

ip link add name veth-red type veth peer name veth-blue
ip link set dev veth-red up master vrf-red
ip link set dev veth-blue up master vrf-red
ip address add 192.0.2.1/24 dev veth-red
ip address add 192.0.2.2/24 dev veth-blue

ip vrf exec vrf-red ping -I 192.0.2.1 192.0.2.2 -c 1 -w 1

Before the patch:

PING 192.0.2.2 (192.0.2.2) from 192.0.2.1 : 56(84) bytes of data.
64 bytes from 192.0.2.2: icmp_seq=1 ttl=64 time=0.053 ms

After the patch:

bind: Cannot assign requested address

This specific test case is fixed by following patch, but at least a
similar change is needed for IPv6. Other changes might also be required.
Sorry for not providing a complete solution, but I'm busy with other
tasks at the moment. :/

diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 535427292194..8463b0e9e811 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -297,6 +297,7 @@ static int ping_check_bind_addr(struct sock *sk, struct inet_sock *isk,
        struct net *net = sock_net(sk);
        if (sk->sk_family == AF_INET) {
                struct sockaddr_in *addr = (struct sockaddr_in *) uaddr;
+               u32 tb_id = RT_TABLE_LOCAL;
                int chk_addr_ret;
 
                if (addr_len < sizeof(*addr))
@@ -310,7 +311,15 @@ static int ping_check_bind_addr(struct sock *sk, struct inet_sock *isk,
                pr_debug("ping_check_bind_addr(sk=%p,addr=%pI4,port=%d)\n",
                         sk, &addr->sin_addr.s_addr, ntohs(addr->sin_port));
 
-               chk_addr_ret = inet_addr_type(net, addr->sin_addr.s_addr);
+               if (sk->sk_bound_dev_if) {
+                       tb_id = l3mdev_fib_table_by_index(net,
+                                                         sk->sk_bound_dev_if);
+                       if (!tb_id)
+                               tb_id = RT_TABLE_LOCAL;
+               }
+
+               chk_addr_ret = inet_addr_type_table(net, addr->sin_addr.s_addr,
+                                                   tb_id);
 
                if (addr->sin_addr.s_addr == htonl(INADDR_ANY))
                        chk_addr_ret = RTN_LOCAL;

> ---
>  net/ipv4/af_inet.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index cf58e29cf746..1a8cb6f3ee38 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1819,12 +1819,8 @@ static __net_init int inet_init_net(struct net *net)
>  	net->ipv4.ip_local_ports.range[1] =  60999;
>  
>  	seqlock_init(&net->ipv4.ping_group_range.lock);
> -	/*
> -	 * Sane defaults - nobody may create ping sockets.
> -	 * Boot scripts should set this to distro-specific group.
> -	 */
> -	net->ipv4.ping_group_range.range[0] = make_kgid(&init_user_ns, 1);
> -	net->ipv4.ping_group_range.range[1] = make_kgid(&init_user_ns, 0);
> +	net->ipv4.ping_group_range.range[0] = GLOBAL_ROOT_GID;
> +	net->ipv4.ping_group_range.range[1] = KGIDT_INIT(0x7FFFFFFF);
>  
>  	/* Default values for sysctl-controlled parameters.
>  	 * We set them here, in case sysctl is not compiled.
> -- 
> 2.26.2.645.ge9eca65c58-goog
> 

  reply	other threads:[~2020-05-09 19:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-08 23:42 [PATCH] net-icmp: make icmp{,v6} (ping) sockets available to all by default Maciej Żenczykowski
2020-05-09 19:15 ` Ido Schimmel [this message]
2020-05-09 19:17   ` Maciej Żenczykowski
2020-05-09 19:32     ` Ido Schimmel
2020-05-09 21:09       ` David Ahern
2020-05-09 21:20   ` David Ahern
2020-05-09 21:35     ` Maciej Żenczykowski
2020-05-10  1:24       ` David Ahern
2020-05-10  5:15         ` Maciej Żenczykowski
2020-05-10 15:29           ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200509191536.GA370521@splinter \
    --to=idosch@idosch.org \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=zenczykowski@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.