All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 1/2] net: retrieve netns cookie via getsocketopt
@ 2021-06-22 20:26 Martynas Pumputis
  2021-06-22 20:26 ` [PATCH net-next 2/2] tools/testing: add a selftest for SO_NETNS_COOKIE Martynas Pumputis
  2021-06-23  7:45 ` [PATCH net-next 1/2] net: retrieve netns cookie via getsocketopt Lorenz Bauer
  0 siblings, 2 replies; 3+ messages in thread
From: Martynas Pumputis @ 2021-06-22 20:26 UTC (permalink / raw)
  To: netdev; +Cc: edumazet, daniel, ast, Martynas Pumputis, Lorenz Bauer

It's getting more common to run nested container environments for
testing cloud software. One of such examples is Kind [1] which runs a
Kubernetes cluster in Docker containers on a single host. Each container
acts as a Kubernetes node, and thus can run any Pod (aka container)
inside the former. This approach simplifies testing a lot, as it
eliminates complicated VM setups.

Unfortunately, such a setup breaks some functionality when cgroupv2 BPF
programs are used for load-balancing. The load-balancer BPF program
needs to detect whether a request originates from the host netns or a
container netns in order to allow some access, e.g. to a service via a
loopback IP address. Typically, the programs detect this by comparing
netns cookies with the one of the init ns via a call to
bpf_get_netns_cookie(NULL). However, in nested environments the latter
cannot be used given the Kubernetes node's netns is outside the init ns.
To fix this, we need to pass the Kubernetes node netns cookie to the
program in a different way: by extending getsockopt() with a
SO_NETNS_COOKIE option, the orchestrator which runs in the Kubernetes
node netns can retrieve the cookie and pass it to the program instead.

Thus, this is following up on Eric's commit 3d368ab87cf6 ("net:
initialize net->net_cookie at netns setup") to allow retrieval via
SO_NETNS_COOKIE.  This is also in line in how we retrieve socket cookie
via SO_COOKIE.

  [1] https://kind.sigs.k8s.io/

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Cc: Eric Dumazet <edumazet@google.com>
---
 arch/alpha/include/uapi/asm/socket.h  | 2 ++
 arch/mips/include/uapi/asm/socket.h   | 2 ++
 arch/parisc/include/uapi/asm/socket.h | 2 ++
 arch/sparc/include/uapi/asm/socket.h  | 2 ++
 include/uapi/asm-generic/socket.h     | 2 ++
 net/core/sock.c                       | 9 +++++++++
 6 files changed, 19 insertions(+)

diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
index 57420356ce4c..6b3daba60987 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -127,6 +127,8 @@
 #define SO_PREFER_BUSY_POLL	69
 #define SO_BUSY_POLL_BUDGET	70
 
+#define SO_NETNS_COOKIE		71
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64
diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
index 2d949969313b..cdf404a831b2 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -138,6 +138,8 @@
 #define SO_PREFER_BUSY_POLL	69
 #define SO_BUSY_POLL_BUDGET	70
 
+#define SO_NETNS_COOKIE		71
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64
diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
index f60904329bbc..5b5351cdcb33 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -119,6 +119,8 @@
 #define SO_PREFER_BUSY_POLL	0x4043
 #define SO_BUSY_POLL_BUDGET	0x4044
 
+#define SO_NETNS_COOKIE		0x4045
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64
diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
index 848a22fbac20..92675dc380fa 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -120,6 +120,8 @@
 #define SO_PREFER_BUSY_POLL	 0x0048
 #define SO_BUSY_POLL_BUDGET	 0x0049
 
+#define SO_NETNS_COOKIE          0x0050
+
 #if !defined(__KERNEL__)
 
 
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 4dcd13d097a9..d588c244ec2f 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -122,6 +122,8 @@
 #define SO_PREFER_BUSY_POLL	69
 #define SO_BUSY_POLL_BUDGET	70
 
+#define SO_NETNS_COOKIE		71
+
 #if !defined(__KERNEL__)
 
 #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
diff --git a/net/core/sock.c b/net/core/sock.c
index ddfa88082a2b..462fe1fb2056 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1635,6 +1635,15 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
 		v.val = sk->sk_bound_dev_if;
 		break;
 
+#ifdef CONFIG_NET_NS
+	case SO_NETNS_COOKIE:
+		lv = sizeof(u64);
+		if (len != lv)
+			return -EINVAL;
+		v.val64 = sock_net(sk)->net_cookie;
+		break;
+#endif
+
 	default:
 		/* We implement the SO_SNDLOWAT etc to not be settable
 		 * (1003.1g 7).
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH net-next 2/2] tools/testing: add a selftest for SO_NETNS_COOKIE
  2021-06-22 20:26 [PATCH net-next 1/2] net: retrieve netns cookie via getsocketopt Martynas Pumputis
@ 2021-06-22 20:26 ` Martynas Pumputis
  2021-06-23  7:45 ` [PATCH net-next 1/2] net: retrieve netns cookie via getsocketopt Lorenz Bauer
  1 sibling, 0 replies; 3+ messages in thread
From: Martynas Pumputis @ 2021-06-22 20:26 UTC (permalink / raw)
  To: netdev; +Cc: edumazet, daniel, ast, Lorenz Bauer

From: Lorenz Bauer <lmb@cloudflare.com>

Make sure that SO_NETNS_COOKIE returns a non-zero value, and
that sockets from different namespaces have a distinct cookie
value.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
---
 tools/testing/selftests/net/.gitignore        |  1 +
 tools/testing/selftests/net/Makefile          |  2 +-
 tools/testing/selftests/net/config            |  1 +
 tools/testing/selftests/net/so_netns_cookie.c | 61 +++++++++++++++++++
 4 files changed, 64 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/net/so_netns_cookie.c

diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore
index 61ae899cfc17..19deb9cdf72f 100644
--- a/tools/testing/selftests/net/.gitignore
+++ b/tools/testing/selftests/net/.gitignore
@@ -30,3 +30,4 @@ hwtstamp_config
 rxtimestamp
 timestamping
 txtimestamp
+so_netns_cookie
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 3915bb7bfc39..79c9eb0034d5 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -30,7 +30,7 @@ TEST_GEN_FILES =  socket nettest
 TEST_GEN_FILES += psock_fanout psock_tpacket msg_zerocopy reuseport_addr_any
 TEST_GEN_FILES += tcp_mmap tcp_inq psock_snd txring_overwrite
 TEST_GEN_FILES += udpgso udpgso_bench_tx udpgso_bench_rx ip_defrag
-TEST_GEN_FILES += so_txtime ipv6_flowlabel ipv6_flowlabel_mgr
+TEST_GEN_FILES += so_txtime ipv6_flowlabel ipv6_flowlabel_mgr so_netns_cookie
 TEST_GEN_FILES += tcp_fastopen_backup_key
 TEST_GEN_FILES += fin_ack_lat
 TEST_GEN_FILES += reuseaddr_ports_exhausted
diff --git a/tools/testing/selftests/net/config b/tools/testing/selftests/net/config
index 614d5477365a..6f905b53904f 100644
--- a/tools/testing/selftests/net/config
+++ b/tools/testing/selftests/net/config
@@ -1,4 +1,5 @@
 CONFIG_USER_NS=y
+CONFIG_NET_NS=y
 CONFIG_BPF_SYSCALL=y
 CONFIG_TEST_BPF=m
 CONFIG_NUMA=y
diff --git a/tools/testing/selftests/net/so_netns_cookie.c b/tools/testing/selftests/net/so_netns_cookie.c
new file mode 100644
index 000000000000..b39e87e967cd
--- /dev/null
+++ b/tools/testing/selftests/net/so_netns_cookie.c
@@ -0,0 +1,61 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <sched.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+
+#ifndef SO_NETNS_COOKIE
+#define SO_NETNS_COOKIE 71
+#endif
+
+#define pr_err(fmt, ...) \
+	({ \
+		fprintf(stderr, "%s:%d:" fmt ": %m\n", \
+			__func__, __LINE__, ##__VA_ARGS__); \
+		1; \
+	})
+
+int main(int argc, char *argvp[])
+{
+	uint64_t cookie1, cookie2;
+	socklen_t vallen;
+	int sock1, sock2;
+
+	sock1 = socket(AF_INET, SOCK_STREAM, 0);
+	if (sock1 < 0)
+		return pr_err("Unable to create TCP socket");
+
+	vallen = sizeof(cookie1);
+	if (getsockopt(sock1, SOL_SOCKET, SO_NETNS_COOKIE, &cookie1, &vallen) != 0)
+		return pr_err("getsockopt(SOL_SOCKET, SO_NETNS_COOKIE)");
+
+	if (!cookie1)
+		return pr_err("SO_NETNS_COOKIE returned zero cookie");
+
+	if (unshare(CLONE_NEWNET))
+		return pr_err("unshare");
+
+	sock2 = socket(AF_INET, SOCK_STREAM, 0);
+	if (sock2 < 0)
+		return pr_err("Unable to create TCP socket");
+
+	vallen = sizeof(cookie2);
+	if (getsockopt(sock2, SOL_SOCKET, SO_NETNS_COOKIE, &cookie2, &vallen) != 0)
+		return pr_err("getsockopt(SOL_SOCKET, SO_NETNS_COOKIE)");
+
+	if (!cookie2)
+		return pr_err("SO_NETNS_COOKIE returned zero cookie");
+
+	if (cookie1 == cookie2)
+		return pr_err("SO_NETNS_COOKIE returned identical cookies for distinct ns");
+
+	close(sock1);
+	close(sock2);
+	return 0;
+}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next 1/2] net: retrieve netns cookie via getsocketopt
  2021-06-22 20:26 [PATCH net-next 1/2] net: retrieve netns cookie via getsocketopt Martynas Pumputis
  2021-06-22 20:26 ` [PATCH net-next 2/2] tools/testing: add a selftest for SO_NETNS_COOKIE Martynas Pumputis
@ 2021-06-23  7:45 ` Lorenz Bauer
  1 sibling, 0 replies; 3+ messages in thread
From: Lorenz Bauer @ 2021-06-23  7:45 UTC (permalink / raw)
  To: Martynas Pumputis
  Cc: Networking, Eric Dumazet, Daniel Borkmann, Alexei Starovoitov

On Tue, 22 Jun 2021 at 21:24, Martynas Pumputis <m@lambda.lt> wrote:
>
> It's getting more common to run nested container environments for
> testing cloud software. One of such examples is Kind [1] which runs a
> Kubernetes cluster in Docker containers on a single host. Each container
> acts as a Kubernetes node, and thus can run any Pod (aka container)
> inside the former. This approach simplifies testing a lot, as it
> eliminates complicated VM setups.
>
> Unfortunately, such a setup breaks some functionality when cgroupv2 BPF
> programs are used for load-balancing. The load-balancer BPF program
> needs to detect whether a request originates from the host netns or a
> container netns in order to allow some access, e.g. to a service via a
> loopback IP address. Typically, the programs detect this by comparing
> netns cookies with the one of the init ns via a call to
> bpf_get_netns_cookie(NULL). However, in nested environments the latter
> cannot be used given the Kubernetes node's netns is outside the init ns.
> To fix this, we need to pass the Kubernetes node netns cookie to the
> program in a different way: by extending getsockopt() with a
> SO_NETNS_COOKIE option, the orchestrator which runs in the Kubernetes
> node netns can retrieve the cookie and pass it to the program instead.
>
> Thus, this is following up on Eric's commit 3d368ab87cf6 ("net:
> initialize net->net_cookie at netns setup") to allow retrieval via
> SO_NETNS_COOKIE.  This is also in line in how we retrieve socket cookie
> via SO_COOKIE.
>
>   [1] https://kind.sigs.k8s.io/
>
> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> Signed-off-by: Martynas Pumputis <m@lambda.lt>
> Cc: Eric Dumazet <edumazet@google.com>
> ---
>  arch/alpha/include/uapi/asm/socket.h  | 2 ++
>  arch/mips/include/uapi/asm/socket.h   | 2 ++
>  arch/parisc/include/uapi/asm/socket.h | 2 ++
>  arch/sparc/include/uapi/asm/socket.h  | 2 ++
>  include/uapi/asm-generic/socket.h     | 2 ++
>  net/core/sock.c                       | 9 +++++++++
>  6 files changed, 19 insertions(+)
>
> diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
> index 57420356ce4c..6b3daba60987 100644
> --- a/arch/alpha/include/uapi/asm/socket.h
> +++ b/arch/alpha/include/uapi/asm/socket.h
> @@ -127,6 +127,8 @@
>  #define SO_PREFER_BUSY_POLL    69
>  #define SO_BUSY_POLL_BUDGET    70
>
> +#define SO_NETNS_COOKIE                71
> +
>  #if !defined(__KERNEL__)
>
>  #if __BITS_PER_LONG == 64
> diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
> index 2d949969313b..cdf404a831b2 100644
> --- a/arch/mips/include/uapi/asm/socket.h
> +++ b/arch/mips/include/uapi/asm/socket.h
> @@ -138,6 +138,8 @@
>  #define SO_PREFER_BUSY_POLL    69
>  #define SO_BUSY_POLL_BUDGET    70
>
> +#define SO_NETNS_COOKIE                71
> +
>  #if !defined(__KERNEL__)
>
>  #if __BITS_PER_LONG == 64
> diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
> index f60904329bbc..5b5351cdcb33 100644
> --- a/arch/parisc/include/uapi/asm/socket.h
> +++ b/arch/parisc/include/uapi/asm/socket.h
> @@ -119,6 +119,8 @@
>  #define SO_PREFER_BUSY_POLL    0x4043
>  #define SO_BUSY_POLL_BUDGET    0x4044
>
> +#define SO_NETNS_COOKIE                0x4045
> +
>  #if !defined(__KERNEL__)
>
>  #if __BITS_PER_LONG == 64
> diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
> index 848a22fbac20..92675dc380fa 100644
> --- a/arch/sparc/include/uapi/asm/socket.h
> +++ b/arch/sparc/include/uapi/asm/socket.h
> @@ -120,6 +120,8 @@
>  #define SO_PREFER_BUSY_POLL     0x0048
>  #define SO_BUSY_POLL_BUDGET     0x0049
>
> +#define SO_NETNS_COOKIE          0x0050
> +
>  #if !defined(__KERNEL__)
>
>
> diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
> index 4dcd13d097a9..d588c244ec2f 100644
> --- a/include/uapi/asm-generic/socket.h
> +++ b/include/uapi/asm-generic/socket.h
> @@ -122,6 +122,8 @@
>  #define SO_PREFER_BUSY_POLL    69
>  #define SO_BUSY_POLL_BUDGET    70
>
> +#define SO_NETNS_COOKIE                71
> +
>  #if !defined(__KERNEL__)
>
>  #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
> diff --git a/net/core/sock.c b/net/core/sock.c
> index ddfa88082a2b..462fe1fb2056 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1635,6 +1635,15 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
>                 v.val = sk->sk_bound_dev_if;
>                 break;
>
> +#ifdef CONFIG_NET_NS

Nit: sock_net already takes care of CONFIG_NET_NS and returns the root
ns if !NET_NS, so this define is not necessary. I think this behaviour
is nicer: uapi stays consistent even if kernel config changes.

> +       case SO_NETNS_COOKIE:
> +               lv = sizeof(u64);
> +               if (len != lv)
> +                       return -EINVAL;
> +               v.val64 = sock_net(sk)->net_cookie;
> +               break;
> +#endif
> +
>         default:
>                 /* We implement the SO_SNDLOWAT etc to not be settable
>                  * (1003.1g 7).
> --
> 2.32.0
>


-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-23  7:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22 20:26 [PATCH net-next 1/2] net: retrieve netns cookie via getsocketopt Martynas Pumputis
2021-06-22 20:26 ` [PATCH net-next 2/2] tools/testing: add a selftest for SO_NETNS_COOKIE Martynas Pumputis
2021-06-23  7:45 ` [PATCH net-next 1/2] net: retrieve netns cookie via getsocketopt Lorenz Bauer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.