lvs-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v6 0/3] Insulate Kernel Space From SOCK_ADDR Hooks
@ 2023-09-26 20:05 Jordan Rife
  2023-09-26 20:05 ` [PATCH net v6 1/3] net: replace calls to sock->ops->connect() with kernel_connect() Jordan Rife
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Jordan Rife @ 2023-09-26 20:05 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, willemdebruijn.kernel, netdev
  Cc: dborkman, horms, pablo, kadlec, fw, santosh.shilimkar, ast, rdna,
	linux-rdma, rds-devel, coreteam, netfilter-devel, ja, lvs-devel,
	kafai, daniel, daan.j.demeyer, Jordan Rife

==OVERVIEW==

The sock_sendmsg(), kernel_connect(), and kernel_bind() functions
provide kernel space equivalents to the sendmsg(), connect(), and bind()
system calls.

When used in conjunction with BPF SOCK_ADDR hooks that rewrite the send,
connect, or bind address, callers may observe that the address passed to
the call is modified. This is a problem not just in theory, but in
practice, with uninsulated calls to kernel_connect() causing issues with
broken NFS and CIFS mounts.

commit 0bdf399342c5 ("net: Avoid address overwrite in kernel_connect")
ensured that callers to kernel_connect() are insulated from such effects
by passing a copy of the address parameter down the stack, but did not
go far enough:

- There remain many instances of direct calls to sock->ops->connect()
  throughout the kernel which do not benefit from the change to
  kernel_connect().
- sock_sendmsg() and kernel_bind() remain uninsulated from address
  rewrites and there exist many direct calls to sock->ops->bind()
  throughout the kernel.

This patch series is the first step to ensuring all socket operations in
kernel space are safe to use with BPF SOCK_ADDR hooks. It

1) Wraps direct calls to sock->ops->connect() with kernel_connect() to
   insulate them.
2) Introduces an address copy to sock_sendmsg() to insulate both calls
   to kernel_sendmsg() and sock_sendmsg() in kernel space.
3) Introduces an address copy to kernel_bind() and wraps direct calls to
   sock->ops->bind() to insulate them.

Earlier versions of this patch series wrapped all calls to
sock->ops->conect() and sock->ops->bind() throughout the kernel, but
this was pared down to instances occuring only in net to avoid merge
conflicts. A set of patches to various trees will be made as a follow up
to this series to address this gap.

==CHANGELOG==

V5->V6
------
- Preserve original value of msg->msg_namelen in sock_sendmsg() in
  anticipation of this patch that adds support for SOCK_ADDR hooks to
  Unix sockets and the ability to modify msg->msg_namelen:
  - https://lore.kernel.org/bpf/202309231339.L2O0CrMU-lkp@intel.com/T/#m181770af51156bdaa70fd4a4cb013ba11f28e101

V4->V5
------
- Removed non-net changes to avoid potential merge conflicts.

V3->V4
------
- Removed address length precondition checks from kernel_connect() and
  kernel_bind().
- Reordered variable declarations in sock_sendmsg() to maintain reverse
  xmas tree order.

V2->V3
------
- Added "Fixes" tags
- Added address length precondition checks to kernel_connect() and
  kernel_bind().

V1->V2
------
- Split up single patch into patch series.
- Wrapped all direct calls to sock->ops->connect() with kernel_connect()
  instead of pushing the address deeper into the stack to avoid
  duplication of address copy logic and to encourage a consistent
  interface.
- Moved address copy up the stack to sock_sendmsg() to avoid duplication
  of address copy logic.
- Introduced address copy to kernel_bind() and insulated direct calls to
  sock->ops->bind().

Jordan Rife (3):
  net: replace calls to sock->ops->connect() with kernel_connect()
  net: prevent rewrite of msg_name and msg_namelen in sock_sendmsg()
  net: prevent address rewrite in kernel_bind()

 net/netfilter/ipvs/ip_vs_sync.c |  8 ++++----
 net/rds/tcp_connect.c           |  4 ++--
 net/rds/tcp_listen.c            |  2 +-
 net/socket.c                    | 36 ++++++++++++++++++++++++++-------
 4 files changed, 36 insertions(+), 14 deletions(-)

-- 
2.42.0.515.g380fc7ccd1-goog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net v6 1/3] net: replace calls to sock->ops->connect() with kernel_connect()
  2023-09-26 20:05 [PATCH net v6 0/3] Insulate Kernel Space From SOCK_ADDR Hooks Jordan Rife
@ 2023-09-26 20:05 ` Jordan Rife
  2023-09-26 20:05 ` [PATCH net v6 2/3] net: prevent rewrite of msg_name and msg_namelen in sock_sendmsg() Jordan Rife
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Jordan Rife @ 2023-09-26 20:05 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, willemdebruijn.kernel, netdev
  Cc: dborkman, horms, pablo, kadlec, fw, santosh.shilimkar, ast, rdna,
	linux-rdma, rds-devel, coreteam, netfilter-devel, ja, lvs-devel,
	kafai, daniel, daan.j.demeyer, Jordan Rife, stable,
	Willem de Bruijn

commit 0bdf399342c5 ("net: Avoid address overwrite in kernel_connect")
ensured that kernel_connect() will not overwrite the address parameter
in cases where BPF connect hooks perform an address rewrite. This change
replaces direct calls to sock->ops->connect() in net with kernel_connect()
to make these call safe.

Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect")
Cc: stable@vger.kernel.org
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jordan Rife <jrife@google.com>
---
 net/netfilter/ipvs/ip_vs_sync.c | 4 ++--
 net/rds/tcp_connect.c           | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index da5af28ff57b5..6e4ed1e11a3b7 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1505,8 +1505,8 @@ static int make_send_sock(struct netns_ipvs *ipvs, int id,
 	}
 
 	get_mcast_sockaddr(&mcast_addr, &salen, &ipvs->mcfg, id);
-	result = sock->ops->connect(sock, (struct sockaddr *) &mcast_addr,
-				    salen, 0);
+	result = kernel_connect(sock, (struct sockaddr *)&mcast_addr,
+				salen, 0);
 	if (result < 0) {
 		pr_err("Error connecting to the multicast addr\n");
 		goto error;
diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c
index f0c477c5d1db4..d788c6d28986f 100644
--- a/net/rds/tcp_connect.c
+++ b/net/rds/tcp_connect.c
@@ -173,7 +173,7 @@ int rds_tcp_conn_path_connect(struct rds_conn_path *cp)
 	 * own the socket
 	 */
 	rds_tcp_set_callbacks(sock, cp);
-	ret = sock->ops->connect(sock, addr, addrlen, O_NONBLOCK);
+	ret = kernel_connect(sock, addr, addrlen, O_NONBLOCK);
 
 	rdsdebug("connect to address %pI6c returned %d\n", &conn->c_faddr, ret);
 	if (ret == -EINPROGRESS)
-- 
2.42.0.515.g380fc7ccd1-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net v6 2/3] net: prevent rewrite of msg_name and msg_namelen in sock_sendmsg()
  2023-09-26 20:05 [PATCH net v6 0/3] Insulate Kernel Space From SOCK_ADDR Hooks Jordan Rife
  2023-09-26 20:05 ` [PATCH net v6 1/3] net: replace calls to sock->ops->connect() with kernel_connect() Jordan Rife
@ 2023-09-26 20:05 ` Jordan Rife
  2023-09-26 20:05 ` [PATCH net v6 3/3] net: prevent address rewrite in kernel_bind() Jordan Rife
  2023-10-01 18:40 ` [PATCH net v6 0/3] Insulate Kernel Space From SOCK_ADDR Hooks patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Jordan Rife @ 2023-09-26 20:05 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, willemdebruijn.kernel, netdev
  Cc: dborkman, horms, pablo, kadlec, fw, santosh.shilimkar, ast, rdna,
	linux-rdma, rds-devel, coreteam, netfilter-devel, ja, lvs-devel,
	kafai, daniel, daan.j.demeyer, Jordan Rife, stable

Callers of sock_sendmsg(), and similarly kernel_sendmsg(), in kernel
space may observe their value of msg_name change in cases where BPF
sendmsg hooks rewrite the send address. This has been confirmed to break
NFS mounts running in UDP mode and has the potential to break other
systems.

Soon, support will be added for BPF sockaddr hooks for Unix sockets
which introduces the ability to modify the msg->msg_namelen value.

This patch:

1) Creates a new function called __sock_sendmsg() with same logic as the
   old sock_sendmsg() function.
2) Replaces calls to sock_sendmsg() made by __sys_sendto() and
   __sys_sendmsg() with __sock_sendmsg() to avoid an unnecessary copy,
   as these system calls are already protected.
3) Makes a copy of msg->msg_name and to insulate callers.
4) Makes a copy of msg->msg_namelen to insulate callers in anticipation
   of the aforementioned change to support Unix sockets.

Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
Link: https://lore.kernel.org/bpf/202309231339.L2O0CrMU-lkp@intel.com/T/#m181770af51156bdaa70fd4a4cb013ba11f28e101
Fixes: 1cedee13d25a ("bpf: Hooks for sys_sendmsg")
Cc: stable@vger.kernel.org
Signed-off-by: Jordan Rife <jrife@google.com>
---
 net/socket.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index c8b08b32f097e..107a257a75186 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -737,6 +737,14 @@ static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg)
 	return ret;
 }
 
+static int __sock_sendmsg(struct socket *sock, struct msghdr *msg)
+{
+	int err = security_socket_sendmsg(sock, msg,
+					  msg_data_left(msg));
+
+	return err ?: sock_sendmsg_nosec(sock, msg);
+}
+
 /**
  *	sock_sendmsg - send a message through @sock
  *	@sock: socket
@@ -747,10 +755,21 @@ static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg)
  */
 int sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	int err = security_socket_sendmsg(sock, msg,
-					  msg_data_left(msg));
+	struct sockaddr_storage *save_addr = (struct sockaddr_storage *)msg->msg_name;
+	int save_addrlen = msg->msg_namelen;
+	struct sockaddr_storage address;
+	int ret;
 
-	return err ?: sock_sendmsg_nosec(sock, msg);
+	if (msg->msg_name) {
+		memcpy(&address, msg->msg_name, msg->msg_namelen);
+		msg->msg_name = &address;
+	}
+
+	ret = __sock_sendmsg(sock, msg);
+	msg->msg_name = save_addr;
+	msg->msg_namelen = save_addrlen;
+
+	return ret;
 }
 EXPORT_SYMBOL(sock_sendmsg);
 
@@ -1138,7 +1157,7 @@ static ssize_t sock_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	if (sock->type == SOCK_SEQPACKET)
 		msg.msg_flags |= MSG_EOR;
 
-	res = sock_sendmsg(sock, &msg);
+	res = __sock_sendmsg(sock, &msg);
 	*from = msg.msg_iter;
 	return res;
 }
@@ -2174,7 +2193,7 @@ int __sys_sendto(int fd, void __user *buff, size_t len, unsigned int flags,
 	if (sock->file->f_flags & O_NONBLOCK)
 		flags |= MSG_DONTWAIT;
 	msg.msg_flags = flags;
-	err = sock_sendmsg(sock, &msg);
+	err = __sock_sendmsg(sock, &msg);
 
 out_put:
 	fput_light(sock->file, fput_needed);
@@ -2538,7 +2557,7 @@ static int ____sys_sendmsg(struct socket *sock, struct msghdr *msg_sys,
 		err = sock_sendmsg_nosec(sock, msg_sys);
 		goto out_freectl;
 	}
-	err = sock_sendmsg(sock, msg_sys);
+	err = __sock_sendmsg(sock, msg_sys);
 	/*
 	 * If this is sendmmsg() and sending to current destination address was
 	 * successful, remember it.
-- 
2.42.0.515.g380fc7ccd1-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net v6 3/3] net: prevent address rewrite in kernel_bind()
  2023-09-26 20:05 [PATCH net v6 0/3] Insulate Kernel Space From SOCK_ADDR Hooks Jordan Rife
  2023-09-26 20:05 ` [PATCH net v6 1/3] net: replace calls to sock->ops->connect() with kernel_connect() Jordan Rife
  2023-09-26 20:05 ` [PATCH net v6 2/3] net: prevent rewrite of msg_name and msg_namelen in sock_sendmsg() Jordan Rife
@ 2023-09-26 20:05 ` Jordan Rife
  2023-10-01 18:40 ` [PATCH net v6 0/3] Insulate Kernel Space From SOCK_ADDR Hooks patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Jordan Rife @ 2023-09-26 20:05 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, willemdebruijn.kernel, netdev
  Cc: dborkman, horms, pablo, kadlec, fw, santosh.shilimkar, ast, rdna,
	linux-rdma, rds-devel, coreteam, netfilter-devel, ja, lvs-devel,
	kafai, daniel, daan.j.demeyer, Jordan Rife, stable,
	Willem de Bruijn

Similar to the change in commit 0bdf399342c5("net: Avoid address
overwrite in kernel_connect"), BPF hooks run on bind may rewrite the
address passed to kernel_bind(). This change

1) Makes a copy of the bind address in kernel_bind() to insulate
   callers.
2) Replaces direct calls to sock->ops->bind() in net with kernel_bind()

Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
Fixes: 4fbac77d2d09 ("bpf: Hooks for sys_bind")
Cc: stable@vger.kernel.org
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jordan Rife <jrife@google.com>
---
 net/netfilter/ipvs/ip_vs_sync.c | 4 ++--
 net/rds/tcp_connect.c           | 2 +-
 net/rds/tcp_listen.c            | 2 +-
 net/socket.c                    | 7 ++++++-
 4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index 6e4ed1e11a3b7..4174076c66fa7 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1439,7 +1439,7 @@ static int bind_mcastif_addr(struct socket *sock, struct net_device *dev)
 	sin.sin_addr.s_addr  = addr;
 	sin.sin_port         = 0;
 
-	return sock->ops->bind(sock, (struct sockaddr*)&sin, sizeof(sin));
+	return kernel_bind(sock, (struct sockaddr *)&sin, sizeof(sin));
 }
 
 static void get_mcast_sockaddr(union ipvs_sockaddr *sa, int *salen,
@@ -1546,7 +1546,7 @@ static int make_receive_sock(struct netns_ipvs *ipvs, int id,
 
 	get_mcast_sockaddr(&mcast_addr, &salen, &ipvs->bcfg, id);
 	sock->sk->sk_bound_dev_if = dev->ifindex;
-	result = sock->ops->bind(sock, (struct sockaddr *)&mcast_addr, salen);
+	result = kernel_bind(sock, (struct sockaddr *)&mcast_addr, salen);
 	if (result < 0) {
 		pr_err("Error binding to the multicast addr\n");
 		goto error;
diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c
index d788c6d28986f..a0046e99d6df7 100644
--- a/net/rds/tcp_connect.c
+++ b/net/rds/tcp_connect.c
@@ -145,7 +145,7 @@ int rds_tcp_conn_path_connect(struct rds_conn_path *cp)
 		addrlen = sizeof(sin);
 	}
 
-	ret = sock->ops->bind(sock, addr, addrlen);
+	ret = kernel_bind(sock, addr, addrlen);
 	if (ret) {
 		rdsdebug("bind failed with %d at address %pI6c\n",
 			 ret, &conn->c_laddr);
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 014fa24418c12..53b3535a1e4a8 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -306,7 +306,7 @@ struct socket *rds_tcp_listen_init(struct net *net, bool isv6)
 		addr_len = sizeof(*sin);
 	}
 
-	ret = sock->ops->bind(sock, (struct sockaddr *)&ss, addr_len);
+	ret = kernel_bind(sock, (struct sockaddr *)&ss, addr_len);
 	if (ret < 0) {
 		rdsdebug("could not bind %s listener socket: %d\n",
 			 isv6 ? "IPv6" : "IPv4", ret);
diff --git a/net/socket.c b/net/socket.c
index 107a257a75186..3408bd6bb1e5a 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3518,7 +3518,12 @@ static long compat_sock_ioctl(struct file *file, unsigned int cmd,
 
 int kernel_bind(struct socket *sock, struct sockaddr *addr, int addrlen)
 {
-	return READ_ONCE(sock->ops)->bind(sock, addr, addrlen);
+	struct sockaddr_storage address;
+
+	memcpy(&address, addr, addrlen);
+
+	return READ_ONCE(sock->ops)->bind(sock, (struct sockaddr *)&address,
+					  addrlen);
 }
 EXPORT_SYMBOL(kernel_bind);
 
-- 
2.42.0.515.g380fc7ccd1-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net v6 0/3] Insulate Kernel Space From SOCK_ADDR Hooks
  2023-09-26 20:05 [PATCH net v6 0/3] Insulate Kernel Space From SOCK_ADDR Hooks Jordan Rife
                   ` (2 preceding siblings ...)
  2023-09-26 20:05 ` [PATCH net v6 3/3] net: prevent address rewrite in kernel_bind() Jordan Rife
@ 2023-10-01 18:40 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-10-01 18:40 UTC (permalink / raw)
  To: Jordan Rife
  Cc: davem, edumazet, kuba, pabeni, willemdebruijn.kernel, netdev,
	dborkman, horms, pablo, kadlec, fw, santosh.shilimkar, ast, rdna,
	linux-rdma, rds-devel, coreteam, netfilter-devel, ja, lvs-devel,
	kafai, daniel, daan.j.demeyer

Hello:

This series was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:

On Tue, 26 Sep 2023 15:05:02 -0500 you wrote:
> ==OVERVIEW==
> 
> The sock_sendmsg(), kernel_connect(), and kernel_bind() functions
> provide kernel space equivalents to the sendmsg(), connect(), and bind()
> system calls.
> 
> When used in conjunction with BPF SOCK_ADDR hooks that rewrite the send,
> connect, or bind address, callers may observe that the address passed to
> the call is modified. This is a problem not just in theory, but in
> practice, with uninsulated calls to kernel_connect() causing issues with
> broken NFS and CIFS mounts.
> 
> [...]

Here is the summary with links:
  - [net,v6,1/3] net: replace calls to sock->ops->connect() with kernel_connect()
    https://git.kernel.org/netdev/net/c/26297b4ce1ce
  - [net,v6,2/3] net: prevent rewrite of msg_name and msg_namelen in sock_sendmsg()
    (no matching commit)
  - [net,v6,3/3] net: prevent address rewrite in kernel_bind()
    https://git.kernel.org/netdev/net/c/c889a99a21bf

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-10-01 18:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-26 20:05 [PATCH net v6 0/3] Insulate Kernel Space From SOCK_ADDR Hooks Jordan Rife
2023-09-26 20:05 ` [PATCH net v6 1/3] net: replace calls to sock->ops->connect() with kernel_connect() Jordan Rife
2023-09-26 20:05 ` [PATCH net v6 2/3] net: prevent rewrite of msg_name and msg_namelen in sock_sendmsg() Jordan Rife
2023-09-26 20:05 ` [PATCH net v6 3/3] net: prevent address rewrite in kernel_bind() Jordan Rife
2023-10-01 18:40 ` [PATCH net v6 0/3] Insulate Kernel Space From SOCK_ADDR Hooks patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).