linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM.
@ 2022-09-28  0:27 Kuniyuki Iwashima
  2022-09-28  0:27 ` [PATCH v2 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-28  0:27 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel

This series fixes some memory leaks and data races caused in the
same scenario where one thread converts an IPv6 socket into IPv4
with IPV6_ADDRFORM and another accesses the socket concurrently.

Note patch 1 and 5 conflict with these commits in net-next, respectively.

  * 24426654ed3a ("bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf")
  * 34704ef024ae ("bpf: net: Change do_tcp_getsockopt() to take the sockptr_t argument")


Changes:
  v2:
    * Patch 3:
      * Move inet6_destroy_sock() from sk_prot->destroy()
        to sk->sk_destruct() and fix CONFIG_IPV6=m build failure
    * Patch 5:
      * Add WRITE_ONCE()s in tcp_v6_connect()
      * Add Reported-by tags and KCSAN log in changelog

  v1: https://lore.kernel.org/netdev/20220927161209.32939-1-kuniyu@amazon.com/


Kuniyuki Iwashima (5):
  tcp/udp: Fix memory leak in ipv6_renew_options().
  udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
  tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  ipv6: Fix data races around sk->sk_prot.
  tcp: Fix data races around icsk->icsk_af_ops.

 include/net/ipv6.h       |  1 +
 include/net/udp.h        |  2 +-
 net/core/sock.c          |  6 ++++--
 net/ipv4/af_inet.c       | 23 ++++++++++++++++-------
 net/ipv4/tcp.c           | 10 ++++++----
 net/ipv4/udp.c           |  8 ++++++--
 net/ipv6/af_inet6.c      |  9 ++++++++-
 net/ipv6/ipv6_sockglue.c | 29 ++++++++++++++---------------
 net/ipv6/tcp_ipv6.c      |  6 ++++--
 net/ipv6/udp.c           | 15 ++++++++++++++-
 10 files changed, 74 insertions(+), 35 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options().
  2022-09-28  0:27 [PATCH v2 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
@ 2022-09-28  0:27 ` Kuniyuki Iwashima
  2022-09-28  0:27 ` [PATCH v2 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Kuniyuki Iwashima
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-28  0:27 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel, syzbot

syzbot reported a memory leak [0] related to IPV6_ADDRFORM.

The scenario is that while one thread is converting an IPv6 socket into
IPv4 with IPV6_ADDRFORM, another thread calls do_ipv6_setsockopt() and
allocates memory to inet6_sk(sk)->XXX after conversion.

Then, the converted sk with (tcp|udp)_prot never frees the IPv6 resources,
which inet6_destroy_sock() should have cleaned up.

setsockopt(IPV6_ADDRFORM)                 setsockopt(IPV6_DSTOPTS)
+-----------------------+                 +----------------------+
- do_ipv6_setsockopt(sk, ...)
  - lock_sock(sk)                         - do_ipv6_setsockopt(sk, ...)
  - WRITE_ONCE(sk->sk_prot, &tcp_prot)      ^._ called via tcpv6_prot
  - xchg(&np->opt, NULL)                        before WRITE_ONCE()
  - txopt_put(opt)
  - release_sock(sk)
                                            - lock_sock(sk)
                                            - ipv6_set_opt_hdr(sk, ...)
                                              - ipv6_update_options(sk, opt)
                                                - xchg(&inet6_sk(sk)->opt, opt)
                                                  ^._ opt is never freed.

                                            - release_sock(sk)

Since IPV6_DSTOPTS allocates options under lock_sock(), we can avoid this
memory leak by testing whether sk_family is changed by IPV6_ADDRFORM after
acquiring the lock.

This issue exists from the initial commit between IPV6_ADDRFORM and
IPV6_PKTOPTIONS.

[0]:
BUG: memory leak
unreferenced object 0xffff888009ab9f80 (size 96):
  comm "syz-executor583", pid 328, jiffies 4294916198 (age 13.034s)
  hex dump (first 32 bytes):
    01 00 00 00 48 00 00 00 08 00 00 00 00 00 00 00  ....H...........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000002ee98ae1>] kmalloc include/linux/slab.h:605 [inline]
    [<000000002ee98ae1>] sock_kmalloc+0xb3/0x100 net/core/sock.c:2566
    [<0000000065d7b698>] ipv6_renew_options+0x21e/0x10b0 net/ipv6/exthdrs.c:1318
    [<00000000a8c756d7>] ipv6_set_opt_hdr net/ipv6/ipv6_sockglue.c:354 [inline]
    [<00000000a8c756d7>] do_ipv6_setsockopt.constprop.0+0x28b7/0x4350 net/ipv6/ipv6_sockglue.c:668
    [<000000002854d204>] ipv6_setsockopt+0xdf/0x190 net/ipv6/ipv6_sockglue.c:1021
    [<00000000e69fdcf8>] tcp_setsockopt+0x13b/0x2620 net/ipv4/tcp.c:3789
    [<0000000090da4b9b>] __sys_setsockopt+0x239/0x620 net/socket.c:2252
    [<00000000b10d192f>] __do_sys_setsockopt net/socket.c:2263 [inline]
    [<00000000b10d192f>] __se_sys_setsockopt net/socket.c:2260 [inline]
    [<00000000b10d192f>] __x64_sys_setsockopt+0xbe/0x160 net/socket.c:2260
    [<000000000a80d7aa>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [<000000000a80d7aa>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
    [<000000004562b5c6>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/ipv6_sockglue.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index e0dcc7a193df..b61066ac8648 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -419,6 +419,12 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		rtnl_lock();
 	lock_sock(sk);
 
+	/* Another thread has converted the socket into IPv4 with
+	 * IPV6_ADDRFORM concurrently.
+	 */
+	if (unlikely(sk->sk_family != AF_INET6))
+		goto unlock;
+
 	switch (optname) {
 
 	case IPV6_ADDRFORM:
@@ -994,6 +1000,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		break;
 	}
 
+unlock:
 	release_sock(sk);
 	if (needs_rtnl)
 		rtnl_unlock();
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
  2022-09-28  0:27 [PATCH v2 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
  2022-09-28  0:27 ` [PATCH v2 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
@ 2022-09-28  0:27 ` Kuniyuki Iwashima
  2022-09-28  5:05   ` Eric Dumazet
  2022-09-28  0:27 ` [PATCH v2 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct() Kuniyuki Iwashima
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-28  0:27 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel

Commit 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support") forgot
to add a change to free inet6_sk(sk)->rxpmtu while converting an IPv6
socket into IPv4 with IPV6_ADDRFORM.  After conversion, sk_prot is
changed to udp_prot and ->destroy() never cleans it up, resulting in
a memory leak.

This is due to the discrepancy between inet6_destroy_sock() and
IPV6_ADDRFORM, so let's call inet6_destroy_sock() from IPV6_ADDRFORM
to remove the difference.

However, this is not enough for now because rxpmtu can be changed
without lock_sock() after commit 03485f2adcde ("udpv6: Add lockless
sendmsg() support").  We will fix this case in the following patch.

Fixes: 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/ipv6_sockglue.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index b61066ac8648..030a4cf23ceb 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -431,9 +431,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		if (optlen < sizeof(int))
 			goto e_inval;
 		if (val == PF_INET) {
-			struct ipv6_txoptions *opt;
-			struct sk_buff *pktopt;
-
 			if (sk->sk_type == SOCK_RAW)
 				break;
 
@@ -464,7 +461,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				break;
 			}
 
-			fl6_free_socklist(sk);
 			__ipv6_sock_mc_close(sk);
 			__ipv6_sock_ac_close(sk);
 
@@ -501,14 +497,9 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				sk->sk_socket->ops = &inet_dgram_ops;
 				sk->sk_family = PF_INET;
 			}
-			opt = xchg((__force struct ipv6_txoptions **)&np->opt,
-				   NULL);
-			if (opt) {
-				atomic_sub(opt->tot_len, &sk->sk_omem_alloc);
-				txopt_put(opt);
-			}
-			pktopt = xchg(&np->pktoptions, NULL);
-			kfree_skb(pktopt);
+
+			np->rxopt.all = 0;
+			inet6_destroy_sock(sk);
 
 			/*
 			 * ... and add it to the refcnt debug socks count
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  2022-09-28  0:27 [PATCH v2 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
  2022-09-28  0:27 ` [PATCH v2 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
  2022-09-28  0:27 ` [PATCH v2 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Kuniyuki Iwashima
@ 2022-09-28  0:27 ` Kuniyuki Iwashima
  2022-09-28  3:43   ` Eric Dumazet
  2022-09-28  0:27 ` [PATCH v2 net 4/5] ipv6: Fix data races around sk->sk_prot Kuniyuki Iwashima
  2022-09-28  0:27 ` [PATCH v2 net 5/5] tcp: Fix data races around icsk->icsk_af_ops Kuniyuki Iwashima
  4 siblings, 1 reply; 12+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-28  0:27 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel

Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
Add lockless sendmsg() support") added a lockless memory allocation path,
which could cause a memory leak:

setsockopt(IPV6_ADDRFORM)                 sendmsg()
+-----------------------+                 +-------+
- do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
  - lock_sock(sk)                           ^._ called via udpv6_prot
  - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
  - inet6_destroy_sock()
  - release_sock(sk)                        - ip6_make_skb(sk, ...)
                                              ^._ lockless fast path for
                                                  the non-corking case

                                              - __ip6_append_data(sk, ...)
                                                - ipv6_local_rxpmtu(sk, ...)
                                                  - xchg(&np->rxpmtu, skb)
                                                    ^._ rxpmtu is never freed.

                                            - lock_sock(sk)

For now, rxpmtu is only the case, but let's call inet6_destroy_sock()
in IPv6 sk->sk_destruct() not to miss the future change and a similar
bug fixed in commit e27326009a3d ("net: ping6: Fix memleak in
ipv6_renew_options().")

We can now remove all inet6_destroy_sock() calls from IPv6 protocol
specific ->destroy() functions, but such changes are invasive to
backport.  So they can be posted as a follow-up later for net-next.

Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 include/net/ipv6.h  |  1 +
 include/net/udp.h   |  2 +-
 net/ipv4/udp.c      |  8 ++++++--
 net/ipv6/af_inet6.c |  9 ++++++++-
 net/ipv6/udp.c      | 15 ++++++++++++++-
 5 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index de9dcc5652c4..11f1a9a8b066 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1178,6 +1178,7 @@ void ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err, __be16 port,
 void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info);
 void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
 
+void inet6_sock_destruct(struct sock *sk);
 int inet6_release(struct socket *sock);
 int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len);
 int inet6_getname(struct socket *sock, struct sockaddr *uaddr,
diff --git a/include/net/udp.h b/include/net/udp.h
index 5ee88ddf79c3..fee053bcd17c 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -247,7 +247,7 @@ static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if,
 }
 
 /* net/ipv4/udp.c */
-void udp_destruct_sock(struct sock *sk);
+void udp_destruct_common(struct sock *sk);
 void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len);
 int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb);
 void udp_skb_destructor(struct sock *sk, struct sk_buff *skb);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 560d9eadeaa5..a84ae44db7e2 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1598,7 +1598,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
 }
 EXPORT_SYMBOL_GPL(__udp_enqueue_schedule_skb);
 
-void udp_destruct_sock(struct sock *sk)
+void udp_destruct_common(struct sock *sk)
 {
 	/* reclaim completely the forward allocated memory */
 	struct udp_sock *up = udp_sk(sk);
@@ -1611,10 +1611,14 @@ void udp_destruct_sock(struct sock *sk)
 		kfree_skb(skb);
 	}
 	udp_rmem_release(sk, total, 0, true);
+}
+EXPORT_SYMBOL_GPL(udp_destruct_common);
 
+static void udp_destruct_sock(struct sock *sk)
+{
+	udp_destruct_common(sk);
 	inet_sock_destruct(sk);
 }
-EXPORT_SYMBOL_GPL(udp_destruct_sock);
 
 int udp_init_sock(struct sock *sk)
 {
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index dbb1430d6cc2..0774cff62f2d 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -109,6 +109,13 @@ static __inline__ struct ipv6_pinfo *inet6_sk_generic(struct sock *sk)
 	return (struct ipv6_pinfo *)(((u8 *)sk) + offset);
 }
 
+void inet6_sock_destruct(struct sock *sk)
+{
+	inet6_destroy_sock(sk);
+	inet_sock_destruct(sk);
+}
+EXPORT_SYMBOL_GPL(inet6_sock_destruct);
+
 static int inet6_create(struct net *net, struct socket *sock, int protocol,
 			int kern)
 {
@@ -201,7 +208,7 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
 			inet->hdrincl = 1;
 	}
 
-	sk->sk_destruct		= inet_sock_destruct;
+	sk->sk_destruct		= inet6_sock_destruct;
 	sk->sk_family		= PF_INET6;
 	sk->sk_protocol		= protocol;
 
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 3366d6a77ff2..a5256f7184ab 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -56,6 +56,19 @@
 #include <trace/events/skb.h>
 #include "udp_impl.h"
 
+static void udpv6_destruct_sock(struct sock *sk)
+{
+	udp_destruct_common(sk);
+	inet6_sock_destruct(sk);
+}
+
+static int udpv6_init_sock(struct sock *sk)
+{
+	skb_queue_head_init(&udp_sk(sk)->reader_queue);
+	sk->sk_destruct = udpv6_destruct_sock;
+	return 0;
+}
+
 static u32 udp6_ehashfn(const struct net *net,
 			const struct in6_addr *laddr,
 			const u16 lport,
@@ -1723,7 +1736,7 @@ struct proto udpv6_prot = {
 	.connect		= ip6_datagram_connect,
 	.disconnect		= udp_disconnect,
 	.ioctl			= udp_ioctl,
-	.init			= udp_init_sock,
+	.init			= udpv6_init_sock,
 	.destroy		= udpv6_destroy_sock,
 	.setsockopt		= udpv6_setsockopt,
 	.getsockopt		= udpv6_getsockopt,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 net 4/5] ipv6: Fix data races around sk->sk_prot.
  2022-09-28  0:27 [PATCH v2 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
                   ` (2 preceding siblings ...)
  2022-09-28  0:27 ` [PATCH v2 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct() Kuniyuki Iwashima
@ 2022-09-28  0:27 ` Kuniyuki Iwashima
  2022-09-28  0:27 ` [PATCH v2 net 5/5] tcp: Fix data races around icsk->icsk_af_ops Kuniyuki Iwashima
  4 siblings, 0 replies; 12+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-28  0:27 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel

Commit 086d49058cd8 ("ipv6: annotate some data-races around sk->sk_prot")
fixed some data-races around sk->sk_prot but it was not enough.

Some functions in inet6_(stream|dgram)_ops still access sk->sk_prot
without lock_sock() or rtnl_lock(), so they need READ_ONCE() to avoid
load tearing.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/sock.c          |  6 ++++--
 net/ipv4/af_inet.c       | 23 ++++++++++++++++-------
 net/ipv6/ipv6_sockglue.c |  4 ++--
 3 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 788c1372663c..9c05637663bf 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3556,7 +3556,8 @@ int sock_common_getsockopt(struct socket *sock, int level, int optname,
 {
 	struct sock *sk = sock->sk;
 
-	return sk->sk_prot->getsockopt(sk, level, optname, optval, optlen);
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	return READ_ONCE(sk->sk_prot)->getsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(sock_common_getsockopt);
 
@@ -3582,7 +3583,8 @@ int sock_common_setsockopt(struct socket *sock, int level, int optname,
 {
 	struct sock *sk = sock->sk;
 
-	return sk->sk_prot->setsockopt(sk, level, optname, optval, optlen);
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	return READ_ONCE(sk->sk_prot)->setsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(sock_common_setsockopt);
 
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 3ca0cc467886..405fbad998df 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -558,22 +558,27 @@ int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr,
 		       int addr_len, int flags)
 {
 	struct sock *sk = sock->sk;
+	const struct proto *prot;
 	int err;
 
 	if (addr_len < sizeof(uaddr->sa_family))
 		return -EINVAL;
+
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	prot = READ_ONCE(sk->sk_prot);
+
 	if (uaddr->sa_family == AF_UNSPEC)
-		return sk->sk_prot->disconnect(sk, flags);
+		return prot->disconnect(sk, flags);
 
 	if (BPF_CGROUP_PRE_CONNECT_ENABLED(sk)) {
-		err = sk->sk_prot->pre_connect(sk, uaddr, addr_len);
+		err = prot->pre_connect(sk, uaddr, addr_len);
 		if (err)
 			return err;
 	}
 
 	if (data_race(!inet_sk(sk)->inet_num) && inet_autobind(sk))
 		return -EAGAIN;
-	return sk->sk_prot->connect(sk, uaddr, addr_len);
+	return prot->connect(sk, uaddr, addr_len);
 }
 EXPORT_SYMBOL(inet_dgram_connect);
 
@@ -734,10 +739,11 @@ EXPORT_SYMBOL(inet_stream_connect);
 int inet_accept(struct socket *sock, struct socket *newsock, int flags,
 		bool kern)
 {
-	struct sock *sk1 = sock->sk;
+	struct sock *sk1 = sock->sk, *sk2;
 	int err = -EINVAL;
-	struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, &err, kern);
 
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	sk2 = READ_ONCE(sk1->sk_prot)->accept(sk1, flags, &err, kern);
 	if (!sk2)
 		goto do_err;
 
@@ -825,12 +831,15 @@ ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 		      size_t size, int flags)
 {
 	struct sock *sk = sock->sk;
+	const struct proto *prot;
 
 	if (unlikely(inet_send_prepare(sk)))
 		return -EAGAIN;
 
-	if (sk->sk_prot->sendpage)
-		return sk->sk_prot->sendpage(sk, page, offset, size, flags);
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	prot = READ_ONCE(sk->sk_prot);
+	if (prot->sendpage)
+		return prot->sendpage(sk, page, offset, size, flags);
 	return sock_no_sendpage(sock, page, offset, size, flags);
 }
 EXPORT_SYMBOL(inet_sendpage);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 030a4cf23ceb..a89db5872dc3 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -477,7 +477,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				sock_prot_inuse_add(net, sk->sk_prot, -1);
 				sock_prot_inuse_add(net, &tcp_prot, 1);
 
-				/* Paired with READ_ONCE(sk->sk_prot) in net/ipv6/af_inet6.c */
+				/* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
 				WRITE_ONCE(sk->sk_prot, &tcp_prot);
 				icsk->icsk_af_ops = &ipv4_specific;
 				sk->sk_socket->ops = &inet_stream_ops;
@@ -492,7 +492,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				sock_prot_inuse_add(net, sk->sk_prot, -1);
 				sock_prot_inuse_add(net, prot, 1);
 
-				/* Paired with READ_ONCE(sk->sk_prot) in net/ipv6/af_inet6.c */
+				/* Paired with READ_ONCE(sk->sk_prot) in inet6_dgram_ops */
 				WRITE_ONCE(sk->sk_prot, prot);
 				sk->sk_socket->ops = &inet_dgram_ops;
 				sk->sk_family = PF_INET;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 net 5/5] tcp: Fix data races around icsk->icsk_af_ops.
  2022-09-28  0:27 [PATCH v2 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
                   ` (3 preceding siblings ...)
  2022-09-28  0:27 ` [PATCH v2 net 4/5] ipv6: Fix data races around sk->sk_prot Kuniyuki Iwashima
@ 2022-09-28  0:27 ` Kuniyuki Iwashima
  4 siblings, 0 replies; 12+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-28  0:27 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel, syzbot

setsockopt(IPV6_ADDRFORM) and tcp_v6_connect() change icsk->icsk_af_ops
under lock_sock(), but tcp_(get|set)sockopt() read it locklessly.  To
avoid load/store tearing, we need to add READ_ONCE() and WRITE_ONCE()
for the reads and writes.

Thanks to Eric Dumazet for providing the syzbot report:

BUG: KCSAN: data-race in tcp_setsockopt / tcp_v6_connect

write to 0xffff88813c624518 of 8 bytes by task 23936 on cpu 0:
tcp_v6_connect+0x5b3/0xce0 net/ipv6/tcp_ipv6.c:240
__inet_stream_connect+0x159/0x6d0 net/ipv4/af_inet.c:660
inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:724
__sys_connect_file net/socket.c:1976 [inline]
__sys_connect+0x197/0x1b0 net/socket.c:1993
__do_sys_connect net/socket.c:2003 [inline]
__se_sys_connect net/socket.c:2000 [inline]
__x64_sys_connect+0x3d/0x50 net/socket.c:2000
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff88813c624518 of 8 bytes by task 23937 on cpu 1:
tcp_setsockopt+0x147/0x1c80 net/ipv4/tcp.c:3789
sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3585
__sys_setsockopt+0x212/0x2b0 net/socket.c:2252
__do_sys_setsockopt net/socket.c:2263 [inline]
__se_sys_setsockopt net/socket.c:2260 [inline]
__x64_sys_setsockopt+0x62/0x70 net/socket.c:2260
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0xffffffff8539af68 -> 0xffffffff8539aff8

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 23937 Comm: syz-executor.5 Not tainted
6.0.0-rc4-syzkaller-00331-g4ed9c1e971b1-dirty #0

Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 08/26/2022

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv4/tcp.c           | 10 ++++++----
 net/ipv6/ipv6_sockglue.c |  3 ++-
 net/ipv6/tcp_ipv6.c      |  6 ++++--
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index e373dde1f46f..76b80d81fd6a 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3795,8 +3795,9 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
 	const struct inet_connection_sock *icsk = inet_csk(sk);
 
 	if (level != SOL_TCP)
-		return icsk->icsk_af_ops->setsockopt(sk, level, optname,
-						     optval, optlen);
+		/* Paired with WRITE_ONCE() in do_ipv6_setsockopt() and tcp_v6_connect() */
+		return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname,
+								optval, optlen);
 	return do_tcp_setsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(tcp_setsockopt);
@@ -4394,8 +4395,9 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
 	struct inet_connection_sock *icsk = inet_csk(sk);
 
 	if (level != SOL_TCP)
-		return icsk->icsk_af_ops->getsockopt(sk, level, optname,
-						     optval, optlen);
+		/* Paired with WRITE_ONCE() in do_ipv6_setsockopt() and tcp_v6_connect() */
+		return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname,
+								optval, optlen);
 	return do_tcp_getsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(tcp_getsockopt);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index a89db5872dc3..726d95859898 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -479,7 +479,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 
 				/* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
 				WRITE_ONCE(sk->sk_prot, &tcp_prot);
-				icsk->icsk_af_ops = &ipv4_specific;
+				/* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
+				WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific);
 				sk->sk_socket->ops = &inet_stream_ops;
 				sk->sk_family = PF_INET;
 				tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index e54eee80ce5f..8680aa83f0b9 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -237,7 +237,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 		sin.sin_port = usin->sin6_port;
 		sin.sin_addr.s_addr = usin->sin6_addr.s6_addr32[3];
 
-		icsk->icsk_af_ops = &ipv6_mapped;
+		/* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
+		WRITE_ONCE(icsk->icsk_af_ops, &ipv6_mapped);
 		if (sk_is_mptcp(sk))
 			mptcpv6_handle_mapped(sk, true);
 		sk->sk_backlog_rcv = tcp_v4_do_rcv;
@@ -249,7 +250,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 
 		if (err) {
 			icsk->icsk_ext_hdr_len = exthdrlen;
-			icsk->icsk_af_ops = &ipv6_specific;
+			/* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
+			WRITE_ONCE(icsk->icsk_af_ops, &ipv6_specific);
 			if (sk_is_mptcp(sk))
 				mptcpv6_handle_mapped(sk, false);
 			sk->sk_backlog_rcv = tcp_v6_do_rcv;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  2022-09-28  0:27 ` [PATCH v2 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct() Kuniyuki Iwashima
@ 2022-09-28  3:43   ` Eric Dumazet
  2022-09-28  4:00     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2022-09-28  3:43 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
	Kuniyuki Iwashima, netdev, syzkaller-bugs, LKML

On Tue, Sep 27, 2022 at 5:29 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
> able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
> IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
> Add lockless sendmsg() support") added a lockless memory allocation path,
> which could cause a memory leak:
>
> setsockopt(IPV6_ADDRFORM)                 sendmsg()
> +-----------------------+                 +-------+
> - do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
>   - lock_sock(sk)                           ^._ called via udpv6_prot
>   - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
>   - inet6_destroy_sock()
>   - release_sock(sk)                        - ip6_make_skb(sk, ...)
>                                               ^._ lockless fast path for
>                                                   the non-corking case
>
>                                               - __ip6_append_data(sk, ...)
>                                                 - ipv6_local_rxpmtu(sk, ...)
>                                                   - xchg(&np->rxpmtu, skb)
>                                                     ^._ rxpmtu is never freed.
>
>                                             - lock_sock(sk)
>
> For now, rxpmtu is only the case, but let's call inet6_destroy_sock()
> in IPv6 sk->sk_destruct() not to miss the future change and a similar
> bug fixed in commit e27326009a3d ("net: ping6: Fix memleak in
> ipv6_renew_options().")

I do not see how your patches prevent rxpmtu to be created at the time
of IPV6_ADDRFROM ?

There seem to be races.

lockless UDP sendmsg() is a disaster really.

>
> We can now remove all inet6_destroy_sock() calls from IPv6 protocol
> specific ->destroy() functions, but such changes are invasive to
> backport.  So they can be posted as a follow-up later for net-next.
>
> Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
>  include/net/ipv6.h  |  1 +
>  include/net/udp.h   |  2 +-
>  net/ipv4/udp.c      |  8 ++++++--
>  net/ipv6/af_inet6.c |  9 ++++++++-
>  net/ipv6/udp.c      | 15 ++++++++++++++-
>  5 files changed, 30 insertions(+), 5 deletions(-)
>
> diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> index de9dcc5652c4..11f1a9a8b066 100644
> --- a/include/net/ipv6.h
> +++ b/include/net/ipv6.h
> @@ -1178,6 +1178,7 @@ void ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err, __be16 port,
>  void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info);
>  void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
>
> +void inet6_sock_destruct(struct sock *sk);
>  int inet6_release(struct socket *sock);
>  int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len);
>  int inet6_getname(struct socket *sock, struct sockaddr *uaddr,
> diff --git a/include/net/udp.h b/include/net/udp.h
> index 5ee88ddf79c3..fee053bcd17c 100644
> --- a/include/net/udp.h
> +++ b/include/net/udp.h
> @@ -247,7 +247,7 @@ static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if,
>  }
>
>  /* net/ipv4/udp.c */
> -void udp_destruct_sock(struct sock *sk);
> +void udp_destruct_common(struct sock *sk);
>  void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len);
>  int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb);
>  void udp_skb_destructor(struct sock *sk, struct sk_buff *skb);
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 560d9eadeaa5..a84ae44db7e2 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1598,7 +1598,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
>  }
>  EXPORT_SYMBOL_GPL(__udp_enqueue_schedule_skb);
>
> -void udp_destruct_sock(struct sock *sk)
> +void udp_destruct_common(struct sock *sk)
>  {
>         /* reclaim completely the forward allocated memory */
>         struct udp_sock *up = udp_sk(sk);
> @@ -1611,10 +1611,14 @@ void udp_destruct_sock(struct sock *sk)
>                 kfree_skb(skb);
>         }
>         udp_rmem_release(sk, total, 0, true);
> +}
> +EXPORT_SYMBOL_GPL(udp_destruct_common);
>
> +static void udp_destruct_sock(struct sock *sk)
> +{
> +       udp_destruct_common(sk);
>         inet_sock_destruct(sk);
>  }
> -EXPORT_SYMBOL_GPL(udp_destruct_sock);
>
>  int udp_init_sock(struct sock *sk)
>  {
> diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> index dbb1430d6cc2..0774cff62f2d 100644
> --- a/net/ipv6/af_inet6.c
> +++ b/net/ipv6/af_inet6.c
> @@ -109,6 +109,13 @@ static __inline__ struct ipv6_pinfo *inet6_sk_generic(struct sock *sk)
>         return (struct ipv6_pinfo *)(((u8 *)sk) + offset);
>  }
>
> +void inet6_sock_destruct(struct sock *sk)
> +{
> +       inet6_destroy_sock(sk);
> +       inet_sock_destruct(sk);
> +}
> +EXPORT_SYMBOL_GPL(inet6_sock_destruct);
> +
>  static int inet6_create(struct net *net, struct socket *sock, int protocol,
>                         int kern)
>  {
> @@ -201,7 +208,7 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
>                         inet->hdrincl = 1;
>         }
>
> -       sk->sk_destruct         = inet_sock_destruct;
> +       sk->sk_destruct         = inet6_sock_destruct;
>         sk->sk_family           = PF_INET6;
>         sk->sk_protocol         = protocol;
>
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index 3366d6a77ff2..a5256f7184ab 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -56,6 +56,19 @@
>  #include <trace/events/skb.h>
>  #include "udp_impl.h"
>
> +static void udpv6_destruct_sock(struct sock *sk)
> +{
> +       udp_destruct_common(sk);
> +       inet6_sock_destruct(sk);
> +}
> +
> +static int udpv6_init_sock(struct sock *sk)
> +{
> +       skb_queue_head_init(&udp_sk(sk)->reader_queue);
> +       sk->sk_destruct = udpv6_destruct_sock;
> +       return 0;
> +}
> +
>  static u32 udp6_ehashfn(const struct net *net,
>                         const struct in6_addr *laddr,
>                         const u16 lport,
> @@ -1723,7 +1736,7 @@ struct proto udpv6_prot = {
>         .connect                = ip6_datagram_connect,
>         .disconnect             = udp_disconnect,
>         .ioctl                  = udp_ioctl,
> -       .init                   = udp_init_sock,
> +       .init                   = udpv6_init_sock,
>         .destroy                = udpv6_destroy_sock,
>         .setsockopt             = udpv6_setsockopt,
>         .getsockopt             = udpv6_getsockopt,
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  2022-09-28  3:43   ` Eric Dumazet
@ 2022-09-28  4:00     ` Kuniyuki Iwashima
  2022-09-28  4:10       ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-28  4:00 UTC (permalink / raw)
  To: edumazet
  Cc: davem, dsahern, kuba, kuni1840, kuniyu, linux-kernel, netdev,
	pabeni, syzkaller-bugs

From:   Eric Dumazet <edumazet@google.com>
Date:   Tue, 27 Sep 2022 20:43:51 -0700
> On Tue, Sep 27, 2022 at 5:29 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
> > able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
> > IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
> > Add lockless sendmsg() support") added a lockless memory allocation path,
> > which could cause a memory leak:
> >
> > setsockopt(IPV6_ADDRFORM)                 sendmsg()
> > +-----------------------+                 +-------+
> > - do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
> >   - lock_sock(sk)                           ^._ called via udpv6_prot
> >   - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
> >   - inet6_destroy_sock()
> >   - release_sock(sk)                        - ip6_make_skb(sk, ...)
> >                                               ^._ lockless fast path for
> >                                                   the non-corking case
> >
> >                                               - __ip6_append_data(sk, ...)
> >                                                 - ipv6_local_rxpmtu(sk, ...)
> >                                                   - xchg(&np->rxpmtu, skb)
> >                                                     ^._ rxpmtu is never freed.
> >
> >                                             - lock_sock(sk)
> >
> > For now, rxpmtu is only the case, but let's call inet6_destroy_sock()
> > in IPv6 sk->sk_destruct() not to miss the future change and a similar
> > bug fixed in commit e27326009a3d ("net: ping6: Fix memleak in
> > ipv6_renew_options().")
> 
> I do not see how your patches prevent rxpmtu to be created at the time
> of IPV6_ADDRFROM ?
> 
> There seem to be races.
> 
> lockless UDP sendmsg() is a disaster really.

I think we are never able to prevent it and races exist unless we remove
the lockless path itself, so the patch makes sure to free rxpmtu at least
when we close() the socket.  Currently, we can not even free it.


> > We can now remove all inet6_destroy_sock() calls from IPv6 protocol
> > specific ->destroy() functions, but such changes are invasive to
> > backport.  So they can be posted as a follow-up later for net-next.
> >
> > Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> >  include/net/ipv6.h  |  1 +
> >  include/net/udp.h   |  2 +-
> >  net/ipv4/udp.c      |  8 ++++++--
> >  net/ipv6/af_inet6.c |  9 ++++++++-
> >  net/ipv6/udp.c      | 15 ++++++++++++++-
> >  5 files changed, 30 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> > index de9dcc5652c4..11f1a9a8b066 100644
> > --- a/include/net/ipv6.h
> > +++ b/include/net/ipv6.h
> > @@ -1178,6 +1178,7 @@ void ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err, __be16 port,
> >  void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info);
> >  void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
> >
> > +void inet6_sock_destruct(struct sock *sk);
> >  int inet6_release(struct socket *sock);
> >  int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len);
> >  int inet6_getname(struct socket *sock, struct sockaddr *uaddr,
> > diff --git a/include/net/udp.h b/include/net/udp.h
> > index 5ee88ddf79c3..fee053bcd17c 100644
> > --- a/include/net/udp.h
> > +++ b/include/net/udp.h
> > @@ -247,7 +247,7 @@ static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if,
> >  }
> >
> >  /* net/ipv4/udp.c */
> > -void udp_destruct_sock(struct sock *sk);
> > +void udp_destruct_common(struct sock *sk);
> >  void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len);
> >  int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb);
> >  void udp_skb_destructor(struct sock *sk, struct sk_buff *skb);
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 560d9eadeaa5..a84ae44db7e2 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -1598,7 +1598,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
> >  }
> >  EXPORT_SYMBOL_GPL(__udp_enqueue_schedule_skb);
> >
> > -void udp_destruct_sock(struct sock *sk)
> > +void udp_destruct_common(struct sock *sk)
> >  {
> >         /* reclaim completely the forward allocated memory */
> >         struct udp_sock *up = udp_sk(sk);
> > @@ -1611,10 +1611,14 @@ void udp_destruct_sock(struct sock *sk)
> >                 kfree_skb(skb);
> >         }
> >         udp_rmem_release(sk, total, 0, true);
> > +}
> > +EXPORT_SYMBOL_GPL(udp_destruct_common);
> >
> > +static void udp_destruct_sock(struct sock *sk)
> > +{
> > +       udp_destruct_common(sk);
> >         inet_sock_destruct(sk);
> >  }
> > -EXPORT_SYMBOL_GPL(udp_destruct_sock);
> >
> >  int udp_init_sock(struct sock *sk)
> >  {
> > diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> > index dbb1430d6cc2..0774cff62f2d 100644
> > --- a/net/ipv6/af_inet6.c
> > +++ b/net/ipv6/af_inet6.c
> > @@ -109,6 +109,13 @@ static __inline__ struct ipv6_pinfo *inet6_sk_generic(struct sock *sk)
> >         return (struct ipv6_pinfo *)(((u8 *)sk) + offset);
> >  }
> >
> > +void inet6_sock_destruct(struct sock *sk)
> > +{
> > +       inet6_destroy_sock(sk);
> > +       inet_sock_destruct(sk);
> > +}
> > +EXPORT_SYMBOL_GPL(inet6_sock_destruct);
> > +
> >  static int inet6_create(struct net *net, struct socket *sock, int protocol,
> >                         int kern)
> >  {
> > @@ -201,7 +208,7 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
> >                         inet->hdrincl = 1;
> >         }
> >
> > -       sk->sk_destruct         = inet_sock_destruct;
> > +       sk->sk_destruct         = inet6_sock_destruct;
> >         sk->sk_family           = PF_INET6;
> >         sk->sk_protocol         = protocol;
> >
> > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > index 3366d6a77ff2..a5256f7184ab 100644
> > --- a/net/ipv6/udp.c
> > +++ b/net/ipv6/udp.c
> > @@ -56,6 +56,19 @@
> >  #include <trace/events/skb.h>
> >  #include "udp_impl.h"
> >
> > +static void udpv6_destruct_sock(struct sock *sk)
> > +{
> > +       udp_destruct_common(sk);
> > +       inet6_sock_destruct(sk);
> > +}
> > +
> > +static int udpv6_init_sock(struct sock *sk)
> > +{
> > +       skb_queue_head_init(&udp_sk(sk)->reader_queue);
> > +       sk->sk_destruct = udpv6_destruct_sock;
> > +       return 0;
> > +}
> > +
> >  static u32 udp6_ehashfn(const struct net *net,
> >                         const struct in6_addr *laddr,
> >                         const u16 lport,
> > @@ -1723,7 +1736,7 @@ struct proto udpv6_prot = {
> >         .connect                = ip6_datagram_connect,
> >         .disconnect             = udp_disconnect,
> >         .ioctl                  = udp_ioctl,
> > -       .init                   = udp_init_sock,
> > +       .init                   = udpv6_init_sock,
> >         .destroy                = udpv6_destroy_sock,
> >         .setsockopt             = udpv6_setsockopt,
> >         .getsockopt             = udpv6_getsockopt,
> > --
> > 2.30.2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  2022-09-28  4:00     ` Kuniyuki Iwashima
@ 2022-09-28  4:10       ` Eric Dumazet
  2022-09-28  4:31         ` Kuniyuki Iwashima
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2022-09-28  4:10 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David Miller, David Ahern, Jakub Kicinski, Kuniyuki Iwashima,
	LKML, netdev, Paolo Abeni, syzkaller-bugs

On Tue, Sep 27, 2022 at 9:00 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From:   Eric Dumazet <edumazet@google.com>
> Date:   Tue, 27 Sep 2022 20:43:51 -0700
> > On Tue, Sep 27, 2022 at 5:29 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > >
> > > Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
> > > able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
> > > IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
> > > Add lockless sendmsg() support") added a lockless memory allocation path,
> > > which could cause a memory leak:
> > >
> > > setsockopt(IPV6_ADDRFORM)                 sendmsg()
> > > +-----------------------+                 +-------+
> > > - do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
> > >   - lock_sock(sk)                           ^._ called via udpv6_prot
> > >   - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
> > >   - inet6_destroy_sock()
> > >   - release_sock(sk)                        - ip6_make_skb(sk, ...)
> > >                                               ^._ lockless fast path for
> > >                                                   the non-corking case
> > >
> > >                                               - __ip6_append_data(sk, ...)
> > >                                                 - ipv6_local_rxpmtu(sk, ...)
> > >                                                   - xchg(&np->rxpmtu, skb)
> > >                                                     ^._ rxpmtu is never freed.
> > >
> > >                                             - lock_sock(sk)
> > >
> > > For now, rxpmtu is only the case, but let's call inet6_destroy_sock()
> > > in IPv6 sk->sk_destruct() not to miss the future change and a similar
> > > bug fixed in commit e27326009a3d ("net: ping6: Fix memleak in
> > > ipv6_renew_options().")
> >
> > I do not see how your patches prevent rxpmtu to be created at the time
> > of IPV6_ADDRFROM ?
> >
> > There seem to be races.
> >
> > lockless UDP sendmsg() is a disaster really.
>
> I think we are never able to prevent it and races exist unless we remove
> the lockless path itself, so the patch makes sure to free rxpmtu at least
> when we close() the socket.  Currently, we can not even free it.

I am saying your patches do not guarantee the rxpmtu is freed at close() time.

Once the v6 socket has been transformed to IPv4 one,
inet6_sock_destruct() is not going to be called.



>
>
> > > We can now remove all inet6_destroy_sock() calls from IPv6 protocol
> > > specific ->destroy() functions, but such changes are invasive to
> > > backport.  So they can be posted as a follow-up later for net-next.
> > >
> > > Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
> > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > ---
> > >  include/net/ipv6.h  |  1 +
> > >  include/net/udp.h   |  2 +-
> > >  net/ipv4/udp.c      |  8 ++++++--
> > >  net/ipv6/af_inet6.c |  9 ++++++++-
> > >  net/ipv6/udp.c      | 15 ++++++++++++++-
> > >  5 files changed, 30 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> > > index de9dcc5652c4..11f1a9a8b066 100644
> > > --- a/include/net/ipv6.h
> > > +++ b/include/net/ipv6.h
> > > @@ -1178,6 +1178,7 @@ void ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err, __be16 port,
> > >  void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info);
> > >  void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
> > >
> > > +void inet6_sock_destruct(struct sock *sk);
> > >  int inet6_release(struct socket *sock);
> > >  int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len);
> > >  int inet6_getname(struct socket *sock, struct sockaddr *uaddr,
> > > diff --git a/include/net/udp.h b/include/net/udp.h
> > > index 5ee88ddf79c3..fee053bcd17c 100644
> > > --- a/include/net/udp.h
> > > +++ b/include/net/udp.h
> > > @@ -247,7 +247,7 @@ static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if,
> > >  }
> > >
> > >  /* net/ipv4/udp.c */
> > > -void udp_destruct_sock(struct sock *sk);
> > > +void udp_destruct_common(struct sock *sk);
> > >  void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len);
> > >  int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb);
> > >  void udp_skb_destructor(struct sock *sk, struct sk_buff *skb);
> > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > > index 560d9eadeaa5..a84ae44db7e2 100644
> > > --- a/net/ipv4/udp.c
> > > +++ b/net/ipv4/udp.c
> > > @@ -1598,7 +1598,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
> > >  }
> > >  EXPORT_SYMBOL_GPL(__udp_enqueue_schedule_skb);
> > >
> > > -void udp_destruct_sock(struct sock *sk)
> > > +void udp_destruct_common(struct sock *sk)
> > >  {
> > >         /* reclaim completely the forward allocated memory */
> > >         struct udp_sock *up = udp_sk(sk);
> > > @@ -1611,10 +1611,14 @@ void udp_destruct_sock(struct sock *sk)
> > >                 kfree_skb(skb);
> > >         }
> > >         udp_rmem_release(sk, total, 0, true);
> > > +}
> > > +EXPORT_SYMBOL_GPL(udp_destruct_common);
> > >
> > > +static void udp_destruct_sock(struct sock *sk)
> > > +{
> > > +       udp_destruct_common(sk);
> > >         inet_sock_destruct(sk);
> > >  }
> > > -EXPORT_SYMBOL_GPL(udp_destruct_sock);
> > >
> > >  int udp_init_sock(struct sock *sk)
> > >  {
> > > diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> > > index dbb1430d6cc2..0774cff62f2d 100644
> > > --- a/net/ipv6/af_inet6.c
> > > +++ b/net/ipv6/af_inet6.c
> > > @@ -109,6 +109,13 @@ static __inline__ struct ipv6_pinfo *inet6_sk_generic(struct sock *sk)
> > >         return (struct ipv6_pinfo *)(((u8 *)sk) + offset);
> > >  }
> > >
> > > +void inet6_sock_destruct(struct sock *sk)
> > > +{
> > > +       inet6_destroy_sock(sk);
> > > +       inet_sock_destruct(sk);
> > > +}
> > > +EXPORT_SYMBOL_GPL(inet6_sock_destruct);
> > > +
> > >  static int inet6_create(struct net *net, struct socket *sock, int protocol,
> > >                         int kern)
> > >  {
> > > @@ -201,7 +208,7 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
> > >                         inet->hdrincl = 1;
> > >         }
> > >
> > > -       sk->sk_destruct         = inet_sock_destruct;
> > > +       sk->sk_destruct         = inet6_sock_destruct;
> > >         sk->sk_family           = PF_INET6;
> > >         sk->sk_protocol         = protocol;
> > >
> > > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > > index 3366d6a77ff2..a5256f7184ab 100644
> > > --- a/net/ipv6/udp.c
> > > +++ b/net/ipv6/udp.c
> > > @@ -56,6 +56,19 @@
> > >  #include <trace/events/skb.h>
> > >  #include "udp_impl.h"
> > >
> > > +static void udpv6_destruct_sock(struct sock *sk)
> > > +{
> > > +       udp_destruct_common(sk);
> > > +       inet6_sock_destruct(sk);
> > > +}
> > > +
> > > +static int udpv6_init_sock(struct sock *sk)
> > > +{
> > > +       skb_queue_head_init(&udp_sk(sk)->reader_queue);
> > > +       sk->sk_destruct = udpv6_destruct_sock;
> > > +       return 0;
> > > +}
> > > +
> > >  static u32 udp6_ehashfn(const struct net *net,
> > >                         const struct in6_addr *laddr,
> > >                         const u16 lport,
> > > @@ -1723,7 +1736,7 @@ struct proto udpv6_prot = {
> > >         .connect                = ip6_datagram_connect,
> > >         .disconnect             = udp_disconnect,
> > >         .ioctl                  = udp_ioctl,
> > > -       .init                   = udp_init_sock,
> > > +       .init                   = udpv6_init_sock,
> > >         .destroy                = udpv6_destroy_sock,
> > >         .setsockopt             = udpv6_setsockopt,
> > >         .getsockopt             = udpv6_getsockopt,
> > > --
> > > 2.30.2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  2022-09-28  4:10       ` Eric Dumazet
@ 2022-09-28  4:31         ` Kuniyuki Iwashima
  0 siblings, 0 replies; 12+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-28  4:31 UTC (permalink / raw)
  To: edumazet
  Cc: davem, dsahern, kuba, kuni1840, kuniyu, linux-kernel, netdev,
	pabeni, syzkaller-bugs

From:   Eric Dumazet <edumazet@google.com>
Date:   Tue, 27 Sep 2022 21:10:22 -0700
> On Tue, Sep 27, 2022 at 9:00 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > From:   Eric Dumazet <edumazet@google.com>
> > Date:   Tue, 27 Sep 2022 20:43:51 -0700
> > > On Tue, Sep 27, 2022 at 5:29 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > > >
> > > > Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
> > > > able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
> > > > IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
> > > > Add lockless sendmsg() support") added a lockless memory allocation path,
> > > > which could cause a memory leak:
> > > >
> > > > setsockopt(IPV6_ADDRFORM)                 sendmsg()
> > > > +-----------------------+                 +-------+
> > > > - do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
> > > >   - lock_sock(sk)                           ^._ called via udpv6_prot
> > > >   - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
> > > >   - inet6_destroy_sock()
> > > >   - release_sock(sk)                        - ip6_make_skb(sk, ...)
> > > >                                               ^._ lockless fast path for
> > > >                                                   the non-corking case
> > > >
> > > >                                               - __ip6_append_data(sk, ...)
> > > >                                                 - ipv6_local_rxpmtu(sk, ...)
> > > >                                                   - xchg(&np->rxpmtu, skb)
> > > >                                                     ^._ rxpmtu is never freed.
> > > >
> > > >                                             - lock_sock(sk)
> > > >
> > > > For now, rxpmtu is only the case, but let's call inet6_destroy_sock()
> > > > in IPv6 sk->sk_destruct() not to miss the future change and a similar
> > > > bug fixed in commit e27326009a3d ("net: ping6: Fix memleak in
> > > > ipv6_renew_options().")
> > >
> > > I do not see how your patches prevent rxpmtu to be created at the time
> > > of IPV6_ADDRFROM ?
> > >
> > > There seem to be races.
> > >
> > > lockless UDP sendmsg() is a disaster really.
> >
> > I think we are never able to prevent it and races exist unless we remove
> > the lockless path itself, so the patch makes sure to free rxpmtu at least
> > when we close() the socket.  Currently, we can not even free it.
> 
> I am saying your patches do not guarantee the rxpmtu is freed at close() time.
> 
> Once the v6 socket has been transformed to IPv4 one,
> inet6_sock_destruct() is not going to be called.

inet6_sock_destruct() is set to sk->sk_destruct(), which is not changed
by the transformation and will be called from __sk_destruct().

But I might be missing something.
The lockless sendmsg() keeps sk_wmem_alloc > 0 after the conversion and
prevents sk_free() ?


> > > > We can now remove all inet6_destroy_sock() calls from IPv6 protocol
> > > > specific ->destroy() functions, but such changes are invasive to
> > > > backport.  So they can be posted as a follow-up later for net-next.
> > > >
> > > > Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
> > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > > ---
> > > >  include/net/ipv6.h  |  1 +
> > > >  include/net/udp.h   |  2 +-
> > > >  net/ipv4/udp.c      |  8 ++++++--
> > > >  net/ipv6/af_inet6.c |  9 ++++++++-
> > > >  net/ipv6/udp.c      | 15 ++++++++++++++-
> > > >  5 files changed, 30 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> > > > index de9dcc5652c4..11f1a9a8b066 100644
> > > > --- a/include/net/ipv6.h
> > > > +++ b/include/net/ipv6.h
> > > > @@ -1178,6 +1178,7 @@ void ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err, __be16 port,
> > > >  void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info);
> > > >  void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
> > > >
> > > > +void inet6_sock_destruct(struct sock *sk);
> > > >  int inet6_release(struct socket *sock);
> > > >  int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len);
> > > >  int inet6_getname(struct socket *sock, struct sockaddr *uaddr,
> > > > diff --git a/include/net/udp.h b/include/net/udp.h
> > > > index 5ee88ddf79c3..fee053bcd17c 100644
> > > > --- a/include/net/udp.h
> > > > +++ b/include/net/udp.h
> > > > @@ -247,7 +247,7 @@ static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if,
> > > >  }
> > > >
> > > >  /* net/ipv4/udp.c */
> > > > -void udp_destruct_sock(struct sock *sk);
> > > > +void udp_destruct_common(struct sock *sk);
> > > >  void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len);
> > > >  int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb);
> > > >  void udp_skb_destructor(struct sock *sk, struct sk_buff *skb);
> > > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > > > index 560d9eadeaa5..a84ae44db7e2 100644
> > > > --- a/net/ipv4/udp.c
> > > > +++ b/net/ipv4/udp.c
> > > > @@ -1598,7 +1598,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(__udp_enqueue_schedule_skb);
> > > >
> > > > -void udp_destruct_sock(struct sock *sk)
> > > > +void udp_destruct_common(struct sock *sk)
> > > >  {
> > > >         /* reclaim completely the forward allocated memory */
> > > >         struct udp_sock *up = udp_sk(sk);
> > > > @@ -1611,10 +1611,14 @@ void udp_destruct_sock(struct sock *sk)
> > > >                 kfree_skb(skb);
> > > >         }
> > > >         udp_rmem_release(sk, total, 0, true);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(udp_destruct_common);
> > > >
> > > > +static void udp_destruct_sock(struct sock *sk)
> > > > +{
> > > > +       udp_destruct_common(sk);
> > > >         inet_sock_destruct(sk);
> > > >  }
> > > > -EXPORT_SYMBOL_GPL(udp_destruct_sock);
> > > >
> > > >  int udp_init_sock(struct sock *sk)
> > > >  {
> > > > diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> > > > index dbb1430d6cc2..0774cff62f2d 100644
> > > > --- a/net/ipv6/af_inet6.c
> > > > +++ b/net/ipv6/af_inet6.c
> > > > @@ -109,6 +109,13 @@ static __inline__ struct ipv6_pinfo *inet6_sk_generic(struct sock *sk)
> > > >         return (struct ipv6_pinfo *)(((u8 *)sk) + offset);
> > > >  }
> > > >
> > > > +void inet6_sock_destruct(struct sock *sk)
> > > > +{
> > > > +       inet6_destroy_sock(sk);
> > > > +       inet_sock_destruct(sk);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(inet6_sock_destruct);
> > > > +
> > > >  static int inet6_create(struct net *net, struct socket *sock, int protocol,
> > > >                         int kern)
> > > >  {
> > > > @@ -201,7 +208,7 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
> > > >                         inet->hdrincl = 1;
> > > >         }
> > > >
> > > > -       sk->sk_destruct         = inet_sock_destruct;
> > > > +       sk->sk_destruct         = inet6_sock_destruct;
> > > >         sk->sk_family           = PF_INET6;
> > > >         sk->sk_protocol         = protocol;
> > > >
> > > > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > > > index 3366d6a77ff2..a5256f7184ab 100644
> > > > --- a/net/ipv6/udp.c
> > > > +++ b/net/ipv6/udp.c
> > > > @@ -56,6 +56,19 @@
> > > >  #include <trace/events/skb.h>
> > > >  #include "udp_impl.h"
> > > >
> > > > +static void udpv6_destruct_sock(struct sock *sk)
> > > > +{
> > > > +       udp_destruct_common(sk);
> > > > +       inet6_sock_destruct(sk);
> > > > +}
> > > > +
> > > > +static int udpv6_init_sock(struct sock *sk)
> > > > +{
> > > > +       skb_queue_head_init(&udp_sk(sk)->reader_queue);
> > > > +       sk->sk_destruct = udpv6_destruct_sock;
> > > > +       return 0;
> > > > +}
> > > > +
> > > >  static u32 udp6_ehashfn(const struct net *net,
> > > >                         const struct in6_addr *laddr,
> > > >                         const u16 lport,
> > > > @@ -1723,7 +1736,7 @@ struct proto udpv6_prot = {
> > > >         .connect                = ip6_datagram_connect,
> > > >         .disconnect             = udp_disconnect,
> > > >         .ioctl                  = udp_ioctl,
> > > > -       .init                   = udp_init_sock,
> > > > +       .init                   = udpv6_init_sock,
> > > >         .destroy                = udpv6_destroy_sock,
> > > >         .setsockopt             = udpv6_setsockopt,
> > > >         .getsockopt             = udpv6_getsockopt,
> > > > --
> > > > 2.30.2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
  2022-09-28  0:27 ` [PATCH v2 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Kuniyuki Iwashima
@ 2022-09-28  5:05   ` Eric Dumazet
  2022-09-28  5:29     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2022-09-28  5:05 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
	Kuniyuki Iwashima, netdev, syzkaller-bugs, LKML

On Tue, Sep 27, 2022 at 5:28 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> Commit 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support") forgot
> to add a change to free inet6_sk(sk)->rxpmtu while converting an IPv6
> socket into IPv4 with IPV6_ADDRFORM.  After conversion, sk_prot is
> changed to udp_prot and ->destroy() never cleans it up, resulting in
> a memory leak.
>
> This is due to the discrepancy between inet6_destroy_sock() and
> IPV6_ADDRFORM, so let's call inet6_destroy_sock() from IPV6_ADDRFORM
> to remove the difference.
>
> However, this is not enough for now because rxpmtu can be changed
> without lock_sock() after commit 03485f2adcde ("udpv6: Add lockless
> sendmsg() support").  We will fix this case in the following patch.
>
> Fixes: 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
>  net/ipv6/ipv6_sockglue.c | 15 +++------------
>  1 file changed, 3 insertions(+), 12 deletions(-)
>
> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> index b61066ac8648..030a4cf23ceb 100644
> --- a/net/ipv6/ipv6_sockglue.c
> +++ b/net/ipv6/ipv6_sockglue.c
> @@ -431,9 +431,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
>                 if (optlen < sizeof(int))
>                         goto e_inval;
>                 if (val == PF_INET) {
> -                       struct ipv6_txoptions *opt;
> -                       struct sk_buff *pktopt;
> -
>                         if (sk->sk_type == SOCK_RAW)
>                                 break;
>
> @@ -464,7 +461,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
>                                 break;
>                         }
>
> -                       fl6_free_socklist(sk);
>                         __ipv6_sock_mc_close(sk);
>                         __ipv6_sock_ac_close(sk);
>
> @@ -501,14 +497,9 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
>                                 sk->sk_socket->ops = &inet_dgram_ops;
>                                 sk->sk_family = PF_INET;
>                         }
> -                       opt = xchg((__force struct ipv6_txoptions **)&np->opt,
> -                                  NULL);
> -                       if (opt) {
> -                               atomic_sub(opt->tot_len, &sk->sk_omem_alloc);
> -                               txopt_put(opt);
> -                       }
> -                       pktopt = xchg(&np->pktoptions, NULL);
> -                       kfree_skb(pktopt);
> +

 Why is this needed ? I think a comment could help.
> +                       np->rxopt.all = 0;

> +                       inet6_destroy_sock(sk);

This name is unfortunate. This really is an inet6_cleanup_sock() in
this context.

>
>                         /*
>                          * ... and add it to the refcnt debug socks count
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
  2022-09-28  5:05   ` Eric Dumazet
@ 2022-09-28  5:29     ` Kuniyuki Iwashima
  0 siblings, 0 replies; 12+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-28  5:29 UTC (permalink / raw)
  To: edumazet
  Cc: davem, dsahern, kuba, kuni1840, kuniyu, linux-kernel, netdev,
	pabeni, syzkaller-bugs

From:   Eric Dumazet <edumazet@google.com>
Date:   Tue, 27 Sep 2022 22:05:09 -0700
> On Tue, Sep 27, 2022 at 5:28 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > Commit 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support") forgot
> > to add a change to free inet6_sk(sk)->rxpmtu while converting an IPv6
> > socket into IPv4 with IPV6_ADDRFORM.  After conversion, sk_prot is
> > changed to udp_prot and ->destroy() never cleans it up, resulting in
> > a memory leak.
> >
> > This is due to the discrepancy between inet6_destroy_sock() and
> > IPV6_ADDRFORM, so let's call inet6_destroy_sock() from IPV6_ADDRFORM
> > to remove the difference.
> >
> > However, this is not enough for now because rxpmtu can be changed
> > without lock_sock() after commit 03485f2adcde ("udpv6: Add lockless
> > sendmsg() support").  We will fix this case in the following patch.
> >
> > Fixes: 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support")
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> >  net/ipv6/ipv6_sockglue.c | 15 +++------------
> >  1 file changed, 3 insertions(+), 12 deletions(-)
> >
> > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> > index b61066ac8648..030a4cf23ceb 100644
> > --- a/net/ipv6/ipv6_sockglue.c
> > +++ b/net/ipv6/ipv6_sockglue.c
> > @@ -431,9 +431,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
> >                 if (optlen < sizeof(int))
> >                         goto e_inval;
> >                 if (val == PF_INET) {
> > -                       struct ipv6_txoptions *opt;
> > -                       struct sk_buff *pktopt;
> > -
> >                         if (sk->sk_type == SOCK_RAW)
> >                                 break;
> >
> > @@ -464,7 +461,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
> >                                 break;
> >                         }
> >
> > -                       fl6_free_socklist(sk);
> >                         __ipv6_sock_mc_close(sk);
> >                         __ipv6_sock_ac_close(sk);
> >
> > @@ -501,14 +497,9 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
> >                                 sk->sk_socket->ops = &inet_dgram_ops;
> >                                 sk->sk_family = PF_INET;
> >                         }
> > -                       opt = xchg((__force struct ipv6_txoptions **)&np->opt,
> > -                                  NULL);
> > -                       if (opt) {
> > -                               atomic_sub(opt->tot_len, &sk->sk_omem_alloc);
> > -                               txopt_put(opt);
> > -                       }
> > -                       pktopt = xchg(&np->pktoptions, NULL);
> > -                       kfree_skb(pktopt);
> > +
> 
>  Why is this needed ? I think a comment could help.
> > +                       np->rxopt.all = 0;

I added it to reduce the possibility of lockless allocation in
ipv6_local_rxpmtu(), which checks np->rxopt.bits.rxpmtu first.

But, lockless sendmsg() even races with this, so it might not be
needed or should be added in the next patch?


> 
> > +                       inet6_destroy_sock(sk);
> 
> This name is unfortunate. This really is an inet6_cleanup_sock() in
> this context.

Ok, I'll add inet6_cleanup_sock() as a wrapper of inet6_destroy_sock()
in v3.

And, I'll post a cleanup patch later which renames inet6_destroy_sock()
to inet6_cleanup_sock() and removes all unnecessary inet6_destroy_sock()
calls in each sk->sk_prot->destroy().


> 
> >
> >                         /*
> >                          * ... and add it to the refcnt debug socks count
> > --
> > 2.30.2

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-09-28  5:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-28  0:27 [PATCH v2 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
2022-09-28  0:27 ` [PATCH v2 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
2022-09-28  0:27 ` [PATCH v2 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Kuniyuki Iwashima
2022-09-28  5:05   ` Eric Dumazet
2022-09-28  5:29     ` Kuniyuki Iwashima
2022-09-28  0:27 ` [PATCH v2 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct() Kuniyuki Iwashima
2022-09-28  3:43   ` Eric Dumazet
2022-09-28  4:00     ` Kuniyuki Iwashima
2022-09-28  4:10       ` Eric Dumazet
2022-09-28  4:31         ` Kuniyuki Iwashima
2022-09-28  0:27 ` [PATCH v2 net 4/5] ipv6: Fix data races around sk->sk_prot Kuniyuki Iwashima
2022-09-28  0:27 ` [PATCH v2 net 5/5] tcp: Fix data races around icsk->icsk_af_ops Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).