linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM.
@ 2022-10-04 17:17 Kuniyuki Iwashima
  2022-10-04 17:17 ` [PATCH v4 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Kuniyuki Iwashima @ 2022-10-04 17:17 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	David Ahern, Hideaki YOSHIFUJI
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel

This series fixes some memory leaks and data races caused in the
same scenario where one thread converts an IPv6 socket into IPv4
with IPV6_ADDRFORM and another accesses the socket concurrently.

Note patch 1 and 5 conflict with these commits in net-next, respectively.

  * 24426654ed3a ("bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf")
  * 34704ef024ae ("bpf: net: Change do_tcp_getsockopt() to take the sockptr_t argument")


Changes:
  v4:
    * Patch 3:
      * Change UDPv6 Lite's sk->sk_prot->init() and sk->destruct() as well.
      * Move udplite_sk_init() from udplite.h to udplite.c.

  v3 (Resend): https://lore.kernel.org/netdev/20221003154425.49458-1-kuniyu@amazon.com/
    * CC blamed commits' EHOSTUNREACH authors to make patchwork happy

  v3: https://lore.kernel.org/netdev/20220929012542.55424-1-kuniyu@amazon.com/
    * Patch 2:
      * Add comment for np->rxopt.all = 0
      * Add inet6_cleanup_sock()
    * Patch 3:
      * Call inet6_cleanup_sock() instead of inet6_destroy_sock()

  v2: https://lore.kernel.org/netdev/20220928002741.64237-1-kuniyu@amazon.com/
    * Patch 3:
      * Move inet6_destroy_sock() from sk_prot->destroy()
        to sk->sk_destruct() and fix CONFIG_IPV6=m build failure
    * Patch 5:
      * Add WRITE_ONCE()s in tcp_v6_connect()
      * Add Reported-by tags and KCSAN log in changelog

  v1: https://lore.kernel.org/netdev/20220927161209.32939-1-kuniyu@amazon.com/


Kuniyuki Iwashima (5):
  tcp/udp: Fix memory leak in ipv6_renew_options().
  udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
  tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  ipv6: Fix data races around sk->sk_prot.
  tcp: Fix data races around icsk->icsk_af_ops.

 include/net/ipv6.h       |  2 ++
 include/net/udp.h        |  2 +-
 include/net/udplite.h    |  8 --------
 net/core/sock.c          |  6 ++++--
 net/ipv4/af_inet.c       | 23 ++++++++++++++++-------
 net/ipv4/tcp.c           | 10 ++++++----
 net/ipv4/udp.c           |  9 ++++++---
 net/ipv4/udplite.c       |  8 ++++++++
 net/ipv6/af_inet6.c      | 15 ++++++++++++++-
 net/ipv6/ipv6_sockglue.c | 34 +++++++++++++++++++---------------
 net/ipv6/tcp_ipv6.c      |  6 ++++--
 net/ipv6/udp.c           | 15 ++++++++++++++-
 net/ipv6/udp_impl.h      |  1 +
 net/ipv6/udplite.c       |  9 ++++++++-
 14 files changed, 103 insertions(+), 45 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options().
  2022-10-04 17:17 [PATCH v4 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
@ 2022-10-04 17:17 ` Kuniyuki Iwashima
  2022-10-04 17:17 ` [PATCH v4 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Kuniyuki Iwashima
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Kuniyuki Iwashima @ 2022-10-04 17:17 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	David Ahern, Hideaki YOSHIFUJI
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel, syzbot

syzbot reported a memory leak [0] related to IPV6_ADDRFORM.

The scenario is that while one thread is converting an IPv6 socket into
IPv4 with IPV6_ADDRFORM, another thread calls do_ipv6_setsockopt() and
allocates memory to inet6_sk(sk)->XXX after conversion.

Then, the converted sk with (tcp|udp)_prot never frees the IPv6 resources,
which inet6_destroy_sock() should have cleaned up.

setsockopt(IPV6_ADDRFORM)                 setsockopt(IPV6_DSTOPTS)
+-----------------------+                 +----------------------+
- do_ipv6_setsockopt(sk, ...)
  - lock_sock(sk)                         - do_ipv6_setsockopt(sk, ...)
  - WRITE_ONCE(sk->sk_prot, &tcp_prot)      ^._ called via tcpv6_prot
  - xchg(&np->opt, NULL)                        before WRITE_ONCE()
  - txopt_put(opt)
  - release_sock(sk)
                                            - lock_sock(sk)
                                            - ipv6_set_opt_hdr(sk, ...)
                                              - ipv6_update_options(sk, opt)
                                                - xchg(&inet6_sk(sk)->opt, opt)
                                                  ^._ opt is never freed.

                                            - release_sock(sk)

Since IPV6_DSTOPTS allocates options under lock_sock(), we can avoid this
memory leak by testing whether sk_family is changed by IPV6_ADDRFORM after
acquiring the lock.

This issue exists from the initial commit between IPV6_ADDRFORM and
IPV6_PKTOPTIONS.

[0]:
BUG: memory leak
unreferenced object 0xffff888009ab9f80 (size 96):
  comm "syz-executor583", pid 328, jiffies 4294916198 (age 13.034s)
  hex dump (first 32 bytes):
    01 00 00 00 48 00 00 00 08 00 00 00 00 00 00 00  ....H...........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000002ee98ae1>] kmalloc include/linux/slab.h:605 [inline]
    [<000000002ee98ae1>] sock_kmalloc+0xb3/0x100 net/core/sock.c:2566
    [<0000000065d7b698>] ipv6_renew_options+0x21e/0x10b0 net/ipv6/exthdrs.c:1318
    [<00000000a8c756d7>] ipv6_set_opt_hdr net/ipv6/ipv6_sockglue.c:354 [inline]
    [<00000000a8c756d7>] do_ipv6_setsockopt.constprop.0+0x28b7/0x4350 net/ipv6/ipv6_sockglue.c:668
    [<000000002854d204>] ipv6_setsockopt+0xdf/0x190 net/ipv6/ipv6_sockglue.c:1021
    [<00000000e69fdcf8>] tcp_setsockopt+0x13b/0x2620 net/ipv4/tcp.c:3789
    [<0000000090da4b9b>] __sys_setsockopt+0x239/0x620 net/socket.c:2252
    [<00000000b10d192f>] __do_sys_setsockopt net/socket.c:2263 [inline]
    [<00000000b10d192f>] __se_sys_setsockopt net/socket.c:2260 [inline]
    [<00000000b10d192f>] __x64_sys_setsockopt+0xbe/0x160 net/socket.c:2260
    [<000000000a80d7aa>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [<000000000a80d7aa>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
    [<000000004562b5c6>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv6/ipv6_sockglue.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index e0dcc7a193df..b61066ac8648 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -419,6 +419,12 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		rtnl_lock();
 	lock_sock(sk);
 
+	/* Another thread has converted the socket into IPv4 with
+	 * IPV6_ADDRFORM concurrently.
+	 */
+	if (unlikely(sk->sk_family != AF_INET6))
+		goto unlock;
+
 	switch (optname) {
 
 	case IPV6_ADDRFORM:
@@ -994,6 +1000,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		break;
 	}
 
+unlock:
 	release_sock(sk);
 	if (needs_rtnl)
 		rtnl_unlock();
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
  2022-10-04 17:17 [PATCH v4 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
  2022-10-04 17:17 ` [PATCH v4 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
@ 2022-10-04 17:17 ` Kuniyuki Iwashima
  2022-10-04 17:18 ` [PATCH v4 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct() Kuniyuki Iwashima
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Kuniyuki Iwashima @ 2022-10-04 17:17 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	David Ahern, Hideaki YOSHIFUJI
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel, Brian Haley

Commit 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support") forgot
to add a change to free inet6_sk(sk)->rxpmtu while converting an IPv6
socket into IPv4 with IPV6_ADDRFORM.  After conversion, sk_prot is
changed to udp_prot and ->destroy() never cleans it up, resulting in
a memory leak.

This is due to the discrepancy between inet6_destroy_sock() and
IPV6_ADDRFORM, so let's call inet6_destroy_sock() from IPV6_ADDRFORM
to remove the difference.

However, this is not enough for now because rxpmtu can be changed
without lock_sock() after commit 03485f2adcde ("udpv6: Add lockless
sendmsg() support").  We will fix this case in the following patch.

Note we will rename inet6_destroy_sock() to inet6_cleanup_sock() and
remove unnecessary inet6_destroy_sock() calls in sk_prot->destroy()
in the future.

Fixes: 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
Cc: Brian Haley <brian.haley@hp.com>
---
 include/net/ipv6.h       |  1 +
 net/ipv6/af_inet6.c      |  6 ++++++
 net/ipv6/ipv6_sockglue.c | 20 ++++++++------------
 3 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index de9dcc5652c4..dfa70789b771 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1178,6 +1178,7 @@ void ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err, __be16 port,
 void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info);
 void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
 
+void inet6_cleanup_sock(struct sock *sk);
 int inet6_release(struct socket *sock);
 int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len);
 int inet6_getname(struct socket *sock, struct sockaddr *uaddr,
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index dbb1430d6cc2..83b9e432f3df 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -510,6 +510,12 @@ void inet6_destroy_sock(struct sock *sk)
 }
 EXPORT_SYMBOL_GPL(inet6_destroy_sock);
 
+void inet6_cleanup_sock(struct sock *sk)
+{
+	inet6_destroy_sock(sk);
+}
+EXPORT_SYMBOL_GPL(inet6_cleanup_sock);
+
 /*
  *	This does both peername and sockname.
  */
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index b61066ac8648..d258c94745ca 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -431,9 +431,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		if (optlen < sizeof(int))
 			goto e_inval;
 		if (val == PF_INET) {
-			struct ipv6_txoptions *opt;
-			struct sk_buff *pktopt;
-
 			if (sk->sk_type == SOCK_RAW)
 				break;
 
@@ -464,7 +461,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				break;
 			}
 
-			fl6_free_socklist(sk);
 			__ipv6_sock_mc_close(sk);
 			__ipv6_sock_ac_close(sk);
 
@@ -501,14 +497,14 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				sk->sk_socket->ops = &inet_dgram_ops;
 				sk->sk_family = PF_INET;
 			}
-			opt = xchg((__force struct ipv6_txoptions **)&np->opt,
-				   NULL);
-			if (opt) {
-				atomic_sub(opt->tot_len, &sk->sk_omem_alloc);
-				txopt_put(opt);
-			}
-			pktopt = xchg(&np->pktoptions, NULL);
-			kfree_skb(pktopt);
+
+			/* Disable all options not to allocate memory anymore,
+			 * but there is still a race.  See the lockless path
+			 * in udpv6_sendmsg() and ipv6_local_rxpmtu().
+			 */
+			np->rxopt.all = 0;
+
+			inet6_cleanup_sock(sk);
 
 			/*
 			 * ... and add it to the refcnt debug socks count
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  2022-10-04 17:17 [PATCH v4 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
  2022-10-04 17:17 ` [PATCH v4 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
  2022-10-04 17:17 ` [PATCH v4 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Kuniyuki Iwashima
@ 2022-10-04 17:18 ` Kuniyuki Iwashima
  2022-10-06  9:19   ` Paolo Abeni
  2022-10-04 17:18 ` [PATCH v4 net 4/5] ipv6: Fix data races around sk->sk_prot Kuniyuki Iwashima
  2022-10-04 17:18 ` [PATCH v4 net 5/5] tcp: Fix data races around icsk->icsk_af_ops Kuniyuki Iwashima
  4 siblings, 1 reply; 8+ messages in thread
From: Kuniyuki Iwashima @ 2022-10-04 17:18 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	David Ahern, Hideaki YOSHIFUJI
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel, Vladislav Yasevich

Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
Add lockless sendmsg() support") added a lockless memory allocation path,
which could cause a memory leak:

setsockopt(IPV6_ADDRFORM)                 sendmsg()
+-----------------------+                 +-------+
- do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
  - lock_sock(sk)                           ^._ called via udpv6_prot
  - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
  - inet6_destroy_sock()
  - release_sock(sk)                        - ip6_make_skb(sk, ...)
                                              ^._ lockless fast path for
                                                  the non-corking case

                                              - __ip6_append_data(sk, ...)
                                                - ipv6_local_rxpmtu(sk, ...)
                                                  - xchg(&np->rxpmtu, skb)
                                                    ^._ rxpmtu is never freed.

                                            - lock_sock(sk)

For now, rxpmtu is only the case, but not to miss the future change
and a similar bug fixed in commit e27326009a3d ("net: ping6: Fix
memleak in ipv6_renew_options()."), let's set a new function to IPv6
sk->sk_destruct() and call inet6_cleanup_sock() there.  Since the
conversion does not change sk->sk_destruct(), we can guarantee that
we can clean up IPv6 resources finally.

We can now remove all inet6_destroy_sock() calls from IPv6 protocol
specific ->destroy() functions, but such changes are invasive to
backport.  So they can be posted as a follow-up later for net-next.

Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
Cc: Vladislav Yasevich <vyasevic@redhat.com>
---
 include/net/ipv6.h    |  1 +
 include/net/udp.h     |  2 +-
 include/net/udplite.h |  8 --------
 net/ipv4/udp.c        |  9 ++++++---
 net/ipv4/udplite.c    |  8 ++++++++
 net/ipv6/af_inet6.c   |  9 ++++++++-
 net/ipv6/udp.c        | 15 ++++++++++++++-
 net/ipv6/udp_impl.h   |  1 +
 net/ipv6/udplite.c    |  9 ++++++++-
 9 files changed, 47 insertions(+), 15 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index dfa70789b771..e7ec3e8cd52e 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1179,6 +1179,7 @@ void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info);
 void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
 
 void inet6_cleanup_sock(struct sock *sk);
+void inet6_sock_destruct(struct sock *sk);
 int inet6_release(struct socket *sock);
 int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len);
 int inet6_getname(struct socket *sock, struct sockaddr *uaddr,
diff --git a/include/net/udp.h b/include/net/udp.h
index 5ee88ddf79c3..fee053bcd17c 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -247,7 +247,7 @@ static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if,
 }
 
 /* net/ipv4/udp.c */
-void udp_destruct_sock(struct sock *sk);
+void udp_destruct_common(struct sock *sk);
 void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len);
 int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb);
 void udp_skb_destructor(struct sock *sk, struct sk_buff *skb);
diff --git a/include/net/udplite.h b/include/net/udplite.h
index 0143b373602e..299c14ce2bb9 100644
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -25,14 +25,6 @@ static __inline__ int udplite_getfrag(void *from, char *to, int  offset,
 	return copy_from_iter_full(to, len, &msg->msg_iter) ? 0 : -EFAULT;
 }
 
-/* Designate sk as UDP-Lite socket */
-static inline int udplite_sk_init(struct sock *sk)
-{
-	udp_init_sock(sk);
-	udp_sk(sk)->pcflag = UDPLITE_BIT;
-	return 0;
-}
-
 /*
  * 	Checksumming routines
  */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 560d9eadeaa5..48adb418e404 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1598,7 +1598,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
 }
 EXPORT_SYMBOL_GPL(__udp_enqueue_schedule_skb);
 
-void udp_destruct_sock(struct sock *sk)
+void udp_destruct_common(struct sock *sk)
 {
 	/* reclaim completely the forward allocated memory */
 	struct udp_sock *up = udp_sk(sk);
@@ -1611,10 +1611,14 @@ void udp_destruct_sock(struct sock *sk)
 		kfree_skb(skb);
 	}
 	udp_rmem_release(sk, total, 0, true);
+}
+EXPORT_SYMBOL_GPL(udp_destruct_common);
 
+static void udp_destruct_sock(struct sock *sk)
+{
+	udp_destruct_common(sk);
 	inet_sock_destruct(sk);
 }
-EXPORT_SYMBOL_GPL(udp_destruct_sock);
 
 int udp_init_sock(struct sock *sk)
 {
@@ -1622,7 +1626,6 @@ int udp_init_sock(struct sock *sk)
 	sk->sk_destruct = udp_destruct_sock;
 	return 0;
 }
-EXPORT_SYMBOL_GPL(udp_init_sock);
 
 void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len)
 {
diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c
index 6e08a76ae1e7..4785ac4a8719 100644
--- a/net/ipv4/udplite.c
+++ b/net/ipv4/udplite.c
@@ -17,6 +17,14 @@
 struct udp_table 	udplite_table __read_mostly;
 EXPORT_SYMBOL(udplite_table);
 
+/* Designate sk as UDP-Lite socket */
+static inline int udplite_sk_init(struct sock *sk)
+{
+	udp_init_sock(sk);
+	udp_sk(sk)->pcflag = UDPLITE_BIT;
+	return 0;
+}
+
 static int udplite_rcv(struct sk_buff *skb)
 {
 	return __udp4_lib_rcv(skb, &udplite_table, IPPROTO_UDPLITE);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 83b9e432f3df..ce5378b78ec9 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -109,6 +109,13 @@ static __inline__ struct ipv6_pinfo *inet6_sk_generic(struct sock *sk)
 	return (struct ipv6_pinfo *)(((u8 *)sk) + offset);
 }
 
+void inet6_sock_destruct(struct sock *sk)
+{
+	inet6_cleanup_sock(sk);
+	inet_sock_destruct(sk);
+}
+EXPORT_SYMBOL_GPL(inet6_sock_destruct);
+
 static int inet6_create(struct net *net, struct socket *sock, int protocol,
 			int kern)
 {
@@ -201,7 +208,7 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
 			inet->hdrincl = 1;
 	}
 
-	sk->sk_destruct		= inet_sock_destruct;
+	sk->sk_destruct		= inet6_sock_destruct;
 	sk->sk_family		= PF_INET6;
 	sk->sk_protocol		= protocol;
 
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 3366d6a77ff2..4bdeb66eea34 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -56,6 +56,19 @@
 #include <trace/events/skb.h>
 #include "udp_impl.h"
 
+static void udpv6_destruct_sock(struct sock *sk)
+{
+	udp_destruct_common(sk);
+	inet6_sock_destruct(sk);
+}
+
+int udpv6_init_sock(struct sock *sk)
+{
+	skb_queue_head_init(&udp_sk(sk)->reader_queue);
+	sk->sk_destruct = udpv6_destruct_sock;
+	return 0;
+}
+
 static u32 udp6_ehashfn(const struct net *net,
 			const struct in6_addr *laddr,
 			const u16 lport,
@@ -1723,7 +1736,7 @@ struct proto udpv6_prot = {
 	.connect		= ip6_datagram_connect,
 	.disconnect		= udp_disconnect,
 	.ioctl			= udp_ioctl,
-	.init			= udp_init_sock,
+	.init			= udpv6_init_sock,
 	.destroy		= udpv6_destroy_sock,
 	.setsockopt		= udpv6_setsockopt,
 	.getsockopt		= udpv6_getsockopt,
diff --git a/net/ipv6/udp_impl.h b/net/ipv6/udp_impl.h
index 4251e49d32a0..0590f566379d 100644
--- a/net/ipv6/udp_impl.h
+++ b/net/ipv6/udp_impl.h
@@ -12,6 +12,7 @@ int __udp6_lib_rcv(struct sk_buff *, struct udp_table *, int);
 int __udp6_lib_err(struct sk_buff *, struct inet6_skb_parm *, u8, u8, int,
 		   __be32, struct udp_table *);
 
+int udpv6_init_sock(struct sock *sk);
 int udp_v6_get_port(struct sock *sk, unsigned short snum);
 void udp_v6_rehash(struct sock *sk);
 
diff --git a/net/ipv6/udplite.c b/net/ipv6/udplite.c
index b70725856259..67eaf3ca14ce 100644
--- a/net/ipv6/udplite.c
+++ b/net/ipv6/udplite.c
@@ -12,6 +12,13 @@
 #include <linux/proc_fs.h>
 #include "udp_impl.h"
 
+static int udplitev6_sk_init(struct sock *sk)
+{
+	udpv6_init_sock(sk);
+	udp_sk(sk)->pcflag = UDPLITE_BIT;
+	return 0;
+}
+
 static int udplitev6_rcv(struct sk_buff *skb)
 {
 	return __udp6_lib_rcv(skb, &udplite_table, IPPROTO_UDPLITE);
@@ -38,7 +45,7 @@ struct proto udplitev6_prot = {
 	.connect	   = ip6_datagram_connect,
 	.disconnect	   = udp_disconnect,
 	.ioctl		   = udp_ioctl,
-	.init		   = udplite_sk_init,
+	.init		   = udplitev6_sk_init,
 	.destroy	   = udpv6_destroy_sock,
 	.setsockopt	   = udpv6_setsockopt,
 	.getsockopt	   = udpv6_getsockopt,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 net 4/5] ipv6: Fix data races around sk->sk_prot.
  2022-10-04 17:17 [PATCH v4 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
                   ` (2 preceding siblings ...)
  2022-10-04 17:18 ` [PATCH v4 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct() Kuniyuki Iwashima
@ 2022-10-04 17:18 ` Kuniyuki Iwashima
  2022-10-04 17:18 ` [PATCH v4 net 5/5] tcp: Fix data races around icsk->icsk_af_ops Kuniyuki Iwashima
  4 siblings, 0 replies; 8+ messages in thread
From: Kuniyuki Iwashima @ 2022-10-04 17:18 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	David Ahern, Hideaki YOSHIFUJI
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel

Commit 086d49058cd8 ("ipv6: annotate some data-races around sk->sk_prot")
fixed some data-races around sk->sk_prot but it was not enough.

Some functions in inet6_(stream|dgram)_ops still access sk->sk_prot
without lock_sock() or rtnl_lock(), so they need READ_ONCE() to avoid
load tearing.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/sock.c          |  6 ++++--
 net/ipv4/af_inet.c       | 23 ++++++++++++++++-------
 net/ipv6/ipv6_sockglue.c |  4 ++--
 3 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 788c1372663c..9c05637663bf 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3556,7 +3556,8 @@ int sock_common_getsockopt(struct socket *sock, int level, int optname,
 {
 	struct sock *sk = sock->sk;
 
-	return sk->sk_prot->getsockopt(sk, level, optname, optval, optlen);
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	return READ_ONCE(sk->sk_prot)->getsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(sock_common_getsockopt);
 
@@ -3582,7 +3583,8 @@ int sock_common_setsockopt(struct socket *sock, int level, int optname,
 {
 	struct sock *sk = sock->sk;
 
-	return sk->sk_prot->setsockopt(sk, level, optname, optval, optlen);
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	return READ_ONCE(sk->sk_prot)->setsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(sock_common_setsockopt);
 
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 3ca0cc467886..405fbad998df 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -558,22 +558,27 @@ int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr,
 		       int addr_len, int flags)
 {
 	struct sock *sk = sock->sk;
+	const struct proto *prot;
 	int err;
 
 	if (addr_len < sizeof(uaddr->sa_family))
 		return -EINVAL;
+
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	prot = READ_ONCE(sk->sk_prot);
+
 	if (uaddr->sa_family == AF_UNSPEC)
-		return sk->sk_prot->disconnect(sk, flags);
+		return prot->disconnect(sk, flags);
 
 	if (BPF_CGROUP_PRE_CONNECT_ENABLED(sk)) {
-		err = sk->sk_prot->pre_connect(sk, uaddr, addr_len);
+		err = prot->pre_connect(sk, uaddr, addr_len);
 		if (err)
 			return err;
 	}
 
 	if (data_race(!inet_sk(sk)->inet_num) && inet_autobind(sk))
 		return -EAGAIN;
-	return sk->sk_prot->connect(sk, uaddr, addr_len);
+	return prot->connect(sk, uaddr, addr_len);
 }
 EXPORT_SYMBOL(inet_dgram_connect);
 
@@ -734,10 +739,11 @@ EXPORT_SYMBOL(inet_stream_connect);
 int inet_accept(struct socket *sock, struct socket *newsock, int flags,
 		bool kern)
 {
-	struct sock *sk1 = sock->sk;
+	struct sock *sk1 = sock->sk, *sk2;
 	int err = -EINVAL;
-	struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, &err, kern);
 
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	sk2 = READ_ONCE(sk1->sk_prot)->accept(sk1, flags, &err, kern);
 	if (!sk2)
 		goto do_err;
 
@@ -825,12 +831,15 @@ ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 		      size_t size, int flags)
 {
 	struct sock *sk = sock->sk;
+	const struct proto *prot;
 
 	if (unlikely(inet_send_prepare(sk)))
 		return -EAGAIN;
 
-	if (sk->sk_prot->sendpage)
-		return sk->sk_prot->sendpage(sk, page, offset, size, flags);
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	prot = READ_ONCE(sk->sk_prot);
+	if (prot->sendpage)
+		return prot->sendpage(sk, page, offset, size, flags);
 	return sock_no_sendpage(sock, page, offset, size, flags);
 }
 EXPORT_SYMBOL(inet_sendpage);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index d258c94745ca..2fb9ee413c53 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -477,7 +477,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				sock_prot_inuse_add(net, sk->sk_prot, -1);
 				sock_prot_inuse_add(net, &tcp_prot, 1);
 
-				/* Paired with READ_ONCE(sk->sk_prot) in net/ipv6/af_inet6.c */
+				/* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
 				WRITE_ONCE(sk->sk_prot, &tcp_prot);
 				icsk->icsk_af_ops = &ipv4_specific;
 				sk->sk_socket->ops = &inet_stream_ops;
@@ -492,7 +492,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				sock_prot_inuse_add(net, sk->sk_prot, -1);
 				sock_prot_inuse_add(net, prot, 1);
 
-				/* Paired with READ_ONCE(sk->sk_prot) in net/ipv6/af_inet6.c */
+				/* Paired with READ_ONCE(sk->sk_prot) in inet6_dgram_ops */
 				WRITE_ONCE(sk->sk_prot, prot);
 				sk->sk_socket->ops = &inet_dgram_ops;
 				sk->sk_family = PF_INET;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 net 5/5] tcp: Fix data races around icsk->icsk_af_ops.
  2022-10-04 17:17 [PATCH v4 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
                   ` (3 preceding siblings ...)
  2022-10-04 17:18 ` [PATCH v4 net 4/5] ipv6: Fix data races around sk->sk_prot Kuniyuki Iwashima
@ 2022-10-04 17:18 ` Kuniyuki Iwashima
  4 siblings, 0 replies; 8+ messages in thread
From: Kuniyuki Iwashima @ 2022-10-04 17:18 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	David Ahern, Hideaki YOSHIFUJI
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel, syzbot

setsockopt(IPV6_ADDRFORM) and tcp_v6_connect() change icsk->icsk_af_ops
under lock_sock(), but tcp_(get|set)sockopt() read it locklessly.  To
avoid load/store tearing, we need to add READ_ONCE() and WRITE_ONCE()
for the reads and writes.

Thanks to Eric Dumazet for providing the syzbot report:

BUG: KCSAN: data-race in tcp_setsockopt / tcp_v6_connect

write to 0xffff88813c624518 of 8 bytes by task 23936 on cpu 0:
tcp_v6_connect+0x5b3/0xce0 net/ipv6/tcp_ipv6.c:240
__inet_stream_connect+0x159/0x6d0 net/ipv4/af_inet.c:660
inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:724
__sys_connect_file net/socket.c:1976 [inline]
__sys_connect+0x197/0x1b0 net/socket.c:1993
__do_sys_connect net/socket.c:2003 [inline]
__se_sys_connect net/socket.c:2000 [inline]
__x64_sys_connect+0x3d/0x50 net/socket.c:2000
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff88813c624518 of 8 bytes by task 23937 on cpu 1:
tcp_setsockopt+0x147/0x1c80 net/ipv4/tcp.c:3789
sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3585
__sys_setsockopt+0x212/0x2b0 net/socket.c:2252
__do_sys_setsockopt net/socket.c:2263 [inline]
__se_sys_setsockopt net/socket.c:2260 [inline]
__x64_sys_setsockopt+0x62/0x70 net/socket.c:2260
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0xffffffff8539af68 -> 0xffffffff8539aff8

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 23937 Comm: syz-executor.5 Not tainted
6.0.0-rc4-syzkaller-00331-g4ed9c1e971b1-dirty #0

Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 08/26/2022

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv4/tcp.c           | 10 ++++++----
 net/ipv6/ipv6_sockglue.c |  3 ++-
 net/ipv6/tcp_ipv6.c      |  6 ++++--
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 997a80ce1e13..08db82c05a4a 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3797,8 +3797,9 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
 	const struct inet_connection_sock *icsk = inet_csk(sk);
 
 	if (level != SOL_TCP)
-		return icsk->icsk_af_ops->setsockopt(sk, level, optname,
-						     optval, optlen);
+		/* Paired with WRITE_ONCE() in do_ipv6_setsockopt() and tcp_v6_connect() */
+		return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname,
+								optval, optlen);
 	return do_tcp_setsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(tcp_setsockopt);
@@ -4396,8 +4397,9 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
 	struct inet_connection_sock *icsk = inet_csk(sk);
 
 	if (level != SOL_TCP)
-		return icsk->icsk_af_ops->getsockopt(sk, level, optname,
-						     optval, optlen);
+		/* Paired with WRITE_ONCE() in do_ipv6_setsockopt() and tcp_v6_connect() */
+		return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname,
+								optval, optlen);
 	return do_tcp_getsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(tcp_getsockopt);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 2fb9ee413c53..19ac75c2cd54 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -479,7 +479,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 
 				/* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
 				WRITE_ONCE(sk->sk_prot, &tcp_prot);
-				icsk->icsk_af_ops = &ipv4_specific;
+				/* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
+				WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific);
 				sk->sk_socket->ops = &inet_stream_ops;
 				sk->sk_family = PF_INET;
 				tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index e54eee80ce5f..8680aa83f0b9 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -237,7 +237,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 		sin.sin_port = usin->sin6_port;
 		sin.sin_addr.s_addr = usin->sin6_addr.s6_addr32[3];
 
-		icsk->icsk_af_ops = &ipv6_mapped;
+		/* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
+		WRITE_ONCE(icsk->icsk_af_ops, &ipv6_mapped);
 		if (sk_is_mptcp(sk))
 			mptcpv6_handle_mapped(sk, true);
 		sk->sk_backlog_rcv = tcp_v4_do_rcv;
@@ -249,7 +250,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 
 		if (err) {
 			icsk->icsk_ext_hdr_len = exthdrlen;
-			icsk->icsk_af_ops = &ipv6_specific;
+			/* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
+			WRITE_ONCE(icsk->icsk_af_ops, &ipv6_specific);
 			if (sk_is_mptcp(sk))
 				mptcpv6_handle_mapped(sk, false);
 			sk->sk_backlog_rcv = tcp_v6_do_rcv;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  2022-10-04 17:18 ` [PATCH v4 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct() Kuniyuki Iwashima
@ 2022-10-06  9:19   ` Paolo Abeni
  2022-10-06 17:10     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2022-10-06  9:19 UTC (permalink / raw)
  To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
	David Ahern, Hideaki YOSHIFUJI
  Cc: Kuniyuki Iwashima, netdev, syzkaller-bugs, linux-kernel,
	Vladislav Yasevich

On Tue, 2022-10-04 at 10:18 -0700, Kuniyuki Iwashima wrote:
> Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
> able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
> IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
> Add lockless sendmsg() support") added a lockless memory allocation path,
> which could cause a memory leak:
> 
> setsockopt(IPV6_ADDRFORM)                 sendmsg()
> +-----------------------+                 +-------+
> - do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
>   - lock_sock(sk)                           ^._ called via udpv6_prot
>   - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
>   - inet6_destroy_sock()
>   - release_sock(sk)                        - ip6_make_skb(sk, ...)
>                                               ^._ lockless fast path for
>                                                   the non-corking case
> 
>                                               - __ip6_append_data(sk, ...)
>                                                 - ipv6_local_rxpmtu(sk, ...)
>                                                   - xchg(&np->rxpmtu, skb)
>                                                     ^._ rxpmtu is never freed.
> 
>                                             - lock_sock(sk)
> 
> For now, rxpmtu is only the case, but not to miss the future change
> and a similar bug fixed in commit e27326009a3d ("net: ping6: Fix
> memleak in ipv6_renew_options()."), let's set a new function to IPv6
> sk->sk_destruct() and call inet6_cleanup_sock() there.  Since the
> conversion does not change sk->sk_destruct(), we can guarantee that
> we can clean up IPv6 resources finally.
> 
> We can now remove all inet6_destroy_sock() calls from IPv6 protocol
> specific ->destroy() functions, but such changes are invasive to
> backport.  So they can be posted as a follow-up later for net-next.
> 
> Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
> Cc: Vladislav Yasevich <vyasevic@redhat.com>
> ---
>  include/net/ipv6.h    |  1 +
>  include/net/udp.h     |  2 +-
>  include/net/udplite.h |  8 --------
>  net/ipv4/udp.c        |  9 ++++++---
>  net/ipv4/udplite.c    |  8 ++++++++
>  net/ipv6/af_inet6.c   |  9 ++++++++-
>  net/ipv6/udp.c        | 15 ++++++++++++++-
>  net/ipv6/udp_impl.h   |  1 +
>  net/ipv6/udplite.c    |  9 ++++++++-
>  9 files changed, 47 insertions(+), 15 deletions(-)
> 
> diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> index dfa70789b771..e7ec3e8cd52e 100644
> --- a/include/net/ipv6.h
> +++ b/include/net/ipv6.h
> @@ -1179,6 +1179,7 @@ void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info);
>  void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
>  
>  void inet6_cleanup_sock(struct sock *sk);
> +void inet6_sock_destruct(struct sock *sk);
>  int inet6_release(struct socket *sock);
>  int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len);
>  int inet6_getname(struct socket *sock, struct sockaddr *uaddr,
> diff --git a/include/net/udp.h b/include/net/udp.h
> index 5ee88ddf79c3..fee053bcd17c 100644
> --- a/include/net/udp.h
> +++ b/include/net/udp.h
> @@ -247,7 +247,7 @@ static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if,
>  }
>  
>  /* net/ipv4/udp.c */
> -void udp_destruct_sock(struct sock *sk);
> +void udp_destruct_common(struct sock *sk);
>  void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len);
>  int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb);
>  void udp_skb_destructor(struct sock *sk, struct sk_buff *skb);
> diff --git a/include/net/udplite.h b/include/net/udplite.h
> index 0143b373602e..299c14ce2bb9 100644
> --- a/include/net/udplite.h
> +++ b/include/net/udplite.h
> @@ -25,14 +25,6 @@ static __inline__ int udplite_getfrag(void *from, char *to, int  offset,
>  	return copy_from_iter_full(to, len, &msg->msg_iter) ? 0 : -EFAULT;
>  }
>  
> -/* Designate sk as UDP-Lite socket */
> -static inline int udplite_sk_init(struct sock *sk)
> -{
> -	udp_init_sock(sk);
> -	udp_sk(sk)->pcflag = UDPLITE_BIT;
> -	return 0;
> -}
> -
>  /*
>   * 	Checksumming routines
>   */
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 560d9eadeaa5..48adb418e404 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1598,7 +1598,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
>  }
>  EXPORT_SYMBOL_GPL(__udp_enqueue_schedule_skb);
>  
> -void udp_destruct_sock(struct sock *sk)
> +void udp_destruct_common(struct sock *sk)
>  {
>  	/* reclaim completely the forward allocated memory */
>  	struct udp_sock *up = udp_sk(sk);
> @@ -1611,10 +1611,14 @@ void udp_destruct_sock(struct sock *sk)
>  		kfree_skb(skb);
>  	}
>  	udp_rmem_release(sk, total, 0, true);
> +}
> +EXPORT_SYMBOL_GPL(udp_destruct_common);
>  
> +static void udp_destruct_sock(struct sock *sk)
> +{
> +	udp_destruct_common(sk);
>  	inet_sock_destruct(sk);
>  }
> -EXPORT_SYMBOL_GPL(udp_destruct_sock);
>  
>  int udp_init_sock(struct sock *sk)
>  {
> @@ -1622,7 +1626,6 @@ int udp_init_sock(struct sock *sk)
>  	sk->sk_destruct = udp_destruct_sock;
>  	return 0;
>  }
> -EXPORT_SYMBOL_GPL(udp_init_sock);
>  
>  void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len)
>  {
> diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c
> index 6e08a76ae1e7..4785ac4a8719 100644
> --- a/net/ipv4/udplite.c
> +++ b/net/ipv4/udplite.c
> @@ -17,6 +17,14 @@
>  struct udp_table 	udplite_table __read_mostly;
>  EXPORT_SYMBOL(udplite_table);
>  
> +/* Designate sk as UDP-Lite socket */
> +static inline int udplite_sk_init(struct sock *sk)

You should avoid the 'inline' specifier in c files.

> +{
> +	udp_init_sock(sk);
> +	udp_sk(sk)->pcflag = UDPLITE_BIT;
> +	return 0;
> +}
> +
>  static int udplite_rcv(struct sk_buff *skb)
>  {
>  	return __udp4_lib_rcv(skb, &udplite_table, IPPROTO_UDPLITE);
> diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> index 83b9e432f3df..ce5378b78ec9 100644
> --- a/net/ipv6/af_inet6.c
> +++ b/net/ipv6/af_inet6.c
> @@ -109,6 +109,13 @@ static __inline__ struct ipv6_pinfo *inet6_sk_generic(struct sock *sk)
>  	return (struct ipv6_pinfo *)(((u8 *)sk) + offset);
>  }
>  
> +void inet6_sock_destruct(struct sock *sk)
> +{
> +	inet6_cleanup_sock(sk);
> +	inet_sock_destruct(sk);
> +}
> +EXPORT_SYMBOL_GPL(inet6_sock_destruct);

I'm sorry for not noticing this before, but it looks like the above
export is not needed? only used by udp, which is in the same binary
(either kernel of ipv6 module) as af_inet6


Cheers,

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  2022-10-06  9:19   ` Paolo Abeni
@ 2022-10-06 17:10     ` Kuniyuki Iwashima
  0 siblings, 0 replies; 8+ messages in thread
From: Kuniyuki Iwashima @ 2022-10-06 17:10 UTC (permalink / raw)
  To: pabeni
  Cc: davem, dsahern, edumazet, kuba, kuni1840, kuniyu, linux-kernel,
	netdev, syzkaller-bugs, vyasevic, yoshfuji

From:   Paolo Abeni <pabeni@redhat.com>
Date:   Thu, 06 Oct 2022 11:19:53 +0200
> On Tue, 2022-10-04 at 10:18 -0700, Kuniyuki Iwashima wrote:
> > Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
> > able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
> > IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
> > Add lockless sendmsg() support") added a lockless memory allocation path,
> > which could cause a memory leak:
> > 
> > setsockopt(IPV6_ADDRFORM)                 sendmsg()
> > +-----------------------+                 +-------+
> > - do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
> >   - lock_sock(sk)                           ^._ called via udpv6_prot
> >   - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
> >   - inet6_destroy_sock()
> >   - release_sock(sk)                        - ip6_make_skb(sk, ...)
> >                                               ^._ lockless fast path for
> >                                                   the non-corking case
> > 
> >                                               - __ip6_append_data(sk, ...)
> >                                                 - ipv6_local_rxpmtu(sk, ...)
> >                                                   - xchg(&np->rxpmtu, skb)
> >                                                     ^._ rxpmtu is never freed.
> > 
> >                                             - lock_sock(sk)
> > 
> > For now, rxpmtu is only the case, but not to miss the future change
> > and a similar bug fixed in commit e27326009a3d ("net: ping6: Fix
> > memleak in ipv6_renew_options()."), let's set a new function to IPv6
> > sk->sk_destruct() and call inet6_cleanup_sock() there.  Since the
> > conversion does not change sk->sk_destruct(), we can guarantee that
> > we can clean up IPv6 resources finally.
> > 
> > We can now remove all inet6_destroy_sock() calls from IPv6 protocol
> > specific ->destroy() functions, but such changes are invasive to
> > backport.  So they can be posted as a follow-up later for net-next.
> > 
> > Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> > Cc: Vladislav Yasevich <vyasevic@redhat.com>
> > ---
> >  include/net/ipv6.h    |  1 +
> >  include/net/udp.h     |  2 +-
> >  include/net/udplite.h |  8 --------
> >  net/ipv4/udp.c        |  9 ++++++---
> >  net/ipv4/udplite.c    |  8 ++++++++
> >  net/ipv6/af_inet6.c   |  9 ++++++++-
> >  net/ipv6/udp.c        | 15 ++++++++++++++-
> >  net/ipv6/udp_impl.h   |  1 +
> >  net/ipv6/udplite.c    |  9 ++++++++-
> >  9 files changed, 47 insertions(+), 15 deletions(-)
> > 
> > diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> > index dfa70789b771..e7ec3e8cd52e 100644
> > --- a/include/net/ipv6.h
> > +++ b/include/net/ipv6.h
> > @@ -1179,6 +1179,7 @@ void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info);
> >  void ipv6_local_rxpmtu(struct sock *sk, struct flowi6 *fl6, u32 mtu);
> >  
> >  void inet6_cleanup_sock(struct sock *sk);
> > +void inet6_sock_destruct(struct sock *sk);
> >  int inet6_release(struct socket *sock);
> >  int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len);
> >  int inet6_getname(struct socket *sock, struct sockaddr *uaddr,
> > diff --git a/include/net/udp.h b/include/net/udp.h
> > index 5ee88ddf79c3..fee053bcd17c 100644
> > --- a/include/net/udp.h
> > +++ b/include/net/udp.h
> > @@ -247,7 +247,7 @@ static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if,
> >  }
> >  
> >  /* net/ipv4/udp.c */
> > -void udp_destruct_sock(struct sock *sk);
> > +void udp_destruct_common(struct sock *sk);
> >  void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len);
> >  int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb);
> >  void udp_skb_destructor(struct sock *sk, struct sk_buff *skb);
> > diff --git a/include/net/udplite.h b/include/net/udplite.h
> > index 0143b373602e..299c14ce2bb9 100644
> > --- a/include/net/udplite.h
> > +++ b/include/net/udplite.h
> > @@ -25,14 +25,6 @@ static __inline__ int udplite_getfrag(void *from, char *to, int  offset,
> >  	return copy_from_iter_full(to, len, &msg->msg_iter) ? 0 : -EFAULT;
> >  }
> >  
> > -/* Designate sk as UDP-Lite socket */
> > -static inline int udplite_sk_init(struct sock *sk)
> > -{
> > -	udp_init_sock(sk);
> > -	udp_sk(sk)->pcflag = UDPLITE_BIT;
> > -	return 0;
> > -}
> > -
> >  /*
> >   * 	Checksumming routines
> >   */
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 560d9eadeaa5..48adb418e404 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -1598,7 +1598,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
> >  }
> >  EXPORT_SYMBOL_GPL(__udp_enqueue_schedule_skb);
> >  
> > -void udp_destruct_sock(struct sock *sk)
> > +void udp_destruct_common(struct sock *sk)
> >  {
> >  	/* reclaim completely the forward allocated memory */
> >  	struct udp_sock *up = udp_sk(sk);
> > @@ -1611,10 +1611,14 @@ void udp_destruct_sock(struct sock *sk)
> >  		kfree_skb(skb);
> >  	}
> >  	udp_rmem_release(sk, total, 0, true);
> > +}
> > +EXPORT_SYMBOL_GPL(udp_destruct_common);
> >  
> > +static void udp_destruct_sock(struct sock *sk)
> > +{
> > +	udp_destruct_common(sk);
> >  	inet_sock_destruct(sk);
> >  }
> > -EXPORT_SYMBOL_GPL(udp_destruct_sock);
> >  
> >  int udp_init_sock(struct sock *sk)
> >  {
> > @@ -1622,7 +1626,6 @@ int udp_init_sock(struct sock *sk)
> >  	sk->sk_destruct = udp_destruct_sock;
> >  	return 0;
> >  }
> > -EXPORT_SYMBOL_GPL(udp_init_sock);
> >  
> >  void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len)
> >  {
> > diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c
> > index 6e08a76ae1e7..4785ac4a8719 100644
> > --- a/net/ipv4/udplite.c
> > +++ b/net/ipv4/udplite.c
> > @@ -17,6 +17,14 @@
> >  struct udp_table 	udplite_table __read_mostly;
> >  EXPORT_SYMBOL(udplite_table);
> >  
> > +/* Designate sk as UDP-Lite socket */
> > +static inline int udplite_sk_init(struct sock *sk)
> 
> You should avoid the 'inline' specifier in c files.

I'll fix it.


> > +{
> > +	udp_init_sock(sk);
> > +	udp_sk(sk)->pcflag = UDPLITE_BIT;
> > +	return 0;
> > +}
> > +
> >  static int udplite_rcv(struct sk_buff *skb)
> >  {
> >  	return __udp4_lib_rcv(skb, &udplite_table, IPPROTO_UDPLITE);
> > diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> > index 83b9e432f3df..ce5378b78ec9 100644
> > --- a/net/ipv6/af_inet6.c
> > +++ b/net/ipv6/af_inet6.c
> > @@ -109,6 +109,13 @@ static __inline__ struct ipv6_pinfo *inet6_sk_generic(struct sock *sk)
> >  	return (struct ipv6_pinfo *)(((u8 *)sk) + offset);
> >  }
> >  
> > +void inet6_sock_destruct(struct sock *sk)
> > +{
> > +	inet6_cleanup_sock(sk);
> > +	inet_sock_destruct(sk);
> > +}
> > +EXPORT_SYMBOL_GPL(inet6_sock_destruct);
> 
> I'm sorry for not noticing this before, but it looks like the above
> export is not needed? only used by udp, which is in the same binary
> (either kernel of ipv6 module) as af_inet6

Ah, please don't be sorry, I appreciate your review!
Exactly, it compiled without exporting the symbol, I'll remove it in v5.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-10-06 17:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-04 17:17 [PATCH v4 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
2022-10-04 17:17 ` [PATCH v4 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
2022-10-04 17:17 ` [PATCH v4 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Kuniyuki Iwashima
2022-10-04 17:18 ` [PATCH v4 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct() Kuniyuki Iwashima
2022-10-06  9:19   ` Paolo Abeni
2022-10-06 17:10     ` Kuniyuki Iwashima
2022-10-04 17:18 ` [PATCH v4 net 4/5] ipv6: Fix data races around sk->sk_prot Kuniyuki Iwashima
2022-10-04 17:18 ` [PATCH v4 net 5/5] tcp: Fix data races around icsk->icsk_af_ops Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).