All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM.
@ 2022-09-27 16:12 Kuniyuki Iwashima
  2022-09-27 16:12 ` [PATCH v1 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-27 16:12 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel

This series fixes some memory leaks and data races caused in the
same scenario where one thread converts an IPv6 socket into IPv4
with IPV6_ADDRFORM and another accesses the socket concurrently.

Note patch 1 and 5 conflict with these commits in net-next, respectively.

  * 24426654ed3a ("bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf")
  * 34704ef024ae ("bpf: net: Change do_tcp_getsockopt() to take the sockptr_t argument")


Kuniyuki Iwashima (5):
  tcp/udp: Fix memory leak in ipv6_renew_options().
  udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
  tcp/udp: Call inet6_destroy_sock() in IPv4 sk_prot->destroy().
  ipv6: Fix data races around sk->sk_prot.
  tcp: Fix data races around icsk->icsk_af_ops.

 net/core/sock.c          |  6 ++++--
 net/ipv4/af_inet.c       | 23 ++++++++++++++++-------
 net/ipv4/tcp.c           | 10 ++++++----
 net/ipv4/tcp_ipv4.c      |  5 +++++
 net/ipv4/udp.c           |  6 ++++++
 net/ipv6/ipv6_sockglue.c | 29 ++++++++++++++---------------
 net/ipv6/tcp_ipv6.c      |  1 -
 7 files changed, 51 insertions(+), 29 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v1 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options().
  2022-09-27 16:12 [PATCH v1 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
@ 2022-09-27 16:12 ` Kuniyuki Iwashima
  2022-09-27 16:12 ` [PATCH v1 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Kuniyuki Iwashima
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-27 16:12 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel, syzbot

syzbot reported a memory leak [0] related to IPV6_ADDRFORM.

The scenario is that while one thread is converting an IPv6 socket into
IPv4 with IPV6_ADDRFORM, another thread calls do_ipv6_setsockopt() and
allocates memory to inet6_sk(sk)->XXX after conversion.

Then, the converted sk with (tcp|udp)_prot never frees the IPv6 resources,
which inet6_destroy_sock() should have cleaned up.

setsockopt(IPV6_ADDRFORM)                 setsockopt(IPV6_DSTOPTS)
+-----------------------+                 +----------------------+
- do_ipv6_setsockopt(sk, ...)
  - lock_sock(sk)                         - do_ipv6_setsockopt(sk, ...)
  - WRITE_ONCE(sk->sk_prot, &tcp_prot)      ^._ called via tcpv6_prot
  - xchg(&np->opt, NULL)                        before WRITE_ONCE()
  - txopt_put(opt)
  - release_sock(sk)
                                            - lock_sock(sk)
                                            - ipv6_set_opt_hdr(sk, ...)
                                              - ipv6_update_options(sk, opt)
                                                - xchg(&inet6_sk(sk)->opt, opt)
                                                  ^._ opt is never freed.

                                            - release_sock(sk)

Since IPV6_DSTOPTS allocates options under lock_sock(), we can avoid this
memory leak by testing whether sk_family is changed by IPV6_ADDRFORM after
acquiring the lock.

This issue exists from the initial commit between IPV6_ADDRFORM and
IPV6_PKTOPTIONS.

[0]:
BUG: memory leak
unreferenced object 0xffff888009ab9f80 (size 96):
  comm "syz-executor583", pid 328, jiffies 4294916198 (age 13.034s)
  hex dump (first 32 bytes):
    01 00 00 00 48 00 00 00 08 00 00 00 00 00 00 00  ....H...........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000002ee98ae1>] kmalloc include/linux/slab.h:605 [inline]
    [<000000002ee98ae1>] sock_kmalloc+0xb3/0x100 net/core/sock.c:2566
    [<0000000065d7b698>] ipv6_renew_options+0x21e/0x10b0 net/ipv6/exthdrs.c:1318
    [<00000000a8c756d7>] ipv6_set_opt_hdr net/ipv6/ipv6_sockglue.c:354 [inline]
    [<00000000a8c756d7>] do_ipv6_setsockopt.constprop.0+0x28b7/0x4350 net/ipv6/ipv6_sockglue.c:668
    [<000000002854d204>] ipv6_setsockopt+0xdf/0x190 net/ipv6/ipv6_sockglue.c:1021
    [<00000000e69fdcf8>] tcp_setsockopt+0x13b/0x2620 net/ipv4/tcp.c:3789
    [<0000000090da4b9b>] __sys_setsockopt+0x239/0x620 net/socket.c:2252
    [<00000000b10d192f>] __do_sys_setsockopt net/socket.c:2263 [inline]
    [<00000000b10d192f>] __se_sys_setsockopt net/socket.c:2260 [inline]
    [<00000000b10d192f>] __x64_sys_setsockopt+0xbe/0x160 net/socket.c:2260
    [<000000000a80d7aa>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [<000000000a80d7aa>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
    [<000000004562b5c6>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
Note the syzbot is running on our EC2, so we don't have a URL or hash.
Also, there seems to be no similar report on the public syzkaller dashboard.
Thus the Reported-by address might be inappropriate.

Please let me know if we should keep it empty or use another email address.
---
 net/ipv6/ipv6_sockglue.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index e0dcc7a193df..b61066ac8648 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -419,6 +419,12 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		rtnl_lock();
 	lock_sock(sk);
 
+	/* Another thread has converted the socket into IPv4 with
+	 * IPV6_ADDRFORM concurrently.
+	 */
+	if (unlikely(sk->sk_family != AF_INET6))
+		goto unlock;
+
 	switch (optname) {
 
 	case IPV6_ADDRFORM:
@@ -994,6 +1000,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		break;
 	}
 
+unlock:
 	release_sock(sk);
 	if (needs_rtnl)
 		rtnl_unlock();
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
  2022-09-27 16:12 [PATCH v1 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
  2022-09-27 16:12 ` [PATCH v1 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
@ 2022-09-27 16:12 ` Kuniyuki Iwashima
  2022-09-27 16:12 ` [PATCH v1 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv4 sk_prot->destroy() Kuniyuki Iwashima
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-27 16:12 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel, Brian Haley

Commit 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support") forgot
to add a change to free inet6_sk(sk)->rxpmtu while converting an IPv6
socket into IPv4 with IPV6_ADDRFORM.  After conversion, sk_prot is
changed to udp_prot and ->destroy() never cleans it up, resulting in
a memory leak.

This is due to the discrepancy between inet6_destroy_sock() and
IPV6_ADDRFORM, so let's call inet6_destroy_sock() from IPV6_ADDRFORM
to remove the difference.

However, this is not enough for now because rxpmtu can be changed
without lock_sock() after commit 03485f2adcde ("udpv6: Add lockless
sendmsg() support").  We will fix this case in the following patch.

Fixes: 4b340ae20d0e ("IPv6: Complete IPV6_DONTFRAG support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
Cc: Brian Haley <brian.haley@hp.com>
---
 net/ipv6/ipv6_sockglue.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index b61066ac8648..030a4cf23ceb 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -431,9 +431,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		if (optlen < sizeof(int))
 			goto e_inval;
 		if (val == PF_INET) {
-			struct ipv6_txoptions *opt;
-			struct sk_buff *pktopt;
-
 			if (sk->sk_type == SOCK_RAW)
 				break;
 
@@ -464,7 +461,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				break;
 			}
 
-			fl6_free_socklist(sk);
 			__ipv6_sock_mc_close(sk);
 			__ipv6_sock_ac_close(sk);
 
@@ -501,14 +497,9 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				sk->sk_socket->ops = &inet_dgram_ops;
 				sk->sk_family = PF_INET;
 			}
-			opt = xchg((__force struct ipv6_txoptions **)&np->opt,
-				   NULL);
-			if (opt) {
-				atomic_sub(opt->tot_len, &sk->sk_omem_alloc);
-				txopt_put(opt);
-			}
-			pktopt = xchg(&np->pktoptions, NULL);
-			kfree_skb(pktopt);
+
+			np->rxopt.all = 0;
+			inet6_destroy_sock(sk);
 
 			/*
 			 * ... and add it to the refcnt debug socks count
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv4 sk_prot->destroy().
  2022-09-27 16:12 [PATCH v1 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
  2022-09-27 16:12 ` [PATCH v1 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
  2022-09-27 16:12 ` [PATCH v1 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Kuniyuki Iwashima
@ 2022-09-27 16:12 ` Kuniyuki Iwashima
  2022-09-27 16:50   ` Eric Dumazet
  2022-09-27 16:12 ` [PATCH v1 net 4/5] ipv6: Fix data races around sk->sk_prot Kuniyuki Iwashima
  2022-09-27 16:12 ` [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops Kuniyuki Iwashima
  4 siblings, 1 reply; 14+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-27 16:12 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel, Vladislav Yasevich

Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
Add lockless sendmsg() support") added a lockless memory allocation path,
which could cause a memory leak:

setsockopt(IPV6_ADDRFORM)                 sendmsg()
+-----------------------+                 +-------+
- do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
  - lock_sock(sk)                           ^._ called via udpv6_prot
  - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
  - inet6_destroy_sock()
  - release_sock(sk)                        - ip6_make_skb(sk, ...)
                                              ^._ lockless fast path for
                                                  the non-corking case

                                              - __ip6_append_data(sk, ...)
                                                - ipv6_local_rxpmtu(sk, ...)
                                                  - xchg(&np->rxpmtu, skb)
                                                    ^._ rxpmtu is never freed.

                                            - lock_sock(sk)

For now, rxpmtu is only the case, but let's call inet6_destroy_sock()
in both TCP/UDP v4 destroy functions not to miss the future change.

We can consolidate TCP/UDP v4/v6 destroy functions, but such changes
are too invasive to backport to stable.  So, they can be posted as a
follow-up later for net-next.

Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
Cc: Vladislav Yasevich <vyasevic@redhat.com>
---
 net/ipv4/tcp_ipv4.c | 5 +++++
 net/ipv4/udp.c      | 6 ++++++
 net/ipv6/tcp_ipv6.c | 1 -
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 5b019ba2b9d2..035b6c52a243 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2263,6 +2263,11 @@ void tcp_v4_destroy_sock(struct sock *sk)
 	tcp_saved_syn_free(tp);
 
 	sk_sockets_allocated_dec(sk);
+
+#if IS_ENABLED(CONFIG_IPV6)
+	if (sk->sk_prot_creator == &tcpv6_prot)
+		inet6_destroy_sock(sk);
+#endif
 }
 EXPORT_SYMBOL(tcp_v4_destroy_sock);
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 560d9eadeaa5..cdf131c0a819 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -115,6 +115,7 @@
 #include <net/udp_tunnel.h>
 #if IS_ENABLED(CONFIG_IPV6)
 #include <net/ipv6_stubs.h>
+#include <net/transp_v6.h>
 #endif
 
 struct udp_table udp_table __read_mostly;
@@ -2666,6 +2667,11 @@ void udp_destroy_sock(struct sock *sk)
 		if (up->encap_enabled)
 			static_branch_dec(&udp_encap_needed_key);
 	}
+
+#if IS_ENABLED(CONFIG_IPV6)
+	if (sk->sk_prot_creator == &udpv6_prot)
+		inet6_destroy_sock(sk);
+#endif
 }
 
 /*
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index e54eee80ce5f..1ff6a92f7774 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1945,7 +1945,6 @@ static int tcp_v6_init_sock(struct sock *sk)
 static void tcp_v6_destroy_sock(struct sock *sk)
 {
 	tcp_v4_destroy_sock(sk);
-	inet6_destroy_sock(sk);
 }
 
 #ifdef CONFIG_PROC_FS
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1 net 4/5] ipv6: Fix data races around sk->sk_prot.
  2022-09-27 16:12 [PATCH v1 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
                   ` (2 preceding siblings ...)
  2022-09-27 16:12 ` [PATCH v1 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv4 sk_prot->destroy() Kuniyuki Iwashima
@ 2022-09-27 16:12 ` Kuniyuki Iwashima
  2022-09-27 16:12 ` [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops Kuniyuki Iwashima
  4 siblings, 0 replies; 14+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-27 16:12 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel

Commit 086d49058cd8 ("ipv6: annotate some data-races around sk->sk_prot")
fixed some data-races around sk->sk_prot but it was not enough.

Some functions in inet6_(stream|dgram)_ops still access sk->sk_prot
without lock_sock() or rtnl_lock(), so they need READ_ONCE() to avoid
load tearing.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/sock.c          |  6 ++++--
 net/ipv4/af_inet.c       | 23 ++++++++++++++++-------
 net/ipv6/ipv6_sockglue.c |  4 ++--
 3 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 788c1372663c..9c05637663bf 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3556,7 +3556,8 @@ int sock_common_getsockopt(struct socket *sock, int level, int optname,
 {
 	struct sock *sk = sock->sk;
 
-	return sk->sk_prot->getsockopt(sk, level, optname, optval, optlen);
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	return READ_ONCE(sk->sk_prot)->getsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(sock_common_getsockopt);
 
@@ -3582,7 +3583,8 @@ int sock_common_setsockopt(struct socket *sock, int level, int optname,
 {
 	struct sock *sk = sock->sk;
 
-	return sk->sk_prot->setsockopt(sk, level, optname, optval, optlen);
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	return READ_ONCE(sk->sk_prot)->setsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(sock_common_setsockopt);
 
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 3ca0cc467886..405fbad998df 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -558,22 +558,27 @@ int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr,
 		       int addr_len, int flags)
 {
 	struct sock *sk = sock->sk;
+	const struct proto *prot;
 	int err;
 
 	if (addr_len < sizeof(uaddr->sa_family))
 		return -EINVAL;
+
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	prot = READ_ONCE(sk->sk_prot);
+
 	if (uaddr->sa_family == AF_UNSPEC)
-		return sk->sk_prot->disconnect(sk, flags);
+		return prot->disconnect(sk, flags);
 
 	if (BPF_CGROUP_PRE_CONNECT_ENABLED(sk)) {
-		err = sk->sk_prot->pre_connect(sk, uaddr, addr_len);
+		err = prot->pre_connect(sk, uaddr, addr_len);
 		if (err)
 			return err;
 	}
 
 	if (data_race(!inet_sk(sk)->inet_num) && inet_autobind(sk))
 		return -EAGAIN;
-	return sk->sk_prot->connect(sk, uaddr, addr_len);
+	return prot->connect(sk, uaddr, addr_len);
 }
 EXPORT_SYMBOL(inet_dgram_connect);
 
@@ -734,10 +739,11 @@ EXPORT_SYMBOL(inet_stream_connect);
 int inet_accept(struct socket *sock, struct socket *newsock, int flags,
 		bool kern)
 {
-	struct sock *sk1 = sock->sk;
+	struct sock *sk1 = sock->sk, *sk2;
 	int err = -EINVAL;
-	struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, &err, kern);
 
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	sk2 = READ_ONCE(sk1->sk_prot)->accept(sk1, flags, &err, kern);
 	if (!sk2)
 		goto do_err;
 
@@ -825,12 +831,15 @@ ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 		      size_t size, int flags)
 {
 	struct sock *sk = sock->sk;
+	const struct proto *prot;
 
 	if (unlikely(inet_send_prepare(sk)))
 		return -EAGAIN;
 
-	if (sk->sk_prot->sendpage)
-		return sk->sk_prot->sendpage(sk, page, offset, size, flags);
+	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
+	prot = READ_ONCE(sk->sk_prot);
+	if (prot->sendpage)
+		return prot->sendpage(sk, page, offset, size, flags);
 	return sock_no_sendpage(sock, page, offset, size, flags);
 }
 EXPORT_SYMBOL(inet_sendpage);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 030a4cf23ceb..a89db5872dc3 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -477,7 +477,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				sock_prot_inuse_add(net, sk->sk_prot, -1);
 				sock_prot_inuse_add(net, &tcp_prot, 1);
 
-				/* Paired with READ_ONCE(sk->sk_prot) in net/ipv6/af_inet6.c */
+				/* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
 				WRITE_ONCE(sk->sk_prot, &tcp_prot);
 				icsk->icsk_af_ops = &ipv4_specific;
 				sk->sk_socket->ops = &inet_stream_ops;
@@ -492,7 +492,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				sock_prot_inuse_add(net, sk->sk_prot, -1);
 				sock_prot_inuse_add(net, prot, 1);
 
-				/* Paired with READ_ONCE(sk->sk_prot) in net/ipv6/af_inet6.c */
+				/* Paired with READ_ONCE(sk->sk_prot) in inet6_dgram_ops */
 				WRITE_ONCE(sk->sk_prot, prot);
 				sk->sk_socket->ops = &inet_dgram_ops;
 				sk->sk_family = PF_INET;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops.
  2022-09-27 16:12 [PATCH v1 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
                   ` (3 preceding siblings ...)
  2022-09-27 16:12 ` [PATCH v1 net 4/5] ipv6: Fix data races around sk->sk_prot Kuniyuki Iwashima
@ 2022-09-27 16:12 ` Kuniyuki Iwashima
  2022-09-27 16:39   ` Eric Dumazet
  4 siblings, 1 reply; 14+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-27 16:12 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, David Ahern
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, syzkaller-bugs,
	linux-kernel

IPV6_ADDRFORM changes icsk->icsk_af_ops under lock_sock(), but
tcp_(get|set)sockopt() read it locklessly.  To avoid load/store
tearing, we need to add READ_ONCE() and WRITE_ONCE() for the reads
and write.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/ipv4/tcp.c           | 10 ++++++----
 net/ipv6/ipv6_sockglue.c |  3 ++-
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index e373dde1f46f..c86dd0ccef5b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3795,8 +3795,9 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
 	const struct inet_connection_sock *icsk = inet_csk(sk);
 
 	if (level != SOL_TCP)
-		return icsk->icsk_af_ops->setsockopt(sk, level, optname,
-						     optval, optlen);
+		/* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
+		return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname,
+								optval, optlen);
 	return do_tcp_setsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(tcp_setsockopt);
@@ -4394,8 +4395,9 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
 	struct inet_connection_sock *icsk = inet_csk(sk);
 
 	if (level != SOL_TCP)
-		return icsk->icsk_af_ops->getsockopt(sk, level, optname,
-						     optval, optlen);
+		/* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
+		return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname,
+								optval, optlen);
 	return do_tcp_getsockopt(sk, level, optname, optval, optlen);
 }
 EXPORT_SYMBOL(tcp_getsockopt);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index a89db5872dc3..726d95859898 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -479,7 +479,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 
 				/* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
 				WRITE_ONCE(sk->sk_prot, &tcp_prot);
-				icsk->icsk_af_ops = &ipv4_specific;
+				/* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
+				WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific);
 				sk->sk_socket->ops = &inet_stream_ops;
 				sk->sk_family = PF_INET;
 				tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops.
  2022-09-27 16:12 ` [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops Kuniyuki Iwashima
@ 2022-09-27 16:39   ` Eric Dumazet
  2022-09-27 16:48     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2022-09-27 16:39 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
	Kuniyuki Iwashima, netdev, syzkaller-bugs, LKML

On Tue, Sep 27, 2022 at 9:33 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> IPV6_ADDRFORM changes icsk->icsk_af_ops under lock_sock(), but
> tcp_(get|set)sockopt() read it locklessly.  To avoid load/store
> tearing, we need to add READ_ONCE() and WRITE_ONCE() for the reads
> and write.

I am pretty sure I have released a syzkaller bug recently with this issue.
Have you seen this?
If yes, please include the appropriate syzbot tag.


>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
>  net/ipv4/tcp.c           | 10 ++++++----
>  net/ipv6/ipv6_sockglue.c |  3 ++-
>  2 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index e373dde1f46f..c86dd0ccef5b 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -3795,8 +3795,9 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
>         const struct inet_connection_sock *icsk = inet_csk(sk);
>
>         if (level != SOL_TCP)
> -               return icsk->icsk_af_ops->setsockopt(sk, level, optname,
> -                                                    optval, optlen);
> +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> +               return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname,
> +                                                               optval, optlen);
>         return do_tcp_setsockopt(sk, level, optname, optval, optlen);
>  }
>  EXPORT_SYMBOL(tcp_setsockopt);
> @@ -4394,8 +4395,9 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
>         struct inet_connection_sock *icsk = inet_csk(sk);
>
>         if (level != SOL_TCP)
> -               return icsk->icsk_af_ops->getsockopt(sk, level, optname,
> -                                                    optval, optlen);
> +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> +               return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname,
> +                                                               optval, optlen);
>         return do_tcp_getsockopt(sk, level, optname, optval, optlen);
>  }
>  EXPORT_SYMBOL(tcp_getsockopt);
> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> index a89db5872dc3..726d95859898 100644
> --- a/net/ipv6/ipv6_sockglue.c
> +++ b/net/ipv6/ipv6_sockglue.c
> @@ -479,7 +479,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
>
>                                 /* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
>                                 WRITE_ONCE(sk->sk_prot, &tcp_prot);
> -                               icsk->icsk_af_ops = &ipv4_specific;
> +                               /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
> +                               WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific);
>                                 sk->sk_socket->ops = &inet_stream_ops;
>                                 sk->sk_family = PF_INET;
>                                 tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops.
  2022-09-27 16:39   ` Eric Dumazet
@ 2022-09-27 16:48     ` Kuniyuki Iwashima
  2022-09-27 16:53       ` Kuniyuki Iwashima
  2022-09-27 16:55       ` Eric Dumazet
  0 siblings, 2 replies; 14+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-27 16:48 UTC (permalink / raw)
  To: edumazet
  Cc: davem, dsahern, kuba, kuni1840, kuniyu, linux-kernel, netdev,
	pabeni, syzkaller-bugs

From:   Eric Dumazet <edumazet@google.com>
Date:   Tue, 27 Sep 2022 09:39:37 -0700
> On Tue, Sep 27, 2022 at 9:33 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > IPV6_ADDRFORM changes icsk->icsk_af_ops under lock_sock(), but
> > tcp_(get|set)sockopt() read it locklessly.  To avoid load/store
> > tearing, we need to add READ_ONCE() and WRITE_ONCE() for the reads
> > and write.
> 
> I am pretty sure I have released a syzkaller bug recently with this issue.
> Have you seen this?
> If yes, please include the appropriate syzbot tag.

No, I haven't.
Could you provide the URL?
I'm happy to include the syzbot tag and KCSAN report in the changelog.


> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> >  net/ipv4/tcp.c           | 10 ++++++----
> >  net/ipv6/ipv6_sockglue.c |  3 ++-
> >  2 files changed, 8 insertions(+), 5 deletions(-)
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index e373dde1f46f..c86dd0ccef5b 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -3795,8 +3795,9 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
> >         const struct inet_connection_sock *icsk = inet_csk(sk);
> >
> >         if (level != SOL_TCP)
> > -               return icsk->icsk_af_ops->setsockopt(sk, level, optname,
> > -                                                    optval, optlen);
> > +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> > +               return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname,
> > +                                                               optval, optlen);
> >         return do_tcp_setsockopt(sk, level, optname, optval, optlen);
> >  }
> >  EXPORT_SYMBOL(tcp_setsockopt);
> > @@ -4394,8 +4395,9 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
> >         struct inet_connection_sock *icsk = inet_csk(sk);
> >
> >         if (level != SOL_TCP)
> > -               return icsk->icsk_af_ops->getsockopt(sk, level, optname,
> > -                                                    optval, optlen);
> > +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> > +               return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname,
> > +                                                               optval, optlen);
> >         return do_tcp_getsockopt(sk, level, optname, optval, optlen);
> >  }
> >  EXPORT_SYMBOL(tcp_getsockopt);
> > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> > index a89db5872dc3..726d95859898 100644
> > --- a/net/ipv6/ipv6_sockglue.c
> > +++ b/net/ipv6/ipv6_sockglue.c
> > @@ -479,7 +479,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
> >
> >                                 /* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
> >                                 WRITE_ONCE(sk->sk_prot, &tcp_prot);
> > -                               icsk->icsk_af_ops = &ipv4_specific;
> > +                               /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
> > +                               WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific);
> >                                 sk->sk_socket->ops = &inet_stream_ops;
> >                                 sk->sk_family = PF_INET;
> >                                 tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
> > --
> > 2.30.2
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv4 sk_prot->destroy().
  2022-09-27 16:12 ` [PATCH v1 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv4 sk_prot->destroy() Kuniyuki Iwashima
@ 2022-09-27 16:50   ` Eric Dumazet
  2022-09-27 17:02     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2022-09-27 16:50 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, David Ahern,
	Kuniyuki Iwashima, netdev, syzkaller-bugs, LKML,
	Vladislav Yasevich

On Tue, Sep 27, 2022 at 9:13 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
> able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
> IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
> Add lockless sendmsg() support") added a lockless memory allocation path,
> which could cause a memory leak:
>
> setsockopt(IPV6_ADDRFORM)                 sendmsg()
> +-----------------------+                 +-------+
> - do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
>   - lock_sock(sk)                           ^._ called via udpv6_prot
>   - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
>   - inet6_destroy_sock()
>   - release_sock(sk)                        - ip6_make_skb(sk, ...)
>                                               ^._ lockless fast path for
>                                                   the non-corking case
>
>                                               - __ip6_append_data(sk, ...)
>                                                 - ipv6_local_rxpmtu(sk, ...)
>                                                   - xchg(&np->rxpmtu, skb)
>                                                     ^._ rxpmtu is never freed.
>
>                                             - lock_sock(sk)
>
> For now, rxpmtu is only the case, but let's call inet6_destroy_sock()
> in both TCP/UDP v4 destroy functions not to miss the future change.
>
> We can consolidate TCP/UDP v4/v6 destroy functions, but such changes
> are too invasive to backport to stable.  So, they can be posted as a
> follow-up later for net-next.
>
> Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
> Cc: Vladislav Yasevich <vyasevic@redhat.com>
> ---
>  net/ipv4/tcp_ipv4.c | 5 +++++
>  net/ipv4/udp.c      | 6 ++++++
>  net/ipv6/tcp_ipv6.c | 1 -
>  3 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 5b019ba2b9d2..035b6c52a243 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -2263,6 +2263,11 @@ void tcp_v4_destroy_sock(struct sock *sk)
>         tcp_saved_syn_free(tp);
>
>         sk_sockets_allocated_dec(sk);
> +
> +#if IS_ENABLED(CONFIG_IPV6)
> +       if (sk->sk_prot_creator == &tcpv6_prot)
> +               inet6_destroy_sock(sk);
> +#endif
>  }

This is ugly, and will not compile with CONFIG_IPV6=m, right ?


>  EXPORT_SYMBOL(tcp_v4_destroy_sock);
>
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 560d9eadeaa5..cdf131c0a819 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -115,6 +115,7 @@
>  #include <net/udp_tunnel.h>
>  #if IS_ENABLED(CONFIG_IPV6)
>  #include <net/ipv6_stubs.h>
> +#include <net/transp_v6.h>
>  #endif
>
>  struct udp_table udp_table __read_mostly;
> @@ -2666,6 +2667,11 @@ void udp_destroy_sock(struct sock *sk)
>                 if (up->encap_enabled)
>                         static_branch_dec(&udp_encap_needed_key);
>         }
> +
> +#if IS_ENABLED(CONFIG_IPV6)
> +       if (sk->sk_prot_creator == &udpv6_prot)
> +               inet6_destroy_sock(sk);
> +#endif
>  }
>
>  /*
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index e54eee80ce5f..1ff6a92f7774 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -1945,7 +1945,6 @@ static int tcp_v6_init_sock(struct sock *sk)
>  static void tcp_v6_destroy_sock(struct sock *sk)
>  {
>         tcp_v4_destroy_sock(sk);
> -       inet6_destroy_sock(sk);
>  }
>
>  #ifdef CONFIG_PROC_FS
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops.
  2022-09-27 16:48     ` Kuniyuki Iwashima
@ 2022-09-27 16:53       ` Kuniyuki Iwashima
  2022-09-27 17:56         ` Eric Dumazet
  2022-09-27 16:55       ` Eric Dumazet
  1 sibling, 1 reply; 14+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-27 16:53 UTC (permalink / raw)
  To: kuniyu
  Cc: davem, dsahern, edumazet, kuba, kuni1840, linux-kernel, netdev,
	pabeni, syzkaller-bugs

From:   Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Tue, 27 Sep 2022 09:48:24 -0700
> From:   Eric Dumazet <edumazet@google.com>
> Date:   Tue, 27 Sep 2022 09:39:37 -0700
> > On Tue, Sep 27, 2022 at 9:33 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > >
> > > IPV6_ADDRFORM changes icsk->icsk_af_ops under lock_sock(), but
> > > tcp_(get|set)sockopt() read it locklessly.  To avoid load/store
> > > tearing, we need to add READ_ONCE() and WRITE_ONCE() for the reads
> > > and write.
> > 
> > I am pretty sure I have released a syzkaller bug recently with this issue.
> > Have you seen this?
> > If yes, please include the appropriate syzbot tag.

Are you mentioning this commit ?

086d49058cd8 ("ipv6: annotate some data-races around sk->sk_prot")

Then, yes, I'll add syzbot tags to patch 4 and 5.


> 
> No, I haven't.
> Could you provide the URL?
> I'm happy to include the syzbot tag and KCSAN report in the changelog.
> 
> 
> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > ---
> > >  net/ipv4/tcp.c           | 10 ++++++----
> > >  net/ipv6/ipv6_sockglue.c |  3 ++-
> > >  2 files changed, 8 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > > index e373dde1f46f..c86dd0ccef5b 100644
> > > --- a/net/ipv4/tcp.c
> > > +++ b/net/ipv4/tcp.c
> > > @@ -3795,8 +3795,9 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
> > >         const struct inet_connection_sock *icsk = inet_csk(sk);
> > >
> > >         if (level != SOL_TCP)
> > > -               return icsk->icsk_af_ops->setsockopt(sk, level, optname,
> > > -                                                    optval, optlen);
> > > +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> > > +               return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname,
> > > +                                                               optval, optlen);
> > >         return do_tcp_setsockopt(sk, level, optname, optval, optlen);
> > >  }
> > >  EXPORT_SYMBOL(tcp_setsockopt);
> > > @@ -4394,8 +4395,9 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
> > >         struct inet_connection_sock *icsk = inet_csk(sk);
> > >
> > >         if (level != SOL_TCP)
> > > -               return icsk->icsk_af_ops->getsockopt(sk, level, optname,
> > > -                                                    optval, optlen);
> > > +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> > > +               return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname,
> > > +                                                               optval, optlen);
> > >         return do_tcp_getsockopt(sk, level, optname, optval, optlen);
> > >  }
> > >  EXPORT_SYMBOL(tcp_getsockopt);
> > > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> > > index a89db5872dc3..726d95859898 100644
> > > --- a/net/ipv6/ipv6_sockglue.c
> > > +++ b/net/ipv6/ipv6_sockglue.c
> > > @@ -479,7 +479,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
> > >
> > >                                 /* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
> > >                                 WRITE_ONCE(sk->sk_prot, &tcp_prot);
> > > -                               icsk->icsk_af_ops = &ipv4_specific;
> > > +                               /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
> > > +                               WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific);
> > >                                 sk->sk_socket->ops = &inet_stream_ops;
> > >                                 sk->sk_family = PF_INET;
> > >                                 tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
> > > --
> > > 2.30.2
> > >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops.
  2022-09-27 16:48     ` Kuniyuki Iwashima
  2022-09-27 16:53       ` Kuniyuki Iwashima
@ 2022-09-27 16:55       ` Eric Dumazet
  2022-09-27 17:07         ` Kuniyuki Iwashima
  1 sibling, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2022-09-27 16:55 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David Miller, David Ahern, Jakub Kicinski, Kuniyuki Iwashima,
	LKML, netdev, Paolo Abeni, syzkaller-bugs

On Tue, Sep 27, 2022 at 9:48 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From:   Eric Dumazet <edumazet@google.com>
> Date:   Tue, 27 Sep 2022 09:39:37 -0700
> > On Tue, Sep 27, 2022 at 9:33 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > >
> > > IPV6_ADDRFORM changes icsk->icsk_af_ops under lock_sock(), but
> > > tcp_(get|set)sockopt() read it locklessly.  To avoid load/store
> > > tearing, we need to add READ_ONCE() and WRITE_ONCE() for the reads
> > > and write.
> >
> > I am pretty sure I have released a syzkaller bug recently with this issue.
> > Have you seen this?
> > If yes, please include the appropriate syzbot tag.
>
> No, I haven't.
> Could you provide the URL?
> I'm happy to include the syzbot tag and KCSAN report in the changelog.
>
>

Report has been released 10 days ago, but apparently the syzbot queue
is so full these days that the report is still throttled.

==================================================================
BUG: KCSAN: data-race in tcp_setsockopt / tcp_v6_connect

write to 0xffff88813c624518 of 8 bytes by task 23936 on cpu 0:
tcp_v6_connect+0x5b3/0xce0 net/ipv6/tcp_ipv6.c:240
__inet_stream_connect+0x159/0x6d0 net/ipv4/af_inet.c:660
inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:724
__sys_connect_file net/socket.c:1976 [inline]
__sys_connect+0x197/0x1b0 net/socket.c:1993
__do_sys_connect net/socket.c:2003 [inline]
__se_sys_connect net/socket.c:2000 [inline]
__x64_sys_connect+0x3d/0x50 net/socket.c:2000
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff88813c624518 of 8 bytes by task 23937 on cpu 1:
tcp_setsockopt+0x147/0x1c80 net/ipv4/tcp.c:3789
sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3585
__sys_setsockopt+0x212/0x2b0 net/socket.c:2252
__do_sys_setsockopt net/socket.c:2263 [inline]
__se_sys_setsockopt net/socket.c:2260 [inline]
__x64_sys_setsockopt+0x62/0x70 net/socket.c:2260
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0xffffffff8539af68 -> 0xffffffff8539aff8

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 23937 Comm: syz-executor.5 Not tainted
6.0.0-rc4-syzkaller-00331-g4ed9c1e971b1-dirty #0

Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 08/26/2022
==================================================================

> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > ---
> > >  net/ipv4/tcp.c           | 10 ++++++----
> > >  net/ipv6/ipv6_sockglue.c |  3 ++-
> > >  2 files changed, 8 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > > index e373dde1f46f..c86dd0ccef5b 100644
> > > --- a/net/ipv4/tcp.c
> > > +++ b/net/ipv4/tcp.c
> > > @@ -3795,8 +3795,9 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
> > >         const struct inet_connection_sock *icsk = inet_csk(sk);
> > >
> > >         if (level != SOL_TCP)
> > > -               return icsk->icsk_af_ops->setsockopt(sk, level, optname,
> > > -                                                    optval, optlen);
> > > +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> > > +               return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname,
> > > +                                                               optval, optlen);
> > >         return do_tcp_setsockopt(sk, level, optname, optval, optlen);
> > >  }
> > >  EXPORT_SYMBOL(tcp_setsockopt);
> > > @@ -4394,8 +4395,9 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
> > >         struct inet_connection_sock *icsk = inet_csk(sk);
> > >
> > >         if (level != SOL_TCP)
> > > -               return icsk->icsk_af_ops->getsockopt(sk, level, optname,
> > > -                                                    optval, optlen);
> > > +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> > > +               return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname,
> > > +                                                               optval, optlen);
> > >         return do_tcp_getsockopt(sk, level, optname, optval, optlen);
> > >  }
> > >  EXPORT_SYMBOL(tcp_getsockopt);
> > > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> > > index a89db5872dc3..726d95859898 100644
> > > --- a/net/ipv6/ipv6_sockglue.c
> > > +++ b/net/ipv6/ipv6_sockglue.c
> > > @@ -479,7 +479,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
> > >
> > >                                 /* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
> > >                                 WRITE_ONCE(sk->sk_prot, &tcp_prot);
> > > -                               icsk->icsk_af_ops = &ipv4_specific;
> > > +                               /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
> > > +                               WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific);
> > >                                 sk->sk_socket->ops = &inet_stream_ops;
> > >                                 sk->sk_family = PF_INET;
> > >                                 tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
> > > --
> > > 2.30.2
> > >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv4 sk_prot->destroy().
  2022-09-27 16:50   ` Eric Dumazet
@ 2022-09-27 17:02     ` Kuniyuki Iwashima
  0 siblings, 0 replies; 14+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-27 17:02 UTC (permalink / raw)
  To: edumazet
  Cc: davem, dsahern, kuba, kuni1840, kuniyu, linux-kernel, netdev,
	pabeni, syzkaller-bugs, vyasevic

From:   Eric Dumazet <edumazet@google.com>
Date:   Tue, 27 Sep 2022 09:50:04 -0700
> On Tue, Sep 27, 2022 at 9:13 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
> > able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
> > IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2adcde ("udpv6:
> > Add lockless sendmsg() support") added a lockless memory allocation path,
> > which could cause a memory leak:
> >
> > setsockopt(IPV6_ADDRFORM)                 sendmsg()
> > +-----------------------+                 +-------+
> > - do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
> >   - lock_sock(sk)                           ^._ called via udpv6_prot
> >   - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
> >   - inet6_destroy_sock()
> >   - release_sock(sk)                        - ip6_make_skb(sk, ...)
> >                                               ^._ lockless fast path for
> >                                                   the non-corking case
> >
> >                                               - __ip6_append_data(sk, ...)
> >                                                 - ipv6_local_rxpmtu(sk, ...)
> >                                                   - xchg(&np->rxpmtu, skb)
> >                                                     ^._ rxpmtu is never freed.
> >
> >                                             - lock_sock(sk)
> >
> > For now, rxpmtu is only the case, but let's call inet6_destroy_sock()
> > in both TCP/UDP v4 destroy functions not to miss the future change.
> >
> > We can consolidate TCP/UDP v4/v6 destroy functions, but such changes
> > are too invasive to backport to stable.  So, they can be posted as a
> > follow-up later for net-next.
> >
> > Fixes: 03485f2adcde ("udpv6: Add lockless sendmsg() support")
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> > Cc: Vladislav Yasevich <vyasevic@redhat.com>
> > ---
> >  net/ipv4/tcp_ipv4.c | 5 +++++
> >  net/ipv4/udp.c      | 6 ++++++
> >  net/ipv6/tcp_ipv6.c | 1 -
> >  3 files changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 5b019ba2b9d2..035b6c52a243 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -2263,6 +2263,11 @@ void tcp_v4_destroy_sock(struct sock *sk)
> >         tcp_saved_syn_free(tp);
> >
> >         sk_sockets_allocated_dec(sk);
> > +
> > +#if IS_ENABLED(CONFIG_IPV6)
> > +       if (sk->sk_prot_creator == &tcpv6_prot)
> > +               inet6_destroy_sock(sk);
> > +#endif
> >  }
> 
> This is ugly, and will not compile with CONFIG_IPV6=m, right ?

Ah, exactly...

ld: net/ipv4/tcp_ipv4.o: in function `tcp_v4_destroy_sock':
/mnt/ec2-user/kernel/214_tcp_ipv6_renew_options_memleak/net/ipv4/tcp_ipv4.c:2290: undefined reference to `tcpv6_prot'
ld: /mnt/ec2-user/kernel/214_tcp_ipv6_renew_options_memleak/net/ipv4/tcp_ipv4.c:2291: undefined reference to `inet6_destroy_sock'
ld: net/ipv4/udp.o: in function `udp_destroy_sock':
/mnt/ec2-user/kernel/214_tcp_ipv6_renew_options_memleak/net/ipv4/udp.c:2660: undefined reference to `udpv6_prot'
ld: /mnt/ec2-user/kernel/214_tcp_ipv6_renew_options_memleak/net/ipv4/udp.c:2661: undefined reference to `inet6_destroy_sock'

So, do we have to move these 4 under net/ipv4/ with #ifdef CONFIG_IPv6 ?


> >  EXPORT_SYMBOL(tcp_v4_destroy_sock);
> >
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 560d9eadeaa5..cdf131c0a819 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -115,6 +115,7 @@
> >  #include <net/udp_tunnel.h>
> >  #if IS_ENABLED(CONFIG_IPV6)
> >  #include <net/ipv6_stubs.h>
> > +#include <net/transp_v6.h>
> >  #endif
> >
> >  struct udp_table udp_table __read_mostly;
> > @@ -2666,6 +2667,11 @@ void udp_destroy_sock(struct sock *sk)
> >                 if (up->encap_enabled)
> >                         static_branch_dec(&udp_encap_needed_key);
> >         }
> > +
> > +#if IS_ENABLED(CONFIG_IPV6)
> > +       if (sk->sk_prot_creator == &udpv6_prot)
> > +               inet6_destroy_sock(sk);
> > +#endif
> >  }
> >
> >  /*
> > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> > index e54eee80ce5f..1ff6a92f7774 100644
> > --- a/net/ipv6/tcp_ipv6.c
> > +++ b/net/ipv6/tcp_ipv6.c
> > @@ -1945,7 +1945,6 @@ static int tcp_v6_init_sock(struct sock *sk)
> >  static void tcp_v6_destroy_sock(struct sock *sk)
> >  {
> >         tcp_v4_destroy_sock(sk);
> > -       inet6_destroy_sock(sk);
> >  }
> >
> >  #ifdef CONFIG_PROC_FS
> > --
> > 2.30.2
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops.
  2022-09-27 16:55       ` Eric Dumazet
@ 2022-09-27 17:07         ` Kuniyuki Iwashima
  0 siblings, 0 replies; 14+ messages in thread
From: Kuniyuki Iwashima @ 2022-09-27 17:07 UTC (permalink / raw)
  To: edumazet
  Cc: davem, dsahern, kuba, kuni1840, kuniyu, linux-kernel, netdev,
	pabeni, syzkaller-bugs

From:   Eric Dumazet <edumazet@google.com>
Date:   Tue, 27 Sep 2022 09:55:03 -0700
> On Tue, Sep 27, 2022 at 9:48 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > From:   Eric Dumazet <edumazet@google.com>
> > Date:   Tue, 27 Sep 2022 09:39:37 -0700
> > > On Tue, Sep 27, 2022 at 9:33 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > > >
> > > > IPV6_ADDRFORM changes icsk->icsk_af_ops under lock_sock(), but
> > > > tcp_(get|set)sockopt() read it locklessly.  To avoid load/store
> > > > tearing, we need to add READ_ONCE() and WRITE_ONCE() for the reads
> > > > and write.
> > >
> > > I am pretty sure I have released a syzkaller bug recently with this issue.
> > > Have you seen this?
> > > If yes, please include the appropriate syzbot tag.
> >
> > No, I haven't.
> > Could you provide the URL?
> > I'm happy to include the syzbot tag and KCSAN report in the changelog.
> >
> >
> 
> Report has been released 10 days ago, but apparently the syzbot queue
> is so full these days that the report is still throttled.

Thank you!
I'll add this in v2.


> 
> ==================================================================
> BUG: KCSAN: data-race in tcp_setsockopt / tcp_v6_connect
> 
> write to 0xffff88813c624518 of 8 bytes by task 23936 on cpu 0:
> tcp_v6_connect+0x5b3/0xce0 net/ipv6/tcp_ipv6.c:240
> __inet_stream_connect+0x159/0x6d0 net/ipv4/af_inet.c:660
> inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:724
> __sys_connect_file net/socket.c:1976 [inline]
> __sys_connect+0x197/0x1b0 net/socket.c:1993
> __do_sys_connect net/socket.c:2003 [inline]
> __se_sys_connect net/socket.c:2000 [inline]
> __x64_sys_connect+0x3d/0x50 net/socket.c:2000
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> read to 0xffff88813c624518 of 8 bytes by task 23937 on cpu 1:
> tcp_setsockopt+0x147/0x1c80 net/ipv4/tcp.c:3789
> sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3585
> __sys_setsockopt+0x212/0x2b0 net/socket.c:2252
> __do_sys_setsockopt net/socket.c:2263 [inline]
> __se_sys_setsockopt net/socket.c:2260 [inline]
> __x64_sys_setsockopt+0x62/0x70 net/socket.c:2260
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> value changed: 0xffffffff8539af68 -> 0xffffffff8539aff8
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 PID: 23937 Comm: syz-executor.5 Not tainted
> 6.0.0-rc4-syzkaller-00331-g4ed9c1e971b1-dirty #0
> 
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 08/26/2022
> ==================================================================
> 
> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > > ---
> > > >  net/ipv4/tcp.c           | 10 ++++++----
> > > >  net/ipv6/ipv6_sockglue.c |  3 ++-
> > > >  2 files changed, 8 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > > > index e373dde1f46f..c86dd0ccef5b 100644
> > > > --- a/net/ipv4/tcp.c
> > > > +++ b/net/ipv4/tcp.c
> > > > @@ -3795,8 +3795,9 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
> > > >         const struct inet_connection_sock *icsk = inet_csk(sk);
> > > >
> > > >         if (level != SOL_TCP)
> > > > -               return icsk->icsk_af_ops->setsockopt(sk, level, optname,
> > > > -                                                    optval, optlen);
> > > > +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> > > > +               return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname,
> > > > +                                                               optval, optlen);
> > > >         return do_tcp_setsockopt(sk, level, optname, optval, optlen);
> > > >  }
> > > >  EXPORT_SYMBOL(tcp_setsockopt);
> > > > @@ -4394,8 +4395,9 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
> > > >         struct inet_connection_sock *icsk = inet_csk(sk);
> > > >
> > > >         if (level != SOL_TCP)
> > > > -               return icsk->icsk_af_ops->getsockopt(sk, level, optname,
> > > > -                                                    optval, optlen);
> > > > +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> > > > +               return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname,
> > > > +                                                               optval, optlen);
> > > >         return do_tcp_getsockopt(sk, level, optname, optval, optlen);
> > > >  }
> > > >  EXPORT_SYMBOL(tcp_getsockopt);
> > > > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> > > > index a89db5872dc3..726d95859898 100644
> > > > --- a/net/ipv6/ipv6_sockglue.c
> > > > +++ b/net/ipv6/ipv6_sockglue.c
> > > > @@ -479,7 +479,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
> > > >
> > > >                                 /* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
> > > >                                 WRITE_ONCE(sk->sk_prot, &tcp_prot);
> > > > -                               icsk->icsk_af_ops = &ipv4_specific;
> > > > +                               /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
> > > > +                               WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific);
> > > >                                 sk->sk_socket->ops = &inet_stream_ops;
> > > >                                 sk->sk_family = PF_INET;
> > > >                                 tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
> > > > --
> > > > 2.30.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops.
  2022-09-27 16:53       ` Kuniyuki Iwashima
@ 2022-09-27 17:56         ` Eric Dumazet
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2022-09-27 17:56 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David Miller, David Ahern, Jakub Kicinski, Kuniyuki Iwashima,
	LKML, netdev, Paolo Abeni, syzkaller-bugs

On Tue, Sep 27, 2022 at 10:49 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From:   Kuniyuki Iwashima <kuniyu@amazon.com>
> Date:   Tue, 27 Sep 2022 09:48:24 -0700
> > From:   Eric Dumazet <edumazet@google.com>
> > Date:   Tue, 27 Sep 2022 09:39:37 -0700
> > > On Tue, Sep 27, 2022 at 9:33 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > > >
> > > > IPV6_ADDRFORM changes icsk->icsk_af_ops under lock_sock(), but
> > > > tcp_(get|set)sockopt() read it locklessly.  To avoid load/store
> > > > tearing, we need to add READ_ONCE() and WRITE_ONCE() for the reads
> > > > and write.
> > >
> > > I am pretty sure I have released a syzkaller bug recently with this issue.
> > > Have you seen this?
> > > If yes, please include the appropriate syzbot tag.
>
> Are you mentioning this commit ?
>

No, this is a new syzbot report, with a different stack trace.

> 086d49058cd8 ("ipv6: annotate some data-races around sk->sk_prot")
>
> Then, yes, I'll add syzbot tags to patch 4 and 5.
>
>
> >
> > No, I haven't.
> > Could you provide the URL?
> > I'm happy to include the syzbot tag and KCSAN report in the changelog.
> >
> >
> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > > ---
> > > >  net/ipv4/tcp.c           | 10 ++++++----
> > > >  net/ipv6/ipv6_sockglue.c |  3 ++-
> > > >  2 files changed, 8 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > > > index e373dde1f46f..c86dd0ccef5b 100644
> > > > --- a/net/ipv4/tcp.c
> > > > +++ b/net/ipv4/tcp.c
> > > > @@ -3795,8 +3795,9 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
> > > >         const struct inet_connection_sock *icsk = inet_csk(sk);
> > > >
> > > >         if (level != SOL_TCP)
> > > > -               return icsk->icsk_af_ops->setsockopt(sk, level, optname,
> > > > -                                                    optval, optlen);
> > > > +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> > > > +               return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname,
> > > > +                                                               optval, optlen);
> > > >         return do_tcp_setsockopt(sk, level, optname, optval, optlen);
> > > >  }
> > > >  EXPORT_SYMBOL(tcp_setsockopt);
> > > > @@ -4394,8 +4395,9 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
> > > >         struct inet_connection_sock *icsk = inet_csk(sk);
> > > >
> > > >         if (level != SOL_TCP)
> > > > -               return icsk->icsk_af_ops->getsockopt(sk, level, optname,
> > > > -                                                    optval, optlen);
> > > > +               /* IPV6_ADDRFORM can change icsk->icsk_af_ops under us. */
> > > > +               return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname,
> > > > +                                                               optval, optlen);
> > > >         return do_tcp_getsockopt(sk, level, optname, optval, optlen);
> > > >  }
> > > >  EXPORT_SYMBOL(tcp_getsockopt);
> > > > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> > > > index a89db5872dc3..726d95859898 100644
> > > > --- a/net/ipv6/ipv6_sockglue.c
> > > > +++ b/net/ipv6/ipv6_sockglue.c
> > > > @@ -479,7 +479,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
> > > >
> > > >                                 /* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
> > > >                                 WRITE_ONCE(sk->sk_prot, &tcp_prot);
> > > > -                               icsk->icsk_af_ops = &ipv4_specific;
> > > > +                               /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
> > > > +                               WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific);
> > > >                                 sk->sk_socket->ops = &inet_stream_ops;
> > > >                                 sk->sk_family = PF_INET;
> > > >                                 tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
> > > > --
> > > > 2.30.2
> > > >

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-09-27 18:16 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-27 16:12 [PATCH v1 net 0/5] tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM Kuniyuki Iwashima
2022-09-27 16:12 ` [PATCH v1 net 1/5] tcp/udp: Fix memory leak in ipv6_renew_options() Kuniyuki Iwashima
2022-09-27 16:12 ` [PATCH v1 net 2/5] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Kuniyuki Iwashima
2022-09-27 16:12 ` [PATCH v1 net 3/5] tcp/udp: Call inet6_destroy_sock() in IPv4 sk_prot->destroy() Kuniyuki Iwashima
2022-09-27 16:50   ` Eric Dumazet
2022-09-27 17:02     ` Kuniyuki Iwashima
2022-09-27 16:12 ` [PATCH v1 net 4/5] ipv6: Fix data races around sk->sk_prot Kuniyuki Iwashima
2022-09-27 16:12 ` [PATCH v1 net 5/5] tcp: Fix data races around icsk->icsk_af_ops Kuniyuki Iwashima
2022-09-27 16:39   ` Eric Dumazet
2022-09-27 16:48     ` Kuniyuki Iwashima
2022-09-27 16:53       ` Kuniyuki Iwashima
2022-09-27 17:56         ` Eric Dumazet
2022-09-27 16:55       ` Eric Dumazet
2022-09-27 17:07         ` Kuniyuki Iwashima

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.