netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 5.13 01/19] net/802/mrp: fix memleak in mrp_request_join()
@ 2021-07-23  3:57 Sasha Levin
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 02/19] net/802/garp: fix memleak in garp_request_join() Sasha Levin
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Sasha Levin @ 2021-07-23  3:57 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Yang Yingliang, Hulk Robot, David S . Miller, Sasha Levin, netdev

From: Yang Yingliang <yangyingliang@huawei.com>

[ Upstream commit 996af62167d0e0ec69b938a3561e96f84ffff1aa ]

I got kmemleak report when doing fuzz test:

BUG: memory leak
unreferenced object 0xffff88810c239500 (size 64):
comm "syz-executor940", pid 882, jiffies 4294712870 (age 14.631s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 01 00 00 00 01 02 00 04 ................
backtrace:
[<00000000a323afa4>] slab_alloc_node mm/slub.c:2972 [inline]
[<00000000a323afa4>] slab_alloc mm/slub.c:2980 [inline]
[<00000000a323afa4>] __kmalloc+0x167/0x340 mm/slub.c:4130
[<000000005034ca11>] kmalloc include/linux/slab.h:595 [inline]
[<000000005034ca11>] mrp_attr_create net/802/mrp.c:276 [inline]
[<000000005034ca11>] mrp_request_join+0x265/0x550 net/802/mrp.c:530
[<00000000fcfd81f3>] vlan_mvrp_request_join+0x145/0x170 net/8021q/vlan_mvrp.c:40
[<000000009258546e>] vlan_dev_open+0x477/0x890 net/8021q/vlan_dev.c:292
[<0000000059acd82b>] __dev_open+0x281/0x410 net/core/dev.c:1609
[<000000004e6dc695>] __dev_change_flags+0x424/0x560 net/core/dev.c:8767
[<00000000471a09af>] rtnl_configure_link+0xd9/0x210 net/core/rtnetlink.c:3122
[<0000000037a4672b>] __rtnl_newlink+0xe08/0x13e0 net/core/rtnetlink.c:3448
[<000000008d5d0fda>] rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3488
[<000000004882fe39>] rtnetlink_rcv_msg+0x369/0xa10 net/core/rtnetlink.c:5552
[<00000000907e6c54>] netlink_rcv_skb+0x134/0x3d0 net/netlink/af_netlink.c:2504
[<00000000e7d7a8c4>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
[<00000000e7d7a8c4>] netlink_unicast+0x4a0/0x6a0 net/netlink/af_netlink.c:1340
[<00000000e0645d50>] netlink_sendmsg+0x78e/0xc90 net/netlink/af_netlink.c:1929
[<00000000c24559b7>] sock_sendmsg_nosec net/socket.c:654 [inline]
[<00000000c24559b7>] sock_sendmsg+0x139/0x170 net/socket.c:674
[<00000000fc210bc2>] ____sys_sendmsg+0x658/0x7d0 net/socket.c:2350
[<00000000be4577b5>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2404

Calling mrp_request_leave() after mrp_request_join(), the attr->state
is set to MRP_APPLICANT_VO, mrp_attr_destroy() won't be called in last
TX event in mrp_uninit_applicant(), the attr of applicant will be leaked.
To fix this leak, iterate and free each attr of applicant before rerturning
from mrp_uninit_applicant().

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/802/mrp.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/net/802/mrp.c b/net/802/mrp.c
index bea6e43d45a0..35e04cc5390c 100644
--- a/net/802/mrp.c
+++ b/net/802/mrp.c
@@ -292,6 +292,19 @@ static void mrp_attr_destroy(struct mrp_applicant *app, struct mrp_attr *attr)
 	kfree(attr);
 }
 
+static void mrp_attr_destroy_all(struct mrp_applicant *app)
+{
+	struct rb_node *node, *next;
+	struct mrp_attr *attr;
+
+	for (node = rb_first(&app->mad);
+	     next = node ? rb_next(node) : NULL, node != NULL;
+	     node = next) {
+		attr = rb_entry(node, struct mrp_attr, node);
+		mrp_attr_destroy(app, attr);
+	}
+}
+
 static int mrp_pdu_init(struct mrp_applicant *app)
 {
 	struct sk_buff *skb;
@@ -895,6 +908,7 @@ void mrp_uninit_applicant(struct net_device *dev, struct mrp_application *appl)
 
 	spin_lock_bh(&app->lock);
 	mrp_mad_event(app, MRP_EVENT_TX);
+	mrp_attr_destroy_all(app);
 	mrp_pdu_queue(app);
 	spin_unlock_bh(&app->lock);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH AUTOSEL 5.13 02/19] net/802/garp: fix memleak in garp_request_join()
  2021-07-23  3:57 [PATCH AUTOSEL 5.13 01/19] net/802/mrp: fix memleak in mrp_request_join() Sasha Levin
@ 2021-07-23  3:57 ` Sasha Levin
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 03/19] net: annotate data race around sk_ll_usec Sasha Levin
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Sasha Levin @ 2021-07-23  3:57 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Yang Yingliang, Hulk Robot, David S . Miller, Sasha Levin, netdev

From: Yang Yingliang <yangyingliang@huawei.com>

[ Upstream commit 42ca63f980842918560b25f0244307fd83b4777c ]

I got kmemleak report when doing fuzz test:

BUG: memory leak
unreferenced object 0xffff88810c909b80 (size 64):
  comm "syz", pid 957, jiffies 4295220394 (age 399.090s)
  hex dump (first 32 bytes):
    01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 08 00 00 00 01 02 00 04  ................
  backtrace:
    [<00000000ca1f2e2e>] garp_request_join+0x285/0x3d0
    [<00000000bf153351>] vlan_gvrp_request_join+0x15b/0x190
    [<0000000024005e72>] vlan_dev_open+0x706/0x980
    [<00000000dc20c4d4>] __dev_open+0x2bb/0x460
    [<0000000066573004>] __dev_change_flags+0x501/0x650
    [<0000000035b42f83>] rtnl_configure_link+0xee/0x280
    [<00000000a5e69de0>] __rtnl_newlink+0xed5/0x1550
    [<00000000a5258f4a>] rtnl_newlink+0x66/0x90
    [<00000000506568ee>] rtnetlink_rcv_msg+0x439/0xbd0
    [<00000000b7eaeae1>] netlink_rcv_skb+0x14d/0x420
    [<00000000c373ce66>] netlink_unicast+0x550/0x750
    [<00000000ec74ce74>] netlink_sendmsg+0x88b/0xda0
    [<00000000381ff246>] sock_sendmsg+0xc9/0x120
    [<000000008f6a2db3>] ____sys_sendmsg+0x6e8/0x820
    [<000000008d9c1735>] ___sys_sendmsg+0x145/0x1c0
    [<00000000aa39dd8b>] __sys_sendmsg+0xfe/0x1d0

Calling garp_request_leave() after garp_request_join(), the attr->state
is set to GARP_APPLICANT_VO, garp_attr_destroy() won't be called in last
transmit event in garp_uninit_applicant(), the attr of applicant will be
leaked. To fix this leak, iterate and free each attr of applicant before
rerturning from garp_uninit_applicant().

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/802/garp.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/net/802/garp.c b/net/802/garp.c
index 400bd857e5f5..f6012f8e59f0 100644
--- a/net/802/garp.c
+++ b/net/802/garp.c
@@ -203,6 +203,19 @@ static void garp_attr_destroy(struct garp_applicant *app, struct garp_attr *attr
 	kfree(attr);
 }
 
+static void garp_attr_destroy_all(struct garp_applicant *app)
+{
+	struct rb_node *node, *next;
+	struct garp_attr *attr;
+
+	for (node = rb_first(&app->gid);
+	     next = node ? rb_next(node) : NULL, node != NULL;
+	     node = next) {
+		attr = rb_entry(node, struct garp_attr, node);
+		garp_attr_destroy(app, attr);
+	}
+}
+
 static int garp_pdu_init(struct garp_applicant *app)
 {
 	struct sk_buff *skb;
@@ -609,6 +622,7 @@ void garp_uninit_applicant(struct net_device *dev, struct garp_application *appl
 
 	spin_lock_bh(&app->lock);
 	garp_gid_event(app, GARP_EVENT_TRANSMIT_PDU);
+	garp_attr_destroy_all(app);
 	garp_pdu_queue(app);
 	spin_unlock_bh(&app->lock);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH AUTOSEL 5.13 03/19] net: annotate data race around sk_ll_usec
  2021-07-23  3:57 [PATCH AUTOSEL 5.13 01/19] net/802/mrp: fix memleak in mrp_request_join() Sasha Levin
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 02/19] net/802/garp: fix memleak in garp_request_join() Sasha Levin
@ 2021-07-23  3:57 ` Sasha Levin
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 04/19] sctp: move 198 addresses from unusable to private scope Sasha Levin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Sasha Levin @ 2021-07-23  3:57 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eric Dumazet, syzbot, David S . Miller, Sasha Levin, netdev

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 0dbffbb5335a1e3aa6855e4ee317e25e669dd302 ]

sk_ll_usec is read locklessly from sk_can_busy_loop()
while another thread can change its value in sock_setsockopt()

This is correct but needs annotations.

BUG: KCSAN: data-race in __skb_try_recv_datagram / sock_setsockopt

write to 0xffff88814eb5f904 of 4 bytes by task 14011 on cpu 0:
 sock_setsockopt+0x1287/0x2090 net/core/sock.c:1175
 __sys_setsockopt+0x14f/0x200 net/socket.c:2100
 __do_sys_setsockopt net/socket.c:2115 [inline]
 __se_sys_setsockopt net/socket.c:2112 [inline]
 __x64_sys_setsockopt+0x62/0x70 net/socket.c:2112
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff88814eb5f904 of 4 bytes by task 14001 on cpu 1:
 sk_can_busy_loop include/net/busy_poll.h:41 [inline]
 __skb_try_recv_datagram+0x14f/0x320 net/core/datagram.c:273
 unix_dgram_recvmsg+0x14c/0x870 net/unix/af_unix.c:2101
 unix_seqpacket_recvmsg+0x5a/0x70 net/unix/af_unix.c:2067
 ____sys_recvmsg+0x15d/0x310 include/linux/uio.h:244
 ___sys_recvmsg net/socket.c:2598 [inline]
 do_recvmmsg+0x35c/0x9f0 net/socket.c:2692
 __sys_recvmmsg net/socket.c:2771 [inline]
 __do_sys_recvmmsg net/socket.c:2794 [inline]
 __se_sys_recvmmsg net/socket.c:2787 [inline]
 __x64_sys_recvmmsg+0xcf/0x150 net/socket.c:2787
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x00000000 -> 0x00000101

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 14001 Comm: syz-executor.3 Not tainted 5.13.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/busy_poll.h | 2 +-
 net/core/sock.c         | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h
index 73af4a64a599..40296ed976a9 100644
--- a/include/net/busy_poll.h
+++ b/include/net/busy_poll.h
@@ -38,7 +38,7 @@ static inline bool net_busy_loop_on(void)
 
 static inline bool sk_can_busy_loop(const struct sock *sk)
 {
-	return sk->sk_ll_usec && !signal_pending(current);
+	return READ_ONCE(sk->sk_ll_usec) && !signal_pending(current);
 }
 
 bool sk_busy_loop_end(void *p, unsigned long start_time);
diff --git a/net/core/sock.c b/net/core/sock.c
index 946888afef88..d27fdc2adf9c 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1172,7 +1172,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
 			if (val < 0)
 				ret = -EINVAL;
 			else
-				sk->sk_ll_usec = val;
+				WRITE_ONCE(sk->sk_ll_usec, val);
 		}
 		break;
 	case SO_PREFER_BUSY_POLL:
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH AUTOSEL 5.13 04/19] sctp: move 198 addresses from unusable to private scope
  2021-07-23  3:57 [PATCH AUTOSEL 5.13 01/19] net/802/mrp: fix memleak in mrp_request_join() Sasha Levin
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 02/19] net/802/garp: fix memleak in garp_request_join() Sasha Levin
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 03/19] net: annotate data race around sk_ll_usec Sasha Levin
@ 2021-07-23  3:57 ` Sasha Levin
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 07/19] ipv6: allocate enough headroom in ip6_finish_output2() Sasha Levin
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 09/19] sfc: ensure correct number of XDP queues Sasha Levin
  4 siblings, 0 replies; 10+ messages in thread
From: Sasha Levin @ 2021-07-23  3:57 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Xin Long, Sérgio, Marcelo Ricardo Leitner, David S . Miller,
	Sasha Levin, linux-sctp, netdev

From: Xin Long <lucien.xin@gmail.com>

[ Upstream commit 1d11fa231cabeae09a95cb3e4cf1d9dd34e00f08 ]

The doc draft-stewart-tsvwg-sctp-ipv4-00 that restricts 198 addresses
was never published. These addresses as private addresses should be
allowed to use in SCTP.

As Michael Tuexen suggested, this patch is to move 198 addresses from
unusable to private scope.

Reported-by: Sérgio <surkamp@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/sctp/constants.h | 4 +---
 net/sctp/protocol.c          | 3 ++-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h
index 14a0d22c9113..bf23a2ed92da 100644
--- a/include/net/sctp/constants.h
+++ b/include/net/sctp/constants.h
@@ -342,8 +342,7 @@ enum {
 #define SCTP_SCOPE_POLICY_MAX	SCTP_SCOPE_POLICY_LINK
 
 /* Based on IPv4 scoping <draft-stewart-tsvwg-sctp-ipv4-00.txt>,
- * SCTP IPv4 unusable addresses: 0.0.0.0/8, 224.0.0.0/4, 198.18.0.0/24,
- * 192.88.99.0/24.
+ * SCTP IPv4 unusable addresses: 0.0.0.0/8, 224.0.0.0/4, 192.88.99.0/24.
  * Also, RFC 8.4, non-unicast addresses are not considered valid SCTP
  * addresses.
  */
@@ -351,7 +350,6 @@ enum {
 	((htonl(INADDR_BROADCAST) == a) ||  \
 	 ipv4_is_multicast(a) ||	    \
 	 ipv4_is_zeronet(a) ||		    \
-	 ipv4_is_test_198(a) ||		    \
 	 ipv4_is_anycast_6to4(a))
 
 /* Flags used for the bind address copy functions.  */
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 6f2bbfeec3a4..603049aa4950 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -393,7 +393,8 @@ static enum sctp_scope sctp_v4_scope(union sctp_addr *addr)
 		retval = SCTP_SCOPE_LINK;
 	} else if (ipv4_is_private_10(addr->v4.sin_addr.s_addr) ||
 		   ipv4_is_private_172(addr->v4.sin_addr.s_addr) ||
-		   ipv4_is_private_192(addr->v4.sin_addr.s_addr)) {
+		   ipv4_is_private_192(addr->v4.sin_addr.s_addr) ||
+		   ipv4_is_test_198(addr->v4.sin_addr.s_addr)) {
 		retval = SCTP_SCOPE_PRIVATE;
 	} else {
 		retval = SCTP_SCOPE_GLOBAL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH AUTOSEL 5.13 07/19] ipv6: allocate enough headroom in ip6_finish_output2()
  2021-07-23  3:57 [PATCH AUTOSEL 5.13 01/19] net/802/mrp: fix memleak in mrp_request_join() Sasha Levin
                   ` (2 preceding siblings ...)
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 04/19] sctp: move 198 addresses from unusable to private scope Sasha Levin
@ 2021-07-23  3:57 ` Sasha Levin
  2021-07-23  6:03   ` Vasily Averin
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 09/19] sfc: ensure correct number of XDP queues Sasha Levin
  4 siblings, 1 reply; 10+ messages in thread
From: Sasha Levin @ 2021-07-23  3:57 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Vasily Averin, David S . Miller, Sasha Levin, netdev

From: Vasily Averin <vvs@virtuozzo.com>

[ Upstream commit 5796015fa968a3349027a27dcd04c71d95c53ba5 ]

When TEE target mirrors traffic to another interface, sk_buff may
not have enough headroom to be processed correctly.
ip_finish_output2() detect this situation for ipv4 and allocates
new skb with enogh headroom. However ipv6 lacks this logic in
ip_finish_output2 and it leads to skb_under_panic:

 skbuff: skb_under_panic: text:ffffffffc0866ad4 len:96 put:24
 head:ffff97be85e31800 data:ffff97be85e317f8 tail:0x58 end:0xc0 dev:gre0
 ------------[ cut here ]------------
 kernel BUG at net/core/skbuff.c:110!
 invalid opcode: 0000 [#1] SMP PTI
 CPU: 2 PID: 393 Comm: kworker/2:2 Tainted: G           OE     5.13.0 #13
 Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.4 04/01/2014
 Workqueue: ipv6_addrconf addrconf_dad_work
 RIP: 0010:skb_panic+0x48/0x4a
 Call Trace:
  skb_push.cold.111+0x10/0x10
  ipgre_header+0x24/0xf0 [ip_gre]
  neigh_connected_output+0xae/0xf0
  ip6_finish_output2+0x1a8/0x5a0
  ip6_output+0x5c/0x110
  nf_dup_ipv6+0x158/0x1000 [nf_dup_ipv6]
  tee_tg6+0x2e/0x40 [xt_TEE]
  ip6t_do_table+0x294/0x470 [ip6_tables]
  nf_hook_slow+0x44/0xc0
  nf_hook.constprop.34+0x72/0xe0
  ndisc_send_skb+0x20d/0x2e0
  ndisc_send_ns+0xd1/0x210
  addrconf_dad_work+0x3c8/0x540
  process_one_work+0x1d1/0x370
  worker_thread+0x30/0x390
  kthread+0x116/0x130
  ret_from_fork+0x22/0x30

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ipv6/ip6_output.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index ff4f9ebcf7f6..0efcb9b04151 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -60,10 +60,38 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
 {
 	struct dst_entry *dst = skb_dst(skb);
 	struct net_device *dev = dst->dev;
+	unsigned int hh_len = LL_RESERVED_SPACE(dev);
+	int delta = hh_len - skb_headroom(skb);
 	const struct in6_addr *nexthop;
 	struct neighbour *neigh;
 	int ret;
 
+	/* Be paranoid, rather than too clever. */
+	if (unlikely(delta > 0) && dev->header_ops) {
+		/* pskb_expand_head() might crash, if skb is shared */
+		if (skb_shared(skb)) {
+			struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
+
+			if (likely(nskb)) {
+				if (skb->sk)
+					skb_set_owner_w(skb, skb->sk);
+				consume_skb(skb);
+			} else {
+				kfree_skb(skb);
+			}
+			skb = nskb;
+		}
+		if (skb &&
+		    pskb_expand_head(skb, SKB_DATA_ALIGN(delta), 0, GFP_ATOMIC)) {
+			kfree_skb(skb);
+			skb = NULL;
+		}
+		if (!skb) {
+			IP6_INC_STATS(net, ip6_dst_idev(dst), IPSTATS_MIB_OUTDISCARDS);
+			return -ENOMEM;
+		}
+	}
+
 	if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr)) {
 		struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH AUTOSEL 5.13 09/19] sfc: ensure correct number of XDP queues
  2021-07-23  3:57 [PATCH AUTOSEL 5.13 01/19] net/802/mrp: fix memleak in mrp_request_join() Sasha Levin
                   ` (3 preceding siblings ...)
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 07/19] ipv6: allocate enough headroom in ip6_finish_output2() Sasha Levin
@ 2021-07-23  3:57 ` Sasha Levin
  2021-07-23 10:12   ` Íñigo Huguet
  4 siblings, 1 reply; 10+ messages in thread
From: Sasha Levin @ 2021-07-23  3:57 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Íñigo Huguet, David S . Miller, Sasha Levin, netdev, bpf

From: Íñigo Huguet <ihuguet@redhat.com>

[ Upstream commit 788bc000d4c2f25232db19ab3a0add0ba4e27671 ]

Commit 99ba0ea616aa ("sfc: adjust efx->xdp_tx_queue_count with the real
number of initialized queues") intended to fix a problem caused by a
round up when calculating the number of XDP channels and queues.
However, this was not the real problem. The real problem was that the
number of XDP TX queues had been reduced to half in
commit e26ca4b53582 ("sfc: reduce the number of requested xdp ev queues"),
but the variable xdp_tx_queue_count had remained the same.

Once the correct number of XDP TX queues is created again in the
previous patch of this series, this also can be reverted since the error
doesn't actually exist.

Only in the case that there is a bug in the code we can have different
values in xdp_queue_number and efx->xdp_tx_queue_count. Because of this,
and per Edward Cree's suggestion, I add instead a WARN_ON to catch if it
happens again in the future.

Note that the number of allocated queues can be higher than the number
of used ones due to the round up, as explained in the existing comment
in the code. That's why we also have to stop increasing xdp_queue_number
beyond efx->xdp_tx_queue_count.

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/sfc/efx_channels.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c
index a3ca406a3561..bea0b27baf4b 100644
--- a/drivers/net/ethernet/sfc/efx_channels.c
+++ b/drivers/net/ethernet/sfc/efx_channels.c
@@ -891,18 +891,20 @@ int efx_set_channels(struct efx_nic *efx)
 			if (efx_channel_is_xdp_tx(channel)) {
 				efx_for_each_channel_tx_queue(tx_queue, channel) {
 					tx_queue->queue = next_queue++;
-					netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n",
-						  channel->channel, tx_queue->label,
-						  xdp_queue_number, tx_queue->queue);
+
 					/* We may have a few left-over XDP TX
 					 * queues owing to xdp_tx_queue_count
 					 * not dividing evenly by EFX_MAX_TXQ_PER_CHANNEL.
 					 * We still allocate and probe those
 					 * TXQs, but never use them.
 					 */
-					if (xdp_queue_number < efx->xdp_tx_queue_count)
+					if (xdp_queue_number < efx->xdp_tx_queue_count) {
+						netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n",
+							  channel->channel, tx_queue->label,
+							  xdp_queue_number, tx_queue->queue);
 						efx->xdp_tx_queues[xdp_queue_number] = tx_queue;
-					xdp_queue_number++;
+						xdp_queue_number++;
+					}
 				}
 			} else {
 				efx_for_each_channel_tx_queue(tx_queue, channel) {
@@ -914,8 +916,7 @@ int efx_set_channels(struct efx_nic *efx)
 			}
 		}
 	}
-	if (xdp_queue_number)
-		efx->xdp_tx_queue_count = xdp_queue_number;
+	WARN_ON(xdp_queue_number != efx->xdp_tx_queue_count);
 
 	rc = netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels);
 	if (rc)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH AUTOSEL 5.13 07/19] ipv6: allocate enough headroom in ip6_finish_output2()
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 07/19] ipv6: allocate enough headroom in ip6_finish_output2() Sasha Levin
@ 2021-07-23  6:03   ` Vasily Averin
  2021-07-28 14:07     ` Sasha Levin
  0 siblings, 1 reply; 10+ messages in thread
From: Vasily Averin @ 2021-07-23  6:03 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable; +Cc: David S . Miller, netdev

this patch is incomplete, and requires following fixup
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d85a1b31dde84038ea07ad825c3d8d3e71f4344

Please backport it too.

Thank you,
	Vasily Averin

On 7/23/21 6:57 AM, Sasha Levin wrote:
> From: Vasily Averin <vvs@virtuozzo.com>
> 
> [ Upstream commit 5796015fa968a3349027a27dcd04c71d95c53ba5 ]
> 
> When TEE target mirrors traffic to another interface, sk_buff may
> not have enough headroom to be processed correctly.
> ip_finish_output2() detect this situation for ipv4 and allocates
> new skb with enogh headroom. However ipv6 lacks this logic in
> ip_finish_output2 and it leads to skb_under_panic:
> 
>  skbuff: skb_under_panic: text:ffffffffc0866ad4 len:96 put:24
>  head:ffff97be85e31800 data:ffff97be85e317f8 tail:0x58 end:0xc0 dev:gre0
>  ------------[ cut here ]------------
>  kernel BUG at net/core/skbuff.c:110!
>  invalid opcode: 0000 [#1] SMP PTI
>  CPU: 2 PID: 393 Comm: kworker/2:2 Tainted: G           OE     5.13.0 #13
>  Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.4 04/01/2014
>  Workqueue: ipv6_addrconf addrconf_dad_work
>  RIP: 0010:skb_panic+0x48/0x4a
>  Call Trace:
>   skb_push.cold.111+0x10/0x10
>   ipgre_header+0x24/0xf0 [ip_gre]
>   neigh_connected_output+0xae/0xf0
>   ip6_finish_output2+0x1a8/0x5a0
>   ip6_output+0x5c/0x110
>   nf_dup_ipv6+0x158/0x1000 [nf_dup_ipv6]
>   tee_tg6+0x2e/0x40 [xt_TEE]
>   ip6t_do_table+0x294/0x470 [ip6_tables]
>   nf_hook_slow+0x44/0xc0
>   nf_hook.constprop.34+0x72/0xe0
>   ndisc_send_skb+0x20d/0x2e0
>   ndisc_send_ns+0xd1/0x210
>   addrconf_dad_work+0x3c8/0x540
>   process_one_work+0x1d1/0x370
>   worker_thread+0x30/0x390
>   kthread+0x116/0x130
>   ret_from_fork+0x22/0x30
> 
> Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  net/ipv6/ip6_output.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index ff4f9ebcf7f6..0efcb9b04151 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -60,10 +60,38 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
>  {
>  	struct dst_entry *dst = skb_dst(skb);
>  	struct net_device *dev = dst->dev;
> +	unsigned int hh_len = LL_RESERVED_SPACE(dev);
> +	int delta = hh_len - skb_headroom(skb);
>  	const struct in6_addr *nexthop;
>  	struct neighbour *neigh;
>  	int ret;
>  
> +	/* Be paranoid, rather than too clever. */
> +	if (unlikely(delta > 0) && dev->header_ops) {
> +		/* pskb_expand_head() might crash, if skb is shared */
> +		if (skb_shared(skb)) {
> +			struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
> +
> +			if (likely(nskb)) {
> +				if (skb->sk)
> +					skb_set_owner_w(skb, skb->sk);
> +				consume_skb(skb);
> +			} else {
> +				kfree_skb(skb);
> +			}
> +			skb = nskb;
> +		}
> +		if (skb &&
> +		    pskb_expand_head(skb, SKB_DATA_ALIGN(delta), 0, GFP_ATOMIC)) {
> +			kfree_skb(skb);
> +			skb = NULL;
> +		}
> +		if (!skb) {
> +			IP6_INC_STATS(net, ip6_dst_idev(dst), IPSTATS_MIB_OUTDISCARDS);
> +			return -ENOMEM;
> +		}
> +	}
> +
>  	if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr)) {
>  		struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));
>  
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH AUTOSEL 5.13 09/19] sfc: ensure correct number of XDP queues
  2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 09/19] sfc: ensure correct number of XDP queues Sasha Levin
@ 2021-07-23 10:12   ` Íñigo Huguet
  2021-07-23 10:18     ` Íñigo Huguet
  0 siblings, 1 reply; 10+ messages in thread
From: Íñigo Huguet @ 2021-07-23 10:12 UTC (permalink / raw)
  To: Sasha Levin; +Cc: linux-kernel, stable, David S . Miller, netdev, bpf

Hi,

On Fri, Jul 23, 2021 at 5:57 AM Sasha Levin <sashal@kernel.org> wrote:
>
> From: Íñigo Huguet <ihuguet@redhat.com>
>
> [ Upstream commit 788bc000d4c2f25232db19ab3a0add0ba4e27671 ]

Applying this commit alone could be problematic, leading even to
kernel crash in certain situations.

The real fix is the previous one of the series:
f43a24f446da sfc: fix lack of XDP TX queues - error XDP TX failed (-22)

This one can be applied too, but not really a must-have.

> Commit 99ba0ea616aa ("sfc: adjust efx->xdp_tx_queue_count with the real
> number of initialized queues") intended to fix a problem caused by a
> round up when calculating the number of XDP channels and queues.
> However, this was not the real problem. The real problem was that the
> number of XDP TX queues had been reduced to half in
> commit e26ca4b53582 ("sfc: reduce the number of requested xdp ev queues"),
> but the variable xdp_tx_queue_count had remained the same.
>
> Once the correct number of XDP TX queues is created again in the
> previous patch of this series, this also can be reverted since the error
> doesn't actually exist.
>
> Only in the case that there is a bug in the code we can have different
> values in xdp_queue_number and efx->xdp_tx_queue_count. Because of this,
> and per Edward Cree's suggestion, I add instead a WARN_ON to catch if it
> happens again in the future.
>
> Note that the number of allocated queues can be higher than the number
> of used ones due to the round up, as explained in the existing comment
> in the code. That's why we also have to stop increasing xdp_queue_number
> beyond efx->xdp_tx_queue_count.
>
> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  drivers/net/ethernet/sfc/efx_channels.c | 15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c
> index a3ca406a3561..bea0b27baf4b 100644
> --- a/drivers/net/ethernet/sfc/efx_channels.c
> +++ b/drivers/net/ethernet/sfc/efx_channels.c
> @@ -891,18 +891,20 @@ int efx_set_channels(struct efx_nic *efx)
>                         if (efx_channel_is_xdp_tx(channel)) {
>                                 efx_for_each_channel_tx_queue(tx_queue, channel) {
>                                         tx_queue->queue = next_queue++;
> -                                       netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n",
> -                                                 channel->channel, tx_queue->label,
> -                                                 xdp_queue_number, tx_queue->queue);
> +
>                                         /* We may have a few left-over XDP TX
>                                          * queues owing to xdp_tx_queue_count
>                                          * not dividing evenly by EFX_MAX_TXQ_PER_CHANNEL.
>                                          * We still allocate and probe those
>                                          * TXQs, but never use them.
>                                          */
> -                                       if (xdp_queue_number < efx->xdp_tx_queue_count)
> +                                       if (xdp_queue_number < efx->xdp_tx_queue_count) {
> +                                               netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n",
> +                                                         channel->channel, tx_queue->label,
> +                                                         xdp_queue_number, tx_queue->queue);
>                                                 efx->xdp_tx_queues[xdp_queue_number] = tx_queue;
> -                                       xdp_queue_number++;
> +                                               xdp_queue_number++;
> +                                       }
>                                 }
>                         } else {
>                                 efx_for_each_channel_tx_queue(tx_queue, channel) {
> @@ -914,8 +916,7 @@ int efx_set_channels(struct efx_nic *efx)
>                         }
>                 }
>         }
> -       if (xdp_queue_number)
> -               efx->xdp_tx_queue_count = xdp_queue_number;
> +       WARN_ON(xdp_queue_number != efx->xdp_tx_queue_count);
>
>         rc = netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels);
>         if (rc)
> --
> 2.30.2
>

-- 
Íñigo Huguet


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH AUTOSEL 5.13 09/19] sfc: ensure correct number of XDP queues
  2021-07-23 10:12   ` Íñigo Huguet
@ 2021-07-23 10:18     ` Íñigo Huguet
  0 siblings, 0 replies; 10+ messages in thread
From: Íñigo Huguet @ 2021-07-23 10:18 UTC (permalink / raw)
  To: Sasha Levin; +Cc: linux-kernel, stable, David S . Miller, netdev, bpf

On Fri, Jul 23, 2021 at 12:12 PM Íñigo Huguet <ihuguet@redhat.com> wrote:
> This one can be applied too, but not really a must-have.

Sorry, I have to correct myself. Both must be applied:
58e3bb77bf1b sfc: ensure correct number of XDP queues
f43a24f446da sfc: fix lack of XDP TX queues - error XDP TX failed (-22)

Otherwise, if there are some left-over TXQs because of round up,
xdp_tx_queue_count coud be set to a wrong value, higher than it should
be.

Regards
-- 
Íñigo Huguet


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH AUTOSEL 5.13 07/19] ipv6: allocate enough headroom in ip6_finish_output2()
  2021-07-23  6:03   ` Vasily Averin
@ 2021-07-28 14:07     ` Sasha Levin
  0 siblings, 0 replies; 10+ messages in thread
From: Sasha Levin @ 2021-07-28 14:07 UTC (permalink / raw)
  To: Vasily Averin; +Cc: linux-kernel, stable, David S . Miller, netdev

On Fri, Jul 23, 2021 at 09:03:33AM +0300, Vasily Averin wrote:
>this patch is incomplete, and requires following fixup
>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d85a1b31dde84038ea07ad825c3d8d3e71f4344

I've grabbed it too, thanks!

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-07-28 14:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-23  3:57 [PATCH AUTOSEL 5.13 01/19] net/802/mrp: fix memleak in mrp_request_join() Sasha Levin
2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 02/19] net/802/garp: fix memleak in garp_request_join() Sasha Levin
2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 03/19] net: annotate data race around sk_ll_usec Sasha Levin
2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 04/19] sctp: move 198 addresses from unusable to private scope Sasha Levin
2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 07/19] ipv6: allocate enough headroom in ip6_finish_output2() Sasha Levin
2021-07-23  6:03   ` Vasily Averin
2021-07-28 14:07     ` Sasha Levin
2021-07-23  3:57 ` [PATCH AUTOSEL 5.13 09/19] sfc: ensure correct number of XDP queues Sasha Levin
2021-07-23 10:12   ` Íñigo Huguet
2021-07-23 10:18     ` Íñigo Huguet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).