* [PATCH net] tipc: fix a race in tipc_sk_mcast_rcv
@ 2021-04-28 19:30 Xin Long
2021-04-29 2:18 ` [tipc-discussion] " Tung Quang Nguyen
2021-05-13 21:15 ` Jon Maloy
0 siblings, 2 replies; 3+ messages in thread
From: Xin Long @ 2021-04-28 19:30 UTC (permalink / raw)
To: network dev, tipc-discussion; +Cc: davem, kuba, Jon Maloy, Ying Xue, lyl2019
After commit cb1b728096f5 ("tipc: eliminate race condition at multicast
reception"), when processing the multicast reception, the packets are
firstly moved from be->inputq1 to be->arrvq in tipc_node_broadcast(),
then process be->arrvq in tipc_sk_mcast_rcv().
In tipc_sk_mcast_rcv(), it gets the 1st skb by skb_peek(), then process
this skb without any lock. It means meanwhile another thread could also
call tipc_sk_mcast_rcv() and process be->arrvq and pick up the same skb,
then free it. A double free issue will be caused as Li Shuang reported:
[] kernel BUG at mm/slub.c:305!
[] kfree+0x3a7/0x3d0
[] kfree_skb+0x32/0xa0
[] skb_release_data+0xb4/0x170
[] kfree_skb+0x32/0xa0
[] skb_release_data+0xb4/0x170
[] kfree_skb+0x32/0xa0
[] tipc_sk_mcast_rcv+0x1fa/0x380 [tipc]
[] tipc_rcv+0x411/0x1120 [tipc]
[] tipc_udp_recv+0xc6/0x1e0 [tipc]
[] udp_queue_rcv_one_skb+0x1a9/0x500
[] udp_unicast_rcv_skb.isra.66+0x75/0x90
[] __udp4_lib_rcv+0x537/0xc40
[] ip_protocol_deliver_rcu+0xdf/0x1d0
[] ip_local_deliver_finish+0x4a/0x50
[] ip_local_deliver+0x6b/0xe0
[] ip_rcv+0x27b/0x36a
[] __netif_receive_skb_core+0xb47/0xc40
[] process_backlog+0xae/0x160
Commit 6bf24dc0cc0c ("net:tipc: Fix a double free in tipc_sk_mcast_rcv")
tried to fix this double free by not releasing the skbs in be->arrvq,
which would definitely cause the skbs' leak.
The problem is we shouldn't process the skbs in be->arrvq without any
lock to protect the code from peeking to dequeuing them. The fix here
is to use a temp skb list instead of be->arrvq to make it "per thread
safe". While at it, remove the no-longer-used be->arrvq.
Fixes: cb1b728096f5 ("tipc: eliminate race condition at multicast reception")
Fixes: 6bf24dc0cc0c ("net:tipc: Fix a double free in tipc_sk_mcast_rcv")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
net/tipc/node.c | 9 ++++-----
net/tipc/socket.c | 16 +++-------------
2 files changed, 7 insertions(+), 18 deletions(-)
diff --git a/net/tipc/node.c b/net/tipc/node.c
index e0ee832..0c636fb 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -72,7 +72,6 @@ struct tipc_link_entry {
struct tipc_bclink_entry {
struct tipc_link *link;
struct sk_buff_head inputq1;
- struct sk_buff_head arrvq;
struct sk_buff_head inputq2;
struct sk_buff_head namedq;
u16 named_rcv_nxt;
@@ -552,7 +551,6 @@ struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id,
INIT_LIST_HEAD(&n->conn_sks);
skb_queue_head_init(&n->bc_entry.namedq);
skb_queue_head_init(&n->bc_entry.inputq1);
- __skb_queue_head_init(&n->bc_entry.arrvq);
skb_queue_head_init(&n->bc_entry.inputq2);
for (i = 0; i < MAX_BEARERS; i++)
spin_lock_init(&n->links[i].lock);
@@ -1803,14 +1801,15 @@ void tipc_node_broadcast(struct net *net, struct sk_buff *skb, int rc_dests)
static void tipc_node_mcast_rcv(struct tipc_node *n)
{
struct tipc_bclink_entry *be = &n->bc_entry;
+ struct sk_buff_head tmpq;
- /* 'arrvq' is under inputq2's lock protection */
+ __skb_queue_head_init(&tmpq);
spin_lock_bh(&be->inputq2.lock);
spin_lock_bh(&be->inputq1.lock);
- skb_queue_splice_tail_init(&be->inputq1, &be->arrvq);
+ skb_queue_splice_tail_init(&be->inputq1, &tmpq);
spin_unlock_bh(&be->inputq1.lock);
spin_unlock_bh(&be->inputq2.lock);
- tipc_sk_mcast_rcv(n->net, &be->arrvq, &be->inputq2);
+ tipc_sk_mcast_rcv(n->net, &tmpq, &be->inputq2);
}
static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr,
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 022999e..2870798 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -1210,8 +1210,7 @@ void tipc_sk_mcast_rcv(struct net *net, struct sk_buff_head *arrvq,
__skb_queue_head_init(&tmpq);
INIT_LIST_HEAD(&dports);
- skb = tipc_skb_peek(arrvq, &inputq->lock);
- for (; skb; skb = tipc_skb_peek(arrvq, &inputq->lock)) {
+ while ((skb = __skb_dequeue(arrvq)) != NULL) {
hdr = buf_msg(skb);
user = msg_user(hdr);
mtyp = msg_type(hdr);
@@ -1220,13 +1219,7 @@ void tipc_sk_mcast_rcv(struct net *net, struct sk_buff_head *arrvq,
type = msg_nametype(hdr);
if (mtyp == TIPC_GRP_UCAST_MSG || user == GROUP_PROTOCOL) {
- spin_lock_bh(&inputq->lock);
- if (skb_peek(arrvq) == skb) {
- __skb_dequeue(arrvq);
- __skb_queue_tail(inputq, skb);
- }
- kfree_skb(skb);
- spin_unlock_bh(&inputq->lock);
+ skb_queue_tail(inputq, skb);
continue;
}
@@ -1263,10 +1256,7 @@ void tipc_sk_mcast_rcv(struct net *net, struct sk_buff_head *arrvq,
}
/* Append to inputq if not already done by other thread */
spin_lock_bh(&inputq->lock);
- if (skb_peek(arrvq) == skb) {
- skb_queue_splice_tail_init(&tmpq, inputq);
- __skb_dequeue(arrvq);
- }
+ skb_queue_splice_tail_init(&tmpq, inputq);
spin_unlock_bh(&inputq->lock);
__skb_queue_purge(&tmpq);
kfree_skb(skb);
--
2.1.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* RE: [tipc-discussion] [PATCH net] tipc: fix a race in tipc_sk_mcast_rcv
2021-04-28 19:30 [PATCH net] tipc: fix a race in tipc_sk_mcast_rcv Xin Long
@ 2021-04-29 2:18 ` Tung Quang Nguyen
2021-05-13 21:15 ` Jon Maloy
1 sibling, 0 replies; 3+ messages in thread
From: Tung Quang Nguyen @ 2021-04-29 2:18 UTC (permalink / raw)
To: Xin Long, network dev, tipc-discussion; +Cc: kuba, lyl2019, davem
Hi Xin,
See my comments inline.
Best regards,
Tung Nguyen
-----Original Message-----
From: Xin Long <lucien.xin@gmail.com>
Sent: Thursday, April 29, 2021 2:31 AM
To: network dev <netdev@vger.kernel.org>; tipc-discussion@lists.sourceforge.net
Cc: kuba@kernel.org; lyl2019@mail.ustc.edu.cn; davem@davemloft.net
Subject: [tipc-discussion] [PATCH net] tipc: fix a race in tipc_sk_mcast_rcv
After commit cb1b728096f5 ("tipc: eliminate race condition at multicast
reception"), when processing the multicast reception, the packets are
firstly moved from be->inputq1 to be->arrvq in tipc_node_broadcast(),
then process be->arrvq in tipc_sk_mcast_rcv().
In tipc_sk_mcast_rcv(), it gets the 1st skb by skb_peek(), then process
this skb without any lock. It means meanwhile another thread could also
call tipc_sk_mcast_rcv() and process be->arrvq and pick up the same skb,
then free it. A double free issue will be caused as Li Shuang reported:
[] kernel BUG at mm/slub.c:305!
[] kfree+0x3a7/0x3d0
[] kfree_skb+0x32/0xa0
[] skb_release_data+0xb4/0x170
[] kfree_skb+0x32/0xa0
[] skb_release_data+0xb4/0x170
[] kfree_skb+0x32/0xa0
[] tipc_sk_mcast_rcv+0x1fa/0x380 [tipc]
[] tipc_rcv+0x411/0x1120 [tipc]
[] tipc_udp_recv+0xc6/0x1e0 [tipc]
[] udp_queue_rcv_one_skb+0x1a9/0x500
[] udp_unicast_rcv_skb.isra.66+0x75/0x90
[] __udp4_lib_rcv+0x537/0xc40
[] ip_protocol_deliver_rcu+0xdf/0x1d0
[] ip_local_deliver_finish+0x4a/0x50
[] ip_local_deliver+0x6b/0xe0
[] ip_rcv+0x27b/0x36a
[] __netif_receive_skb_core+0xb47/0xc40
[] process_backlog+0xae/0x160
Commit 6bf24dc0cc0c ("net:tipc: Fix a double free in tipc_sk_mcast_rcv")
tried to fix this double free by not releasing the skbs in be->arrvq,
which would definitely cause the skbs' leak.
The problem is we shouldn't process the skbs in be->arrvq without any
lock to protect the code from peeking to dequeuing them. The fix here
is to use a temp skb list instead of be->arrvq to make it "per thread
safe". While at it, remove the no-longer-used be->arrvq.
Fixes: cb1b728096f5 ("tipc: eliminate race condition at multicast reception")
Fixes: 6bf24dc0cc0c ("net:tipc: Fix a double free in tipc_sk_mcast_rcv")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
net/tipc/node.c | 9 ++++-----
net/tipc/socket.c | 16 +++-------------
2 files changed, 7 insertions(+), 18 deletions(-)
diff --git a/net/tipc/node.c b/net/tipc/node.c
index e0ee832..0c636fb 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -72,7 +72,6 @@ struct tipc_link_entry {
struct tipc_bclink_entry {
struct tipc_link *link;
struct sk_buff_head inputq1;
- struct sk_buff_head arrvq;
struct sk_buff_head inputq2;
struct sk_buff_head namedq;
u16 named_rcv_nxt;
@@ -552,7 +551,6 @@ struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id,
INIT_LIST_HEAD(&n->conn_sks);
skb_queue_head_init(&n->bc_entry.namedq);
skb_queue_head_init(&n->bc_entry.inputq1);
- __skb_queue_head_init(&n->bc_entry.arrvq);
skb_queue_head_init(&n->bc_entry.inputq2);
for (i = 0; i < MAX_BEARERS; i++)
spin_lock_init(&n->links[i].lock);
@@ -1803,14 +1801,15 @@ void tipc_node_broadcast(struct net *net, struct sk_buff *skb, int rc_dests)
static void tipc_node_mcast_rcv(struct tipc_node *n)
{
struct tipc_bclink_entry *be = &n->bc_entry;
+ struct sk_buff_head tmpq;
- /* 'arrvq' is under inputq2's lock protection */
+ __skb_queue_head_init(&tmpq);
spin_lock_bh(&be->inputq2.lock);
spin_lock_bh(&be->inputq1.lock);
- skb_queue_splice_tail_init(&be->inputq1, &be->arrvq);
+ skb_queue_splice_tail_init(&be->inputq1, &tmpq);
spin_unlock_bh(&be->inputq1.lock);
spin_unlock_bh(&be->inputq2.lock);
- tipc_sk_mcast_rcv(n->net, &be->arrvq, &be->inputq2);
+ tipc_sk_mcast_rcv(n->net, &tmpq, &be->inputq2);
}
static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr,
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 022999e..2870798 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -1210,8 +1210,7 @@ void tipc_sk_mcast_rcv(struct net *net, struct sk_buff_head *arrvq,
__skb_queue_head_init(&tmpq);
INIT_LIST_HEAD(&dports);
- skb = tipc_skb_peek(arrvq, &inputq->lock);
- for (; skb; skb = tipc_skb_peek(arrvq, &inputq->lock)) {
Please remove function tipc_skb_peek() because it is no longer used.
+ while ((skb = __skb_dequeue(arrvq)) != NULL) {
hdr = buf_msg(skb);
user = msg_user(hdr);
mtyp = msg_type(hdr);
@@ -1220,13 +1219,7 @@ void tipc_sk_mcast_rcv(struct net *net, struct sk_buff_head *arrvq,
type = msg_nametype(hdr);
if (mtyp == TIPC_GRP_UCAST_MSG || user == GROUP_PROTOCOL) {
- spin_lock_bh(&inputq->lock);
- if (skb_peek(arrvq) == skb) {
- __skb_dequeue(arrvq);
- __skb_queue_tail(inputq, skb);
- }
- kfree_skb(skb);
- spin_unlock_bh(&inputq->lock);
+ skb_queue_tail(inputq, skb);
continue;
}
@@ -1263,10 +1256,7 @@ void tipc_sk_mcast_rcv(struct net *net, struct sk_buff_head *arrvq,
}
/* Append to inputq if not already done by other thread */
Please remove above comment because the temporary queue is thread-safe
spin_lock_bh(&inputq->lock);
- if (skb_peek(arrvq) == skb) {
- skb_queue_splice_tail_init(&tmpq, inputq);
- __skb_dequeue(arrvq);
- }
+ skb_queue_splice_tail_init(&tmpq, inputq);
spin_unlock_bh(&inputq->lock);
__skb_queue_purge(&tmpq);
kfree_skb(skb);
--
2.1.0
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net] tipc: fix a race in tipc_sk_mcast_rcv
2021-04-28 19:30 [PATCH net] tipc: fix a race in tipc_sk_mcast_rcv Xin Long
2021-04-29 2:18 ` [tipc-discussion] " Tung Quang Nguyen
@ 2021-05-13 21:15 ` Jon Maloy
1 sibling, 0 replies; 3+ messages in thread
From: Jon Maloy @ 2021-05-13 21:15 UTC (permalink / raw)
To: Xin Long, network dev, tipc-discussion; +Cc: davem, kuba, Ying Xue, lyl2019
On 4/28/21 3:30 PM, Xin Long wrote:
> After commit cb1b728096f5 ("tipc: eliminate race condition at multicast
> reception"), when processing the multicast reception, the packets are
> firstly moved from be->inputq1 to be->arrvq in tipc_node_broadcast(),
> then process be->arrvq in tipc_sk_mcast_rcv().
>
> In tipc_sk_mcast_rcv(), it gets the 1st skb by skb_peek(), then process
> this skb without any lock. It means meanwhile another thread could also
> call tipc_sk_mcast_rcv() and process be->arrvq and pick up the same skb,
> then free it. A double free issue will be caused as Li Shuang reported:
>
> [] kernel BUG at mm/slub.c:305!
> [] kfree+0x3a7/0x3d0
> [] kfree_skb+0x32/0xa0
> [] skb_release_data+0xb4/0x170
> [] kfree_skb+0x32/0xa0
> [] skb_release_data+0xb4/0x170
> [] kfree_skb+0x32/0xa0
> [] tipc_sk_mcast_rcv+0x1fa/0x380 [tipc]
> [] tipc_rcv+0x411/0x1120 [tipc]
> [] tipc_udp_recv+0xc6/0x1e0 [tipc]
> [] udp_queue_rcv_one_skb+0x1a9/0x500
> [] udp_unicast_rcv_skb.isra.66+0x75/0x90
> [] __udp4_lib_rcv+0x537/0xc40
> [] ip_protocol_deliver_rcu+0xdf/0x1d0
> [] ip_local_deliver_finish+0x4a/0x50
> [] ip_local_deliver+0x6b/0xe0
> [] ip_rcv+0x27b/0x36a
> [] __netif_receive_skb_core+0xb47/0xc40
> [] process_backlog+0xae/0x160
>
> Commit 6bf24dc0cc0c ("net:tipc: Fix a double free in tipc_sk_mcast_rcv")
> tried to fix this double free by not releasing the skbs in be->arrvq,
> which would definitely cause the skbs' leak.
>
> The problem is we shouldn't process the skbs in be->arrvq without any
> lock to protect the code from peeking to dequeuing them. The fix here
> is to use a temp skb list instead of be->arrvq to make it "per thread
> safe". While at it, remove the no-longer-used be->arrvq.
>
> Fixes: cb1b728096f5 ("tipc: eliminate race condition at multicast reception")
> Fixes: 6bf24dc0cc0c ("net:tipc: Fix a double free in tipc_sk_mcast_rcv")
> Reported-by: Li Shuang <shuali@redhat.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
> net/tipc/node.c | 9 ++++-----
> net/tipc/socket.c | 16 +++-------------
> 2 files changed, 7 insertions(+), 18 deletions(-)
>
> diff --git a/net/tipc/node.c b/net/tipc/node.c
> index e0ee832..0c636fb 100644
> --- a/net/tipc/node.c
> +++ b/net/tipc/node.c
> @@ -72,7 +72,6 @@ struct tipc_link_entry {
> struct tipc_bclink_entry {
> struct tipc_link *link;
> struct sk_buff_head inputq1;
> - struct sk_buff_head arrvq;
> struct sk_buff_head inputq2;
> struct sk_buff_head namedq;
> u16 named_rcv_nxt;
> @@ -552,7 +551,6 @@ struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id,
> INIT_LIST_HEAD(&n->conn_sks);
> skb_queue_head_init(&n->bc_entry.namedq);
> skb_queue_head_init(&n->bc_entry.inputq1);
> - __skb_queue_head_init(&n->bc_entry.arrvq);
> skb_queue_head_init(&n->bc_entry.inputq2);
> for (i = 0; i < MAX_BEARERS; i++)
> spin_lock_init(&n->links[i].lock);
> @@ -1803,14 +1801,15 @@ void tipc_node_broadcast(struct net *net, struct sk_buff *skb, int rc_dests)
> static void tipc_node_mcast_rcv(struct tipc_node *n)
> {
> struct tipc_bclink_entry *be = &n->bc_entry;
> + struct sk_buff_head tmpq;
>
> - /* 'arrvq' is under inputq2's lock protection */
> + __skb_queue_head_init(&tmpq);
> spin_lock_bh(&be->inputq2.lock);
> spin_lock_bh(&be->inputq1.lock);
> - skb_queue_splice_tail_init(&be->inputq1, &be->arrvq);
> + skb_queue_splice_tail_init(&be->inputq1, &tmpq);
> spin_unlock_bh(&be->inputq1.lock);
> spin_unlock_bh(&be->inputq2.lock);
> - tipc_sk_mcast_rcv(n->net, &be->arrvq, &be->inputq2);
> + tipc_sk_mcast_rcv(n->net, &tmpq, &be->inputq2);
> }
>
> static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr,
> diff --git a/net/tipc/socket.c b/net/tipc/socket.c
> index 022999e..2870798 100644
> --- a/net/tipc/socket.c
> +++ b/net/tipc/socket.c
> @@ -1210,8 +1210,7 @@ void tipc_sk_mcast_rcv(struct net *net, struct sk_buff_head *arrvq,
> __skb_queue_head_init(&tmpq);
> INIT_LIST_HEAD(&dports);
>
> - skb = tipc_skb_peek(arrvq, &inputq->lock);
> - for (; skb; skb = tipc_skb_peek(arrvq, &inputq->lock)) {
> + while ((skb = __skb_dequeue(arrvq)) != NULL) {
> hdr = buf_msg(skb);
> user = msg_user(hdr);
> mtyp = msg_type(hdr);
> @@ -1220,13 +1219,7 @@ void tipc_sk_mcast_rcv(struct net *net, struct sk_buff_head *arrvq,
> type = msg_nametype(hdr);
>
> if (mtyp == TIPC_GRP_UCAST_MSG || user == GROUP_PROTOCOL) {
> - spin_lock_bh(&inputq->lock);
> - if (skb_peek(arrvq) == skb) {
> - __skb_dequeue(arrvq);
> - __skb_queue_tail(inputq, skb);
> - }
> - kfree_skb(skb);
> - spin_unlock_bh(&inputq->lock);
> + skb_queue_tail(inputq, skb);
> continue;
> }
>
> @@ -1263,10 +1256,7 @@ void tipc_sk_mcast_rcv(struct net *net, struct sk_buff_head *arrvq,
> }
> /* Append to inputq if not already done by other thread */
> spin_lock_bh(&inputq->lock);
> - if (skb_peek(arrvq) == skb) {
> - skb_queue_splice_tail_init(&tmpq, inputq);
> - __skb_dequeue(arrvq);
> - }
> + skb_queue_splice_tail_init(&tmpq, inputq);
> spin_unlock_bh(&inputq->lock);
> __skb_queue_purge(&tmpq);
> kfree_skb(skb);
Nack.
This would invalidate the sequence guarantee of messages between two
specific sockets.
The whole point of having a lock protected arrival queue is to preserve
the order when messages are moved from inputq1 to inputq2.
Let's take a discussion on our mailing list.
///jon
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-05-13 21:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-28 19:30 [PATCH net] tipc: fix a race in tipc_sk_mcast_rcv Xin Long
2021-04-29 2:18 ` [tipc-discussion] " Tung Quang Nguyen
2021-05-13 21:15 ` Jon Maloy
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.