All of lore.kernel.org
 help / color / mirror / Atom feed
* [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append()
@ 2020-10-27  3:24 Tung Nguyen
  2020-10-27 20:50 ` Cong Wang
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Tung Nguyen @ 2020-10-27  3:24 UTC (permalink / raw)
  To: davem, netdev, tipc-discussion

Commit ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
replaced skb_unshare() with skb_copy() to not reduce the data reference
counter of the original skb intentionally. This is not the correct
way to handle the cloned skb because it causes memory leak in 2
following cases:
 1/ Sending multicast messages via broadcast link
  The original skb list is cloned to the local skb list for local
  destination. After that, the data reference counter of each skb
  in the original list has the value of 2. This causes each skb not
  to be freed after receiving ACK:
  tipc_link_advance_transmq()
  {
   ...
   /* release skb */
   __skb_unlink(skb, &l->transmq);
   kfree_skb(skb); <-- memory exists after being freed
  }

 2/ Sending multicast messages via replicast link
  Similar to the above case, each skb cannot be freed after purging
  the skb list:
  tipc_mcast_xmit()
  {
   ...
   __skb_queue_purge(pkts); <-- memory exists after being freed
  }

This commit fixes this issue by using skb_unshare() instead. Besides,
to avoid use-after-free error reported by KASAN, the pointer to the
fragment is set to NULL before calling skb_unshare() to make sure that
the original skb is not freed after freeing the fragment 2 times in
case skb_unshare() returns NULL.

Fixes: ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Reported-by: Thang Hoang Ngo <thang.h.ngo@dektech.com.au>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
---
 net/tipc/msg.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index 2a78aa701572..32c79c59052b 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -150,12 +150,11 @@ int tipc_buf_append(struct sk_buff **headbuf, struct sk_buff **buf)
 	if (fragid == FIRST_FRAGMENT) {
 		if (unlikely(head))
 			goto err;
-		if (skb_cloned(frag))
-			frag = skb_copy(frag, GFP_ATOMIC);
+		*buf = NULL;
+		frag = skb_unshare(frag, GFP_ATOMIC);
 		if (unlikely(!frag))
 			goto err;
 		head = *headbuf = frag;
-		*buf = NULL;
 		TIPC_SKB_CB(head)->tail = NULL;
 		if (skb_is_nonlinear(head)) {
 			skb_walk_frags(head, tail) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append()
  2020-10-27  3:24 [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append() Tung Nguyen
@ 2020-10-27 20:50 ` Cong Wang
  2020-10-28  5:23   ` Tung Quang Nguyen
  2020-10-28  5:21 ` Xin Long
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2020-10-27 20:50 UTC (permalink / raw)
  To: Tung Nguyen
  Cc: David Miller, Linux Kernel Network Developers, tipc-discussion

On Tue, Oct 27, 2020 at 1:09 PM Tung Nguyen
<tung.q.nguyen@dektech.com.au> wrote:
>
> Commit ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
> replaced skb_unshare() with skb_copy() to not reduce the data reference
> counter of the original skb intentionally. This is not the correct
> way to handle the cloned skb because it causes memory leak in 2
> following cases:
>  1/ Sending multicast messages via broadcast link
>   The original skb list is cloned to the local skb list for local
>   destination. After that, the data reference counter of each skb
>   in the original list has the value of 2. This causes each skb not
>   to be freed after receiving ACK:

This does not make sense at all.

skb_unclone() expects refcnt == 1, as stated in the comments
above pskb_expand_head(). skb_unclone() was used prior to
Xin Long's commit.

So either the above is wrong, or something important is still missing
in your changelog. None of them is addressed in your V3.

I also asked you two questions before you sent V3, you seem to
intentionally ignore them. This is not how we collaborate.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append()
  2020-10-27  3:24 [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append() Tung Nguyen
  2020-10-27 20:50 ` Cong Wang
@ 2020-10-28  5:21 ` Xin Long
  2020-10-28 19:31 ` Cong Wang
  2020-10-29 16:55 ` Jakub Kicinski
  3 siblings, 0 replies; 7+ messages in thread
From: Xin Long @ 2020-10-28  5:21 UTC (permalink / raw)
  To: Tung Nguyen; +Cc: davem, network dev, tipc-discussion

On Tue, Oct 27, 2020 at 11:25 AM Tung Nguyen
<tung.q.nguyen@dektech.com.au> wrote:
>
> Commit ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
> replaced skb_unshare() with skb_copy() to not reduce the data reference
> counter of the original skb intentionally. This is not the correct
> way to handle the cloned skb because it causes memory leak in 2
> following cases:
>  1/ Sending multicast messages via broadcast link
>   The original skb list is cloned to the local skb list for local
>   destination. After that, the data reference counter of each skb
>   in the original list has the value of 2. This causes each skb not
>   to be freed after receiving ACK:
>   tipc_link_advance_transmq()
>   {
>    ...
>    /* release skb */
>    __skb_unlink(skb, &l->transmq);
>    kfree_skb(skb); <-- memory exists after being freed
>   }
>
>  2/ Sending multicast messages via replicast link
>   Similar to the above case, each skb cannot be freed after purging
>   the skb list:
>   tipc_mcast_xmit()
>   {
>    ...
>    __skb_queue_purge(pkts); <-- memory exists after being freed
>   }
>
> This commit fixes this issue by using skb_unshare() instead. Besides,
> to avoid use-after-free error reported by KASAN, the pointer to the
> fragment is set to NULL before calling skb_unshare() to make sure that
> the original skb is not freed after freeing the fragment 2 times in
> case skb_unshare() returns NULL.
>
> Fixes: ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
> Acked-by: Jon Maloy <jmaloy@redhat.com>
> Reported-by: Thang Hoang Ngo <thang.h.ngo@dektech.com.au>
> Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Reviewed-by: Xin Long <lucien.xin@gmail.com>

> ---
>  net/tipc/msg.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/net/tipc/msg.c b/net/tipc/msg.c
> index 2a78aa701572..32c79c59052b 100644
> --- a/net/tipc/msg.c
> +++ b/net/tipc/msg.c
> @@ -150,12 +150,11 @@ int tipc_buf_append(struct sk_buff **headbuf, struct sk_buff **buf)
>         if (fragid == FIRST_FRAGMENT) {
>                 if (unlikely(head))
>                         goto err;
> -               if (skb_cloned(frag))
> -                       frag = skb_copy(frag, GFP_ATOMIC);
> +               *buf = NULL;
> +               frag = skb_unshare(frag, GFP_ATOMIC);
>                 if (unlikely(!frag))
>                         goto err;
>                 head = *headbuf = frag;
> -               *buf = NULL;
>                 TIPC_SKB_CB(head)->tail = NULL;
>                 if (skb_is_nonlinear(head)) {
>                         skb_walk_frags(head, tail) {
> --
> 2.17.1
>
>
>
> _______________________________________________
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append()
  2020-10-27 20:50 ` Cong Wang
@ 2020-10-28  5:23   ` Tung Quang Nguyen
  2020-10-28 19:29     ` Cong Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Tung Quang Nguyen @ 2020-10-28  5:23 UTC (permalink / raw)
  To: Cong Wang, tipc-discussion, Xin Long, David Miller,
	Linux Kernel Network Developers

Hi Cong,

No, I have never ignored any comment from reviewers. I sent v2 on Oct 26 after discussing with Xin Long, and v3 on Oct 27 after receiving comment from Jakub.
I received your 3 emails nearly at the same time on Oct 28. It's weird. Your emails did not appear in this email archive either: https://sourceforge.net/p/tipc/mailman/tipc-discussion/

Anyway, I answer your questions:
1/ Why it is not correct if not decreasing the data reference counter in tipc_buf_append()
In my changelog, I just pinpointed the place where the leak would happen. I show you the details here:
tipc_msg_reassemble(list,-)
{
 ...
 frag = skb_clone(skb, GFP_ATOMIC); // each data reference counter of the original skb has the value of 2.
 ...
 If (tipc_buf_append(&head, &frag)) // each data reference counter of the original skb STILL has the value of 2 because the usage of skb_copy() instead of skb_unshare()
 ...
}
The original skb list then is passed to tipc_bcast_xmit() which in turn calls tipc_link_xmit():
tipc_link_xmit(-, list, -)
{
 ...
 _skb = skb_clone(skb, GFP_ATOMIC); // each data reference counter of the original skb has the value of 3.
...
}

When each cloned skb is sent out by the driver, it is freed by the driver. That leads to each data reference counter of the original skb has the value of 2.
After receiving ACK from another peer, the original skb needs to be freed:
tipc_link_advance_transmq()
{
 ...
 kfree_skb(skb);  // memory exists after being freed because the data reference counter still has the value of 2.
}

This indeed causes memory leak.

2/ Why previously-used skb_unclone() works.
The purpose of skb_unclone() is to unclone the cloned skb. So, it does not make any sense to say that " skb_unclone() expects refcnt == 1" as I understand
you implied the data reference counter.
pskb_expand_head() inside skb_unclone() requires that the user reference counter has the value of 1 as implemented:
pskb_expand_head()
{
 ...
 BUG_ON(skb_shared(skb)); // User reference counter must be 1
...
atomic_set(&skb_shinfo(skb)->dataref, 1); // The data reference counter of the original skb has the value of 1
...
}
That explains why after being passed to tipc_link_xmit(), each data reference counter of each original skb has the value of 2 and can be freed in tipc_link_advance_transmq().

Best regards,
Tung Nguyen

-----Original Message-----
From: Cong Wang <xiyou.wangcong@gmail.com> 
Sent: Wednesday, October 28, 2020 3:50 AM
To: Tung Quang Nguyen <tung.q.nguyen@dektech.com.au>
Cc: David Miller <davem@davemloft.net>; Linux Kernel Network Developers <netdev@vger.kernel.org>; tipc-discussion@lists.sourceforge.net
Subject: Re: [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append()

On Tue, Oct 27, 2020 at 1:09 PM Tung Nguyen
<tung.q.nguyen@dektech.com.au> wrote:
>
> Commit ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
> replaced skb_unshare() with skb_copy() to not reduce the data reference
> counter of the original skb intentionally. This is not the correct
> way to handle the cloned skb because it causes memory leak in 2
> following cases:
>  1/ Sending multicast messages via broadcast link
>   The original skb list is cloned to the local skb list for local
>   destination. After that, the data reference counter of each skb
>   in the original list has the value of 2. This causes each skb not
>   to be freed after receiving ACK:

This does not make sense at all.

skb_unclone() expects refcnt == 1, as stated in the comments
above pskb_expand_head(). skb_unclone() was used prior to
Xin Long's commit.

So either the above is wrong, or something important is still missing
in your changelog. None of them is addressed in your V3.

I also asked you two questions before you sent V3, you seem to
intentionally ignore them. This is not how we collaborate.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append()
  2020-10-28  5:23   ` Tung Quang Nguyen
@ 2020-10-28 19:29     ` Cong Wang
  0 siblings, 0 replies; 7+ messages in thread
From: Cong Wang @ 2020-10-28 19:29 UTC (permalink / raw)
  To: Tung Quang Nguyen
  Cc: tipc-discussion, Xin Long, David Miller, Linux Kernel Network Developers

On Tue, Oct 27, 2020 at 10:23 PM Tung Quang Nguyen
<tung.q.nguyen@dektech.com.au> wrote:
>
> Hi Cong,
>
> No, I have never ignored any comment from reviewers. I sent v2 on Oct 26 after discussing with Xin Long, and v3 on Oct 27 after receiving comment from Jakub.
> I received your 3 emails nearly at the same time on Oct 28. It's weird. Your emails did not appear in this email archive either: https://sourceforge.net/p/tipc/mailman/tipc-discussion/
>
> Anyway, I answer your questions:

Oh, I just realized you meant shinfo->dataref, not skb->users...
Then it makes sense now.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append()
  2020-10-27  3:24 [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append() Tung Nguyen
  2020-10-27 20:50 ` Cong Wang
  2020-10-28  5:21 ` Xin Long
@ 2020-10-28 19:31 ` Cong Wang
  2020-10-29 16:55 ` Jakub Kicinski
  3 siblings, 0 replies; 7+ messages in thread
From: Cong Wang @ 2020-10-28 19:31 UTC (permalink / raw)
  To: Tung Nguyen
  Cc: David Miller, Linux Kernel Network Developers, tipc-discussion

On Tue, Oct 27, 2020 at 1:09 PM Tung Nguyen
<tung.q.nguyen@dektech.com.au> wrote:
>
> Commit ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
> replaced skb_unshare() with skb_copy() to not reduce the data reference
> counter of the original skb intentionally. This is not the correct

More precisely, it is shinfo->dataref.


> way to handle the cloned skb because it causes memory leak in 2
> following cases:
>  1/ Sending multicast messages via broadcast link
>   The original skb list is cloned to the local skb list for local
>   destination. After that, the data reference counter of each skb
>   in the original list has the value of 2. This causes each skb not
>   to be freed after receiving ACK:
>   tipc_link_advance_transmq()
>   {
>    ...
>    /* release skb */
>    __skb_unlink(skb, &l->transmq);
>    kfree_skb(skb); <-- memory exists after being freed
>   }
>
>  2/ Sending multicast messages via replicast link
>   Similar to the above case, each skb cannot be freed after purging
>   the skb list:
>   tipc_mcast_xmit()
>   {
>    ...
>    __skb_queue_purge(pkts); <-- memory exists after being freed
>   }
>
> This commit fixes this issue by using skb_unshare() instead. Besides,
> to avoid use-after-free error reported by KASAN, the pointer to the
> fragment is set to NULL before calling skb_unshare() to make sure that
> the original skb is not freed after freeing the fragment 2 times in
> case skb_unshare() returns NULL.
>
> Fixes: ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
> Acked-by: Jon Maloy <jmaloy@redhat.com>
> Reported-by: Thang Hoang Ngo <thang.h.ngo@dektech.com.au>
> Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>

Acked-by: Cong Wang <xiyou.wangcong@gmail.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append()
  2020-10-27  3:24 [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append() Tung Nguyen
                   ` (2 preceding siblings ...)
  2020-10-28 19:31 ` Cong Wang
@ 2020-10-29 16:55 ` Jakub Kicinski
  3 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2020-10-29 16:55 UTC (permalink / raw)
  To: Tung Nguyen; +Cc: davem, netdev, tipc-discussion

On Tue, 27 Oct 2020 10:24:03 +0700 Tung Nguyen wrote:
> Commit ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
> replaced skb_unshare() with skb_copy() to not reduce the data reference
> counter of the original skb intentionally. This is not the correct
> way to handle the cloned skb because it causes memory leak in 2
> following cases:
>  1/ Sending multicast messages via broadcast link
>   The original skb list is cloned to the local skb list for local
>   destination. After that, the data reference counter of each skb
>   in the original list has the value of 2. This causes each skb not
>   to be freed after receiving ACK:
>   tipc_link_advance_transmq()
>   {
>    ...
>    /* release skb */
>    __skb_unlink(skb, &l->transmq);
>    kfree_skb(skb); <-- memory exists after being freed
>   }
> 
>  2/ Sending multicast messages via replicast link
>   Similar to the above case, each skb cannot be freed after purging
>   the skb list:
>   tipc_mcast_xmit()
>   {
>    ...
>    __skb_queue_purge(pkts); <-- memory exists after being freed
>   }
> 
> This commit fixes this issue by using skb_unshare() instead. Besides,
> to avoid use-after-free error reported by KASAN, the pointer to the
> fragment is set to NULL before calling skb_unshare() to make sure that
> the original skb is not freed after freeing the fragment 2 times in
> case skb_unshare() returns NULL.
> 
> Fixes: ed42989eab57 ("tipc: fix the skb_unshare() in tipc_buf_append()")
> Acked-by: Jon Maloy <jmaloy@redhat.com>
> Reported-by: Thang Hoang Ngo <thang.h.ngo@dektech.com.au>
> Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>

Applied, queued for all the stables.

Thanks everyone!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-29 16:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-27  3:24 [tipc-discussion] [net v3 1/1] tipc: fix memory leak caused by tipc_buf_append() Tung Nguyen
2020-10-27 20:50 ` Cong Wang
2020-10-28  5:23   ` Tung Quang Nguyen
2020-10-28 19:29     ` Cong Wang
2020-10-28  5:21 ` Xin Long
2020-10-28 19:31 ` Cong Wang
2020-10-29 16:55 ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.