[PATCH bpf-next v2 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH bpf-next v2 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge
@ 2022-03-02  2:27 Wang Yufen
  2022-03-02  2:27 ` [PATCH bpf-next v2 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg Wang Yufen
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Wang Yufen @ 2022-03-02  2:27 UTC (permalink / raw)
  To: john.fastabend, daniel, jakub, lmb
  Cc: davem, edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, bpf, Wang Yufen

This patchset fixes memleaks and incorrect charge/uncharge memory, these
issues cause the following info:

WARNING: CPU: 0 PID: 9202 at net/core/stream.c:205 sk_stream_kill_queues+0xc8/0xe0
Call Trace:
 <IRQ>
 inet_csk_destroy_sock+0x55/0x110
 tcp_rcv_state_process+0xe5f/0xe90
 ? sk_filter_trim_cap+0x10d/0x230
 ? tcp_v4_do_rcv+0x161/0x250
 tcp_v4_do_rcv+0x161/0x250
 tcp_v4_rcv+0xc3a/0xce0
 ip_protocol_deliver_rcu+0x3d/0x230
 ip_local_deliver_finish+0x54/0x60
 ip_local_deliver+0xfd/0x110
 ? ip_protocol_deliver_rcu+0x230/0x230
 ip_rcv+0xd6/0x100
 ? ip_local_deliver+0x110/0x110
 __netif_receive_skb_one_core+0x85/0xa0
 process_backlog+0xa4/0x160
 __napi_poll+0x29/0x1b0
 net_rx_action+0x287/0x300
 __do_softirq+0xff/0x2fc
 do_softirq+0x79/0x90
 </IRQ>

WARNING: CPU: 0 PID: 531 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x175/0x1b0
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 ? process_one_work+0x3c0/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Changes since v1:
-Update the commit message of patch #2, the error path is from ENOMEM not
the ENOSPC.
-Simply returning an error code when psock is null, as John Fastabend
suggested.

Wang Yufen (4):
  bpf, sockmap: Fix memleak in sk_psock_queue_msg
  bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full
  bpf, sockmap: Fix more uncharged while msg has more_data
  bpf, sockmap: Fix double uncharge the mem of sk_msg

 include/linux/skmsg.h | 13 ++++---------
 net/ipv4/tcp_bpf.c    | 18 +++++++++++-------
 2 files changed, 15 insertions(+), 16 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v2 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg
  2022-03-02  2:27 [PATCH bpf-next v2 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
@ 2022-03-02  2:27 ` Wang Yufen
  2022-03-03  0:41   ` Cong Wang
  2022-03-02  2:27 ` [PATCH bpf-next v2 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full Wang Yufen
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Wang Yufen @ 2022-03-02  2:27 UTC (permalink / raw)
  To: john.fastabend, daniel, jakub, lmb
  Cc: davem, edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, bpf, Wang Yufen

If tcp_bpf_sendmsg is running during a tear down operation we may enqueue
data on the ingress msg queue while tear down is trying to free it.

 sk1 (redirect sk2)                         sk2
 -------------------                      ---------------
tcp_bpf_sendmsg()
 tcp_bpf_send_verdict()
  tcp_bpf_sendmsg_redir()
   bpf_tcp_ingress()
                                          sock_map_close()
                                           lock_sock()
    lock_sock() ... blocking
                                           sk_psock_stop
                                            sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED);
                                           release_sock(sk);
    lock_sock()
    sk_mem_charge()
    get_page()
    sk_psock_queue_msg()
     sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED);
      drop_sk_msg()
    release_sock()

While drop_sk_msg(), the msg has charged memory form sk by sk_mem_charge
and has sg pages need to put. To fix we use sk_msg_free() and then kfee()
msg.

This issue can cause the following info:
WARNING: CPU: 0 PID: 9202 at net/core/stream.c:205 sk_stream_kill_queues+0xc8/0xe0
Call Trace:
 <IRQ>
 inet_csk_destroy_sock+0x55/0x110
 tcp_rcv_state_process+0xe5f/0xe90
 ? sk_filter_trim_cap+0x10d/0x230
 ? tcp_v4_do_rcv+0x161/0x250
 tcp_v4_do_rcv+0x161/0x250
 tcp_v4_rcv+0xc3a/0xce0
 ip_protocol_deliver_rcu+0x3d/0x230
 ip_local_deliver_finish+0x54/0x60
 ip_local_deliver+0xfd/0x110
 ? ip_protocol_deliver_rcu+0x230/0x230
 ip_rcv+0xd6/0x100
 ? ip_local_deliver+0x110/0x110
 __netif_receive_skb_one_core+0x85/0xa0
 process_backlog+0xa4/0x160
 __napi_poll+0x29/0x1b0
 net_rx_action+0x287/0x300
 __do_softirq+0xff/0x2fc
 do_softirq+0x79/0x90
 </IRQ>

WARNING: CPU: 0 PID: 531 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x175/0x1b0
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 ? process_one_work+0x3c0/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 9635720b7c88 ("bpf, sockmap: Fix memleak on ingress msg enqueue")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/skmsg.h | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index fdb5375f0562..c5a2d6f50f25 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -304,21 +304,16 @@ static inline void sock_drop(struct sock *sk, struct sk_buff *skb)
 	kfree_skb(skb);
 }

-static inline void drop_sk_msg(struct sk_psock *psock, struct sk_msg *msg)
-{
-	if (msg->skb)
-		sock_drop(psock->sk, msg->skb);
-	kfree(msg);
-}
-
 static inline void sk_psock_queue_msg(struct sk_psock *psock,
 				      struct sk_msg *msg)
 {
 	spin_lock_bh(&psock->ingress_lock);
 	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
 		list_add_tail(&msg->list, &psock->ingress_msg);
-	else
-		drop_sk_msg(psock, msg);
+	else {
+		sk_msg_free(psock->sk, msg);
+		kfree(msg);
+	}
 	spin_unlock_bh(&psock->ingress_lock);
 }

-- 
2.25.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v2 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full
  2022-03-02  2:27 [PATCH bpf-next v2 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
  2022-03-02  2:27 ` [PATCH bpf-next v2 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg Wang Yufen
@ 2022-03-02  2:27 ` Wang Yufen
  2022-03-03  0:48   ` Cong Wang
  2022-03-02  2:27 ` [PATCH bpf-next v2 3/4] bpf, sockmap: Fix more uncharged while msg has more_data Wang Yufen
  2022-03-02  2:27 ` [PATCH bpf-next v2 4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg Wang Yufen
  3 siblings, 1 reply; 11+ messages in thread
From: Wang Yufen @ 2022-03-02  2:27 UTC (permalink / raw)
  To: john.fastabend, daniel, jakub, lmb
  Cc: davem, edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, bpf, Wang Yufen

If tcp_bpf_sendmsg() is running while sk msg is full. When sk_msg_alloc()
returns -ENOMEM error, tcp_bpf_sendmsg() goes to wait_for_memory. If partial
memory has been alloced by sk_msg_alloc(), that is, msg_tx->sg.size is
greater than osize after sk_msg_alloc(), memleak occurs. To fix we use
sk_msg_trim() to release the allocated memory, then goto wait for memory.

This issue can cause the following info:
WARNING: CPU: 3 PID: 7950 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0
Call Trace:
 <TASK>
 inet_csk_destroy_sock+0x55/0x110
 __tcp_close+0x279/0x470
 tcp_close+0x1f/0x60
 inet_release+0x3f/0x80
 __sock_release+0x3d/0xb0
 sock_close+0x11/0x20
 __fput+0x92/0x250
 task_work_run+0x6a/0xa0
 do_exit+0x33b/0xb60
 do_group_exit+0x2f/0xa0
 get_signal+0xb6/0x950
 arch_do_signal_or_restart+0xac/0x2a0
 exit_to_user_mode_prepare+0xa9/0x200
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x46/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
 </TASK>

WARNING: CPU: 3 PID: 2094 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 kthread+0xe6/0x110
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 net/ipv4/tcp_bpf.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 9b9b02052fd3..ac9f491cc139 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -421,8 +421,10 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 		osize = msg_tx->sg.size;
 		err = sk_msg_alloc(sk, msg_tx, msg_tx->sg.size + copy, msg_tx->sg.end - 1);
 		if (err) {
-			if (err != -ENOSPC)
+			if (err != -ENOSPC) {
+				sk_msg_trim(sk, msg_tx, osize);
 				goto wait_for_memory;
+			}
 			enospc = true;
 			copy = msg_tx->sg.size - osize;
 		}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v2 3/4] bpf, sockmap: Fix more uncharged while msg has more_data
  2022-03-02  2:27 [PATCH bpf-next v2 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
  2022-03-02  2:27 ` [PATCH bpf-next v2 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg Wang Yufen
  2022-03-02  2:27 ` [PATCH bpf-next v2 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full Wang Yufen
@ 2022-03-02  2:27 ` Wang Yufen
  2022-03-03  6:14   ` Cong Wang
  2022-03-02  2:27 ` [PATCH bpf-next v2 4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg Wang Yufen
  3 siblings, 1 reply; 11+ messages in thread
From: Wang Yufen @ 2022-03-02  2:27 UTC (permalink / raw)
  To: john.fastabend, daniel, jakub, lmb
  Cc: davem, edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, bpf, Wang Yufen

In tcp_bpf_send_verdict(), if msg has more data after
tcp_bpf_sendmsg_redir():

tcp_bpf_send_verdict()
 tosend = msg->sg.size  //msg->sg.size = 22220
 case __SK_REDIRECT:
  sk_msg_return()  //uncharged msg->sg.size(22220) sk->sk_forward_alloc
  tcp_bpf_sendmsg_redir() //after tcp_bpf_sendmsg_redir, msg->sg.size=11000
 goto more_data;
 tosend = msg->sg.size  //msg->sg.size = 11000
 case __SK_REDIRECT:
  sk_msg_return()  //uncharged msg->sg.size(11000) to sk->sk_forward_alloc

The msg->sg.size(11000) has been uncharged twice, to fix we can charge the
remaining msg->sg.size before goto more data.

This issue can cause the following info:
WARNING: CPU: 0 PID: 9860 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0
Call Trace:
 <TASK>
 inet_csk_destroy_sock+0x55/0x110
 __tcp_close+0x279/0x470
 tcp_close+0x1f/0x60
 inet_release+0x3f/0x80
 __sock_release+0x3d/0xb0
 sock_close+0x11/0x20
 __fput+0x92/0x250
 task_work_run+0x6a/0xa0
 do_exit+0x33b/0xb60
 do_group_exit+0x2f/0xa0
 get_signal+0xb6/0x950
 arch_do_signal_or_restart+0xac/0x2a0
 ? vfs_write+0x237/0x290
 exit_to_user_mode_prepare+0xa9/0x200
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x46/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
 </TASK>

WARNING: CPU: 0 PID: 2136 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 net/ipv4/tcp_bpf.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index ac9f491cc139..1f0364e06619 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -335,7 +335,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
 			cork = true;
 			psock->cork = NULL;
 		}
-		sk_msg_return(sk, msg, tosend);
+		sk_msg_return(sk, msg, msg->sg.size);
 		release_sock(sk);

 		ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags);
@@ -375,8 +375,11 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
 		}
 		if (msg &&
 		    msg->sg.data[msg->sg.start].page_link &&
-		    msg->sg.data[msg->sg.start].length)
+		    msg->sg.data[msg->sg.start].length) {
+			if (eval == __SK_REDIRECT)
+				sk_mem_charge(sk, msg->sg.size);
 			goto more_data;
+		}
 	}
 	return ret;
 }
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v2 4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg
  2022-03-02  2:27 [PATCH bpf-next v2 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
                   ` (2 preceding siblings ...)
  2022-03-02  2:27 ` [PATCH bpf-next v2 3/4] bpf, sockmap: Fix more uncharged while msg has more_data Wang Yufen
@ 2022-03-02  2:27 ` Wang Yufen
  3 siblings, 0 replies; 11+ messages in thread
From: Wang Yufen @ 2022-03-02  2:27 UTC (permalink / raw)
  To: john.fastabend, daniel, jakub, lmb
  Cc: davem, edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, bpf, Wang Yufen

If tcp_bpf_sendmsg is running during a tear down operation, psock may be
freed.

tcp_bpf_sendmsg()
 tcp_bpf_send_verdict()
  sk_msg_return()
  tcp_bpf_sendmsg_redir()
   unlikely(!psock))
     sk_msg_free()

The mem of msg has been uncharged in tcp_bpf_send_verdict() by
sk_msg_return(), and would be uncharged by sk_msg_free() again. When psock
is null, we can simply returning an error code, this would then trigger
the sk_msg_free_nocharge in the error path of __SK_REDIRECT and would have
the side effect of throwing an error up to user space. This would be a
slight change in behavior from user side but would look the same as an
error if the redirect on the socket threw an error.

This issue can cause the following info:
WARNING: CPU: 0 PID: 2136 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
---
 net/ipv4/tcp_bpf.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 1f0364e06619..7ed1e6a8e30a 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -138,10 +138,9 @@ int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg,
 	struct sk_psock *psock = sk_psock_get(sk);
 	int ret;

-	if (unlikely(!psock)) {
-		sk_msg_free(sk, msg);
-		return 0;
-	}
+	if (unlikely(!psock))
+		return -EPIPE;
+
 	ret = ingress ? bpf_tcp_ingress(sk, psock, msg, bytes, flags) :
 			tcp_bpf_push_locked(sk, msg, bytes, flags, false);
 	sk_psock_put(sk, psock);
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg
  2022-03-02  2:27 ` [PATCH bpf-next v2 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg Wang Yufen
@ 2022-03-03  0:41   ` Cong Wang
  2022-03-04  6:45     ` wangyufen
  0 siblings, 1 reply; 11+ messages in thread
From: Cong Wang @ 2022-03-03  0:41 UTC (permalink / raw)
  To: Wang Yufen
  Cc: john.fastabend, daniel, jakub, lmb, davem, edumazet, yoshfuji,
	dsahern, kuba, ast, andrii, kafai, songliubraving, yhs, kpsingh,
	netdev, bpf

On Wed, Mar 02, 2022 at 10:27:52AM +0800, Wang Yufen wrote:
> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index fdb5375f0562..c5a2d6f50f25 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -304,21 +304,16 @@ static inline void sock_drop(struct sock *sk, struct sk_buff *skb)
>  	kfree_skb(skb);
>  }
>  
> -static inline void drop_sk_msg(struct sk_psock *psock, struct sk_msg *msg)
> -{
> -	if (msg->skb)
> -		sock_drop(psock->sk, msg->skb);
> -	kfree(msg);
> -}
> -
>  static inline void sk_psock_queue_msg(struct sk_psock *psock,
>  				      struct sk_msg *msg)
>  {
>  	spin_lock_bh(&psock->ingress_lock);
>  	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
>  		list_add_tail(&msg->list, &psock->ingress_msg);
> -	else
> -		drop_sk_msg(psock, msg);
> +	else {
> +		sk_msg_free(psock->sk, msg);

__sk_msg_free() calls sk_msg_init() at the end.

> +		kfree(msg);

Now you free it, hence the above sk_msg_init() is completely
unnecessary.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full
  2022-03-02  2:27 ` [PATCH bpf-next v2 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full Wang Yufen
@ 2022-03-03  0:48   ` Cong Wang
  2022-03-04  6:51     ` wangyufen
  0 siblings, 1 reply; 11+ messages in thread
From: Cong Wang @ 2022-03-03  0:48 UTC (permalink / raw)
  To: Wang Yufen
  Cc: john.fastabend, daniel, jakub, lmb, davem, edumazet, yoshfuji,
	dsahern, kuba, ast, andrii, kafai, songliubraving, yhs, kpsingh,
	netdev, bpf

On Wed, Mar 02, 2022 at 10:27:53AM +0800, Wang Yufen wrote:
> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> index 9b9b02052fd3..ac9f491cc139 100644
> --- a/net/ipv4/tcp_bpf.c
> +++ b/net/ipv4/tcp_bpf.c
> @@ -421,8 +421,10 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
>  		osize = msg_tx->sg.size;
>  		err = sk_msg_alloc(sk, msg_tx, msg_tx->sg.size + copy, msg_tx->sg.end - 1);
>  		if (err) {
> -			if (err != -ENOSPC)
> +			if (err != -ENOSPC) {
> +				sk_msg_trim(sk, msg_tx, osize);
>  				goto wait_for_memory;

Is it a good idea to handle this logic inside sk_msg_alloc()?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 3/4] bpf, sockmap: Fix more uncharged while msg has more_data
  2022-03-02  2:27 ` [PATCH bpf-next v2 3/4] bpf, sockmap: Fix more uncharged while msg has more_data Wang Yufen
@ 2022-03-03  6:14   ` Cong Wang
  2022-03-04  7:08     ` wangyufen
  0 siblings, 1 reply; 11+ messages in thread
From: Cong Wang @ 2022-03-03  6:14 UTC (permalink / raw)
  To: Wang Yufen
  Cc: john.fastabend, daniel, jakub, lmb, davem, edumazet, yoshfuji,
	dsahern, kuba, ast, andrii, kafai, songliubraving, yhs, kpsingh,
	netdev, bpf

On Wed, Mar 02, 2022 at 10:27:54AM +0800, Wang Yufen wrote:
> In tcp_bpf_send_verdict(), if msg has more data after
> tcp_bpf_sendmsg_redir():
> 
> tcp_bpf_send_verdict()
>  tosend = msg->sg.size  //msg->sg.size = 22220
>  case __SK_REDIRECT:
>   sk_msg_return()  //uncharged msg->sg.size(22220) sk->sk_forward_alloc
>   tcp_bpf_sendmsg_redir() //after tcp_bpf_sendmsg_redir, msg->sg.size=11000
>  goto more_data;
>  tosend = msg->sg.size  //msg->sg.size = 11000
>  case __SK_REDIRECT:
>   sk_msg_return()  //uncharged msg->sg.size(11000) to sk->sk_forward_alloc
> 
> The msg->sg.size(11000) has been uncharged twice, to fix we can charge the
> remaining msg->sg.size before goto more data.

It looks like bpf_exec_tx_verdict() has the same issue.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg
  2022-03-03  0:41   ` Cong Wang
@ 2022-03-04  6:45     ` wangyufen
  0 siblings, 0 replies; 11+ messages in thread
From: wangyufen @ 2022-03-04  6:45 UTC (permalink / raw)
  To: Cong Wang
  Cc: john.fastabend, daniel, jakub, lmb, davem, edumazet, yoshfuji,
	dsahern, kuba, ast, andrii, kafai, songliubraving, yhs, kpsingh,
	netdev, bpf


在 2022/3/3 8:41, Cong Wang 写道:
> On Wed, Mar 02, 2022 at 10:27:52AM +0800, Wang Yufen wrote:
>> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
>> index fdb5375f0562..c5a2d6f50f25 100644
>> --- a/include/linux/skmsg.h
>> +++ b/include/linux/skmsg.h
>> @@ -304,21 +304,16 @@ static inline void sock_drop(struct sock *sk, struct sk_buff *skb)
>>   	kfree_skb(skb);
>>   }
>>   
>> -static inline void drop_sk_msg(struct sk_psock *psock, struct sk_msg *msg)
>> -{
>> -	if (msg->skb)
>> -		sock_drop(psock->sk, msg->skb);
>> -	kfree(msg);
>> -}
>> -
>>   static inline void sk_psock_queue_msg(struct sk_psock *psock,
>>   				      struct sk_msg *msg)
>>   {
>>   	spin_lock_bh(&psock->ingress_lock);
>>   	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
>>   		list_add_tail(&msg->list, &psock->ingress_msg);
>> -	else
>> -		drop_sk_msg(psock, msg);
>> +	else {
>> +		sk_msg_free(psock->sk, msg);
> __sk_msg_free() calls sk_msg_init() at the end.
>
>> +		kfree(msg);
> Now you free it, hence the above sk_msg_init() is completely
> unnecessary.

Invoking of sk_msg_free() does not always follow kfree().

That is, sk_msg needs to be reused in some cases.

We can implement sk_msg_free_xx() without sk_msg_init(),

but I don't think it is necessary.


Thanks

>
> Thanks.
> .

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full
  2022-03-03  0:48   ` Cong Wang
@ 2022-03-04  6:51     ` wangyufen
  0 siblings, 0 replies; 11+ messages in thread
From: wangyufen @ 2022-03-04  6:51 UTC (permalink / raw)
  To: Cong Wang
  Cc: john.fastabend, daniel, jakub, lmb, davem, edumazet, yoshfuji,
	dsahern, kuba, ast, andrii, kafai, songliubraving, yhs, kpsingh,
	netdev, bpf


在 2022/3/3 8:48, Cong Wang 写道:
> On Wed, Mar 02, 2022 at 10:27:53AM +0800, Wang Yufen wrote:
>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
>> index 9b9b02052fd3..ac9f491cc139 100644
>> --- a/net/ipv4/tcp_bpf.c
>> +++ b/net/ipv4/tcp_bpf.c
>> @@ -421,8 +421,10 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
>>   		osize = msg_tx->sg.size;
>>   		err = sk_msg_alloc(sk, msg_tx, msg_tx->sg.size + copy, msg_tx->sg.end - 1);
>>   		if (err) {
>> -			if (err != -ENOSPC)
>> +			if (err != -ENOSPC) {
>> +				sk_msg_trim(sk, msg_tx, osize);
>>   				goto wait_for_memory;
> Is it a good idea to handle this logic inside sk_msg_alloc()?

Yes, I think you're right.

Other call paths of sk_msg_alloc() have the similar problem, such as 
tls_sw_sendmsg(),

will do in v3.


Thanks.

> .

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 3/4] bpf, sockmap: Fix more uncharged while msg has more_data
  2022-03-03  6:14   ` Cong Wang
@ 2022-03-04  7:08     ` wangyufen
  0 siblings, 0 replies; 11+ messages in thread
From: wangyufen @ 2022-03-04  7:08 UTC (permalink / raw)
  To: Cong Wang
  Cc: john.fastabend, daniel, jakub, lmb, davem, edumazet, yoshfuji,
	dsahern, kuba, ast, andrii, kafai, songliubraving, yhs, kpsingh,
	netdev, bpf


在 2022/3/3 14:14, Cong Wang 写道:
> On Wed, Mar 02, 2022 at 10:27:54AM +0800, Wang Yufen wrote:
>> In tcp_bpf_send_verdict(), if msg has more data after
>> tcp_bpf_sendmsg_redir():
>>
>> tcp_bpf_send_verdict()
>>   tosend = msg->sg.size  //msg->sg.size = 22220
>>   case __SK_REDIRECT:
>>    sk_msg_return()  //uncharged msg->sg.size(22220) sk->sk_forward_alloc
>>    tcp_bpf_sendmsg_redir() //after tcp_bpf_sendmsg_redir, msg->sg.size=11000
>>   goto more_data;
>>   tosend = msg->sg.size  //msg->sg.size = 11000
>>   case __SK_REDIRECT:
>>    sk_msg_return()  //uncharged msg->sg.size(11000) to sk->sk_forward_alloc
>>
>> The msg->sg.size(11000) has been uncharged twice, to fix we can charge the
>> remaining msg->sg.size before goto more data.
> It looks like bpf_exec_tx_verdict() has the same issue.
>
> .

In bpf_exec_tx_verdict(), case __SK_REDIRECT,  msg_redir is used and 
msg->sg.size is deducted in advance.

Therefore, this issue (more uncharged) does not exist.

However, I think that if msg_redir processing cannot be completed , that 
is msg_redir has more data,

and there is no subsequent processing,  maybe that is another problem.


Thanks.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-03-04  7:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-02  2:27 [PATCH bpf-next v2 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
2022-03-02  2:27 ` [PATCH bpf-next v2 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg Wang Yufen
2022-03-03  0:41   ` Cong Wang
2022-03-04  6:45     ` wangyufen
2022-03-02  2:27 ` [PATCH bpf-next v2 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full Wang Yufen
2022-03-03  0:48   ` Cong Wang
2022-03-04  6:51     ` wangyufen
2022-03-02  2:27 ` [PATCH bpf-next v2 3/4] bpf, sockmap: Fix more uncharged while msg has more_data Wang Yufen
2022-03-03  6:14   ` Cong Wang
2022-03-04  7:08     ` wangyufen
2022-03-02  2:27 ` [PATCH bpf-next v2 4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg Wang Yufen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.