bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 bpf 0/2] potential sockmap memleak and proc stats fix
@ 2021-07-02  0:11 John Fastabend
  2021-07-02  0:11 ` [PATCH v2 bpf 1/2] bpf, sockmap: fix potential msg memory leak John Fastabend
  2021-07-02  0:11 ` [PATCH v2 bpf 2/2] bpf, sockmap: sk_prot needs inuse_idx for proc stats John Fastabend
  0 siblings, 2 replies; 7+ messages in thread
From: John Fastabend @ 2021-07-02  0:11 UTC (permalink / raw)
  To: ast, daniel, andriin; +Cc: bpf, netdev, john.fastabend

While investigating a memleak in sockmap I found these two issues. Patch
1 found doing code review, I wasn't able to get KASAN to trigger a
memleak here, but should be necessary. Patch 2 fixes proc stats so when
we use sockstats for debugging we get correct values.

The fix for observered memleak will come after these, but requires some
more discussion and potentially patch revert so I'll try to get the set
here going now.

John Fastabend (2):
  bpf, sockmap: fix potential memory leak on unlikely error case
  bpf, sockmap: sk_prot needs inuse_idx set for proc stats

 net/core/skmsg.c    | 4 +++-
 net/core/sock_map.c | 9 +++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 bpf 1/2] bpf, sockmap: fix potential msg memory leak
  2021-07-02  0:11 [PATCH v2 bpf 0/2] potential sockmap memleak and proc stats fix John Fastabend
@ 2021-07-02  0:11 ` John Fastabend
  2021-07-02 19:54   ` Cong Wang
  2021-07-02  0:11 ` [PATCH v2 bpf 2/2] bpf, sockmap: sk_prot needs inuse_idx for proc stats John Fastabend
  1 sibling, 1 reply; 7+ messages in thread
From: John Fastabend @ 2021-07-02  0:11 UTC (permalink / raw)
  To: ast, daniel, andriin; +Cc: bpf, netdev, john.fastabend

If skb_linearize is needed and fails we could leak a msg on the error
handling. To fix ensure we kfree the msg block before returning error.
Found during code review.

Fixes: 4363023d2668e ("bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_list")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/skmsg.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 9b6160a191f8..22603289c2b2 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -505,8 +505,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
 	 * drop the skb. We need to linearize the skb so that the mapping
 	 * in skb_to_sgvec can not error.
 	 */
-	if (skb_linearize(skb))
+	if (skb_linearize(skb)) {
+		kfree(msg);
 		return -EAGAIN;
+	}
 	num_sge = skb_to_sgvec(skb, msg->sg.data, 0, skb->len);
 	if (unlikely(num_sge < 0)) {
 		kfree(msg);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 bpf 2/2] bpf, sockmap: sk_prot needs inuse_idx for proc stats
  2021-07-02  0:11 [PATCH v2 bpf 0/2] potential sockmap memleak and proc stats fix John Fastabend
  2021-07-02  0:11 ` [PATCH v2 bpf 1/2] bpf, sockmap: fix potential msg memory leak John Fastabend
@ 2021-07-02  0:11 ` John Fastabend
  2021-07-02 19:50   ` Cong Wang
  1 sibling, 1 reply; 7+ messages in thread
From: John Fastabend @ 2021-07-02  0:11 UTC (permalink / raw)
  To: ast, daniel, andriin; +Cc: bpf, netdev, john.fastabend

Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
We currently do not set this correctly from sockmap side. The result is
reading sock stats '/proc/net/sockstat' gives incorrect values. The
socket counter is incremented correctly, but because we don't set the
counter correctly when we replace sk_prot we may omit the decrement.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/sock_map.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 60decd6420ca..016ea5460f8f 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -222,6 +222,9 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk)
 	struct bpf_prog *msg_parser = NULL;
 	struct sk_psock *psock;
 	int ret;
+#ifdef CONFIG_PROC_FS
+	int idx;
+#endif
 
 	/* Only sockets we can redirect into/from in BPF need to hold
 	 * refs to parser/verdict progs and have their sk_data_ready
@@ -293,9 +296,15 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk)
 	if (msg_parser)
 		psock_set_prog(&psock->progs.msg_parser, msg_parser);
 
+#ifdef CONFIG_PROC_FS
+	idx = sk->sk_prot->inuse_idx;
+#endif
 	ret = sock_map_init_proto(sk, psock);
 	if (ret < 0)
 		goto out_drop;
+#ifdef CONFIG_PROC_FS
+	sk->sk_prot->inuse_idx = idx;
+#endif
 
 	write_lock_bh(&sk->sk_callback_lock);
 	if (stream_parser && stream_verdict && !psock->saved_data_ready) {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 bpf 2/2] bpf, sockmap: sk_prot needs inuse_idx for proc stats
  2021-07-02  0:11 ` [PATCH v2 bpf 2/2] bpf, sockmap: sk_prot needs inuse_idx for proc stats John Fastabend
@ 2021-07-02 19:50   ` Cong Wang
  2021-07-05 16:28     ` John Fastabend
  0 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2021-07-02 19:50 UTC (permalink / raw)
  To: John Fastabend
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Linux Kernel Network Developers

On Thu, Jul 1, 2021 at 5:12 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
> We currently do not set this correctly from sockmap side. The result is
> reading sock stats '/proc/net/sockstat' gives incorrect values. The
> socket counter is incremented correctly, but because we don't set the
> counter correctly when we replace sk_prot we may omit the decrement.
>
> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  net/core/sock_map.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index 60decd6420ca..016ea5460f8f 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -222,6 +222,9 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk)
>         struct bpf_prog *msg_parser = NULL;
>         struct sk_psock *psock;
>         int ret;
> +#ifdef CONFIG_PROC_FS
> +       int idx;
> +#endif
>
>         /* Only sockets we can redirect into/from in BPF need to hold
>          * refs to parser/verdict progs and have their sk_data_ready
> @@ -293,9 +296,15 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk)
>         if (msg_parser)
>                 psock_set_prog(&psock->progs.msg_parser, msg_parser);
>
> +#ifdef CONFIG_PROC_FS
> +       idx = sk->sk_prot->inuse_idx;
> +#endif
>         ret = sock_map_init_proto(sk, psock);
>         if (ret < 0)
>                 goto out_drop;
> +#ifdef CONFIG_PROC_FS
> +       sk->sk_prot->inuse_idx = idx;
> +#endif

I think it is better to put these into sock_map_init_proto()
so that sock_map_link() does not need to worry about the sk_prot
details.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 bpf 1/2] bpf, sockmap: fix potential msg memory leak
  2021-07-02  0:11 ` [PATCH v2 bpf 1/2] bpf, sockmap: fix potential msg memory leak John Fastabend
@ 2021-07-02 19:54   ` Cong Wang
  2021-07-05 16:27     ` John Fastabend
  0 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2021-07-02 19:54 UTC (permalink / raw)
  To: John Fastabend
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Linux Kernel Network Developers

On Thu, Jul 1, 2021 at 5:12 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> If skb_linearize is needed and fails we could leak a msg on the error
> handling. To fix ensure we kfree the msg block before returning error.
> Found during code review.
>
> Fixes: 4363023d2668e ("bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_list")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  net/core/skmsg.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index 9b6160a191f8..22603289c2b2 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -505,8 +505,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
>          * drop the skb. We need to linearize the skb so that the mapping
>          * in skb_to_sgvec can not error.
>          */
> -       if (skb_linearize(skb))
> +       if (skb_linearize(skb)) {
> +               kfree(msg);
>                 return -EAGAIN;
> +       }
>         num_sge = skb_to_sgvec(skb, msg->sg.data, 0, skb->len);
>         if (unlikely(num_sge < 0)) {
>                 kfree(msg);

I think it is better to let whoever allocates msg free it, IOW,
let sk_psock_skb_ingress_enqueue()'s callers handle its failure.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 bpf 1/2] bpf, sockmap: fix potential msg memory leak
  2021-07-02 19:54   ` Cong Wang
@ 2021-07-05 16:27     ` John Fastabend
  0 siblings, 0 replies; 7+ messages in thread
From: John Fastabend @ 2021-07-05 16:27 UTC (permalink / raw)
  To: Cong Wang, John Fastabend
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Linux Kernel Network Developers

Cong Wang wrote:
> On Thu, Jul 1, 2021 at 5:12 PM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > If skb_linearize is needed and fails we could leak a msg on the error
> > handling. To fix ensure we kfree the msg block before returning error.
> > Found during code review.
> >
> > Fixes: 4363023d2668e ("bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_list")
> > Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> > ---
> >  net/core/skmsg.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> > index 9b6160a191f8..22603289c2b2 100644
> > --- a/net/core/skmsg.c
> > +++ b/net/core/skmsg.c
> > @@ -505,8 +505,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
> >          * drop the skb. We need to linearize the skb so that the mapping
> >          * in skb_to_sgvec can not error.
> >          */
> > -       if (skb_linearize(skb))
> > +       if (skb_linearize(skb)) {
> > +               kfree(msg);
> >                 return -EAGAIN;
> > +       }
> >         num_sge = skb_to_sgvec(skb, msg->sg.data, 0, skb->len);
> >         if (unlikely(num_sge < 0)) {
> >                 kfree(msg);
> 
> I think it is better to let whoever allocates msg free it, IOW,
> let sk_psock_skb_ingress_enqueue()'s callers handle its failure.

Sure, although we already have the one kfree(msg) below. Anyways
I'll just move these back a bit. Agree it is slightly nicer.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 bpf 2/2] bpf, sockmap: sk_prot needs inuse_idx for proc stats
  2021-07-02 19:50   ` Cong Wang
@ 2021-07-05 16:28     ` John Fastabend
  0 siblings, 0 replies; 7+ messages in thread
From: John Fastabend @ 2021-07-05 16:28 UTC (permalink / raw)
  To: Cong Wang, John Fastabend
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Linux Kernel Network Developers

Cong Wang wrote:
> On Thu, Jul 1, 2021 at 5:12 PM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
> > We currently do not set this correctly from sockmap side. The result is
> > reading sock stats '/proc/net/sockstat' gives incorrect values. The
> > socket counter is incremented correctly, but because we don't set the
> > counter correctly when we replace sk_prot we may omit the decrement.
> >
> > Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> > Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> > ---
> >  net/core/sock_map.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> > index 60decd6420ca..016ea5460f8f 100644
> > --- a/net/core/sock_map.c
> > +++ b/net/core/sock_map.c
> > @@ -222,6 +222,9 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk)
> >         struct bpf_prog *msg_parser = NULL;
> >         struct sk_psock *psock;
> >         int ret;
> > +#ifdef CONFIG_PROC_FS
> > +       int idx;
> > +#endif
> >
> >         /* Only sockets we can redirect into/from in BPF need to hold
> >          * refs to parser/verdict progs and have their sk_data_ready
> > @@ -293,9 +296,15 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk)
> >         if (msg_parser)
> >                 psock_set_prog(&psock->progs.msg_parser, msg_parser);
> >
> > +#ifdef CONFIG_PROC_FS
> > +       idx = sk->sk_prot->inuse_idx;
> > +#endif
> >         ret = sock_map_init_proto(sk, psock);
> >         if (ret < 0)
> >                 goto out_drop;
> > +#ifdef CONFIG_PROC_FS
> > +       sk->sk_prot->inuse_idx = idx;
> > +#endif
> 
> I think it is better to put these into sock_map_init_proto()
> so that sock_map_link() does not need to worry about the sk_prot
> details.
> 
> Thanks.

Sure, that is fine.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-07-05 16:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-02  0:11 [PATCH v2 bpf 0/2] potential sockmap memleak and proc stats fix John Fastabend
2021-07-02  0:11 ` [PATCH v2 bpf 1/2] bpf, sockmap: fix potential msg memory leak John Fastabend
2021-07-02 19:54   ` Cong Wang
2021-07-05 16:27     ` John Fastabend
2021-07-02  0:11 ` [PATCH v2 bpf 2/2] bpf, sockmap: sk_prot needs inuse_idx for proc stats John Fastabend
2021-07-02 19:50   ` Cong Wang
2021-07-05 16:28     ` John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).