bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] sockmap, fix for some error paths with helpers
@ 2020-05-04 17:21 John Fastabend
  2020-05-04 17:21 ` [PATCH 1/2] bpf: sockmap, msg_pop_data can incorrecty set an sge length John Fastabend
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: John Fastabend @ 2020-05-04 17:21 UTC (permalink / raw)
  To: jakub, daniel; +Cc: netdev, bpf, john.fastabend, ast

In these two cases sk_msg layout was getting confused with some helper
sequences.

I found these while cleaning up test_sockmap to do a better job covering
the different scenarios. Those patches will go to bpf-next and include
tests that cover these two cases.

---

John Fastabend (2):
      bpf: sockmap, msg_pop_data can incorrecty set an sge length
      bpf: sockmap, bpf_tcp_ingress needs to subtract bytes from sg.size


 include/linux/skmsg.h |    1 +
 net/core/filter.c     |    2 +-
 net/ipv4/tcp_bpf.c    |    1 -
 3 files changed, 2 insertions(+), 2 deletions(-)

--
Signature

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] bpf: sockmap, msg_pop_data can incorrecty set an sge length
  2020-05-04 17:21 [PATCH 0/2] sockmap, fix for some error paths with helpers John Fastabend
@ 2020-05-04 17:21 ` John Fastabend
  2020-05-05 18:59   ` Martin KaFai Lau
  2020-05-04 17:21 ` [PATCH 2/2] bpf: sockmap, bpf_tcp_ingress needs to subtract bytes from sg.size John Fastabend
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: John Fastabend @ 2020-05-04 17:21 UTC (permalink / raw)
  To: jakub, daniel; +Cc: netdev, bpf, john.fastabend, ast

When sk_msg_pop() is called where the pop operation is working on
the end of a sge element and there is no additional trailing data
and there _is_ data in front of pop, like the following case,


   |____________a_____________|__pop__|

We have out of order operations where we incorrectly set the pop
variable so that instead of zero'ing pop we incorrectly leave it
untouched, effectively. This can cause later logic to shift the
buffers around believing it should pop extra space. The result is
we have 'popped' more data then we expected potentially breaking
program logic.

It took us a while to hit this case because typically we pop headers
which seem to rarely be at the end of a scatterlist elements but
we can't rely on this.

Fixes: 7246d8ed4dcce ("bpf: helper to pop data from messages")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/filter.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 7d6ceaa..5cc9276 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2590,8 +2590,8 @@ BPF_CALL_4(bpf_msg_pop_data, struct sk_msg *, msg, u32, start,
 			}
 			pop = 0;
 		} else if (pop >= sge->length - a) {
-			sge->length = a;
 			pop -= (sge->length - a);
+			sge->length = a;
 		}
 	}
 


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] bpf: sockmap, bpf_tcp_ingress needs to subtract bytes from sg.size
  2020-05-04 17:21 [PATCH 0/2] sockmap, fix for some error paths with helpers John Fastabend
  2020-05-04 17:21 ` [PATCH 1/2] bpf: sockmap, msg_pop_data can incorrecty set an sge length John Fastabend
@ 2020-05-04 17:21 ` John Fastabend
  2020-05-05 19:05   ` Martin KaFai Lau
  2020-05-05  9:30 ` [PATCH 0/2] sockmap, fix for some error paths with helpers Jakub Sitnicki
  2020-05-05 22:28 ` Daniel Borkmann
  3 siblings, 1 reply; 7+ messages in thread
From: John Fastabend @ 2020-05-04 17:21 UTC (permalink / raw)
  To: jakub, daniel; +Cc: netdev, bpf, john.fastabend, ast

In bpf_tcp_ingress we used apply_bytes to subtract bytes from sg.size
which is used to track total bytes in a message. But this is not
correct because apply_bytes is itself modified in the main loop doing
the mem_charge.

Then at the end of this we have sg.size incorrectly set and out of
sync with actual sk values. Then we can get a splat if we try to
cork the data later and again try to redirect the msg to ingress. To
fix instead of trying to track msg.size do the easy thing and include
it as part of the sk_msg_xfer logic so that when the msg is moved the
sg.size is always correct.

To reproduce the below users will need ingress + cork and hit an
error path that will then try to 'free' the skmsg.

[  173.699981] BUG: KASAN: null-ptr-deref in sk_msg_free_elem+0xdd/0x120
[  173.699987] Read of size 8 at addr 0000000000000008 by task test_sockmap/5317

[  173.700000] CPU: 2 PID: 5317 Comm: test_sockmap Tainted: G          I       5.7.0-rc1+ #43
[  173.700005] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.9.2 01/24/2019
[  173.700009] Call Trace:
[  173.700021]  dump_stack+0x8e/0xcb
[  173.700029]  ? sk_msg_free_elem+0xdd/0x120
[  173.700034]  ? sk_msg_free_elem+0xdd/0x120
[  173.700042]  __kasan_report+0x102/0x15f
[  173.700052]  ? sk_msg_free_elem+0xdd/0x120
[  173.700060]  kasan_report+0x32/0x50
[  173.700070]  sk_msg_free_elem+0xdd/0x120
[  173.700080]  __sk_msg_free+0x87/0x150
[  173.700094]  tcp_bpf_send_verdict+0x179/0x4f0
[  173.700109]  tcp_bpf_sendpage+0x3ce/0x5d0

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/skmsg.h |    1 +
 net/ipv4/tcp_bpf.c    |    1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index 8a709f6..ad31c9f 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -187,6 +187,7 @@ static inline void sk_msg_xfer(struct sk_msg *dst, struct sk_msg *src,
 	dst->sg.data[which] = src->sg.data[which];
 	dst->sg.data[which].length  = size;
 	dst->sg.size		   += size;
+	src->sg.size		   -= size;
 	src->sg.data[which].length -= size;
 	src->sg.data[which].offset += size;
 }
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index ff96466..629aaa9a 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -125,7 +125,6 @@ static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock,
 
 	if (!ret) {
 		msg->sg.start = i;
-		msg->sg.size -= apply_bytes;
 		sk_psock_queue_msg(psock, tmp);
 		sk_psock_data_ready(sk, psock);
 	} else {


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] sockmap, fix for some error paths with helpers
  2020-05-04 17:21 [PATCH 0/2] sockmap, fix for some error paths with helpers John Fastabend
  2020-05-04 17:21 ` [PATCH 1/2] bpf: sockmap, msg_pop_data can incorrecty set an sge length John Fastabend
  2020-05-04 17:21 ` [PATCH 2/2] bpf: sockmap, bpf_tcp_ingress needs to subtract bytes from sg.size John Fastabend
@ 2020-05-05  9:30 ` Jakub Sitnicki
  2020-05-05 22:28 ` Daniel Borkmann
  3 siblings, 0 replies; 7+ messages in thread
From: Jakub Sitnicki @ 2020-05-05  9:30 UTC (permalink / raw)
  To: John Fastabend; +Cc: daniel, netdev, bpf, ast

On Mon, May 04, 2020 at 07:21 PM CEST, John Fastabend wrote:
> In these two cases sk_msg layout was getting confused with some helper
> sequences.
>
> I found these while cleaning up test_sockmap to do a better job covering
> the different scenarios. Those patches will go to bpf-next and include
> tests that cover these two cases.
>
> ---
>
> John Fastabend (2):
>       bpf: sockmap, msg_pop_data can incorrecty set an sge length
>       bpf: sockmap, bpf_tcp_ingress needs to subtract bytes from sg.size

Both of these LGTM. Looking forward to revamped test_sockmap.

For the series:

Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] bpf: sockmap, msg_pop_data can incorrecty set an sge length
  2020-05-04 17:21 ` [PATCH 1/2] bpf: sockmap, msg_pop_data can incorrecty set an sge length John Fastabend
@ 2020-05-05 18:59   ` Martin KaFai Lau
  0 siblings, 0 replies; 7+ messages in thread
From: Martin KaFai Lau @ 2020-05-05 18:59 UTC (permalink / raw)
  To: John Fastabend; +Cc: jakub, daniel, netdev, bpf, ast

On Mon, May 04, 2020 at 10:21:23AM -0700, John Fastabend wrote:
> When sk_msg_pop() is called where the pop operation is working on
> the end of a sge element and there is no additional trailing data
> and there _is_ data in front of pop, like the following case,
> 
> 
>    |____________a_____________|__pop__|
> 
> We have out of order operations where we incorrectly set the pop
> variable so that instead of zero'ing pop we incorrectly leave it
> untouched, effectively. This can cause later logic to shift the
> buffers around believing it should pop extra space. The result is
> we have 'popped' more data then we expected potentially breaking
> program logic.
> 
> It took us a while to hit this case because typically we pop headers
> which seem to rarely be at the end of a scatterlist elements but
> we can't rely on this.
> 
> Fixes: 7246d8ed4dcce ("bpf: helper to pop data from messages")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] bpf: sockmap, bpf_tcp_ingress needs to subtract bytes from sg.size
  2020-05-04 17:21 ` [PATCH 2/2] bpf: sockmap, bpf_tcp_ingress needs to subtract bytes from sg.size John Fastabend
@ 2020-05-05 19:05   ` Martin KaFai Lau
  0 siblings, 0 replies; 7+ messages in thread
From: Martin KaFai Lau @ 2020-05-05 19:05 UTC (permalink / raw)
  To: John Fastabend; +Cc: jakub, daniel, netdev, bpf, ast

On Mon, May 04, 2020 at 10:21:44AM -0700, John Fastabend wrote:
> In bpf_tcp_ingress we used apply_bytes to subtract bytes from sg.size
> which is used to track total bytes in a message. But this is not
> correct because apply_bytes is itself modified in the main loop doing
> the mem_charge.
> 
> Then at the end of this we have sg.size incorrectly set and out of
> sync with actual sk values. Then we can get a splat if we try to
> cork the data later and again try to redirect the msg to ingress. To
> fix instead of trying to track msg.size do the easy thing and include
> it as part of the sk_msg_xfer logic so that when the msg is moved the
> sg.size is always correct.
> 
> To reproduce the below users will need ingress + cork and hit an
> error path that will then try to 'free' the skmsg.
> 
> [  173.699981] BUG: KASAN: null-ptr-deref in sk_msg_free_elem+0xdd/0x120
> [  173.699987] Read of size 8 at addr 0000000000000008 by task test_sockmap/5317
> 
> [  173.700000] CPU: 2 PID: 5317 Comm: test_sockmap Tainted: G          I       5.7.0-rc1+ #43
> [  173.700005] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.9.2 01/24/2019
> [  173.700009] Call Trace:
> [  173.700021]  dump_stack+0x8e/0xcb
> [  173.700029]  ? sk_msg_free_elem+0xdd/0x120
> [  173.700034]  ? sk_msg_free_elem+0xdd/0x120
> [  173.700042]  __kasan_report+0x102/0x15f
> [  173.700052]  ? sk_msg_free_elem+0xdd/0x120
> [  173.700060]  kasan_report+0x32/0x50
> [  173.700070]  sk_msg_free_elem+0xdd/0x120
> [  173.700080]  __sk_msg_free+0x87/0x150
> [  173.700094]  tcp_bpf_send_verdict+0x179/0x4f0
> [  173.700109]  tcp_bpf_sendpage+0x3ce/0x5d0
> 
> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] sockmap, fix for some error paths with helpers
  2020-05-04 17:21 [PATCH 0/2] sockmap, fix for some error paths with helpers John Fastabend
                   ` (2 preceding siblings ...)
  2020-05-05  9:30 ` [PATCH 0/2] sockmap, fix for some error paths with helpers Jakub Sitnicki
@ 2020-05-05 22:28 ` Daniel Borkmann
  3 siblings, 0 replies; 7+ messages in thread
From: Daniel Borkmann @ 2020-05-05 22:28 UTC (permalink / raw)
  To: John Fastabend, jakub; +Cc: netdev, bpf, ast

On 5/4/20 7:21 PM, John Fastabend wrote:
> In these two cases sk_msg layout was getting confused with some helper
> sequences.
> 
> I found these while cleaning up test_sockmap to do a better job covering
> the different scenarios. Those patches will go to bpf-next and include
> tests that cover these two cases.
> 
> ---
> 
> John Fastabend (2):
>        bpf: sockmap, msg_pop_data can incorrecty set an sge length
>        bpf: sockmap, bpf_tcp_ingress needs to subtract bytes from sg.size
> 
> 
>   include/linux/skmsg.h |    1 +
>   net/core/filter.c     |    2 +-
>   net/ipv4/tcp_bpf.c    |    1 -
>   3 files changed, 2 insertions(+), 2 deletions(-)
> 
> --
> Signature
> 

Applied to bpf, thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-05 22:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-04 17:21 [PATCH 0/2] sockmap, fix for some error paths with helpers John Fastabend
2020-05-04 17:21 ` [PATCH 1/2] bpf: sockmap, msg_pop_data can incorrecty set an sge length John Fastabend
2020-05-05 18:59   ` Martin KaFai Lau
2020-05-04 17:21 ` [PATCH 2/2] bpf: sockmap, bpf_tcp_ingress needs to subtract bytes from sg.size John Fastabend
2020-05-05 19:05   ` Martin KaFai Lau
2020-05-05  9:30 ` [PATCH 0/2] sockmap, fix for some error paths with helpers Jakub Sitnicki
2020-05-05 22:28 ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).