netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/3] add size field to sk_msg_md
@ 2018-12-16 23:47 John Fastabend
  2018-12-16 23:47 ` [PATCH bpf-next 1/3] bpf: sockmap, metadata support for reporting size of msg John Fastabend
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: John Fastabend @ 2018-12-16 23:47 UTC (permalink / raw)
  To: daniel, ast; +Cc: netdev, john.fastabend

This adds a size field to the sk_msg_md data structure used by SK_MSG
programs. Without this in the zerocopy case and in the copy case
where multiple iovs are in use its difficult to know how much data
can be pulled in. The normal method of reading data and data_end
only give the current contiguous buffer. BPF programs can attempt to
pull in extra data but have to guess if it exists. This can result
in multiple "guesses" its much better if we know upfront the size
of the sk_msg.

John Fastabend (3):
  bpf: sockmap, metadata support for reporting length of message
  bpf: add tools lib/include support sk_msg_md size field
  bpf: sk_msg, add tests for size field

 include/linux/skmsg.h                       |  3 +++
 include/uapi/linux/bpf.h                    |  1 +
 net/core/filter.c                           |  6 ++++++
 tools/include/uapi/linux/bpf.h              |  1 +
 tools/testing/selftests/bpf/test_verifier.c | 16 +++++++++++++---
 5 files changed, 24 insertions(+), 3 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf-next 1/3] bpf: sockmap, metadata support for reporting size of msg
  2018-12-16 23:47 [PATCH bpf-next 0/3] add size field to sk_msg_md John Fastabend
@ 2018-12-16 23:47 ` John Fastabend
  2018-12-20  2:49   ` Alexei Starovoitov
  2018-12-16 23:47 ` [PATCH bpf-next 2/3] bpf: add tools lib/include support sk_msg_md size field John Fastabend
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: John Fastabend @ 2018-12-16 23:47 UTC (permalink / raw)
  To: daniel, ast; +Cc: netdev, john.fastabend

This adds metadata to sk_msg_md for BPF programs to read the sk_msg
size.

When the SK_MSG program is running under an application that is using
sendfile the data is not copied into sk_msg buffers by default. Rather
the BPF program uses sk_msg_pull_data to read the bytes in. This
avoids doing the costly memcopy instructions when they are not in
fact needed. However, if we don't know the size of the sk_msg we
have to guess if needed bytes are available by doing a pull request
which may fail. By including the size of the sk_msg BPF programs can
check the size before issuing sk_msg_pull_data requests.

Additionally, the same applies for sendmsg calls when the application
provides multiple iovs. Here the BPF program needs to pull in data
to update data pointers but its not clear where the data ends without
a size parameter. In many cases "guessing" is not easy to do
and results in multiple calls to pull and without bounded loops
everything gets fairly tricky.

Clean this up by including a u32 size field. Note, all writes into
sk_msg_md are rejected already from sk_msg_is_valid_access so nothing
additional is needed there.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/skmsg.h    | 3 +++
 include/uapi/linux/bpf.h | 1 +
 net/core/filter.c        | 6 ++++++
 3 files changed, 10 insertions(+)

diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index 2a11e9d..eb8f6cb 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -36,6 +36,9 @@ struct sk_msg_sg {
 	struct scatterlist		data[MAX_MSG_FRAGS + 1];
 };
 
+/* UAPI in filter.c depends on struct sk_msg_sg being first element. If
+ * this is moved filter.c also must be updated.
+ */
 struct sk_msg {
 	struct sk_msg_sg		sg;
 	void				*data;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 597afdb..498badc 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2608,6 +2608,7 @@ struct sk_msg_md {
 	__u32 local_ip6[4];	/* Stored in network byte order */
 	__u32 remote_port;	/* Stored in network byte order */
 	__u32 local_port;	/* stored in host byte order */
+	__u32 size;		/* Total size of sk_msg */
 };
 
 struct sk_reuseport_md {
diff --git a/net/core/filter.c b/net/core/filter.c
index bd0df75..55fd237 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -7513,6 +7513,12 @@ static u32 sk_msg_convert_ctx_access(enum bpf_access_type type,
 		*insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
 				      offsetof(struct sock_common, skc_num));
 		break;
+
+	case offsetof(struct sk_msg_md, size):
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_msg_sg, size),
+				      si->dst_reg, si->src_reg,
+				      offsetof(struct sk_msg_sg, size));
+		break;
 	}
 
 	return insn - insn_buf;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next 2/3] bpf: add tools lib/include support sk_msg_md size field
  2018-12-16 23:47 [PATCH bpf-next 0/3] add size field to sk_msg_md John Fastabend
  2018-12-16 23:47 ` [PATCH bpf-next 1/3] bpf: sockmap, metadata support for reporting size of msg John Fastabend
@ 2018-12-16 23:47 ` John Fastabend
  2018-12-16 23:47 ` [PATCH bpf-next 3/3] bpf: sk_msg, add tests for " John Fastabend
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: John Fastabend @ 2018-12-16 23:47 UTC (permalink / raw)
  To: daniel, ast; +Cc: netdev, john.fastabend

Add the size field to sk_msg_md for tools.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 tools/include/uapi/linux/bpf.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 597afdb..498badc 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2608,6 +2608,7 @@ struct sk_msg_md {
 	__u32 local_ip6[4];	/* Stored in network byte order */
 	__u32 remote_port;	/* Stored in network byte order */
 	__u32 local_port;	/* stored in host byte order */
+	__u32 size;		/* Total size of sk_msg */
 };
 
 struct sk_reuseport_md {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next 3/3] bpf: sk_msg, add tests for size field
  2018-12-16 23:47 [PATCH bpf-next 0/3] add size field to sk_msg_md John Fastabend
  2018-12-16 23:47 ` [PATCH bpf-next 1/3] bpf: sockmap, metadata support for reporting size of msg John Fastabend
  2018-12-16 23:47 ` [PATCH bpf-next 2/3] bpf: add tools lib/include support sk_msg_md size field John Fastabend
@ 2018-12-16 23:47 ` John Fastabend
  2018-12-18 18:44 ` [PATCH bpf-next 0/3] add size field to sk_msg_md Martin Lau
  2018-12-18 23:32 ` Daniel Borkmann
  4 siblings, 0 replies; 8+ messages in thread
From: John Fastabend @ 2018-12-16 23:47 UTC (permalink / raw)
  To: daniel, ast; +Cc: netdev, john.fastabend

This adds tests to read the size field to test_verifier.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 tools/testing/selftests/bpf/test_verifier.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index 537a8f9..65ea2fa 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -1792,10 +1792,20 @@ static struct bpf_test tests[] = {
 		.prog_type = BPF_PROG_TYPE_SK_SKB,
 	},
 	{
-		"invalid 64B read of family in SK_MSG",
+		"valid access size in SK_MSG",
+		.insns = {
+			BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1,
+				    offsetof(struct sk_msg_md, size)),
+			BPF_EXIT_INSN(),
+		},
+		.result = ACCEPT,
+		.prog_type = BPF_PROG_TYPE_SK_MSG,
+	},
+	{
+		"invalid 64B read of size in SK_MSG",
 		.insns = {
 			BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1,
-				    offsetof(struct sk_msg_md, family)),
+				    offsetof(struct sk_msg_md, size)),
 			BPF_EXIT_INSN(),
 		},
 		.errstr = "invalid bpf_context access",
@@ -1806,7 +1816,7 @@ static struct bpf_test tests[] = {
 		"invalid read past end of SK_MSG",
 		.insns = {
 			BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,
-				    offsetof(struct sk_msg_md, local_port) + 4),
+				    offsetof(struct sk_msg_md, size) + 4),
 			BPF_EXIT_INSN(),
 		},
 		.errstr = "R0 !read_ok",
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next 0/3] add size field to sk_msg_md
  2018-12-16 23:47 [PATCH bpf-next 0/3] add size field to sk_msg_md John Fastabend
                   ` (2 preceding siblings ...)
  2018-12-16 23:47 ` [PATCH bpf-next 3/3] bpf: sk_msg, add tests for " John Fastabend
@ 2018-12-18 18:44 ` Martin Lau
  2018-12-18 23:32 ` Daniel Borkmann
  4 siblings, 0 replies; 8+ messages in thread
From: Martin Lau @ 2018-12-18 18:44 UTC (permalink / raw)
  To: John Fastabend; +Cc: daniel, ast, netdev

On Sun, Dec 16, 2018 at 03:47:03PM -0800, John Fastabend wrote:
> This adds a size field to the sk_msg_md data structure used by SK_MSG
> programs. Without this in the zerocopy case and in the copy case
> where multiple iovs are in use its difficult to know how much data
> can be pulled in. The normal method of reading data and data_end
> only give the current contiguous buffer. BPF programs can attempt to
> pull in extra data but have to guess if it exists. This can result
> in multiple "guesses" its much better if we know upfront the size
> of the sk_msg.
Acked-by: Martin KaFai Lau <kafai@fb.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next 0/3] add size field to sk_msg_md
  2018-12-16 23:47 [PATCH bpf-next 0/3] add size field to sk_msg_md John Fastabend
                   ` (3 preceding siblings ...)
  2018-12-18 18:44 ` [PATCH bpf-next 0/3] add size field to sk_msg_md Martin Lau
@ 2018-12-18 23:32 ` Daniel Borkmann
  4 siblings, 0 replies; 8+ messages in thread
From: Daniel Borkmann @ 2018-12-18 23:32 UTC (permalink / raw)
  To: John Fastabend, ast; +Cc: netdev

On 12/17/2018 12:47 AM, John Fastabend wrote:
> This adds a size field to the sk_msg_md data structure used by SK_MSG
> programs. Without this in the zerocopy case and in the copy case
> where multiple iovs are in use its difficult to know how much data
> can be pulled in. The normal method of reading data and data_end
> only give the current contiguous buffer. BPF programs can attempt to
> pull in extra data but have to guess if it exists. This can result
> in multiple "guesses" its much better if we know upfront the size
> of the sk_msg.
> 
> John Fastabend (3):
>   bpf: sockmap, metadata support for reporting length of message
>   bpf: add tools lib/include support sk_msg_md size field
>   bpf: sk_msg, add tests for size field
> 
>  include/linux/skmsg.h                       |  3 +++
>  include/uapi/linux/bpf.h                    |  1 +
>  net/core/filter.c                           |  6 ++++++
>  tools/include/uapi/linux/bpf.h              |  1 +
>  tools/testing/selftests/bpf/test_verifier.c | 16 +++++++++++++---
>  5 files changed, 24 insertions(+), 3 deletions(-)
> 

Applied, thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next 1/3] bpf: sockmap, metadata support for reporting size of msg
  2018-12-16 23:47 ` [PATCH bpf-next 1/3] bpf: sockmap, metadata support for reporting size of msg John Fastabend
@ 2018-12-20  2:49   ` Alexei Starovoitov
  2018-12-20  6:30     ` John Fastabend
  0 siblings, 1 reply; 8+ messages in thread
From: Alexei Starovoitov @ 2018-12-20  2:49 UTC (permalink / raw)
  To: John Fastabend; +Cc: daniel, ast, netdev

On Sun, Dec 16, 2018 at 03:47:04PM -0800, John Fastabend wrote:
> This adds metadata to sk_msg_md for BPF programs to read the sk_msg
> size.
> 
> When the SK_MSG program is running under an application that is using
> sendfile the data is not copied into sk_msg buffers by default. Rather
> the BPF program uses sk_msg_pull_data to read the bytes in. This
> avoids doing the costly memcopy instructions when they are not in
> fact needed. However, if we don't know the size of the sk_msg we
> have to guess if needed bytes are available by doing a pull request
> which may fail. By including the size of the sk_msg BPF programs can
> check the size before issuing sk_msg_pull_data requests.
> 
> Additionally, the same applies for sendmsg calls when the application
> provides multiple iovs. Here the BPF program needs to pull in data
> to update data pointers but its not clear where the data ends without
> a size parameter. In many cases "guessing" is not easy to do
> and results in multiple calls to pull and without bounded loops
> everything gets fairly tricky.
> 
> Clean this up by including a u32 size field. Note, all writes into
> sk_msg_md are rejected already from sk_msg_is_valid_access so nothing
> additional is needed there.
> 
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  include/linux/skmsg.h    | 3 +++
>  include/uapi/linux/bpf.h | 1 +
>  net/core/filter.c        | 6 ++++++
>  3 files changed, 10 insertions(+)
> 
> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index 2a11e9d..eb8f6cb 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -36,6 +36,9 @@ struct sk_msg_sg {
>  	struct scatterlist		data[MAX_MSG_FRAGS + 1];
>  };
>  
> +/* UAPI in filter.c depends on struct sk_msg_sg being first element. If
> + * this is moved filter.c also must be updated.
> + */
>  struct sk_msg {
>  	struct sk_msg_sg		sg;
>  	void				*data;
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 597afdb..498badc 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2608,6 +2608,7 @@ struct sk_msg_md {
>  	__u32 local_ip6[4];	/* Stored in network byte order */
>  	__u32 remote_port;	/* Stored in network byte order */
>  	__u32 local_port;	/* stored in host byte order */
> +	__u32 size;		/* Total size of sk_msg */
>  };
>  
>  struct sk_reuseport_md {
> diff --git a/net/core/filter.c b/net/core/filter.c
> index bd0df75..55fd237 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -7513,6 +7513,12 @@ static u32 sk_msg_convert_ctx_access(enum bpf_access_type type,
>  		*insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
>  				      offsetof(struct sock_common, skc_num));
>  		break;
> +
> +	case offsetof(struct sk_msg_md, size):
> +		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_msg_sg, size),
> +				      si->dst_reg, si->src_reg,
> +				      offsetof(struct sk_msg_sg, size));
> +		break;

John,

this change broke "test_verifier 129" test.
Now it's failing.
But upon further examination both sk_msg_is_valid_access()
and that test were incorrect.
Here is the bug:
        if (off < 0 || off >= sizeof(struct sk_msg_md))

sizeof() includes padding to 8 bytes.
So out of bounds access passes sk_msg_is_valid_access(),
but rewritten incorrectly by sk_msg_convert_ctx_access()
and the test is testing for wrong thing.
errstr = "R0 !read_ok" is not the message it should be looking for.

After this patch and following adjustment to test_verifier.c
that test is now failing while verifier doing the right thing.
Please submit two patches to fix this:
1 - fix sk_msg_convert_ctx_access
2 - fix test_verifier

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next 1/3] bpf: sockmap, metadata support for reporting size of msg
  2018-12-20  2:49   ` Alexei Starovoitov
@ 2018-12-20  6:30     ` John Fastabend
  0 siblings, 0 replies; 8+ messages in thread
From: John Fastabend @ 2018-12-20  6:30 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: daniel, ast, netdev

On 12/19/18 6:49 PM, Alexei Starovoitov wrote:
> On Sun, Dec 16, 2018 at 03:47:04PM -0800, John Fastabend wrote:
>> This adds metadata to sk_msg_md for BPF programs to read the sk_msg
>> size.
>>
>> When the SK_MSG program is running under an application that is using
>> sendfile the data is not copied into sk_msg buffers by default. Rather
>> the BPF program uses sk_msg_pull_data to read the bytes in. This
>> avoids doing the costly memcopy instructions when they are not in
>> fact needed. However, if we don't know the size of the sk_msg we
>> have to guess if needed bytes are available by doing a pull request
>> which may fail. By including the size of the sk_msg BPF programs can
>> check the size before issuing sk_msg_pull_data requests.
>>
>> Additionally, the same applies for sendmsg calls when the application
>> provides multiple iovs. Here the BPF program needs to pull in data
>> to update data pointers but its not clear where the data ends without
>> a size parameter. In many cases "guessing" is not easy to do
>> and results in multiple calls to pull and without bounded loops
>> everything gets fairly tricky.
>>
>> Clean this up by including a u32 size field. Note, all writes into
>> sk_msg_md are rejected already from sk_msg_is_valid_access so nothing
>> additional is needed there.
>>
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>> ---
>>  include/linux/skmsg.h    | 3 +++
>>  include/uapi/linux/bpf.h | 1 +
>>  net/core/filter.c        | 6 ++++++
>>  3 files changed, 10 insertions(+)
>>
>> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
>> index 2a11e9d..eb8f6cb 100644
>> --- a/include/linux/skmsg.h
>> +++ b/include/linux/skmsg.h
>> @@ -36,6 +36,9 @@ struct sk_msg_sg {
>>  	struct scatterlist		data[MAX_MSG_FRAGS + 1];
>>  };
>>  
>> +/* UAPI in filter.c depends on struct sk_msg_sg being first element. If
>> + * this is moved filter.c also must be updated.
>> + */
>>  struct sk_msg {
>>  	struct sk_msg_sg		sg;
>>  	void				*data;
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 597afdb..498badc 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -2608,6 +2608,7 @@ struct sk_msg_md {
>>  	__u32 local_ip6[4];	/* Stored in network byte order */
>>  	__u32 remote_port;	/* Stored in network byte order */
>>  	__u32 local_port;	/* stored in host byte order */
>> +	__u32 size;		/* Total size of sk_msg */
>>  };
>>  
>>  struct sk_reuseport_md {
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index bd0df75..55fd237 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -7513,6 +7513,12 @@ static u32 sk_msg_convert_ctx_access(enum bpf_access_type type,
>>  		*insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
>>  				      offsetof(struct sock_common, skc_num));
>>  		break;
>> +
>> +	case offsetof(struct sk_msg_md, size):
>> +		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_msg_sg, size),
>> +				      si->dst_reg, si->src_reg,
>> +				      offsetof(struct sk_msg_sg, size));
>> +		break;
> 
> John,
> 
> this change broke "test_verifier 129" test.

Will need to see why its not failing for me. But thanks for
catching.

> Now it's failing.
> But upon further examination both sk_msg_is_valid_access()
> and that test were incorrect.
> Here is the bug:
>         if (off < 0 || off >= sizeof(struct sk_msg_md))
>

thanks. Will use bpf_ctx_range() check instead.


> sizeof() includes padding to 8 bytes.
> So out of bounds access passes sk_msg_is_valid_access(),
> but rewritten incorrectly by sk_msg_convert_ctx_access()
> and the test is testing for wrong thing.
> errstr = "R0 !read_ok" is not the message it should be looking for.

typo :(

> 
> After this patch and following adjustment to test_verifier.c
> that test is now failing while verifier doing the right thing.
> Please submit two patches to fix this:
> 1 - fix sk_msg_convert_ctx_access
> 2 - fix test_verifier
> 

Creating fix now thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-12-20  6:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-16 23:47 [PATCH bpf-next 0/3] add size field to sk_msg_md John Fastabend
2018-12-16 23:47 ` [PATCH bpf-next 1/3] bpf: sockmap, metadata support for reporting size of msg John Fastabend
2018-12-20  2:49   ` Alexei Starovoitov
2018-12-20  6:30     ` John Fastabend
2018-12-16 23:47 ` [PATCH bpf-next 2/3] bpf: add tools lib/include support sk_msg_md size field John Fastabend
2018-12-16 23:47 ` [PATCH bpf-next 3/3] bpf: sk_msg, add tests for " John Fastabend
2018-12-18 18:44 ` [PATCH bpf-next 0/3] add size field to sk_msg_md Martin Lau
2018-12-18 23:32 ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).