All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] bpf: add support for SO_PRIORITY in bpf_getsockopt
@ 2017-11-09 23:04 Vlad Dumitrescu
  2017-11-10  0:43 ` Alexei Starovoitov
  0 siblings, 1 reply; 12+ messages in thread
From: Vlad Dumitrescu @ 2017-11-09 23:04 UTC (permalink / raw)
  To: davem, ast, daniel, brakmo; +Cc: netdev, kraigatgoog

From: Vlad Dumitrescu <vladum@google.com>

Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.

Signed-off-by: Vlad Dumitrescu <vladum@google.com>
---
 net/core/filter.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 1afa17935954..61c791f9f628 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3292,8 +3292,20 @@ BPF_CALL_5(bpf_getsockopt, struct bpf_sock_ops_kern *, bpf_sock,
 	if (!sk_fullsock(sk))
 		goto err_clear;
 
+	if (level == SOL_SOCKET) {
+		if (optlen != sizeof(int))
+			goto err_clear;
+
+		switch (optname) {
+		case SO_PRIORITY:
+			*((int *)optval) = sk->sk_priority;
+			break;
+		default:
+			goto err_clear;
+		}
 #ifdef CONFIG_INET
-	if (level == SOL_TCP && sk->sk_prot->getsockopt == tcp_getsockopt) {
+	} else if (level == SOL_TCP &&
+		   sk->sk_prot->getsockopt == tcp_getsockopt) {
 		if (optname == TCP_CONGESTION) {
 			struct inet_connection_sock *icsk = inet_csk(sk);
 
@@ -3304,11 +3316,11 @@ BPF_CALL_5(bpf_getsockopt, struct bpf_sock_ops_kern *, bpf_sock,
 		} else {
 			goto err_clear;
 		}
+#endif
 	} else {
 		goto err_clear;
 	}
 	return 0;
-#endif
 err_clear:
 	memset(optval, 0, optlen);
 	return -EINVAL;
-- 
2.15.0.448.gf294e3d99a-goog

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] bpf: add support for SO_PRIORITY in bpf_getsockopt
  2017-11-09 23:04 [PATCH net-next] bpf: add support for SO_PRIORITY in bpf_getsockopt Vlad Dumitrescu
@ 2017-11-10  0:43 ` Alexei Starovoitov
  2017-11-10 17:51   ` Vlad Dumitrescu
  2017-11-10 19:17   ` [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops Vlad Dumitrescu
  0 siblings, 2 replies; 12+ messages in thread
From: Alexei Starovoitov @ 2017-11-10  0:43 UTC (permalink / raw)
  To: Vlad Dumitrescu, davem, daniel, brakmo; +Cc: netdev, kraigatgoog

On 11/10/17 8:04 AM, Vlad Dumitrescu wrote:
> From: Vlad Dumitrescu <vladum@google.com>
>
> Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
>
> Signed-off-by: Vlad Dumitrescu <vladum@google.com>
> ---
>  net/core/filter.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 1afa17935954..61c791f9f628 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3292,8 +3292,20 @@ BPF_CALL_5(bpf_getsockopt, struct bpf_sock_ops_kern *, bpf_sock,
>  	if (!sk_fullsock(sk))
>  		goto err_clear;
>
> +	if (level == SOL_SOCKET) {
> +		if (optlen != sizeof(int))
> +			goto err_clear;
> +
> +		switch (optname) {
> +		case SO_PRIORITY:
> +			*((int *)optval) = sk->sk_priority;

would be cleaner to add sk_priority to 'struct bpf_sock_ops' instead.
Faster runtime too.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] bpf: add support for SO_PRIORITY in bpf_getsockopt
  2017-11-10  0:43 ` Alexei Starovoitov
@ 2017-11-10 17:51   ` Vlad Dumitrescu
  2017-11-10 19:17   ` [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops Vlad Dumitrescu
  1 sibling, 0 replies; 12+ messages in thread
From: Vlad Dumitrescu @ 2017-11-10 17:51 UTC (permalink / raw)
  To: Alexei Starovoitov, davem, daniel, brakmo; +Cc: netdev, kraigatgoog

On Thu, Nov 9, 2017 at 4:43 PM, Alexei Starovoitov <ast@fb.com> wrote:
> On 11/10/17 8:04 AM, Vlad Dumitrescu wrote:
>>
>> From: Vlad Dumitrescu <vladum@google.com>
>>
>> Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
>>
>> Signed-off-by: Vlad Dumitrescu <vladum@google.com>
>> ---
>>  net/core/filter.c | 16 ++++++++++++++--
>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 1afa17935954..61c791f9f628 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -3292,8 +3292,20 @@ BPF_CALL_5(bpf_getsockopt, struct bpf_sock_ops_kern
>> *, bpf_sock,
>>         if (!sk_fullsock(sk))
>>                 goto err_clear;
>>
>> +       if (level == SOL_SOCKET) {
>> +               if (optlen != sizeof(int))
>> +                       goto err_clear;
>> +
>> +               switch (optname) {
>> +               case SO_PRIORITY:
>> +                       *((int *)optval) = sk->sk_priority;
>
>
> would be cleaner to add sk_priority to 'struct bpf_sock_ops' instead.
> Faster runtime too.
>

I agree it will be faster, and I considered that as well. However, I
was aiming for consistency with the set function, which supports
SO_PRIORITY.

Maybe both (I have no preference)? I'll prepare the patch for
bpf_sock_ops in the meantime.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops
  2017-11-10  0:43 ` Alexei Starovoitov
  2017-11-10 17:51   ` Vlad Dumitrescu
@ 2017-11-10 19:17   ` Vlad Dumitrescu
  2017-11-10 21:07     ` Daniel Borkmann
  1 sibling, 1 reply; 12+ messages in thread
From: Vlad Dumitrescu @ 2017-11-10 19:17 UTC (permalink / raw)
  To: davem, ast, daniel, brakmo; +Cc: netdev, kraigatgoog

From: Vlad Dumitrescu <vladum@google.com>

Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.

Signed-off-by: Vlad Dumitrescu <vladum@google.com>
---
 include/uapi/linux/bpf.h       |  1 +
 net/core/filter.c              | 11 +++++++++++
 tools/include/uapi/linux/bpf.h |  1 +
 3 files changed, 13 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index e880ae6434ee..9757a2002513 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -947,6 +947,7 @@ struct bpf_sock_ops {
 	__u32 local_ip6[4];	/* Stored in network byte order */
 	__u32 remote_port;	/* Stored in network byte order */
 	__u32 local_port;	/* stored in host byte order */
+	__u32 priority;
 };
 
 /* List of known BPF sock_ops operators.
diff --git a/net/core/filter.c b/net/core/filter.c
index 61c791f9f628..a6329642d047 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4449,6 +4449,17 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type,
 		*insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
 				      offsetof(struct sock_common, skc_num));
 		break;
+
+	case offsetof(struct bpf_sock_ops, priority):
+		BUILD_BUG_ON(FIELD_SIZEOF(struct sock, sk_priority) != 4);
+
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
+						struct bpf_sock_ops_kern, sk),
+				      si->dst_reg, si->src_reg,
+				      offsetof(struct bpf_sock_ops_kern, sk));
+		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
+				      offsetof(struct sock, sk_priority));
+		break;
 	}
 	return insn - insn_buf;
 }
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index e880ae6434ee..9757a2002513 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -947,6 +947,7 @@ struct bpf_sock_ops {
 	__u32 local_ip6[4];	/* Stored in network byte order */
 	__u32 remote_port;	/* Stored in network byte order */
 	__u32 local_port;	/* stored in host byte order */
+	__u32 priority;
 };
 
 /* List of known BPF sock_ops operators.
-- 
2.15.0.448.gf294e3d99a-goog

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops
  2017-11-10 19:17   ` [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops Vlad Dumitrescu
@ 2017-11-10 21:07     ` Daniel Borkmann
  2017-11-11  4:06       ` Alexei Starovoitov
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2017-11-10 21:07 UTC (permalink / raw)
  To: Vlad Dumitrescu, davem, ast, brakmo; +Cc: netdev, kraigatgoog

On 11/10/2017 08:17 PM, Vlad Dumitrescu wrote:
> From: Vlad Dumitrescu <vladum@google.com>
> 
> Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
> 
> Signed-off-by: Vlad Dumitrescu <vladum@google.com>
> ---
>   include/uapi/linux/bpf.h       |  1 +
>   net/core/filter.c              | 11 +++++++++++
>   tools/include/uapi/linux/bpf.h |  1 +
>   3 files changed, 13 insertions(+)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index e880ae6434ee..9757a2002513 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -947,6 +947,7 @@ struct bpf_sock_ops {
>   	__u32 local_ip6[4];	/* Stored in network byte order */
>   	__u32 remote_port;	/* Stored in network byte order */
>   	__u32 local_port;	/* stored in host byte order */
> +	__u32 priority;
>   };
>   
>   /* List of known BPF sock_ops operators.
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 61c791f9f628..a6329642d047 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -4449,6 +4449,17 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type,
>   		*insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
>   				      offsetof(struct sock_common, skc_num));
>   		break;
> +
> +	case offsetof(struct bpf_sock_ops, priority):
> +		BUILD_BUG_ON(FIELD_SIZEOF(struct sock, sk_priority) != 4);
> +
> +		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
> +						struct bpf_sock_ops_kern, sk),
> +				      si->dst_reg, si->src_reg,
> +				      offsetof(struct bpf_sock_ops_kern, sk));
> +		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
> +				      offsetof(struct sock, sk_priority));
> +		break;

Hm, I don't think this would work, I actually think your initial patch was ok.
bpf_setsockopt() as well as bpf_getsockopt() check for sk_fullsock(sk) right
before accessing options on either socket or TCP level, and bail out with error
otherwise; in such cases we'd read something else here and assume it's sk_priority.

>   	}
>   	return insn - insn_buf;
>   }
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index e880ae6434ee..9757a2002513 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -947,6 +947,7 @@ struct bpf_sock_ops {
>   	__u32 local_ip6[4];	/* Stored in network byte order */
>   	__u32 remote_port;	/* Stored in network byte order */
>   	__u32 local_port;	/* stored in host byte order */
> +	__u32 priority;
>   };
>   
>   /* List of known BPF sock_ops operators.
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops
  2017-11-10 21:07     ` Daniel Borkmann
@ 2017-11-11  4:06       ` Alexei Starovoitov
  2017-11-11 20:46         ` Daniel Borkmann
  0 siblings, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2017-11-11  4:06 UTC (permalink / raw)
  To: Daniel Borkmann, Vlad Dumitrescu, davem, brakmo; +Cc: netdev, kraigatgoog

On 11/11/17 6:07 AM, Daniel Borkmann wrote:
> On 11/10/2017 08:17 PM, Vlad Dumitrescu wrote:
>> From: Vlad Dumitrescu <vladum@google.com>
>>
>> Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
>>
>> Signed-off-by: Vlad Dumitrescu <vladum@google.com>
>> ---
>>   include/uapi/linux/bpf.h       |  1 +
>>   net/core/filter.c              | 11 +++++++++++
>>   tools/include/uapi/linux/bpf.h |  1 +
>>   3 files changed, 13 insertions(+)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index e880ae6434ee..9757a2002513 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -947,6 +947,7 @@ struct bpf_sock_ops {
>>       __u32 local_ip6[4];    /* Stored in network byte order */
>>       __u32 remote_port;    /* Stored in network byte order */
>>       __u32 local_port;    /* stored in host byte order */
>> +    __u32 priority;
>>   };
>>     /* List of known BPF sock_ops operators.
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 61c791f9f628..a6329642d047 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -4449,6 +4449,17 @@ static u32 sock_ops_convert_ctx_access(enum
>> bpf_access_type type,
>>           *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
>>                         offsetof(struct sock_common, skc_num));
>>           break;
>> +
>> +    case offsetof(struct bpf_sock_ops, priority):
>> +        BUILD_BUG_ON(FIELD_SIZEOF(struct sock, sk_priority) != 4);
>> +
>> +        *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
>> +                        struct bpf_sock_ops_kern, sk),
>> +                      si->dst_reg, si->src_reg,
>> +                      offsetof(struct bpf_sock_ops_kern, sk));
>> +        *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
>> +                      offsetof(struct sock, sk_priority));
>> +        break;
>
> Hm, I don't think this would work, I actually think your initial patch
> was ok.
> bpf_setsockopt() as well as bpf_getsockopt() check for sk_fullsock(sk)
> right
> before accessing options on either socket or TCP level, and bail out
> with error
> otherwise; in such cases we'd read something else here and assume it's
> sk_priority.

even if it's not fullsock, it will just read zero, no? what's a problem
with that?
In non-fullsock hooks like BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
the program author will know that it's meaningless to read sk_priority,
so returning zero with minimal checks is fine.
While adding extra runtime if (sk_fullsock(sk)) is unnecessary,
since the safety is not compromised.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops
  2017-11-11  4:06       ` Alexei Starovoitov
@ 2017-11-11 20:46         ` Daniel Borkmann
  2017-11-11 22:38           ` Alexei Starovoitov
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2017-11-11 20:46 UTC (permalink / raw)
  To: Alexei Starovoitov, Vlad Dumitrescu, davem, brakmo; +Cc: netdev, kraigatgoog

On 11/11/2017 05:06 AM, Alexei Starovoitov wrote:
> On 11/11/17 6:07 AM, Daniel Borkmann wrote:
>> On 11/10/2017 08:17 PM, Vlad Dumitrescu wrote:
>>> From: Vlad Dumitrescu <vladum@google.com>
>>>
>>> Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
>>>
>>> Signed-off-by: Vlad Dumitrescu <vladum@google.com>
>>> ---
>>>   include/uapi/linux/bpf.h       |  1 +
>>>   net/core/filter.c              | 11 +++++++++++
>>>   tools/include/uapi/linux/bpf.h |  1 +
>>>   3 files changed, 13 insertions(+)
>>>
>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>>> index e880ae6434ee..9757a2002513 100644
>>> --- a/include/uapi/linux/bpf.h
>>> +++ b/include/uapi/linux/bpf.h
>>> @@ -947,6 +947,7 @@ struct bpf_sock_ops {
>>>       __u32 local_ip6[4];    /* Stored in network byte order */
>>>       __u32 remote_port;    /* Stored in network byte order */
>>>       __u32 local_port;    /* stored in host byte order */
>>> +    __u32 priority;
>>>   };
>>>     /* List of known BPF sock_ops operators.
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index 61c791f9f628..a6329642d047 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -4449,6 +4449,17 @@ static u32 sock_ops_convert_ctx_access(enum
>>> bpf_access_type type,
>>>           *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
>>>                         offsetof(struct sock_common, skc_num));
>>>           break;
>>> +
>>> +    case offsetof(struct bpf_sock_ops, priority):
>>> +        BUILD_BUG_ON(FIELD_SIZEOF(struct sock, sk_priority) != 4);
>>> +
>>> +        *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
>>> +                        struct bpf_sock_ops_kern, sk),
>>> +                      si->dst_reg, si->src_reg,
>>> +                      offsetof(struct bpf_sock_ops_kern, sk));
>>> +        *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
>>> +                      offsetof(struct sock, sk_priority));
>>> +        break;
>>
>> Hm, I don't think this would work, I actually think your initial patch
>> was ok.
>> bpf_setsockopt() as well as bpf_getsockopt() check for sk_fullsock(sk)
>> right
>> before accessing options on either socket or TCP level, and bail out
>> with error
>> otherwise; in such cases we'd read something else here and assume it's
>> sk_priority.
> 
> even if it's not fullsock, it will just read zero, no? what's a problem
> with that?
> In non-fullsock hooks like BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
> the program author will know that it's meaningless to read sk_priority,
> so returning zero with minimal checks is fine.
> While adding extra runtime if (sk_fullsock(sk)) is unnecessary,
> since the safety is not compromised.

Hm, on my kernel, struct sock has the 4 bytes sk_priority at offset 440,
struct request_sock itself is only 232 byte long in total, and the struct
inet_timewait_sock is 208 byte long, so you'd be accessing out of bounds
that way, so it cannot be ignored and assumed zero.

If we don't care about error when !fullsock, then you could code the
sk_fullsock(sk) check in BPF itself above in the ctx conversion, and
set it to 0 manually when !fullsock. It might make it harder in the
future to change sk_fullsock() itself, but in any case sk_fullsock()
helper should get a comment in its function saying that when contents
are changed, also above BPF bits need to be adjusted to remain an
equivalent test.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops
  2017-11-11 20:46         ` Daniel Borkmann
@ 2017-11-11 22:38           ` Alexei Starovoitov
  2017-11-13 19:00             ` Vlad Dumitrescu
  0 siblings, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2017-11-11 22:38 UTC (permalink / raw)
  To: Daniel Borkmann, Vlad Dumitrescu, davem, brakmo; +Cc: netdev, kraigatgoog

On 11/12/17 4:46 AM, Daniel Borkmann wrote:
> On 11/11/2017 05:06 AM, Alexei Starovoitov wrote:
>> On 11/11/17 6:07 AM, Daniel Borkmann wrote:
>>> On 11/10/2017 08:17 PM, Vlad Dumitrescu wrote:
>>>> From: Vlad Dumitrescu <vladum@google.com>
>>>>
>>>> Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
>>>>
>>>> Signed-off-by: Vlad Dumitrescu <vladum@google.com>
>>>> ---
>>>>   include/uapi/linux/bpf.h       |  1 +
>>>>   net/core/filter.c              | 11 +++++++++++
>>>>   tools/include/uapi/linux/bpf.h |  1 +
>>>>   3 files changed, 13 insertions(+)
>>>>
>>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>>>> index e880ae6434ee..9757a2002513 100644
>>>> --- a/include/uapi/linux/bpf.h
>>>> +++ b/include/uapi/linux/bpf.h
>>>> @@ -947,6 +947,7 @@ struct bpf_sock_ops {
>>>>       __u32 local_ip6[4];    /* Stored in network byte order */
>>>>       __u32 remote_port;    /* Stored in network byte order */
>>>>       __u32 local_port;    /* stored in host byte order */
>>>> +    __u32 priority;
>>>>   };
>>>>     /* List of known BPF sock_ops operators.
>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>>> index 61c791f9f628..a6329642d047 100644
>>>> --- a/net/core/filter.c
>>>> +++ b/net/core/filter.c
>>>> @@ -4449,6 +4449,17 @@ static u32 sock_ops_convert_ctx_access(enum
>>>> bpf_access_type type,
>>>>           *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
>>>>                         offsetof(struct sock_common, skc_num));
>>>>           break;
>>>> +
>>>> +    case offsetof(struct bpf_sock_ops, priority):
>>>> +        BUILD_BUG_ON(FIELD_SIZEOF(struct sock, sk_priority) != 4);
>>>> +
>>>> +        *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
>>>> +                        struct bpf_sock_ops_kern, sk),
>>>> +                      si->dst_reg, si->src_reg,
>>>> +                      offsetof(struct bpf_sock_ops_kern, sk));
>>>> +        *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
>>>> +                      offsetof(struct sock, sk_priority));
>>>> +        break;
>>>
>>> Hm, I don't think this would work, I actually think your initial patch
>>> was ok.
>>> bpf_setsockopt() as well as bpf_getsockopt() check for sk_fullsock(sk)
>>> right
>>> before accessing options on either socket or TCP level, and bail out
>>> with error
>>> otherwise; in such cases we'd read something else here and assume it's
>>> sk_priority.
>>
>> even if it's not fullsock, it will just read zero, no? what's a problem
>> with that?
>> In non-fullsock hooks like BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
>> the program author will know that it's meaningless to read sk_priority,
>> so returning zero with minimal checks is fine.
>> While adding extra runtime if (sk_fullsock(sk)) is unnecessary,
>> since the safety is not compromised.
>
> Hm, on my kernel, struct sock has the 4 bytes sk_priority at offset 440,
> struct request_sock itself is only 232 byte long in total, and the struct
> inet_timewait_sock is 208 byte long, so you'd be accessing out of bounds
> that way, so it cannot be ignored and assumed zero.

I thought we always pass fully allocated sock but technically not 
fullsock yet. My mistake. We do: tcp_timeout_init((struct sock *)req))
so yeah ctx rewrite approach won't work.
Let's go back to access via helper.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops
  2017-11-11 22:38           ` Alexei Starovoitov
@ 2017-11-13 19:00             ` Vlad Dumitrescu
  2017-11-13 20:09               ` Lawrence Brakmo
  0 siblings, 1 reply; 12+ messages in thread
From: Vlad Dumitrescu @ 2017-11-13 19:00 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, brakmo, davem; +Cc: netdev, Craig Gallek

On Sat, Nov 11, 2017 at 2:38 PM, Alexei Starovoitov <ast@fb.com> wrote:
>
> On 11/12/17 4:46 AM, Daniel Borkmann wrote:
>>
>> On 11/11/2017 05:06 AM, Alexei Starovoitov wrote:
>>>
>>> On 11/11/17 6:07 AM, Daniel Borkmann wrote:
>>>>
>>>> On 11/10/2017 08:17 PM, Vlad Dumitrescu wrote:
>>>>>
>>>>> From: Vlad Dumitrescu <vladum@google.com>
>>>>>
>>>>> Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
>>>>>
>>>>> Signed-off-by: Vlad Dumitrescu <vladum@google.com>
>>>>> ---
>>>>>   include/uapi/linux/bpf.h       |  1 +
>>>>>   net/core/filter.c              | 11 +++++++++++
>>>>>   tools/include/uapi/linux/bpf.h |  1 +
>>>>>   3 files changed, 13 insertions(+)
>>>>>
>>>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>>>>> index e880ae6434ee..9757a2002513 100644
>>>>> --- a/include/uapi/linux/bpf.h
>>>>> +++ b/include/uapi/linux/bpf.h
>>>>> @@ -947,6 +947,7 @@ struct bpf_sock_ops {
>>>>>       __u32 local_ip6[4];    /* Stored in network byte order */
>>>>>       __u32 remote_port;    /* Stored in network byte order */
>>>>>       __u32 local_port;    /* stored in host byte order */
>>>>> +    __u32 priority;
>>>>>   };
>>>>>     /* List of known BPF sock_ops operators.
>>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>>>> index 61c791f9f628..a6329642d047 100644
>>>>> --- a/net/core/filter.c
>>>>> +++ b/net/core/filter.c
>>>>> @@ -4449,6 +4449,17 @@ static u32 sock_ops_convert_ctx_access(enum
>>>>> bpf_access_type type,
>>>>>           *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
>>>>>                         offsetof(struct sock_common, skc_num));
>>>>>           break;
>>>>> +
>>>>> +    case offsetof(struct bpf_sock_ops, priority):
>>>>> +        BUILD_BUG_ON(FIELD_SIZEOF(struct sock, sk_priority) != 4);
>>>>> +
>>>>> +        *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
>>>>> +                        struct bpf_sock_ops_kern, sk),
>>>>> +                      si->dst_reg, si->src_reg,
>>>>> +                      offsetof(struct bpf_sock_ops_kern, sk));
>>>>> +        *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
>>>>> +                      offsetof(struct sock, sk_priority));
>>>>> +        break;
>>>>
>>>>
>>>> Hm, I don't think this would work, I actually think your initial patch
>>>> was ok.
>>>> bpf_setsockopt() as well as bpf_getsockopt() check for sk_fullsock(sk)
>>>> right
>>>> before accessing options on either socket or TCP level, and bail out
>>>> with error
>>>> otherwise; in such cases we'd read something else here and assume it's
>>>> sk_priority.
>>>
>>>
>>> even if it's not fullsock, it will just read zero, no? what's a problem
>>> with that?
>>> In non-fullsock hooks like BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
>>> the program author will know that it's meaningless to read sk_priority,
>>> so returning zero with minimal checks is fine.
>>> While adding extra runtime if (sk_fullsock(sk)) is unnecessary,
>>> since the safety is not compromised.
>>
>>
>> Hm, on my kernel, struct sock has the 4 bytes sk_priority at offset 440,
>> struct request_sock itself is only 232 byte long in total, and the struct
>> inet_timewait_sock is 208 byte long, so you'd be accessing out of bounds
>> that way, so it cannot be ignored and assumed zero.
>
>
> I thought we always pass fully allocated sock but technically not fullsock yet. My mistake. We do: tcp_timeout_init((struct sock *)req))
> so yeah ctx rewrite approach won't work.
> Let's go back to access via helper.
>

TIL. Thanks!

Is there anything else needed from me to get the helper approach accepted?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops
  2017-11-13 19:00             ` Vlad Dumitrescu
@ 2017-11-13 20:09               ` Lawrence Brakmo
  2017-11-13 20:20                 ` Daniel Borkmann
  0 siblings, 1 reply; 12+ messages in thread
From: Lawrence Brakmo @ 2017-11-13 20:09 UTC (permalink / raw)
  To: Vlad Dumitrescu, Alexei Starovoitov, Daniel Borkmann, davem
  Cc: netdev, Craig Gallek

On 11/13/17, 11:01 AM, "Vlad Dumitrescu" <vlad@dumitrescu.ro> wrote:

    On Sat, Nov 11, 2017 at 2:38 PM, Alexei Starovoitov <ast@fb.com> wrote:
    >
    > On 11/12/17 4:46 AM, Daniel Borkmann wrote:
    >>
    >> On 11/11/2017 05:06 AM, Alexei Starovoitov wrote:
    >>>
    >>> On 11/11/17 6:07 AM, Daniel Borkmann wrote:
    >>>>
    >>>> On 11/10/2017 08:17 PM, Vlad Dumitrescu wrote:
    >>>>>
    >>>>> From: Vlad Dumitrescu <vladum@google.com>
    >>>>>
    >>>>> Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
    >>>>>
    >>>>> Signed-off-by: Vlad Dumitrescu <vladum@google.com>
    >>>>> ---
    >>>>>   include/uapi/linux/bpf.h       |  1 +
    >>>>>   net/core/filter.c              | 11 +++++++++++
    >>>>>   tools/include/uapi/linux/bpf.h |  1 +
    >>>>>   3 files changed, 13 insertions(+)
    >>>>>
    >>>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
    >>>>> index e880ae6434ee..9757a2002513 100644
    >>>>> --- a/include/uapi/linux/bpf.h
    >>>>> +++ b/include/uapi/linux/bpf.h
    >>>>> @@ -947,6 +947,7 @@ struct bpf_sock_ops {
    >>>>>       __u32 local_ip6[4];    /* Stored in network byte order */
    >>>>>       __u32 remote_port;    /* Stored in network byte order */
    >>>>>       __u32 local_port;    /* stored in host byte order */
    >>>>> +    __u32 priority;
    >>>>>   };
    >>>>>     /* List of known BPF sock_ops operators.
    >>>>> diff --git a/net/core/filter.c b/net/core/filter.c
    >>>>> index 61c791f9f628..a6329642d047 100644
    >>>>> --- a/net/core/filter.c
    >>>>> +++ b/net/core/filter.c
    >>>>> @@ -4449,6 +4449,17 @@ static u32 sock_ops_convert_ctx_access(enum
    >>>>> bpf_access_type type,
    >>>>>           *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
    >>>>>                         offsetof(struct sock_common, skc_num));
    >>>>>           break;
    >>>>> +
    >>>>> +    case offsetof(struct bpf_sock_ops, priority):
    >>>>> +        BUILD_BUG_ON(FIELD_SIZEOF(struct sock, sk_priority) != 4);
    >>>>> +
    >>>>> +        *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
    >>>>> +                        struct bpf_sock_ops_kern, sk),
    >>>>> +                      si->dst_reg, si->src_reg,
    >>>>> +                      offsetof(struct bpf_sock_ops_kern, sk));
    >>>>> +        *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
    >>>>> +                      offsetof(struct sock, sk_priority));
    >>>>> +        break;
    >>>>
    >>>>
    >>>> Hm, I don't think this would work, I actually think your initial patch
    >>>> was ok.
    >>>> bpf_setsockopt() as well as bpf_getsockopt() check for sk_fullsock(sk)
    >>>> right
    >>>> before accessing options on either socket or TCP level, and bail out
    >>>> with error
    >>>> otherwise; in such cases we'd read something else here and assume it's
    >>>> sk_priority.
    >>>
    >>>
    >>> even if it's not fullsock, it will just read zero, no? what's a problem
    >>> with that?
    >>> In non-fullsock hooks like BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
    >>> the program author will know that it's meaningless to read sk_priority,
    >>> so returning zero with minimal checks is fine.
    >>> While adding extra runtime if (sk_fullsock(sk)) is unnecessary,
    >>> since the safety is not compromised.
    >>
    >>
    >> Hm, on my kernel, struct sock has the 4 bytes sk_priority at offset 440,
    >> struct request_sock itself is only 232 byte long in total, and the struct
    >> inet_timewait_sock is 208 byte long, so you'd be accessing out of bounds
    >> that way, so it cannot be ignored and assumed zero.
    >
    >
    > I thought we always pass fully allocated sock but technically not fullsock yet. My mistake. We do: tcp_timeout_init((struct sock *)req))
    > so yeah ctx rewrite approach won't work.
    > Let's go back to access via helper.
    >
    
    TIL. Thanks!
    
    Is there anything else needed from me to get the helper approach accepted?

I plan to add access to TCP state variables (cwnd, rtt, etc.) and I have been thinking
about this issue. I think it is possible to access it directly as long as we use a value
like 0xffffffff to represent an invalid value (e.g. not fullsock). The ctx rewrite just
needs to add a conditional to determine what to return. I would probably add a
field into the internal kernel struct to indicate if it is fullsock or not (we should
know when we call tcp_call_bpf whether it is a fullsock or not based on context).

Let me do a sample patch that I can send for review and get feedback from
Alexi and Daniel.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops
  2017-11-13 20:09               ` Lawrence Brakmo
@ 2017-11-13 20:20                 ` Daniel Borkmann
  2017-11-13 22:51                   ` Alexei Starovoitov
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2017-11-13 20:20 UTC (permalink / raw)
  To: Lawrence Brakmo, Vlad Dumitrescu, Alexei Starovoitov, davem
  Cc: netdev, Craig Gallek

On 11/13/2017 09:09 PM, Lawrence Brakmo wrote:
> On 11/13/17, 11:01 AM, "Vlad Dumitrescu" <vlad@dumitrescu.ro> wrote:
> 
>     On Sat, Nov 11, 2017 at 2:38 PM, Alexei Starovoitov <ast@fb.com> wrote:
>     >
>     > On 11/12/17 4:46 AM, Daniel Borkmann wrote:
>     >>
>     >> On 11/11/2017 05:06 AM, Alexei Starovoitov wrote:
>     >>>
>     >>> On 11/11/17 6:07 AM, Daniel Borkmann wrote:
>     >>>>
>     >>>> On 11/10/2017 08:17 PM, Vlad Dumitrescu wrote:
>     >>>>>
>     >>>>> From: Vlad Dumitrescu <vladum@google.com>
>     >>>>>
>     >>>>> Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
>     >>>>>
>     >>>>> Signed-off-by: Vlad Dumitrescu <vladum@google.com>
>     >>>>> ---
>     >>>>>   include/uapi/linux/bpf.h       |  1 +
>     >>>>>   net/core/filter.c              | 11 +++++++++++
>     >>>>>   tools/include/uapi/linux/bpf.h |  1 +
>     >>>>>   3 files changed, 13 insertions(+)
>     >>>>>
>     >>>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>     >>>>> index e880ae6434ee..9757a2002513 100644
>     >>>>> --- a/include/uapi/linux/bpf.h
>     >>>>> +++ b/include/uapi/linux/bpf.h
>     >>>>> @@ -947,6 +947,7 @@ struct bpf_sock_ops {
>     >>>>>       __u32 local_ip6[4];    /* Stored in network byte order */
>     >>>>>       __u32 remote_port;    /* Stored in network byte order */
>     >>>>>       __u32 local_port;    /* stored in host byte order */
>     >>>>> +    __u32 priority;
>     >>>>>   };
>     >>>>>     /* List of known BPF sock_ops operators.
>     >>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>     >>>>> index 61c791f9f628..a6329642d047 100644
>     >>>>> --- a/net/core/filter.c
>     >>>>> +++ b/net/core/filter.c
>     >>>>> @@ -4449,6 +4449,17 @@ static u32 sock_ops_convert_ctx_access(enum
>     >>>>> bpf_access_type type,
>     >>>>>           *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
>     >>>>>                         offsetof(struct sock_common, skc_num));
>     >>>>>           break;
>     >>>>> +
>     >>>>> +    case offsetof(struct bpf_sock_ops, priority):
>     >>>>> +        BUILD_BUG_ON(FIELD_SIZEOF(struct sock, sk_priority) != 4);
>     >>>>> +
>     >>>>> +        *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
>     >>>>> +                        struct bpf_sock_ops_kern, sk),
>     >>>>> +                      si->dst_reg, si->src_reg,
>     >>>>> +                      offsetof(struct bpf_sock_ops_kern, sk));
>     >>>>> +        *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
>     >>>>> +                      offsetof(struct sock, sk_priority));
>     >>>>> +        break;
>     >>>>
>     >>>>
>     >>>> Hm, I don't think this would work, I actually think your initial patch
>     >>>> was ok.
>     >>>> bpf_setsockopt() as well as bpf_getsockopt() check for sk_fullsock(sk)
>     >>>> right
>     >>>> before accessing options on either socket or TCP level, and bail out
>     >>>> with error
>     >>>> otherwise; in such cases we'd read something else here and assume it's
>     >>>> sk_priority.
>     >>>
>     >>>
>     >>> even if it's not fullsock, it will just read zero, no? what's a problem
>     >>> with that?
>     >>> In non-fullsock hooks like BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
>     >>> the program author will know that it's meaningless to read sk_priority,
>     >>> so returning zero with minimal checks is fine.
>     >>> While adding extra runtime if (sk_fullsock(sk)) is unnecessary,
>     >>> since the safety is not compromised.
>     >>
>     >>
>     >> Hm, on my kernel, struct sock has the 4 bytes sk_priority at offset 440,
>     >> struct request_sock itself is only 232 byte long in total, and the struct
>     >> inet_timewait_sock is 208 byte long, so you'd be accessing out of bounds
>     >> that way, so it cannot be ignored and assumed zero.
>     >
>     >
>     > I thought we always pass fully allocated sock but technically not fullsock yet. My mistake. We do: tcp_timeout_init((struct sock *)req))
>     > so yeah ctx rewrite approach won't work.
>     > Let's go back to access via helper.
>     >
>     
>     TIL. Thanks!
>     
>     Is there anything else needed from me to get the helper approach accepted?
> 
> I plan to add access to TCP state variables (cwnd, rtt, etc.) and I have been thinking
> about this issue. I think it is possible to access it directly as long as we use a value
> like 0xffffffff to represent an invalid value (e.g. not fullsock). The ctx rewrite just
> needs to add a conditional to determine what to return. I would probably add a
> field into the internal kernel struct to indicate if it is fullsock or not (we should
> know when we call tcp_call_bpf whether it is a fullsock or not based on context).
> 
> Let me do a sample patch that I can send for review and get feedback from
> Alexi and Daniel.

Agree, if the mov op from the ctx rewrite to read(/write) a sk member, for
example, is just a BPF_W, then we know upper reg bits are zero anyway for the
success case, so we might be able to utilize this for writing a signed error
back to the user if !fullsk.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops
  2017-11-13 20:20                 ` Daniel Borkmann
@ 2017-11-13 22:51                   ` Alexei Starovoitov
  0 siblings, 0 replies; 12+ messages in thread
From: Alexei Starovoitov @ 2017-11-13 22:51 UTC (permalink / raw)
  To: Daniel Borkmann, Lawrence Brakmo, Vlad Dumitrescu, davem
  Cc: netdev, Craig Gallek

On 11/14/17 4:20 AM, Daniel Borkmann wrote:
> On 11/13/2017 09:09 PM, Lawrence Brakmo wrote:
>> On 11/13/17, 11:01 AM, "Vlad Dumitrescu" <vlad@dumitrescu.ro> wrote:
>>
>>     On Sat, Nov 11, 2017 at 2:38 PM, Alexei Starovoitov <ast@fb.com> wrote:
>>     >
>>     > On 11/12/17 4:46 AM, Daniel Borkmann wrote:
>>     >>
>>     >> On 11/11/2017 05:06 AM, Alexei Starovoitov wrote:
>>     >>>
>>     >>> On 11/11/17 6:07 AM, Daniel Borkmann wrote:
>>     >>>>
>>     >>>> On 11/10/2017 08:17 PM, Vlad Dumitrescu wrote:
>>     >>>>>
>>     >>>>> From: Vlad Dumitrescu <vladum@google.com>
>>     >>>>>
>>     >>>>> Allows BPF_PROG_TYPE_SOCK_OPS programs to read sk_priority.
>>     >>>>>
>>     >>>>> Signed-off-by: Vlad Dumitrescu <vladum@google.com>
>>     >>>>> ---
>>     >>>>>   include/uapi/linux/bpf.h       |  1 +
>>     >>>>>   net/core/filter.c              | 11 +++++++++++
>>     >>>>>   tools/include/uapi/linux/bpf.h |  1 +
>>     >>>>>   3 files changed, 13 insertions(+)
>>     >>>>>
>>     >>>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>>     >>>>> index e880ae6434ee..9757a2002513 100644
>>     >>>>> --- a/include/uapi/linux/bpf.h
>>     >>>>> +++ b/include/uapi/linux/bpf.h
>>     >>>>> @@ -947,6 +947,7 @@ struct bpf_sock_ops {
>>     >>>>>       __u32 local_ip6[4];    /* Stored in network byte order */
>>     >>>>>       __u32 remote_port;    /* Stored in network byte order */
>>     >>>>>       __u32 local_port;    /* stored in host byte order */
>>     >>>>> +    __u32 priority;
>>     >>>>>   };
>>     >>>>>     /* List of known BPF sock_ops operators.
>>     >>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>     >>>>> index 61c791f9f628..a6329642d047 100644
>>     >>>>> --- a/net/core/filter.c
>>     >>>>> +++ b/net/core/filter.c
>>     >>>>> @@ -4449,6 +4449,17 @@ static u32 sock_ops_convert_ctx_access(enum
>>     >>>>> bpf_access_type type,
>>     >>>>>           *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
>>     >>>>>                         offsetof(struct sock_common, skc_num));
>>     >>>>>           break;
>>     >>>>> +
>>     >>>>> +    case offsetof(struct bpf_sock_ops, priority):
>>     >>>>> +        BUILD_BUG_ON(FIELD_SIZEOF(struct sock, sk_priority) != 4);
>>     >>>>> +
>>     >>>>> +        *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
>>     >>>>> +                        struct bpf_sock_ops_kern, sk),
>>     >>>>> +                      si->dst_reg, si->src_reg,
>>     >>>>> +                      offsetof(struct bpf_sock_ops_kern, sk));
>>     >>>>> +        *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
>>     >>>>> +                      offsetof(struct sock, sk_priority));
>>     >>>>> +        break;
>>     >>>>
>>     >>>>
>>     >>>> Hm, I don't think this would work, I actually think your initial patch
>>     >>>> was ok.
>>     >>>> bpf_setsockopt() as well as bpf_getsockopt() check for sk_fullsock(sk)
>>     >>>> right
>>     >>>> before accessing options on either socket or TCP level, and bail out
>>     >>>> with error
>>     >>>> otherwise; in such cases we'd read something else here and assume it's
>>     >>>> sk_priority.
>>     >>>
>>     >>>
>>     >>> even if it's not fullsock, it will just read zero, no? what's a problem
>>     >>> with that?
>>     >>> In non-fullsock hooks like BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
>>     >>> the program author will know that it's meaningless to read sk_priority,
>>     >>> so returning zero with minimal checks is fine.
>>     >>> While adding extra runtime if (sk_fullsock(sk)) is unnecessary,
>>     >>> since the safety is not compromised.
>>     >>
>>     >>
>>     >> Hm, on my kernel, struct sock has the 4 bytes sk_priority at offset 440,
>>     >> struct request_sock itself is only 232 byte long in total, and the struct
>>     >> inet_timewait_sock is 208 byte long, so you'd be accessing out of bounds
>>     >> that way, so it cannot be ignored and assumed zero.
>>     >
>>     >
>>     > I thought we always pass fully allocated sock but technically not fullsock yet. My mistake. We do: tcp_timeout_init((struct sock *)req))
>>     > so yeah ctx rewrite approach won't work.
>>     > Let's go back to access via helper.
>>     >
>>
>>     TIL. Thanks!
>>
>>     Is there anything else needed from me to get the helper approach accepted?
>>
>> I plan to add access to TCP state variables (cwnd, rtt, etc.) and I have been thinking
>> about this issue. I think it is possible to access it directly as long as we use a value
>> like 0xffffffff to represent an invalid value (e.g. not fullsock). The ctx rewrite just
>> needs to add a conditional to determine what to return. I would probably add a
>> field into the internal kernel struct to indicate if it is fullsock or not (we should
>> know when we call tcp_call_bpf whether it is a fullsock or not based on context).
>>
>> Let me do a sample patch that I can send for review and get feedback from
>> Alexi and Daniel.
>
> Agree, if the mov op from the ctx rewrite to read(/write) a sk member, for
> example, is just a BPF_W, then we know upper reg bits are zero anyway for the
> success case, so we might be able to utilize this for writing a signed error
> back to the user if !fullsk.

it can be __u64 in bpf_sock_ops too, while real read is 32-bit or less,
then guaranteed no conflicts if we return (s64)-enoent or (s64)-einval
in case of !fullsock.
I like the idea of copying boolean value of sk_fullsock() into hidden
part of bpf_sock_ops_kern, since it's been accessed by tcp_call_bpf()
anyway.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-11-13 22:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-09 23:04 [PATCH net-next] bpf: add support for SO_PRIORITY in bpf_getsockopt Vlad Dumitrescu
2017-11-10  0:43 ` Alexei Starovoitov
2017-11-10 17:51   ` Vlad Dumitrescu
2017-11-10 19:17   ` [PATCH net-next] bpf: expose sk_priority through struct bpf_sock_ops Vlad Dumitrescu
2017-11-10 21:07     ` Daniel Borkmann
2017-11-11  4:06       ` Alexei Starovoitov
2017-11-11 20:46         ` Daniel Borkmann
2017-11-11 22:38           ` Alexei Starovoitov
2017-11-13 19:00             ` Vlad Dumitrescu
2017-11-13 20:09               ` Lawrence Brakmo
2017-11-13 20:20                 ` Daniel Borkmann
2017-11-13 22:51                   ` Alexei Starovoitov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.