From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <netdev-owner@vger.kernel.org>
Received: from www62.your-server.de ([213.133.104.62]:43974 "EHLO
        www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751414AbeCOUco (ORCPT
        <rfc822;netdev@vger.kernel.org>); Thu, 15 Mar 2018 16:32:44 -0400
Subject: Re: [bpf-next PATCH v2 06/18] bpf: sockmap, add bpf_msg_apply_bytes()
 helper
To: John Fastabend <john.fastabend@gmail.com>, davem@davemloft.net,
        ast@kernel.org, davejwatson@fb.com
Cc: netdev@vger.kernel.org
References: <20180312192034.8039.70022.stgit@john-Precision-Tower-5810>
 <20180312192334.8039.98220.stgit@john-Precision-Tower-5810>
From: Daniel Borkmann <daniel@iogearbox.net>
Message-ID: <0ae5f4fd-2916-01e2-dba1-bd2505a1677c@iogearbox.net>
Date: Thu, 15 Mar 2018 21:32:35 +0100
MIME-Version: 1.0
In-Reply-To: <20180312192334.8039.98220.stgit@john-Precision-Tower-5810>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On 03/12/2018 08:23 PM, John Fastabend wrote:
> A single sendmsg or sendfile system call can contain multiple logical
> messages that a BPF program may want to read and apply a verdict. But,
> without an apply_bytes helper any verdict on the data applies to all
> bytes in the sendmsg/sendfile. Alternatively, a BPF program may only
> care to read the first N bytes of a msg. If the payload is large say
> MB or even GB setting up and calling the BPF program repeatedly for
> all bytes, even though the verdict is already known, creates
> unnecessary overhead.
> 
> To allow BPF programs to control how many bytes a given verdict
> applies to we implement a bpf_msg_apply_bytes() helper. When called
> from within a BPF program this sets a counter, internal to the
> BPF infrastructure, that applies the last verdict to the next N
> bytes. If the N is smaller than the current data being processed
> from a sendmsg/sendfile call, the first N bytes will be sent and
> the BPF program will be re-run with start_data pointing to the N+1
> byte. If N is larger than the current data being processed the
> BPF verdict will be applied to multiple sendmsg/sendfile calls
> until N bytes are consumed.
> 
> Note1 if a socket closes with apply_bytes counter non-zero this
> is not a problem because data is not being buffered for N bytes
> and is sent as its received.
> 
> Note2 if this is operating in the sendpage context the data
> pointers may be zeroed after this call if the apply walks beyond
> a msg_pull_data() call specified data range. (helper implemented
> shortly in this series).
> 
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  include/uapi/linux/bpf.h |    3 ++-
>  net/core/filter.c        |   16 ++++++++++++++++
>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index b8275f0..e50c61f 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -769,7 +769,8 @@ enum bpf_attach_type {
>  	FN(getsockopt),			\
>  	FN(override_return),		\
>  	FN(sock_ops_cb_flags_set),	\
> -	FN(msg_redirect_map),
> +	FN(msg_redirect_map),		\
> +	FN(msg_apply_bytes),
>  
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>   * function eBPF program intends to call
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 314c311..df2a8f4 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -1928,6 +1928,20 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>  	.arg4_type      = ARG_ANYTHING,
>  };
>  
> +BPF_CALL_2(bpf_msg_apply_bytes, struct sk_msg_buff *, msg, u64, bytes)
> +{
> +	msg->apply_bytes = bytes;

Here in bpf_msg_apply_bytes() but also in bpf_msg_cork_bytes() the signature
is u64, but in struct sk_msg_buff and struct smap_psock it's type int, so
user provided u64 will make these negative. Is there a reason to have this
allow a negative value and not being of type u32 everywhere?

> +	return 0;
> +}
> +
> +static const struct bpf_func_proto bpf_msg_apply_bytes_proto = {
> +	.func           = bpf_msg_apply_bytes,
> +	.gpl_only       = false,
> +	.ret_type       = RET_INTEGER,
> +	.arg1_type	= ARG_PTR_TO_CTX,
> +	.arg2_type      = ARG_ANYTHING,
> +};
> +
>  BPF_CALL_1(bpf_get_cgroup_classid, const struct sk_buff *, skb)
>  {
>  	return task_get_classid(skb);
> @@ -3634,6 +3648,8 @@ static const struct bpf_func_proto *sk_msg_func_proto(enum bpf_func_id func_id)
>  	switch (func_id) {
>  	case BPF_FUNC_msg_redirect_map:
>  		return &bpf_msg_redirect_map_proto;
> +	case BPF_FUNC_msg_apply_bytes:
> +		return &bpf_msg_apply_bytes_proto;
>  	default:
>  		return bpf_base_func_proto(func_id);
>  	}
>