From: Eric Dumazet <eric.dumazet@gmail.com>
To: Martin Lau <kafai@fb.com>, Eric Dumazet <eric.dumazet@gmail.com>
Cc: Lawrence Brakmo <brakmo@fb.com>, netdev <netdev@vger.kernel.org>,
Alexei Starovoitov <ast@fb.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH v2 bpf-next 2/9] bpf: Add bpf helper bpf_tcp_enter_cwr
Date: Sun, 24 Feb 2019 10:00:16 -0800 [thread overview]
Message-ID: <254ae6a8-13d8-101f-45a3-18a1cbe6dea6@gmail.com> (raw)
In-Reply-To: <20190224030845.imwjbkoaxipuzb75@kafai-mbp.dhcp.thefacebook.com>
On 02/23/2019 07:08 PM, Martin Lau wrote:
> On Sat, Feb 23, 2019 at 05:32:14PM -0800, Eric Dumazet wrote:
>>
>>
>> On 02/22/2019 05:06 PM, brakmo wrote:
>>> From: Martin KaFai Lau <kafai@fb.com>
>>>
>>> This patch adds a new bpf helper BPF_FUNC_tcp_enter_cwr
>>> "int bpf_tcp_enter_cwr(struct bpf_tcp_sock *tp)".
>>> It is added to BPF_PROG_TYPE_CGROUP_SKB which can be attached
>>> to the egress path where the bpf prog is called by
>>> ip_finish_output() or ip6_finish_output(). The verifier
>>> ensures that the parameter must be a tcp_sock.
>>>
>>> This helper makes a tcp_sock enter CWR state. It can be used
>>> by a bpf_prog to manage egress network bandwidth limit per
>>> cgroupv2. A later patch will have a sample program to
>>> show how it can be used to limit bandwidth usage per cgroupv2.
>>>
>>> To ensure it is only called from BPF_CGROUP_INET_EGRESS, the
>>> attr->expected_attach_type must be specified as BPF_CGROUP_INET_EGRESS
>>> during load time if the prog uses this new helper.
>>> The newly added prog->enforce_expected_attach_type bit will also be set
>>> if this new helper is used. This bit is for backward compatibility reason
>>> because currently prog->expected_attach_type has been ignored in
>>> BPF_PROG_TYPE_CGROUP_SKB. During attach time,
>>> prog->expected_attach_type is only enforced if the
>>> prog->enforce_expected_attach_type bit is set.
>>> i.e. prog->expected_attach_type is only enforced if this new helper
>>> is used by the prog.
>>>
>>
>> BTW, it seems to me that BPF_CGROUP_INET_EGRESS can be used while the socket lock is not held.
> Thanks for pointing it out.
>
> ic. I just noticed the comments at ip6_xmit():
> /*
> * xmit an sk_buff (used by TCP, SCTP and DCCP)
> * Note : socket lock is not held for SYNACK packets, but might be modified
> * by calls to skb_set_owner_w() and ipv6_local_error(),
> * which are using proper atomic operations or spinlocks.
> */
> Is there other cases other than SYNACK?
Well, I was referring to various virtual devices, re-entering ip stack.
Since we can have a qdisc on any netdev, there is no way we can guarantee the socket is
locked by the current thread.
Random example :
ipvlan_process_v4_outbound()
...
err = ip_local_out(net, skb->sk, skb);
...
next prev parent reply other threads:[~2019-02-24 18:00 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-23 1:06 [PATCH v2 bpf-next 0/9] bpf: Network Resource Manager (NRM) brakmo
2019-02-23 1:06 ` [PATCH v2 bpf-next 1/9] bpf: Remove const from get_func_proto brakmo
2019-02-23 1:06 ` [PATCH v2 bpf-next 2/9] bpf: Add bpf helper bpf_tcp_enter_cwr brakmo
2019-02-24 1:32 ` Eric Dumazet
2019-02-24 3:08 ` Martin Lau
2019-02-24 4:44 ` Alexei Starovoitov
2019-02-24 18:00 ` Eric Dumazet [this message]
2019-02-25 23:14 ` Stanislav Fomichev
2019-02-26 1:30 ` Martin Lau
2019-02-26 3:32 ` Stanislav Fomichev
2019-02-23 1:06 ` [PATCH v2 bpf-next 3/9] bpf: Test bpf_tcp_enter_cwr in test_verifier brakmo
2019-02-23 1:06 ` [PATCH v2 bpf-next 4/9] bpf: add bpf helper bpf_skb_ecn_set_ce brakmo
2019-02-23 1:14 ` Daniel Borkmann
2019-02-23 7:30 ` Martin Lau
2019-02-25 10:10 ` Daniel Borkmann
2019-02-25 16:52 ` Eric Dumazet
2019-02-23 1:06 ` [PATCH v2 bpf-next 5/9] bpf: Add bpf helper bpf_tcp_check_probe_timer brakmo
2019-02-23 1:07 ` [PATCH v2 bpf-next 6/9] bpf: sync bpf.h to tools and update bpf_helpers.h brakmo
2019-02-23 1:07 ` [PATCH v2 bpf-next 7/9] bpf: Sample NRM BPF program to limit egress bw brakmo
2019-02-23 1:07 ` [PATCH v2 bpf-next 8/9] bpf: User program for testing NRM brakmo
2019-02-23 1:07 ` [PATCH v2 bpf-next 9/9] bpf: NRM test script brakmo
2019-02-23 3:03 ` [PATCH v2 bpf-next 0/9] bpf: Network Resource Manager (NRM) David Ahern
2019-02-23 18:39 ` Eric Dumazet
2019-02-23 20:40 ` Alexei Starovoitov
2019-02-23 20:43 ` Eric Dumazet
2019-02-23 23:25 ` Alexei Starovoitov
2019-02-24 2:58 ` David Ahern
2019-02-24 4:48 ` Alexei Starovoitov
2019-02-25 1:38 ` David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=254ae6a8-13d8-101f-45a3-18a1cbe6dea6@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=Kernel-team@fb.com \
--cc=ast@fb.com \
--cc=brakmo@fb.com \
--cc=daniel@iogearbox.net \
--cc=kafai@fb.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).