From: Matthieu Baerts <matthieu.baerts@tessares.net>
To: Stanislav Fomichev <sdf@google.com>
Cc: Paul Moore <paul@paul-moore.com>,
Geliang Tang <geliang.tang@suse.com>,
Alexei Starovoitov <ast@kernel.org>,
bpf@vger.kernel.org, netdev@vger.kernel.org,
mptcp@lists.linux.dev, apparmor@lists.ubuntu.com,
linux-security-module@vger.kernel.org, selinux@vger.kernel.org,
linux-kselftest@vger.kernel.org
Subject: Re: [RFC bpf-next v5] bpf: Force to MPTCP
Date: Mon, 31 Jul 2023 15:56:40 +0200 [thread overview]
Message-ID: <b1f63c3f-5a26-4c69-af8f-216eaac19b69@tessares.net> (raw)
In-Reply-To: <ZMPyCt2uozns776Q@google.com>
Hi Stanislav,
On 28/07/2023 18:51, Stanislav Fomichev wrote:
> On 07/28, Matthieu Baerts wrote:
>> Hi Stanislav,
>>
>> On 27/07/2023 20:01, Stanislav Fomichev wrote:
>>> On 07/27, Matthieu Baerts wrote:
>>>> Hi Paul, Stanislav,
>>>>
>>>> On 18/07/2023 18:14, Paul Moore wrote:
>>>>> On Tue, Jul 18, 2023 at 11:21 AM Geliang Tang <geliang.tang@suse.com> wrote:
>>>>>>
>>>>>> As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
>>>>>>
>>>>>> "Your app can create sockets with IPPROTO_MPTCP as the proto:
>>>>>> ( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
>>>>>> forced to create and use MPTCP sockets instead of TCP ones via the
>>>>>> mptcpize command bundled with the mptcpd daemon."
>>>>>>
>>>>>> But the mptcpize (LD_PRELOAD technique) command has some limitations
>>>>>> [2]:
>>>>>>
>>>>>> - it doesn't work if the application is not using libc (e.g. GoLang
>>>>>> apps)
>>>>>> - in some envs, it might not be easy to set env vars / change the way
>>>>>> apps are launched, e.g. on Android
>>>>>> - mptcpize needs to be launched with all apps that want MPTCP: we could
>>>>>> have more control from BPF to enable MPTCP only for some apps or all the
>>>>>> ones of a netns or a cgroup, etc.
>>>>>> - it is not in BPF, we cannot talk about it at netdev conf.
>>>>>>
>>>>>> So this patchset attempts to use BPF to implement functions similer to
>>>>>> mptcpize.
>>>>>>
>>>>>> The main idea is add a hook in sys_socket() to change the protocol id
>>>>>> from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/multipath-tcp/mptcp_net-next/wiki
>>>>>> [2]
>>>>>> https://github.com/multipath-tcp/mptcp_net-next/issues/79
>>>>>>
>>>>>> v5:
>>>>>> - add bpf_mptcpify helper.
>>>>>>
>>>>>> v4:
>>>>>> - use lsm_cgroup/socket_create
>>>>>>
>>>>>> v3:
>>>>>> - patch 8: char cmd[128]; -> char cmd[256];
>>>>>>
>>>>>> v2:
>>>>>> - Fix build selftests errors reported by CI
>>>>>>
>>>>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
>>>>>> Signed-off-by: Geliang Tang <geliang.tang@suse.com>
>>>>>> ---
>>>>>> include/linux/bpf.h | 1 +
>>>>>> include/linux/lsm_hook_defs.h | 2 +-
>>>>>> include/linux/security.h | 6 +-
>>>>>> include/uapi/linux/bpf.h | 7 +
>>>>>> kernel/bpf/bpf_lsm.c | 2 +
>>>>>> net/mptcp/bpf.c | 20 +++
>>>>>> net/socket.c | 4 +-
>>>>>> security/apparmor/lsm.c | 8 +-
>>>>>> security/security.c | 2 +-
>>>>>> security/selinux/hooks.c | 6 +-
>>>>>> tools/include/uapi/linux/bpf.h | 7 +
>>>>>> .../testing/selftests/bpf/prog_tests/mptcp.c | 128 ++++++++++++++++--
>>>>>> tools/testing/selftests/bpf/progs/mptcpify.c | 17 +++
>>>>>> 13 files changed, 187 insertions(+), 23 deletions(-)
>>>>>> create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
>>>>>
>>>>> ...
>>>>>
>>>>>> diff --git a/security/security.c b/security/security.c
>>>>>> index b720424ca37d..bbebcddce420 100644
>>>>>> --- a/security/security.c
>>>>>> +++ b/security/security.c
>>>>>> @@ -4078,7 +4078,7 @@ EXPORT_SYMBOL(security_unix_may_send);
>>>>>> *
>>>>>> * Return: Returns 0 if permission is granted.
>>>>>> */
>>>>>> -int security_socket_create(int family, int type, int protocol, int kern)
>>>>>> +int security_socket_create(int *family, int *type, int *protocol, int kern)
>>>>>> {
>>>>>> return call_int_hook(socket_create, 0, family, type, protocol, kern);
>>>>>> }
>>>>>
>>>>> Using the LSM to change the protocol family is not something we want
>>>>> to allow. I'm sorry, but you will need to take a different approach.
>>>>
>>>> @Paul: Thank you for your feedback. It makes sense and I understand.
>>>>
>>>> @Stanislav: Despite the fact the implementation was smaller and reusing
>>>> more code, it looks like we cannot go in the direction you suggested. Do
>>>> you think what Geliang suggested before in his v3 [1] can be accepted?
>>>>
>>>> (Note that the v3 is the same as the v1, only some fixes in the selftests.)
>>>
>>> We have too many hooks in networking, so something that doesn't add
>>> a new one is preferable :-(
>>
>> Thank you for your reply and the explanation, I understand.
>>
>>> Moreover, we already have a 'socket init' hook, but it runs a bit late.
>>
>> Indeed. And we cannot move it before the creation of the socket.
>>
>>> Is existing cgroup/sock completely unworkable? Is it possible to
>>> expose some new bpf_upgrade_socket_to(IPPROTO_MPTCP) kfunc which would
>>> call some new net_proto_family->upgrade_to(IPPROTO_MPTCP) to do the surgery?
>>> Or is it too hacky?
>>
>> I cannot judge if it is too hacky or not but if you think it would be
>> OK, please tell us :)
>
> Maybe try and see how it goes? Doing the surgery to convert from tcp
> to mptcp is probably hard, but it seems that we should be able to
> do something like:
>
> int upgrade_to(sock, sk) {
> if (sk is not a tcp one) return -EINVAL;
>
> sk_common_release(sk);
> return inet6_create(net, sock, IPPROTO_MPTCP, false);
> }
>
> ?
>
> The only thing I'm not sure about is whether you can call inet6_create
> on a socket that has seen sk_common_release'd...
Oh sorry, now I better understand your suggestion and the fact it is
hacky. Good workaround, we can keep this in mind if there is no other
solutions to avoid these create-release-create operations.
>>> Another option Alexei suggested is to add some fentry-like thing:
>>>
>>> noinline int update_socket_protocol(int protocol)
>>> {
>>> return protocol;
>>> }
>>> /* TODO: ^^^ add the above to mod_ret set */
>>>
>>> int __sys_socket(int family, int type, int protocol)
>>> {
>>> ...
>>>
>>> protocol = update_socket_protocol(protocol);
>>>
>>> ...
>>> }
>>>
>>> But it's also too problem specific it seems? And it's not cgroup-aware.
>>
>> It looks like it is what Geliang did in his v6. If it is the only
>> acceptable solution, I guess we can do without cgroup support. We can
>> continue the discussions in his v6 if that's easier.
>
> Ack, that works too, let's see how other people feel about it. I'm
> assuming in the bpf program we can always do bpf_get_current_cgroup_id()
> to filter by cgroup.
Good point, that works too and looks enough!
Cheers,
Matt
--
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net
prev parent reply other threads:[~2023-07-31 13:57 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-18 15:21 [RFC bpf-next v5] bpf: Force to MPTCP Geliang Tang
2023-07-18 16:14 ` Paul Moore
2023-07-27 17:36 ` Matthieu Baerts
2023-07-27 18:01 ` Stanislav Fomichev
2023-07-28 15:15 ` Matthieu Baerts
2023-07-28 16:51 ` Stanislav Fomichev
2023-07-31 13:56 ` Matthieu Baerts [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b1f63c3f-5a26-4c69-af8f-216eaac19b69@tessares.net \
--to=matthieu.baerts@tessares.net \
--cc=apparmor@lists.ubuntu.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=geliang.tang@suse.com \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=paul@paul-moore.com \
--cc=sdf@google.com \
--cc=selinux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).