All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-18  9:57 Matthieu Baerts
  0 siblings, 0 replies; 12+ messages in thread
From: Matthieu Baerts @ 2019-07-18  9:57 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 4678 bytes --]

Hi Ossama,

On 16/07/2019 19:45, Othman, Ossama wrote:
> Hi Matthieu,
> 
> On Tue, Jul 16, 2019 at 8:51 AM Matthieu Baerts
> <matthieu.baerts(a)tessares.net> wrote:
>> On 15/07/2019 22:18, Othman, Ossama wrote:
>>> My point was that it also introduces a way for compromised privileged
>>> processes to potentially (and possibly inadvertently) disable MPTCP at
>>> run-time without involving system administrators.  Admittedly, there
>>> doesn't appear to be much reason for such compromised processes to do
>>> that.   At the very least, I think it might be useful to verify that
>>> the process in question has the CAP_NET_ADMIN capability when
>>> modifying this variable.
>>
>> With the current version, I think that this compromised privileged
>> process would need the root/full permissions to modify this new sysctl.
>> It would need more than one capability I think, no?
>> If it is "root", it can then also block MPTCP with NF/IPTables rules,
>> remove files, burn the device, become the president of the US, etc.
>> Maybe a process with CAP_NET_ADMIN can already block MPTCP sockets.
>>
>> Maybe I am wrong but I thought that using capabilities was a way to give
>> extra permissions to a process to do a bit more (e.g. to create a RAW
>> socket) without requiring the full/root rights. To change most sysctl
>> (except the ones that check for some specific capabilities), you need
>> the full rights, no?
> 
> I think it's more accurate to say that the effective user ID needs to
> be root in those cases, regardless of which capabilities are granted
> or removed.  Since capabilities are not checked in the kernel for
> those sysctl variables, a root process without full capabilities could
> conceivably alter them, at least as I understand it.  Please correct
> me if I am wrong.

Thank you for this note!
From what I see in the code, the check for the permissions is done in 2
steps from sysctl_perm()[1]:
- first it gets the mode, that's done in net_ctl_permissions()[2]
because we are in the net root table. So:
  - either the caller has CAP_NET_ADMIN capability and it will get the
same access as root: if the root user can read and write, the user will
be able to do the same.
  - or we take the mode specified in the table, e.g. 644 here, where
only root (id=0) can write.
- then it tests which mode is valid for the current user, see test_perm()[3]

[1]
https://elixir.bootlin.com/linux/latest/source/fs/proc/proc_sysctl.c#L446
[2] https://elixir.bootlin.com/linux/latest/source/net/sysctl_net.c#L42
[3]
https://elixir.bootlin.com/linux/latest/source/fs/proc/proc_sysctl.c#L435

So from what I see, you can modify the value of this new sysctl if you
have CAP_NET_ADMIN capability or you are the root user.

So it seems OK, no? Or do you want that only root with net admin
capability can do that? Not one or the other like most of the "net" sysctl?
Or did I miss something? :)

> Yes, capabilities can be used to grant specific privileges to a
> non-root process.  However, capabilities can also be used to restrict
> what a root process can do.  This is evident in "pscap" command
> output:
> 
> $ pscap
> ppid  pid   name        command           capabilities
> 1     220   root        systemd-journal   chown, dac_override,
> dac_read_search, fowner, setgid, setuid, sys_ptrace, sys_admin,
> audit_control, mac_override, syslog, audit_read
> 1     261   root        systemd-udevd     full
> 1     513   root        haveged           sys_admin
> 1     521   mptcp       mptcpd            net_admin
> 1     532   root        accounts-daemon   full
> ...
> 
> Only some of the root processes listed above have "full" capabilties.

That's interesting!
From what I see, systemd-journal might still be able to modify the "net"
sysctls then, no?
Maybe that's not seen as a security issue?

>> And if you have the full rights, you have the
>> CAP_NET_ADMIN capability, no?
> 
> Yes.  You can't do anything about processes with full capabilities but
> you can restrict sysctl variable write access to root processes that
> do not have the desired capabilities.

OK I see.
So now the question is to know if we can trust processes with
CAP_NET_ADMIN capability, right? :)

>> Is it possible to have "root" processes without CAP_NET_ADMIN
>> capability? Or maybe somehow with SELinux?
> 
> Sure.  See above.  :)

Thank you for the review!

Cheers,
Matt

> 
> HTH,
> -Ossama
> 

-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-31 16:04 Peter Krystad
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Krystad @ 2019-07-31 16:04 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 10065 bytes --]


Hi again Matthieu!

On Thu, 2019-07-18 at 08:37 +0200, Matthieu Baerts wrote:
> Hi Peter,
> 
> Thank you for your reply!
> 
> On 18/07/2019 00:42, Peter Krystad wrote:
> > On Wed, 2019-07-17 at 23:08 +0200, Matthieu Baerts wrote:
> > > Hi Peter,
> > > 
> > > On 17/07/2019 01:12, Peter Krystad wrote:
> > > > Thanks for this patch Matthieu, I have a few tardy comments.
> > > 
> > > Thank you for your review!
> > > 
> > > > As a general comment another approach would be to use this enabled flag to
> > > > control whether MP_CAPABLE options are sent when establishing/accepting an
> > > > MPTCP connection. You can see my "/* @@ if MTPCP enabled */" comments where
> > > > subflow->request_mptcp is being set in protocol.c. This would be the similiar
> > > > behaviour to the mptcp.org version, you just get a (inefficient) TCP socket
> > > > when requesting MPTCP. You get the benefit of not surprising applications
> > > > coded to use MPTCP at the drawback of some security exposure, as the new MPTCP
> > > > fallback layer would still be used. But this is a pretty thin layer.
> > > 
> > > That's an interesting point of view, I didn't have the same one when
> > > creating this patch. For me and compared to the mptcp.org
> > > implementation, an application that explicitly asks for IPPROTO_MPTCP
> > > wants to have MPTCP. It should be ready to face the possibility that
> > > MPTCP is not available or not supported by the kernel. Note that we
> > > should certainly add a way for an app to know if a fallback to TCP has
> > > been done but that's another topic.
> > > 
> > > My point is that if MPTCP is disabled, it seems interesting to me to
> > > directly tell that to the app and then not transparency fallback to TCP.
> > > 
> > > But if we think that it might be interesting to transparently fallback
> > > to TCP, we can do that. We can also use several values (1, 2, ...)
> > > linked to different behaviours.
> > 
> > The code we have so far transparently falls back to TCP if the other end
> > does not support/rejects MPTCP connection, so we should definitely discuss
> > which approach we want to support. We will be removing code if we go as
> > you suggest.
> 
> I think it is still important to transparently fall back to TCP if the
> other end or any other device on the network does not support or does
> not allow MPTCP. Because we don't necessarily want to stop and
> re-establish a new TCP connection once a fall back to TCP is done.

OK I think I may have mis-understood, we can stay as is.

> 
> But here, nothing has been sent to the wire, we can tell the app before
> that MPTCP will not be used. And we can even tell it that it is due to a
> "configuration issue".
> 
> If we do a transparent fall back to TCP, that could be nice to notify
> the app. POLLPRI? sockopt?

That could be a nice feature. Patches accepted. :)

> 
> > > > See my comments to this specific patch below.
> > > > 
> > > > On Thu, 2019-07-11 at 21:00 +0200, Matthieu Baerts wrote:
> > > > > New MPTCP sockets will return -ENOPROTOOPT if MPTCP support is disabled
> > > > > for the current net namespace.
> > > > > 
> > > > > For security reasons, it is interesting to have a global switch for
> > > > > MPTCP. To start, MPTCP will be disabled by default and only privileged
> > > > > users will be able to modify this. The reason is that because MPTCP is
> > > > > new, it will not be tested and reviewed by many and security issues can
> > > > > then take time to be discovered and fixed.
> > > > > 
> > > > > The value of this new sysctl can be different per namespace. We can then
> > > > > restrict the usage of MPTCP to the selected NS. In case of serious
> > > > > issues with MPTCP, administrators can now easily turn MPTCP off.
> > > > > 
> > > > > MPTCP' kselftest has been modified to validate that the correct error is
> > > > > reported when creating a socket while MPTCP support is disabled.
> > > > > 
> > > > > Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
> > > > > ---
> > > > > 
> > > > > Notes:
> > > > >     As just discussed at the meeting we just had, here is a RFC patch to
> > > > >     add a new sysctl for MPTCP.
> > > > >     
> > > > >     Because it is not linked to the protocol itself, I simply created a new
> > > > >     file, ctrl.c like in mptcp.org.
> > > > >     
> > > > >     A few questions:
> > > > >     - Is it OK to reserve space per ns via the "pernet_operations"
> > > > >       structure? Because MPTCP would not be compiled as a module, we could
> > > > >       directly store stuff in the net structure as other parts of the code
> > > > >       do but maybe better to keep MPTCP code on the side as done here.
> > > > >     - In mptcp.org, all sysctls are prepended with 'mptcp_', e.g.
> > > > >       'net.mptcp.mptcp_enabled'. Do we need this? Is it better to keep the
> > > > >       same names if possible or better to differentiate? In this version,
> > > > >       'mptcp_' is not prepended.
> > > > 
> > > > Differentiate, as you have done.
> > > 
> > > Good thank you!
> > > 
> > > > >     - This sysctl will only block new sockets to be created. Is it enough?
> > > > >     - ENOPROTOOPT is returned, maybe something else to return? EPERM is
> > > > >       maybe too generic? ENOTSUPP is not translated by perror().
> > > > >     - Should we start the documentation now for the sysctl?
> > > > >     - A simple test has been added, because it is not linked to the rest, I
> > > > >       put separeted as first test.
> > > > >     - Of course do not hesitate to comment. Even on the idea of having a
> > > > >       sysctl for this purpose.
> > > > > 
> > > > >  net/mptcp/Makefile                            |   2 +-
> > > > >  net/mptcp/ctrl.c                              | 109 ++++++++++++++++++
> > > > >  net/mptcp/protocol.c                          |  12 +-
> > > > >  net/mptcp/protocol.h                          |   4 +
> > > > >  .../selftests/net/mptcp/mptcp_connect.sh      |  24 ++++
> > > > >  5 files changed, 148 insertions(+), 3 deletions(-)
> > > > >  create mode 100644 net/mptcp/ctrl.c
> > > > > 
> > > > > diff --git a/net/mptcp/Makefile b/net/mptcp/Makefile
> > > > > index 7fe7aa64eda0..289fdf4339c1 100644
> > > > > --- a/net/mptcp/Makefile
> > > > > +++ b/net/mptcp/Makefile
> > > > > @@ -1,4 +1,4 @@
> > > > >  # SPDX-License-Identifier: GPL-2.0
> > > > >  obj-$(CONFIG_MPTCP) += mptcp.o
> > > > >  
> > > > > -mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o
> > > > > +mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o ctrl.o
> > > > > diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
> > > > 
> > > > The prefix 'ctl' is used in this file a bunch, so maybe ctl.c is a better
> > > > name, or even 'sysctl.c'. That it is named this is mptcp.org version isn't
> > > > much of a reason IMHO. :)
> > > 
> > > My goal was to use this new file for everything not directly linked to
> > > the protocol but to change different behaviours: sysctl, sockopts,
> > > selection of the PMs or the packet schedulers, etc.
> > > 
> > > What do you think? But we can also create different files if you think
> > > that would be better.

PMs or packet schedulers (if there is a choice at all) would be selected by
sysctl value, and MPTCP sockopt handling would go in protocol.c, so I would
still suggest the different name.

> > > 
> > > > > new file mode 100644
> > > > > index 000000000000..4c9a6a2cfeb3
> > > > > --- /dev/null
> > > > > +++ b/net/mptcp/ctrl.c
> > > 
> > > (...)
> > > 
> > > > > +
> > > > > +static struct pernet_operations mptcp_pernet_ops = {
> > > > > +	.init = mptcp_net_init,
> > > > > +	.exit = mptcp_net_exit,
> > > > > +	.id = &mptcp_pernet_id,
> > > > > +	.size = sizeof(struct mptcp_pernet),
> > > > > +};
> > > > > +
> > > > > +void __init mptcp_init(void)
> > > > 
> > > > I would prefer that the mptcp_init() in protocol.c remain the top-level init
> > > > routine and that it call a sub-level routine in this file the same as other
> > > > files in "module". When TCP is calling mptcp_init() it is registering
> > > > protocols, so the protocol routine should be at the top I think.
> > > 
> > > Sure I can do that.
> > > 
> > > I created this new file to be the central point to control the different
> > > parts in MPTCP (sysctl, sockopt, PM selection, etc.) including the
> > > protocol. But if we think it is better to init the protocol part, I am
> > > fine with that.
> > 
> > I'm not too strong on this either way, what do others think?

After considering this more I'm fine with the init organization you have in
this patchset.

Peter.

> > 
> > > > > +{
> > > > > +	mptcp_proto_init();
> > > > > +
> > > > > +	if (register_pernet_subsys(&mptcp_pernet_ops) < 0)
> > > > > +		panic("Failed to register MPTCP pernet subsystem.\n");
> > > > > +}
> > > > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> > > > > index 774ed25d3b6d..18399cb63f35 100644
> > > > > --- a/net/mptcp/protocol.c
> > > > > +++ b/net/mptcp/protocol.c
> > > > > @@ -559,6 +559,14 @@ static int mptcp_init_sock(struct sock *sk)
> > > > >  	return 0;
> > > > >  }
> > > > >  
> > > > > +static int mptcp_init_sock_cb(struct sock *sk)
> > > > > +{
> > > > > +	if (!mptcp_is_enabled(sock_net(sk)))
> > > > > +		return -ENOPROTOOPT;
> > > > 
> > > > Can we just add this check to mptcp_init_sock() and not add the extra routine?
> > > 
> > > I created a new routine because mptcp_init_sock() is also called from
> > > mptcp_accept(). The (initial) goal of this sysctl is not to allow
> > > creation of new socket and not alter existing ones. We can also change
> > > the definition if needed.
> > 
> > OK. I would still prefer the name of the callback in 'struct proto' to be
> > mptcp_init_sock, and call a "__mptcp_init_sock" from mptcp_accept.
> 
> Sure I can do that!
> 
> Cheers,
> Matt


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-18  6:37 Matthieu Baerts
  0 siblings, 0 replies; 12+ messages in thread
From: Matthieu Baerts @ 2019-07-18  6:37 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 8941 bytes --]

Hi Peter,

Thank you for your reply!

On 18/07/2019 00:42, Peter Krystad wrote:
> On Wed, 2019-07-17 at 23:08 +0200, Matthieu Baerts wrote:
>> Hi Peter,
>>
>> On 17/07/2019 01:12, Peter Krystad wrote:
>>> Thanks for this patch Matthieu, I have a few tardy comments.
>>
>> Thank you for your review!
>>
>>> As a general comment another approach would be to use this enabled flag to
>>> control whether MP_CAPABLE options are sent when establishing/accepting an
>>> MPTCP connection. You can see my "/* @@ if MTPCP enabled */" comments where
>>> subflow->request_mptcp is being set in protocol.c. This would be the similiar
>>> behaviour to the mptcp.org version, you just get a (inefficient) TCP socket
>>> when requesting MPTCP. You get the benefit of not surprising applications
>>> coded to use MPTCP at the drawback of some security exposure, as the new MPTCP
>>> fallback layer would still be used. But this is a pretty thin layer.
>>
>> That's an interesting point of view, I didn't have the same one when
>> creating this patch. For me and compared to the mptcp.org
>> implementation, an application that explicitly asks for IPPROTO_MPTCP
>> wants to have MPTCP. It should be ready to face the possibility that
>> MPTCP is not available or not supported by the kernel. Note that we
>> should certainly add a way for an app to know if a fallback to TCP has
>> been done but that's another topic.
>>
>> My point is that if MPTCP is disabled, it seems interesting to me to
>> directly tell that to the app and then not transparency fallback to TCP.
>>
>> But if we think that it might be interesting to transparently fallback
>> to TCP, we can do that. We can also use several values (1, 2, ...)
>> linked to different behaviours.
> 
> The code we have so far transparently falls back to TCP if the other end
> does not support/rejects MPTCP connection, so we should definitely discuss
> which approach we want to support. We will be removing code if we go as
> you suggest.

I think it is still important to transparently fall back to TCP if the
other end or any other device on the network does not support or does
not allow MPTCP. Because we don't necessarily want to stop and
re-establish a new TCP connection once a fall back to TCP is done.

But here, nothing has been sent to the wire, we can tell the app before
that MPTCP will not be used. And we can even tell it that it is due to a
"configuration issue".

If we do a transparent fall back to TCP, that could be nice to notify
the app. POLLPRI? sockopt?

>>
>>> See my comments to this specific patch below.
>>>
>>> On Thu, 2019-07-11 at 21:00 +0200, Matthieu Baerts wrote:
>>>> New MPTCP sockets will return -ENOPROTOOPT if MPTCP support is disabled
>>>> for the current net namespace.
>>>>
>>>> For security reasons, it is interesting to have a global switch for
>>>> MPTCP. To start, MPTCP will be disabled by default and only privileged
>>>> users will be able to modify this. The reason is that because MPTCP is
>>>> new, it will not be tested and reviewed by many and security issues can
>>>> then take time to be discovered and fixed.
>>>>
>>>> The value of this new sysctl can be different per namespace. We can then
>>>> restrict the usage of MPTCP to the selected NS. In case of serious
>>>> issues with MPTCP, administrators can now easily turn MPTCP off.
>>>>
>>>> MPTCP' kselftest has been modified to validate that the correct error is
>>>> reported when creating a socket while MPTCP support is disabled.
>>>>
>>>> Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
>>>> ---
>>>>
>>>> Notes:
>>>>     As just discussed at the meeting we just had, here is a RFC patch to
>>>>     add a new sysctl for MPTCP.
>>>>     
>>>>     Because it is not linked to the protocol itself, I simply created a new
>>>>     file, ctrl.c like in mptcp.org.
>>>>     
>>>>     A few questions:
>>>>     - Is it OK to reserve space per ns via the "pernet_operations"
>>>>       structure? Because MPTCP would not be compiled as a module, we could
>>>>       directly store stuff in the net structure as other parts of the code
>>>>       do but maybe better to keep MPTCP code on the side as done here.
>>>>     - In mptcp.org, all sysctls are prepended with 'mptcp_', e.g.
>>>>       'net.mptcp.mptcp_enabled'. Do we need this? Is it better to keep the
>>>>       same names if possible or better to differentiate? In this version,
>>>>       'mptcp_' is not prepended.
>>>
>>> Differentiate, as you have done.
>>
>> Good thank you!
>>
>>>>     - This sysctl will only block new sockets to be created. Is it enough?
>>>>     - ENOPROTOOPT is returned, maybe something else to return? EPERM is
>>>>       maybe too generic? ENOTSUPP is not translated by perror().
>>>>     - Should we start the documentation now for the sysctl?
>>>>     - A simple test has been added, because it is not linked to the rest, I
>>>>       put separeted as first test.
>>>>     - Of course do not hesitate to comment. Even on the idea of having a
>>>>       sysctl for this purpose.
>>>>
>>>>  net/mptcp/Makefile                            |   2 +-
>>>>  net/mptcp/ctrl.c                              | 109 ++++++++++++++++++
>>>>  net/mptcp/protocol.c                          |  12 +-
>>>>  net/mptcp/protocol.h                          |   4 +
>>>>  .../selftests/net/mptcp/mptcp_connect.sh      |  24 ++++
>>>>  5 files changed, 148 insertions(+), 3 deletions(-)
>>>>  create mode 100644 net/mptcp/ctrl.c
>>>>
>>>> diff --git a/net/mptcp/Makefile b/net/mptcp/Makefile
>>>> index 7fe7aa64eda0..289fdf4339c1 100644
>>>> --- a/net/mptcp/Makefile
>>>> +++ b/net/mptcp/Makefile
>>>> @@ -1,4 +1,4 @@
>>>>  # SPDX-License-Identifier: GPL-2.0
>>>>  obj-$(CONFIG_MPTCP) += mptcp.o
>>>>  
>>>> -mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o
>>>> +mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o ctrl.o
>>>> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
>>>
>>> The prefix 'ctl' is used in this file a bunch, so maybe ctl.c is a better
>>> name, or even 'sysctl.c'. That it is named this is mptcp.org version isn't
>>> much of a reason IMHO. :)
>>
>> My goal was to use this new file for everything not directly linked to
>> the protocol but to change different behaviours: sysctl, sockopts,
>> selection of the PMs or the packet schedulers, etc.
>>
>> What do you think? But we can also create different files if you think
>> that would be better.
>>
>>>> new file mode 100644
>>>> index 000000000000..4c9a6a2cfeb3
>>>> --- /dev/null
>>>> +++ b/net/mptcp/ctrl.c
>>
>> (...)
>>
>>>> +
>>>> +static struct pernet_operations mptcp_pernet_ops = {
>>>> +	.init = mptcp_net_init,
>>>> +	.exit = mptcp_net_exit,
>>>> +	.id = &mptcp_pernet_id,
>>>> +	.size = sizeof(struct mptcp_pernet),
>>>> +};
>>>> +
>>>> +void __init mptcp_init(void)
>>>
>>> I would prefer that the mptcp_init() in protocol.c remain the top-level init
>>> routine and that it call a sub-level routine in this file the same as other
>>> files in "module". When TCP is calling mptcp_init() it is registering
>>> protocols, so the protocol routine should be at the top I think.
>>
>> Sure I can do that.
>>
>> I created this new file to be the central point to control the different
>> parts in MPTCP (sysctl, sockopt, PM selection, etc.) including the
>> protocol. But if we think it is better to init the protocol part, I am
>> fine with that.
> 
> I'm not too strong on this either way, what do others think?
> 
>>
>>>> +{
>>>> +	mptcp_proto_init();
>>>> +
>>>> +	if (register_pernet_subsys(&mptcp_pernet_ops) < 0)
>>>> +		panic("Failed to register MPTCP pernet subsystem.\n");
>>>> +}
>>>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>>>> index 774ed25d3b6d..18399cb63f35 100644
>>>> --- a/net/mptcp/protocol.c
>>>> +++ b/net/mptcp/protocol.c
>>>> @@ -559,6 +559,14 @@ static int mptcp_init_sock(struct sock *sk)
>>>>  	return 0;
>>>>  }
>>>>  
>>>> +static int mptcp_init_sock_cb(struct sock *sk)
>>>> +{
>>>> +	if (!mptcp_is_enabled(sock_net(sk)))
>>>> +		return -ENOPROTOOPT;
>>>
>>> Can we just add this check to mptcp_init_sock() and not add the extra routine?
>>
>> I created a new routine because mptcp_init_sock() is also called from
>> mptcp_accept(). The (initial) goal of this sysctl is not to allow
>> creation of new socket and not alter existing ones. We can also change
>> the definition if needed.
> 
> OK. I would still prefer the name of the callback in 'struct proto' to be
> mptcp_init_sock, and call a "__mptcp_init_sock" from mptcp_accept.

Sure I can do that!

Cheers,
Matt
-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-17 22:42 Peter Krystad
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Krystad @ 2019-07-17 22:42 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 8186 bytes --]

On Wed, 2019-07-17 at 23:08 +0200, Matthieu Baerts wrote:
> Hi Peter,
> 
> On 17/07/2019 01:12, Peter Krystad wrote:
> > Thanks for this patch Matthieu, I have a few tardy comments.
> 
> Thank you for your review!
> 
> > As a general comment another approach would be to use this enabled flag to
> > control whether MP_CAPABLE options are sent when establishing/accepting an
> > MPTCP connection. You can see my "/* @@ if MTPCP enabled */" comments where
> > subflow->request_mptcp is being set in protocol.c. This would be the similiar
> > behaviour to the mptcp.org version, you just get a (inefficient) TCP socket
> > when requesting MPTCP. You get the benefit of not surprising applications
> > coded to use MPTCP at the drawback of some security exposure, as the new MPTCP
> > fallback layer would still be used. But this is a pretty thin layer.
> 
> That's an interesting point of view, I didn't have the same one when
> creating this patch. For me and compared to the mptcp.org
> implementation, an application that explicitly asks for IPPROTO_MPTCP
> wants to have MPTCP. It should be ready to face the possibility that
> MPTCP is not available or not supported by the kernel. Note that we
> should certainly add a way for an app to know if a fallback to TCP has
> been done but that's another topic.
> 
> My point is that if MPTCP is disabled, it seems interesting to me to
> directly tell that to the app and then not transparency fallback to TCP.
> 
> But if we think that it might be interesting to transparently fallback
> to TCP, we can do that. We can also use several values (1, 2, ...)
> linked to different behaviours.

The code we have so far transparently falls back to TCP if the other end
does not support/rejects MPTCP connection, so we should definitely discuss
which approach we want to support. We will be removing code if we go as
you suggest.

> 
> > See my comments to this specific patch below.
> > 
> > On Thu, 2019-07-11 at 21:00 +0200, Matthieu Baerts wrote:
> > > New MPTCP sockets will return -ENOPROTOOPT if MPTCP support is disabled
> > > for the current net namespace.
> > > 
> > > For security reasons, it is interesting to have a global switch for
> > > MPTCP. To start, MPTCP will be disabled by default and only privileged
> > > users will be able to modify this. The reason is that because MPTCP is
> > > new, it will not be tested and reviewed by many and security issues can
> > > then take time to be discovered and fixed.
> > > 
> > > The value of this new sysctl can be different per namespace. We can then
> > > restrict the usage of MPTCP to the selected NS. In case of serious
> > > issues with MPTCP, administrators can now easily turn MPTCP off.
> > > 
> > > MPTCP' kselftest has been modified to validate that the correct error is
> > > reported when creating a socket while MPTCP support is disabled.
> > > 
> > > Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
> > > ---
> > > 
> > > Notes:
> > >     As just discussed at the meeting we just had, here is a RFC patch to
> > >     add a new sysctl for MPTCP.
> > >     
> > >     Because it is not linked to the protocol itself, I simply created a new
> > >     file, ctrl.c like in mptcp.org.
> > >     
> > >     A few questions:
> > >     - Is it OK to reserve space per ns via the "pernet_operations"
> > >       structure? Because MPTCP would not be compiled as a module, we could
> > >       directly store stuff in the net structure as other parts of the code
> > >       do but maybe better to keep MPTCP code on the side as done here.
> > >     - In mptcp.org, all sysctls are prepended with 'mptcp_', e.g.
> > >       'net.mptcp.mptcp_enabled'. Do we need this? Is it better to keep the
> > >       same names if possible or better to differentiate? In this version,
> > >       'mptcp_' is not prepended.
> > 
> > Differentiate, as you have done.
> 
> Good thank you!
> 
> > >     - This sysctl will only block new sockets to be created. Is it enough?
> > >     - ENOPROTOOPT is returned, maybe something else to return? EPERM is
> > >       maybe too generic? ENOTSUPP is not translated by perror().
> > >     - Should we start the documentation now for the sysctl?
> > >     - A simple test has been added, because it is not linked to the rest, I
> > >       put separeted as first test.
> > >     - Of course do not hesitate to comment. Even on the idea of having a
> > >       sysctl for this purpose.
> > > 
> > >  net/mptcp/Makefile                            |   2 +-
> > >  net/mptcp/ctrl.c                              | 109 ++++++++++++++++++
> > >  net/mptcp/protocol.c                          |  12 +-
> > >  net/mptcp/protocol.h                          |   4 +
> > >  .../selftests/net/mptcp/mptcp_connect.sh      |  24 ++++
> > >  5 files changed, 148 insertions(+), 3 deletions(-)
> > >  create mode 100644 net/mptcp/ctrl.c
> > > 
> > > diff --git a/net/mptcp/Makefile b/net/mptcp/Makefile
> > > index 7fe7aa64eda0..289fdf4339c1 100644
> > > --- a/net/mptcp/Makefile
> > > +++ b/net/mptcp/Makefile
> > > @@ -1,4 +1,4 @@
> > >  # SPDX-License-Identifier: GPL-2.0
> > >  obj-$(CONFIG_MPTCP) += mptcp.o
> > >  
> > > -mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o
> > > +mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o ctrl.o
> > > diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
> > 
> > The prefix 'ctl' is used in this file a bunch, so maybe ctl.c is a better
> > name, or even 'sysctl.c'. That it is named this is mptcp.org version isn't
> > much of a reason IMHO. :)
> 
> My goal was to use this new file for everything not directly linked to
> the protocol but to change different behaviours: sysctl, sockopts,
> selection of the PMs or the packet schedulers, etc.
> 
> What do you think? But we can also create different files if you think
> that would be better.
> 
> > > new file mode 100644
> > > index 000000000000..4c9a6a2cfeb3
> > > --- /dev/null
> > > +++ b/net/mptcp/ctrl.c
> 
> (...)
> 
> > > +
> > > +static struct pernet_operations mptcp_pernet_ops = {
> > > +	.init = mptcp_net_init,
> > > +	.exit = mptcp_net_exit,
> > > +	.id = &mptcp_pernet_id,
> > > +	.size = sizeof(struct mptcp_pernet),
> > > +};
> > > +
> > > +void __init mptcp_init(void)
> > 
> > I would prefer that the mptcp_init() in protocol.c remain the top-level init
> > routine and that it call a sub-level routine in this file the same as other
> > files in "module". When TCP is calling mptcp_init() it is registering
> > protocols, so the protocol routine should be at the top I think.
> 
> Sure I can do that.
> 
> I created this new file to be the central point to control the different
> parts in MPTCP (sysctl, sockopt, PM selection, etc.) including the
> protocol. But if we think it is better to init the protocol part, I am
> fine with that.

I'm not too strong on this either way, what do others think?

> 
> > > +{
> > > +	mptcp_proto_init();
> > > +
> > > +	if (register_pernet_subsys(&mptcp_pernet_ops) < 0)
> > > +		panic("Failed to register MPTCP pernet subsystem.\n");
> > > +}
> > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> > > index 774ed25d3b6d..18399cb63f35 100644
> > > --- a/net/mptcp/protocol.c
> > > +++ b/net/mptcp/protocol.c
> > > @@ -559,6 +559,14 @@ static int mptcp_init_sock(struct sock *sk)
> > >  	return 0;
> > >  }
> > >  
> > > +static int mptcp_init_sock_cb(struct sock *sk)
> > > +{
> > > +	if (!mptcp_is_enabled(sock_net(sk)))
> > > +		return -ENOPROTOOPT;
> > 
> > Can we just add this check to mptcp_init_sock() and not add the extra routine?
> 
> I created a new routine because mptcp_init_sock() is also called from
> mptcp_accept(). The (initial) goal of this sysctl is not to allow
> creation of new socket and not alter existing ones. We can also change
> the definition if needed.

OK. I would still prefer the name of the callback in 'struct proto' to be
mptcp_init_sock, and call a "__mptcp_init_sock" from mptcp_accept.

Thanks,

Peter.


> 
> Cheers,
> Matt


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-17 21:08 Matthieu Baerts
  0 siblings, 0 replies; 12+ messages in thread
From: Matthieu Baerts @ 2019-07-17 21:08 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 7393 bytes --]

Hi Peter,

On 17/07/2019 01:12, Peter Krystad wrote:
> 
> Thanks for this patch Matthieu, I have a few tardy comments.

Thank you for your review!

> As a general comment another approach would be to use this enabled flag to
> control whether MP_CAPABLE options are sent when establishing/accepting an
> MPTCP connection. You can see my "/* @@ if MTPCP enabled */" comments where
> subflow->request_mptcp is being set in protocol.c. This would be the similiar
> behaviour to the mptcp.org version, you just get a (inefficient) TCP socket
> when requesting MPTCP. You get the benefit of not surprising applications
> coded to use MPTCP at the drawback of some security exposure, as the new MPTCP
> fallback layer would still be used. But this is a pretty thin layer.

That's an interesting point of view, I didn't have the same one when
creating this patch. For me and compared to the mptcp.org
implementation, an application that explicitly asks for IPPROTO_MPTCP
wants to have MPTCP. It should be ready to face the possibility that
MPTCP is not available or not supported by the kernel. Note that we
should certainly add a way for an app to know if a fallback to TCP has
been done but that's another topic.

My point is that if MPTCP is disabled, it seems interesting to me to
directly tell that to the app and then not transparency fallback to TCP.

But if we think that it might be interesting to transparently fallback
to TCP, we can do that. We can also use several values (1, 2, ...)
linked to different behaviours.

> See my comments to this specific patch below.
> 
> On Thu, 2019-07-11 at 21:00 +0200, Matthieu Baerts wrote:
>> New MPTCP sockets will return -ENOPROTOOPT if MPTCP support is disabled
>> for the current net namespace.
>>
>> For security reasons, it is interesting to have a global switch for
>> MPTCP. To start, MPTCP will be disabled by default and only privileged
>> users will be able to modify this. The reason is that because MPTCP is
>> new, it will not be tested and reviewed by many and security issues can
>> then take time to be discovered and fixed.
>>
>> The value of this new sysctl can be different per namespace. We can then
>> restrict the usage of MPTCP to the selected NS. In case of serious
>> issues with MPTCP, administrators can now easily turn MPTCP off.
>>
>> MPTCP' kselftest has been modified to validate that the correct error is
>> reported when creating a socket while MPTCP support is disabled.
>>
>> Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
>> ---
>>
>> Notes:
>>     As just discussed at the meeting we just had, here is a RFC patch to
>>     add a new sysctl for MPTCP.
>>     
>>     Because it is not linked to the protocol itself, I simply created a new
>>     file, ctrl.c like in mptcp.org.
>>     
>>     A few questions:
>>     - Is it OK to reserve space per ns via the "pernet_operations"
>>       structure? Because MPTCP would not be compiled as a module, we could
>>       directly store stuff in the net structure as other parts of the code
>>       do but maybe better to keep MPTCP code on the side as done here.
>>     - In mptcp.org, all sysctls are prepended with 'mptcp_', e.g.
>>       'net.mptcp.mptcp_enabled'. Do we need this? Is it better to keep the
>>       same names if possible or better to differentiate? In this version,
>>       'mptcp_' is not prepended.
> 
> Differentiate, as you have done.

Good thank you!

>>     - This sysctl will only block new sockets to be created. Is it enough?
>>     - ENOPROTOOPT is returned, maybe something else to return? EPERM is
>>       maybe too generic? ENOTSUPP is not translated by perror().
>>     - Should we start the documentation now for the sysctl?
>>     - A simple test has been added, because it is not linked to the rest, I
>>       put separeted as first test.
>>     - Of course do not hesitate to comment. Even on the idea of having a
>>       sysctl for this purpose.
>>
>>  net/mptcp/Makefile                            |   2 +-
>>  net/mptcp/ctrl.c                              | 109 ++++++++++++++++++
>>  net/mptcp/protocol.c                          |  12 +-
>>  net/mptcp/protocol.h                          |   4 +
>>  .../selftests/net/mptcp/mptcp_connect.sh      |  24 ++++
>>  5 files changed, 148 insertions(+), 3 deletions(-)
>>  create mode 100644 net/mptcp/ctrl.c
>>
>> diff --git a/net/mptcp/Makefile b/net/mptcp/Makefile
>> index 7fe7aa64eda0..289fdf4339c1 100644
>> --- a/net/mptcp/Makefile
>> +++ b/net/mptcp/Makefile
>> @@ -1,4 +1,4 @@
>>  # SPDX-License-Identifier: GPL-2.0
>>  obj-$(CONFIG_MPTCP) += mptcp.o
>>  
>> -mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o
>> +mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o ctrl.o
>> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
> 
> The prefix 'ctl' is used in this file a bunch, so maybe ctl.c is a better
> name, or even 'sysctl.c'. That it is named this is mptcp.org version isn't
> much of a reason IMHO. :)

My goal was to use this new file for everything not directly linked to
the protocol but to change different behaviours: sysctl, sockopts,
selection of the PMs or the packet schedulers, etc.

What do you think? But we can also create different files if you think
that would be better.

>> new file mode 100644
>> index 000000000000..4c9a6a2cfeb3
>> --- /dev/null
>> +++ b/net/mptcp/ctrl.c

(...)

>> +
>> +static struct pernet_operations mptcp_pernet_ops = {
>> +	.init = mptcp_net_init,
>> +	.exit = mptcp_net_exit,
>> +	.id = &mptcp_pernet_id,
>> +	.size = sizeof(struct mptcp_pernet),
>> +};
>> +
>> +void __init mptcp_init(void)
> 
> I would prefer that the mptcp_init() in protocol.c remain the top-level init
> routine and that it call a sub-level routine in this file the same as other
> files in "module". When TCP is calling mptcp_init() it is registering
> protocols, so the protocol routine should be at the top I think.

Sure I can do that.

I created this new file to be the central point to control the different
parts in MPTCP (sysctl, sockopt, PM selection, etc.) including the
protocol. But if we think it is better to init the protocol part, I am
fine with that.

> 
>> +{
>> +	mptcp_proto_init();
>> +
>> +	if (register_pernet_subsys(&mptcp_pernet_ops) < 0)
>> +		panic("Failed to register MPTCP pernet subsystem.\n");
>> +}
>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>> index 774ed25d3b6d..18399cb63f35 100644
>> --- a/net/mptcp/protocol.c
>> +++ b/net/mptcp/protocol.c
>> @@ -559,6 +559,14 @@ static int mptcp_init_sock(struct sock *sk)
>>  	return 0;
>>  }
>>  
>> +static int mptcp_init_sock_cb(struct sock *sk)
>> +{
>> +	if (!mptcp_is_enabled(sock_net(sk)))
>> +		return -ENOPROTOOPT;
> 
> Can we just add this check to mptcp_init_sock() and not add the extra routine?

I created a new routine because mptcp_init_sock() is also called from
mptcp_accept(). The (initial) goal of this sysctl is not to allow
creation of new socket and not alter existing ones. We can also change
the definition if needed.

Cheers,
Matt
-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-16 23:12 Peter Krystad
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Krystad @ 2019-07-16 23:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 10577 bytes --]


Thanks for this patch Matthieu, I have a few tardy comments.

As a general comment another approach would be to use this enabled flag to
control whether MP_CAPABLE options are sent when establishing/accepting an
MPTCP connection. You can see my "/* @@ if MTPCP enabled */" comments where
subflow->request_mptcp is being set in protocol.c. This would be the similiar
behaviour to the mptcp.org version, you just get a (inefficient) TCP socket
when requesting MPTCP. You get the benefit of not surprising applications
coded to use MPTCP at the drawback of some security exposure, as the new MPTCP
fallback layer would still be used. But this is a pretty thin layer.

See my comments to this specific patch below.

On Thu, 2019-07-11 at 21:00 +0200, Matthieu Baerts wrote:
> New MPTCP sockets will return -ENOPROTOOPT if MPTCP support is disabled
> for the current net namespace.
> 
> For security reasons, it is interesting to have a global switch for
> MPTCP. To start, MPTCP will be disabled by default and only privileged
> users will be able to modify this. The reason is that because MPTCP is
> new, it will not be tested and reviewed by many and security issues can
> then take time to be discovered and fixed.
> 
> The value of this new sysctl can be different per namespace. We can then
> restrict the usage of MPTCP to the selected NS. In case of serious
> issues with MPTCP, administrators can now easily turn MPTCP off.
> 
> MPTCP' kselftest has been modified to validate that the correct error is
> reported when creating a socket while MPTCP support is disabled.
> 
> Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
> ---
> 
> Notes:
>     As just discussed at the meeting we just had, here is a RFC patch to
>     add a new sysctl for MPTCP.
>     
>     Because it is not linked to the protocol itself, I simply created a new
>     file, ctrl.c like in mptcp.org.
>     
>     A few questions:
>     - Is it OK to reserve space per ns via the "pernet_operations"
>       structure? Because MPTCP would not be compiled as a module, we could
>       directly store stuff in the net structure as other parts of the code
>       do but maybe better to keep MPTCP code on the side as done here.
>     - In mptcp.org, all sysctls are prepended with 'mptcp_', e.g.
>       'net.mptcp.mptcp_enabled'. Do we need this? Is it better to keep the
>       same names if possible or better to differentiate? In this version,
>       'mptcp_' is not prepended.

Differentiate, as you have done.

>     - This sysctl will only block new sockets to be created. Is it enough?
>     - ENOPROTOOPT is returned, maybe something else to return? EPERM is
>       maybe too generic? ENOTSUPP is not translated by perror().
>     - Should we start the documentation now for the sysctl?
>     - A simple test has been added, because it is not linked to the rest, I
>       put separeted as first test.
>     - Of course do not hesitate to comment. Even on the idea of having a
>       sysctl for this purpose.
> 
>  net/mptcp/Makefile                            |   2 +-
>  net/mptcp/ctrl.c                              | 109 ++++++++++++++++++
>  net/mptcp/protocol.c                          |  12 +-
>  net/mptcp/protocol.h                          |   4 +
>  .../selftests/net/mptcp/mptcp_connect.sh      |  24 ++++
>  5 files changed, 148 insertions(+), 3 deletions(-)
>  create mode 100644 net/mptcp/ctrl.c
> 
> diff --git a/net/mptcp/Makefile b/net/mptcp/Makefile
> index 7fe7aa64eda0..289fdf4339c1 100644
> --- a/net/mptcp/Makefile
> +++ b/net/mptcp/Makefile
> @@ -1,4 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_MPTCP) += mptcp.o
>  
> -mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o
> +mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o ctrl.o
> diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c

The prefix 'ctl' is used in this file a bunch, so maybe ctl.c is a better
name, or even 'sysctl.c'. That it is named this is mptcp.org version isn't
much of a reason IMHO. :)

> new file mode 100644
> index 000000000000..4c9a6a2cfeb3
> --- /dev/null
> +++ b/net/mptcp/ctrl.c
> @@ -0,0 +1,109 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Multipath TCP
> + *
> + * Copyright (c) 2019, Tessares SA.
> + */
> +
> +#include <linux/sysctl.h>
> +
> +#include <net/net_namespace.h>
> +#include <net/netns/generic.h>
> +
> +#include "protocol.h"
> +
> +#define MPTCP_SYSCTL_PATH "net/mptcp"
> +
> +static int mptcp_pernet_id;
> +struct mptcp_pernet {
> +	struct ctl_table_header *ctl_table_hdr;
> +
> +	int mptcp_enabled;
> +};
> +
> +static struct mptcp_pernet *mptcp_get_pernet(struct net *net)
> +{
> +	return net_generic(net, mptcp_pernet_id);
> +}
> +
> +int mptcp_is_enabled(struct net *net)
> +{
> +	return mptcp_get_pernet(net)->mptcp_enabled;
> +}
> +
> +static struct ctl_table mptcp_sysctl_table[] = {
> +	{
> +		.procname = "enabled",
> +		.maxlen = sizeof(int),
> +		.mode = 0644,
> +		.proc_handler = proc_dointvec,
> +	},
> +	{}
> +};
> +
> +static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
> +{
> +	struct ctl_table_header *hdr;
> +	struct ctl_table *table;
> +
> +	table = mptcp_sysctl_table;
> +	if (!net_eq(net, &init_net)) {
> +		table = kmemdup(table, sizeof(mptcp_sysctl_table), GFP_KERNEL);
> +		if (!table)
> +			goto err_alloc;
> +	}
> +
> +	table[0].data = &pernet->mptcp_enabled;
> +
> +	hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
> +	if (!hdr)
> +		goto err_reg;
> +
> +	pernet->ctl_table_hdr = hdr;
> +
> +	return 0;
> +
> +err_reg:
> +	if (!net_eq(net, &init_net))
> +		kfree(table);
> +err_alloc:
> +	return -ENOMEM;
> +}
> +
> +static void mptcp_pernet_del_table(struct mptcp_pernet *pernet)
> +{
> +	struct ctl_table *table = pernet->ctl_table_hdr->ctl_table_arg;
> +
> +	unregister_net_sysctl_table(pernet->ctl_table_hdr);
> +
> +	kfree(table);
> +}
> +
> +static int __net_init mptcp_net_init(struct net *net)
> +{
> +	struct mptcp_pernet *pernet = mptcp_get_pernet(net);
> +
> +	return mptcp_pernet_new_table(net, pernet);
> +}
> +
> +/* Note: the callback will only be called per extra netns */
> +static void __net_exit mptcp_net_exit(struct net *net)
> +{
> +	struct mptcp_pernet *pernet = mptcp_get_pernet(net);
> +
> +	mptcp_pernet_del_table(pernet);
> +}
> +
> +static struct pernet_operations mptcp_pernet_ops = {
> +	.init = mptcp_net_init,
> +	.exit = mptcp_net_exit,
> +	.id = &mptcp_pernet_id,
> +	.size = sizeof(struct mptcp_pernet),
> +};
> +
> +void __init mptcp_init(void)

I would prefer that the mptcp_init() in protocol.c remain the top-level init
routine and that it call a sub-level routine in this file the same as other
files in "module". When TCP is calling mptcp_init() it is registering
protocols, so the protocol routine should be at the top I think.

> +{
> +	mptcp_proto_init();
> +
> +	if (register_pernet_subsys(&mptcp_pernet_ops) < 0)
> +		panic("Failed to register MPTCP pernet subsystem.\n");
> +}
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 774ed25d3b6d..18399cb63f35 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -559,6 +559,14 @@ static int mptcp_init_sock(struct sock *sk)
>  	return 0;
>  }
>  
> +static int mptcp_init_sock_cb(struct sock *sk)
> +{
> +	if (!mptcp_is_enabled(sock_net(sk)))
> +		return -ENOPROTOOPT;

Can we just add this check to mptcp_init_sock() and not add the extra routine?

Peter.

> +
> +	return mptcp_init_sock(sk);
> +}
> +
>  static void mptcp_close(struct sock *sk, long timeout)
>  {
>  	struct mptcp_sock *msk = mptcp_sk(sk);
> @@ -801,7 +809,7 @@ bool mptcp_sk_is_subflow(const struct sock *sk)
>  static struct proto mptcp_prot = {
>  	.name		= "MPTCP",
>  	.owner		= THIS_MODULE,
> -	.init		= mptcp_init_sock,
> +	.init		= mptcp_init_sock_cb,
>  	.close		= mptcp_close,
>  	.accept		= mptcp_accept,
>  	.setsockopt	= mptcp_setsockopt,
> @@ -993,7 +1001,7 @@ static struct inet_protosw mptcp_protosw = {
>  	.flags		= INET_PROTOSW_ICSK,
>  };
>  
> -void __init mptcp_init(void)
> +void mptcp_proto_init(void)
>  {
>  	mptcp_prot.h.hashinfo = tcp_prot.h.hashinfo;
>  	mptcp_stream_ops = inet_stream_ops;
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 7f15f6aab93d..715ce80d0ae1 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -183,11 +183,15 @@ mptcp_subflow_tcp_socket(const struct subflow_context *subflow)
>  	return subflow->tcp_sock;
>  }
>  
> +int mptcp_is_enabled(struct net *net);
> +
>  void subflow_init(void);
>  int subflow_create_socket(struct sock *sk, struct socket **new_sock);
>  
>  extern const struct inet_connection_sock_af_ops ipv4_specific;
>  
> +void mptcp_proto_init(void);
> +
>  void mptcp_get_options(const struct sk_buff *skb,
>  		       struct tcp_options_received *opt_rx);
>  
> diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
> index 4418163af001..d029bdc5946d 100755
> --- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
> +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
> @@ -45,6 +45,7 @@ trap cleanup EXIT
>  for i in 1 2 3 4;do
>  	ip netns add ns$i || exit $ksft_skip
>  	ip -net ns$i link set lo up
> +	ip netns exec ns$i sysctl -q net.mptcp.enabled=1
>  done
>  
>  #  ns1              ns2                    ns3                     ns4
> @@ -106,6 +107,27 @@ check_transfer()
>  	return 0
>  }
>  
> +check_mptcp_disabled()
> +{
> +	disabled_ns="ns_disabled"
> +	ip netns add ${disabled_ns} || exit $ksft_skip
> +	# by default: sysctl net.mptcp.enabled=0
> +
> +	local err=0
> +	LANG=C ip netns exec ${disabled_ns} ./mptcp_connect -t $timeout -p 10000 -s MPTCP 127.0.0.1 < "$cin" 2>&1 | \
> +		grep -q "^socket: Protocol not available$" && err=1
> +	ip netns delete ${disabled_ns}
> +
> +	if [ ${err} -eq 0 ]; then
> +		echo -e "MPTCP is not disabled by default as expected\t[ FAIL ]"
> +		ret=1
> +		return 1
> +	fi
> +
> +	echo -e "MPTCP is disabled by default as expected\t[ OK ]"
> +	return 0
> +}
> +
>  do_ping()
>  {
>  	listener_ns="$1"
> @@ -241,6 +263,8 @@ run_tests()
>  make_file "$cin" "client"
>  make_file "$sin" "server"
>  
> +check_mptcp_disabled
> +
>  for sender in 1 2 3 4;do
>  	do_ping ns1 ns$sender 10.0.1.1
>  


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-16 17:45 Othman, Ossama
  0 siblings, 0 replies; 12+ messages in thread
From: Othman, Ossama @ 2019-07-16 17:45 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2893 bytes --]

Hi Matthieu,

On Tue, Jul 16, 2019 at 8:51 AM Matthieu Baerts
<matthieu.baerts(a)tessares.net> wrote:
> On 15/07/2019 22:18, Othman, Ossama wrote:
> > My point was that it also introduces a way for compromised privileged
> > processes to potentially (and possibly inadvertently) disable MPTCP at
> > run-time without involving system administrators.  Admittedly, there
> > doesn't appear to be much reason for such compromised processes to do
> > that.   At the very least, I think it might be useful to verify that
> > the process in question has the CAP_NET_ADMIN capability when
> > modifying this variable.
>
> With the current version, I think that this compromised privileged
> process would need the root/full permissions to modify this new sysctl.
> It would need more than one capability I think, no?
> If it is "root", it can then also block MPTCP with NF/IPTables rules,
> remove files, burn the device, become the president of the US, etc.
> Maybe a process with CAP_NET_ADMIN can already block MPTCP sockets.
>
> Maybe I am wrong but I thought that using capabilities was a way to give
> extra permissions to a process to do a bit more (e.g. to create a RAW
> socket) without requiring the full/root rights. To change most sysctl
> (except the ones that check for some specific capabilities), you need
> the full rights, no?

I think it's more accurate to say that the effective user ID needs to
be root in those cases, regardless of which capabilities are granted
or removed.  Since capabilities are not checked in the kernel for
those sysctl variables, a root process without full capabilities could
conceivably alter them, at least as I understand it.  Please correct
me if I am wrong.

Yes, capabilities can be used to grant specific privileges to a
non-root process.  However, capabilities can also be used to restrict
what a root process can do.  This is evident in "pscap" command
output:

$ pscap
ppid  pid   name        command           capabilities
1     220   root        systemd-journal   chown, dac_override,
dac_read_search, fowner, setgid, setuid, sys_ptrace, sys_admin,
audit_control, mac_override, syslog, audit_read
1     261   root        systemd-udevd     full
1     513   root        haveged           sys_admin
1     521   mptcp       mptcpd            net_admin
1     532   root        accounts-daemon   full
...

Only some of the root processes listed above have "full" capabilties.

> And if you have the full rights, you have the
> CAP_NET_ADMIN capability, no?

Yes.  You can't do anything about processes with full capabilities but
you can restrict sysctl variable write access to root processes that
do not have the desired capabilities.

> Is it possible to have "root" processes without CAP_NET_ADMIN
> capability? Or maybe somehow with SELinux?

Sure.  See above.  :)

HTH,
-Ossama

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-16 15:51 Matthieu Baerts
  0 siblings, 0 replies; 12+ messages in thread
From: Matthieu Baerts @ 2019-07-16 15:51 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 4230 bytes --]

Hi Ossama,

On 15/07/2019 22:18, Othman, Ossama wrote:
> Hi Matthieu,
> 
> On Mon, Jul 15, 2019 at 11:44 AM Matthieu Baerts
> <matthieu.baerts(a)tessares.net> wrote:
>> On 15/07/2019 20:17, Othman, Ossama wrote:
>>> Hi Matthieu,
>>>
>>> On Thu, Jul 11, 2019 at 12:01 PM Matthieu Baerts
>>> <matthieu.baerts(a)tessares.net> wrote:
>>>> For security reasons, it is interesting to have a global switch for
>>>> MPTCP. To start, MPTCP will be disabled by default and only privileged
>>>> users will be able to modify this. The reason is that because MPTCP is
>>>> new, it will not be tested and reviewed by many and security issues can
>>>> then take time to be discovered and fixed.
>>>
>>> Do we need to be concerned about this new sysctl variable being used
>>> to maliciously enable or disable MPTCP?
>>
>> I am sorry, I don't understand your question?
>>
>> The goal would be to add a new sysctl variable as a global "safeguard".
>> The application will continue to explicitly ask for MPTCP using
>> IPPROTO_MPTCP.
> 
> I understand the safeguard purpose of the net.mptcp.enabled sysctl
> variable.  It would indeed be useful as a means for a system
> administrator to disable MPTCP without a reboot in the event a
> security issue in MPTCP is identified.
> 
> My point was that it also introduces a way for compromised privileged
> processes to potentially (and possibly inadvertently) disable MPTCP at
> run-time without involving system administrators.  Admittedly, there
> doesn't appear to be much reason for such compromised processes to do
> that.   At the very least, I think it might be useful to verify that
> the process in question has the CAP_NET_ADMIN capability when
> modifying this variable.

With the current version, I think that this compromised privileged
process would need the root/full permissions to modify this new sysctl.
It would need more than one capability I think, no?
If it is "root", it can then also block MPTCP with NF/IPTables rules,
remove files, burn the device, become the president of the US, etc.
Maybe a process with CAP_NET_ADMIN can already block MPTCP sockets.

Maybe I am wrong but I thought that using capabilities was a way to give
extra permissions to a process to do a bit more (e.g. to create a RAW
socket) without requiring the full/root rights. To change most sysctl
(except the ones that check for some specific capabilities), you need
the full rights, no? And if you have the full rights, you have the
CAP_NET_ADMIN capability, no?

> Perhaps I'm being overly paranoid.  :)
> 
>>> According to the socket(2) man page, EINVAL might be the better errno:
>>>
>>>        EINVAL Unknown protocol, or protocol family not available.
>>
>> I didn't read the socket manpage, I should have done that :)
>>
>> I would prefer not to return EINVAL as it is already returned if the
>> protocol is unknown, e.g. if the kernel does not support MPTCP.
>> From an application point of view, I think it could be interesting to
>> know if the kernel supports MPTCP or if MPTCP is "not allowed". I would
>> like to say that this new error is reported not because "the protocol
>> family is not available" but somehow, "not allowed" (EPERM?) to use it
>> if you see what I mean :)
> 
> Yes, I see what you mean now. :)
> 
>> But maybe this is not needed? Or recommended to anyway return EINVAL.
> 
> Based on your intent, EPERM seems like a better fit, but either errno
> works for me.

Great! I can change to EPERM if it is not an issue!

>>> Should we prevent root processes that don't have the CAP_NET_ADMIN
>>> capability from setting net.mptcp.enabled, e.g. something like the
>>> following proc_dointvec() wrapper:
>>
>> I don't know what are the best practices for that. But I am happy to
>> change if needed.
> 
> I guess it really depends on how paranoid we want to be.

Is it possible to have "root" processes without CAP_NET_ADMIN
capability? Or maybe somehow with SELinux?

Cheers,
Matt

> 
> -Ossama
> 

-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-15 20:18 Othman, Ossama
  0 siblings, 0 replies; 12+ messages in thread
From: Othman, Ossama @ 2019-07-15 20:18 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2931 bytes --]

Hi Matthieu,

On Mon, Jul 15, 2019 at 11:44 AM Matthieu Baerts
<matthieu.baerts(a)tessares.net> wrote:
> On 15/07/2019 20:17, Othman, Ossama wrote:
> > Hi Matthieu,
> >
> > On Thu, Jul 11, 2019 at 12:01 PM Matthieu Baerts
> > <matthieu.baerts(a)tessares.net> wrote:
> >> For security reasons, it is interesting to have a global switch for
> >> MPTCP. To start, MPTCP will be disabled by default and only privileged
> >> users will be able to modify this. The reason is that because MPTCP is
> >> new, it will not be tested and reviewed by many and security issues can
> >> then take time to be discovered and fixed.
> >
> > Do we need to be concerned about this new sysctl variable being used
> > to maliciously enable or disable MPTCP?
>
> I am sorry, I don't understand your question?
>
> The goal would be to add a new sysctl variable as a global "safeguard".
> The application will continue to explicitly ask for MPTCP using
> IPPROTO_MPTCP.

I understand the safeguard purpose of the net.mptcp.enabled sysctl
variable.  It would indeed be useful as a means for a system
administrator to disable MPTCP without a reboot in the event a
security issue in MPTCP is identified.

My point was that it also introduces a way for compromised privileged
processes to potentially (and possibly inadvertently) disable MPTCP at
run-time without involving system administrators.  Admittedly, there
doesn't appear to be much reason for such compromised processes to do
that.   At the very least, I think it might be useful to verify that
the process in question has the CAP_NET_ADMIN capability when
modifying this variable.

Perhaps I'm being overly paranoid.  :)

> > According to the socket(2) man page, EINVAL might be the better errno:
> >
> >        EINVAL Unknown protocol, or protocol family not available.
>
> I didn't read the socket manpage, I should have done that :)
>
> I would prefer not to return EINVAL as it is already returned if the
> protocol is unknown, e.g. if the kernel does not support MPTCP.
> From an application point of view, I think it could be interesting to
> know if the kernel supports MPTCP or if MPTCP is "not allowed". I would
> like to say that this new error is reported not because "the protocol
> family is not available" but somehow, "not allowed" (EPERM?) to use it
> if you see what I mean :)

Yes, I see what you mean now. :)

> But maybe this is not needed? Or recommended to anyway return EINVAL.

Based on your intent, EPERM seems like a better fit, but either errno
works for me.

> > Should we prevent root processes that don't have the CAP_NET_ADMIN
> > capability from setting net.mptcp.enabled, e.g. something like the
> > following proc_dointvec() wrapper:
>
> I don't know what are the best practices for that. But I am happy to
> change if needed.

I guess it really depends on how paranoid we want to be.

-Ossama

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-15 18:44 Matthieu Baerts
  0 siblings, 0 replies; 12+ messages in thread
From: Matthieu Baerts @ 2019-07-15 18:44 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 3820 bytes --]

Hi Ossama,

Thank you for your review!

On 15/07/2019 20:17, Othman, Ossama wrote:
> Hi Matthieu,
> 
> On Thu, Jul 11, 2019 at 12:01 PM Matthieu Baerts
> <matthieu.baerts(a)tessares.net> wrote:
>> For security reasons, it is interesting to have a global switch for
>> MPTCP. To start, MPTCP will be disabled by default and only privileged
>> users will be able to modify this. The reason is that because MPTCP is
>> new, it will not be tested and reviewed by many and security issues can
>> then take time to be discovered and fixed.
> 
> Do we need to be concerned about this new sysctl variable being used
> to maliciously enable or disable MPTCP?

I am sorry, I don't understand your question?

The goal would be to add a new sysctl variable as a global "safeguard".
The application will continue to explicitly ask for MPTCP using
IPPROTO_MPTCP.

> 
>>     - In mptcp.org, all sysctls are prepended with 'mptcp_', e.g.
>>       'net.mptcp.mptcp_enabled'. Do we need this? Is it better to keep the
>>       same names if possible or better to differentiate? In this version,
>>       'mptcp_' is not prepended.
> 
> The "mptcpd_" prefix seems redundant since the variable is already
> part of the "net.mptcp" namespace.  If we're going to differentiate
> sysctl interfaces between the two kernels, now is a good time to do so
> given where we are in the upstream kernel development.

I agree with you, thank you for your feedback.

>>     - ENOPROTOOPT is returned, maybe something else to return? EPERM is
>>       maybe too generic? ENOTSUPP is not translated by perror().
> 
> ENOPROTOOPT appears to be more relevant to socket options, e.g. the
> setsockopt(2) man page states the following:
> 
>        ENOPROTOOPT
>                 The option is unknown at the level indicated.
> 
> According to the socket(2) man page, EINVAL might be the better errno:
> 
>        EINVAL Unknown protocol, or protocol family not available.

I didn't read the socket manpage, I should have done that :)

I would prefer not to return EINVAL as it is already returned if the
protocol is unknown, e.g. if the kernel does not support MPTCP.
From an application point of view, I think it could be interesting to
know if the kernel supports MPTCP or if MPTCP is "not allowed". I would
like to say that this new error is reported not because "the protocol
family is not available" but somehow, "not allowed" (EPERM?) to use it
if you see what I mean :)
But maybe this is not needed? Or recommended to anyway return EINVAL.

>> +static struct ctl_table mptcp_sysctl_table[] = {
>> +       {
>> +               .procname = "enabled",
>> +               .maxlen = sizeof(int),
>> +               .mode = 0644,
>> +               .proc_handler = proc_dointvec,
>> +       },
>> +       {}
>> +};
> 
> Should we prevent root processes that don't have the CAP_NET_ADMIN
> capability from setting net.mptcp.enabled, e.g. something like the
> following proc_dointvec() wrapper:

I don't know what are the best practices for that. But I am happy to
change if needed.

> static int mptcp_proc_dointvec(struct ctl_table *table, int write,
>                                 void __user *buffer, size_t *lenp, loff_t *ppos)
> {
>         if (write && !capable(CAP_NET_ADMIN))
>                 return -EPERM;
> 
>         return proc_dointvec(table, write, buffer, lenp, ppos);
> }
> 
> I don't know if this works well for net namespaces, however.

We would need to use ns_capable(), we can pass extra data in the table
entry.

Cheers,
Matt

> 
> Thanks,
> -Ossama
> 

-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-15 18:17 Othman, Ossama
  0 siblings, 0 replies; 12+ messages in thread
From: Othman, Ossama @ 2019-07-15 18:17 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2411 bytes --]

Hi Matthieu,

On Thu, Jul 11, 2019 at 12:01 PM Matthieu Baerts
<matthieu.baerts(a)tessares.net> wrote:
> For security reasons, it is interesting to have a global switch for
> MPTCP. To start, MPTCP will be disabled by default and only privileged
> users will be able to modify this. The reason is that because MPTCP is
> new, it will not be tested and reviewed by many and security issues can
> then take time to be discovered and fixed.

Do we need to be concerned about this new sysctl variable being used
to maliciously enable or disable MPTCP?

>     - In mptcp.org, all sysctls are prepended with 'mptcp_', e.g.
>       'net.mptcp.mptcp_enabled'. Do we need this? Is it better to keep the
>       same names if possible or better to differentiate? In this version,
>       'mptcp_' is not prepended.

The "mptcpd_" prefix seems redundant since the variable is already
part of the "net.mptcp" namespace.  If we're going to differentiate
sysctl interfaces between the two kernels, now is a good time to do so
given where we are in the upstream kernel development.

>     - ENOPROTOOPT is returned, maybe something else to return? EPERM is
>       maybe too generic? ENOTSUPP is not translated by perror().

ENOPROTOOPT appears to be more relevant to socket options, e.g. the
setsockopt(2) man page states the following:

       ENOPROTOOPT
                The option is unknown at the level indicated.

According to the socket(2) man page, EINVAL might be the better errno:

       EINVAL Unknown protocol, or protocol family not available.

> +static struct ctl_table mptcp_sysctl_table[] = {
> +       {
> +               .procname = "enabled",
> +               .maxlen = sizeof(int),
> +               .mode = 0644,
> +               .proc_handler = proc_dointvec,
> +       },
> +       {}
> +};

Should we prevent root processes that don't have the CAP_NET_ADMIN
capability from setting net.mptcp.enabled, e.g. something like the
following proc_dointvec() wrapper:

static int mptcp_proc_dointvec(struct ctl_table *table, int write,
                                void __user *buffer, size_t *lenp, loff_t *ppos)
{
        if (write && !capable(CAP_NET_ADMIN))
                return -EPERM;

        return proc_dointvec(table, write, buffer, lenp, ppos);
}

I don't know if this works well for net namespaces, however.

Thanks,
-Ossama

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS
@ 2019-07-11 19:00 Matthieu Baerts
  0 siblings, 0 replies; 12+ messages in thread
From: Matthieu Baerts @ 2019-07-11 19:00 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 8628 bytes --]

New MPTCP sockets will return -ENOPROTOOPT if MPTCP support is disabled
for the current net namespace.

For security reasons, it is interesting to have a global switch for
MPTCP. To start, MPTCP will be disabled by default and only privileged
users will be able to modify this. The reason is that because MPTCP is
new, it will not be tested and reviewed by many and security issues can
then take time to be discovered and fixed.

The value of this new sysctl can be different per namespace. We can then
restrict the usage of MPTCP to the selected NS. In case of serious
issues with MPTCP, administrators can now easily turn MPTCP off.

MPTCP' kselftest has been modified to validate that the correct error is
reported when creating a socket while MPTCP support is disabled.

Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---

Notes:
    As just discussed at the meeting we just had, here is a RFC patch to
    add a new sysctl for MPTCP.
    
    Because it is not linked to the protocol itself, I simply created a new
    file, ctrl.c like in mptcp.org.
    
    A few questions:
    - Is it OK to reserve space per ns via the "pernet_operations"
      structure? Because MPTCP would not be compiled as a module, we could
      directly store stuff in the net structure as other parts of the code
      do but maybe better to keep MPTCP code on the side as done here.
    - In mptcp.org, all sysctls are prepended with 'mptcp_', e.g.
      'net.mptcp.mptcp_enabled'. Do we need this? Is it better to keep the
      same names if possible or better to differentiate? In this version,
      'mptcp_' is not prepended.
    - This sysctl will only block new sockets to be created. Is it enough?
    - ENOPROTOOPT is returned, maybe something else to return? EPERM is
      maybe too generic? ENOTSUPP is not translated by perror().
    - Should we start the documentation now for the sysctl?
    - A simple test has been added, because it is not linked to the rest, I
      put separeted as first test.
    - Of course do not hesitate to comment. Even on the idea of having a
      sysctl for this purpose.

 net/mptcp/Makefile                            |   2 +-
 net/mptcp/ctrl.c                              | 109 ++++++++++++++++++
 net/mptcp/protocol.c                          |  12 +-
 net/mptcp/protocol.h                          |   4 +
 .../selftests/net/mptcp/mptcp_connect.sh      |  24 ++++
 5 files changed, 148 insertions(+), 3 deletions(-)
 create mode 100644 net/mptcp/ctrl.c

diff --git a/net/mptcp/Makefile b/net/mptcp/Makefile
index 7fe7aa64eda0..289fdf4339c1 100644
--- a/net/mptcp/Makefile
+++ b/net/mptcp/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_MPTCP) += mptcp.o
 
-mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o
+mptcp-y := protocol.o subflow.o options.o token.o crypto.o pm.o ctrl.o
diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c
new file mode 100644
index 000000000000..4c9a6a2cfeb3
--- /dev/null
+++ b/net/mptcp/ctrl.c
@@ -0,0 +1,109 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Multipath TCP
+ *
+ * Copyright (c) 2019, Tessares SA.
+ */
+
+#include <linux/sysctl.h>
+
+#include <net/net_namespace.h>
+#include <net/netns/generic.h>
+
+#include "protocol.h"
+
+#define MPTCP_SYSCTL_PATH "net/mptcp"
+
+static int mptcp_pernet_id;
+struct mptcp_pernet {
+	struct ctl_table_header *ctl_table_hdr;
+
+	int mptcp_enabled;
+};
+
+static struct mptcp_pernet *mptcp_get_pernet(struct net *net)
+{
+	return net_generic(net, mptcp_pernet_id);
+}
+
+int mptcp_is_enabled(struct net *net)
+{
+	return mptcp_get_pernet(net)->mptcp_enabled;
+}
+
+static struct ctl_table mptcp_sysctl_table[] = {
+	{
+		.procname = "enabled",
+		.maxlen = sizeof(int),
+		.mode = 0644,
+		.proc_handler = proc_dointvec,
+	},
+	{}
+};
+
+static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet)
+{
+	struct ctl_table_header *hdr;
+	struct ctl_table *table;
+
+	table = mptcp_sysctl_table;
+	if (!net_eq(net, &init_net)) {
+		table = kmemdup(table, sizeof(mptcp_sysctl_table), GFP_KERNEL);
+		if (!table)
+			goto err_alloc;
+	}
+
+	table[0].data = &pernet->mptcp_enabled;
+
+	hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table);
+	if (!hdr)
+		goto err_reg;
+
+	pernet->ctl_table_hdr = hdr;
+
+	return 0;
+
+err_reg:
+	if (!net_eq(net, &init_net))
+		kfree(table);
+err_alloc:
+	return -ENOMEM;
+}
+
+static void mptcp_pernet_del_table(struct mptcp_pernet *pernet)
+{
+	struct ctl_table *table = pernet->ctl_table_hdr->ctl_table_arg;
+
+	unregister_net_sysctl_table(pernet->ctl_table_hdr);
+
+	kfree(table);
+}
+
+static int __net_init mptcp_net_init(struct net *net)
+{
+	struct mptcp_pernet *pernet = mptcp_get_pernet(net);
+
+	return mptcp_pernet_new_table(net, pernet);
+}
+
+/* Note: the callback will only be called per extra netns */
+static void __net_exit mptcp_net_exit(struct net *net)
+{
+	struct mptcp_pernet *pernet = mptcp_get_pernet(net);
+
+	mptcp_pernet_del_table(pernet);
+}
+
+static struct pernet_operations mptcp_pernet_ops = {
+	.init = mptcp_net_init,
+	.exit = mptcp_net_exit,
+	.id = &mptcp_pernet_id,
+	.size = sizeof(struct mptcp_pernet),
+};
+
+void __init mptcp_init(void)
+{
+	mptcp_proto_init();
+
+	if (register_pernet_subsys(&mptcp_pernet_ops) < 0)
+		panic("Failed to register MPTCP pernet subsystem.\n");
+}
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 774ed25d3b6d..18399cb63f35 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -559,6 +559,14 @@ static int mptcp_init_sock(struct sock *sk)
 	return 0;
 }
 
+static int mptcp_init_sock_cb(struct sock *sk)
+{
+	if (!mptcp_is_enabled(sock_net(sk)))
+		return -ENOPROTOOPT;
+
+	return mptcp_init_sock(sk);
+}
+
 static void mptcp_close(struct sock *sk, long timeout)
 {
 	struct mptcp_sock *msk = mptcp_sk(sk);
@@ -801,7 +809,7 @@ bool mptcp_sk_is_subflow(const struct sock *sk)
 static struct proto mptcp_prot = {
 	.name		= "MPTCP",
 	.owner		= THIS_MODULE,
-	.init		= mptcp_init_sock,
+	.init		= mptcp_init_sock_cb,
 	.close		= mptcp_close,
 	.accept		= mptcp_accept,
 	.setsockopt	= mptcp_setsockopt,
@@ -993,7 +1001,7 @@ static struct inet_protosw mptcp_protosw = {
 	.flags		= INET_PROTOSW_ICSK,
 };
 
-void __init mptcp_init(void)
+void mptcp_proto_init(void)
 {
 	mptcp_prot.h.hashinfo = tcp_prot.h.hashinfo;
 	mptcp_stream_ops = inet_stream_ops;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 7f15f6aab93d..715ce80d0ae1 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -183,11 +183,15 @@ mptcp_subflow_tcp_socket(const struct subflow_context *subflow)
 	return subflow->tcp_sock;
 }
 
+int mptcp_is_enabled(struct net *net);
+
 void subflow_init(void);
 int subflow_create_socket(struct sock *sk, struct socket **new_sock);
 
 extern const struct inet_connection_sock_af_ops ipv4_specific;
 
+void mptcp_proto_init(void);
+
 void mptcp_get_options(const struct sk_buff *skb,
 		       struct tcp_options_received *opt_rx);
 
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
index 4418163af001..d029bdc5946d 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
@@ -45,6 +45,7 @@ trap cleanup EXIT
 for i in 1 2 3 4;do
 	ip netns add ns$i || exit $ksft_skip
 	ip -net ns$i link set lo up
+	ip netns exec ns$i sysctl -q net.mptcp.enabled=1
 done
 
 #  ns1              ns2                    ns3                     ns4
@@ -106,6 +107,27 @@ check_transfer()
 	return 0
 }
 
+check_mptcp_disabled()
+{
+	disabled_ns="ns_disabled"
+	ip netns add ${disabled_ns} || exit $ksft_skip
+	# by default: sysctl net.mptcp.enabled=0
+
+	local err=0
+	LANG=C ip netns exec ${disabled_ns} ./mptcp_connect -t $timeout -p 10000 -s MPTCP 127.0.0.1 < "$cin" 2>&1 | \
+		grep -q "^socket: Protocol not available$" && err=1
+	ip netns delete ${disabled_ns}
+
+	if [ ${err} -eq 0 ]; then
+		echo -e "MPTCP is not disabled by default as expected\t[ FAIL ]"
+		ret=1
+		return 1
+	fi
+
+	echo -e "MPTCP is disabled by default as expected\t[ OK ]"
+	return 0
+}
+
 do_ping()
 {
 	listener_ns="$1"
@@ -241,6 +263,8 @@ run_tests()
 make_file "$cin" "client"
 make_file "$sin" "server"
 
+check_mptcp_disabled
+
 for sender in 1 2 3 4;do
 	do_ping ns1 ns$sender 10.0.1.1
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-07-31 16:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-18  9:57 [MPTCP] [RFC PATCH] mptcp: new sysctl to control the activation per NS Matthieu Baerts
  -- strict thread matches above, loose matches on Subject: below --
2019-07-31 16:04 Peter Krystad
2019-07-18  6:37 Matthieu Baerts
2019-07-17 22:42 Peter Krystad
2019-07-17 21:08 Matthieu Baerts
2019-07-16 23:12 Peter Krystad
2019-07-16 17:45 Othman, Ossama
2019-07-16 15:51 Matthieu Baerts
2019-07-15 20:18 Othman, Ossama
2019-07-15 18:44 Matthieu Baerts
2019-07-15 18:17 Othman, Ossama
2019-07-11 19:00 Matthieu Baerts

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.